Search Results: speech-synthesis

Found 12 Skills

dialogue-audio

Multi-speaker dialogue audio creation with Dia TTS. Covers speaker tags, emotion control, pacing, conversation flow, and post-production. Use for: podcasts, audiobooks, explainers, character dialogue, conversational content. Triggers: dialogue audio, multi speaker, conversation audio, dia tts, two speakers, podcast audio, character voices, voice acting, dialogue generation, conversation tts, multi voice, speaker tags, dialogue recording

🇺🇸|EnglishTranslated

719

AI & Machine Learningcnemri/google-genai-skill...

speech-build

Generate and transcribe speech using Google's Gemini-TTS and Chirp 3 models. Supports Text-to-Speech (Single/Multi-speaker), Instant Custom Voice, and Speech-to-Text (Transcription/Diarization).

🇺🇸|EnglishTranslated

AI & Machine Learningcinience/alicloud-skills

alicloud-ai-audio-tts-voice-design

Voice design workflows with Alibaba Cloud Model Studio Qwen TTS VD models. Use when creating custom synthetic voices from text descriptions and using them for speech synthesis.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningmarswaveai/skills

tts

Text-to-speech and voice narration. Triggers on: "朗读这段", "配音", "TTS", "语音合成", "text to speech", "read this aloud", "convert to speech", "voice narration", "read aloud".

🇺🇸|EnglishTranslated

AI & Machine Learningcinience/alicloud-skills

aliyun-qwen-tts-realtime

Use when real-time speech synthesis is needed with Alibaba Cloud Model Studio Qwen TTS Realtime models. Use when low-latency interactive speech is required, including instruction-controlled realtime synthesis.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningcinience/alicloud-skills

alicloud-ai-audio-tts-realtime

Real-time speech synthesis with Alibaba Cloud Model Studio Qwen TTS Realtime models. Use when low-latency interactive speech is required, including instruction-controlled realtime synthesis.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningqwencloud/qwencloud-ai

qwencloud-audio-tts

[QwenCloud] Synthesize speech from text with Qwen TTS models. TRIGGER when: user wants to convert text to speech, create voiceovers, generate audio narration, read text aloud, build TTS applications, mentions speech synthesis/voice generation/audio output from text, or explicitly invokes this skill by name (e.g. use qwencloud-audio-tts). DO NOT TRIGGER when: user wants speech recognition/ASR, text generation without audio, non-Qwen audio tasks.

🇺🇸|EnglishTranslated

4 scripts/Checked

AI & Machine Learningterrylica/cc-skills

diagnose

Diagnose Kokoro TTS issues. TRIGGERS - kokoro not working, tts diagnose, kokoro error, tts troubleshoot.

🇺🇸|EnglishTranslated

AI & Machine Learningchanjing-ai/chan-skills

chanjing-tts-voice-clone

Use Chanjing TTS API to synthesize speech from text, using user-provided voice

🇺🇸|EnglishTranslated

AI & Machine Learningchanjing-ai/chan-skills

chanjing-tts

Use Chanjing TTS API to convert text to speech

🇺🇸|EnglishTranslated

AI & Machine Learningaradotso/trending-skills

moss-tts-nano-speech

Expert skill for using MOSS-TTS-Nano, a 0.1B parameter multilingual real-time TTS model that runs on CPU with voice cloning support.

🇺🇸|EnglishTranslated

AI & Machine Learningaliyun/alibabacloud-aiops...

alibabacloud-avatar-video

Use Alibaba Cloud DashScope API and LingMou to generate AI video and speech. Seven capabilities — (1) LivePortrait talking-head (image + audio → video, two-step), (2) EMO talking-head, (3) AA/AnimateAnyone full-body animation (three-step), (4) T2I text-to-image (Wan 2.x, default wan2.2-t2i-flash), (5) I2V image-to-video (Wan 2.x, default wan2.7-i2v-flash, supports T2I→I2V pipeline), (6) Qwen TTS (auto model/voice by scene, default qwen3-tts-vd-realtime-2026-01-15), (7) LingMou digital-human template video with random template, public-template copy, and script confirmation. Trigger when the user needs talking-head, portrait, full-body animation, text-to-image, text-to-video, or speech synthesis.

🇺🇸|EnglishTranslated

9 scripts/Attention