Loading...
Loading...
Found 67 Skills
Universal AI voice / text-to-speech skill supporting OpenAI TTS (gpt-4o-mini-tts, tts-1), ElevenLabs multilingual TTS with voice cloning, Bailian Qwen TTS (qwen-tts / qwen3-tts-vd with voice-design custom voices, long-text chunking built in), MiniMax speech-02-hd, SiliconFlow CosyVoice / SenseVoice, and PlayHT 2.0. Use this skill whenever the user asks to read text aloud, synthesize speech, generate narration, create voice-over, dub a script, or turn any text into audio (mp3 / wav / ogg / flac). Typical phrases include "read this aloud", "generate voice for ...", "create a narration of ...", "tts this", "把这段念出来", "做个配音", "合成语音", or mentions of voices / TTS model names like Alloy, Ash, Cherry, Rachel, CosyVoice, PlayHT. Always use this skill even if the user does not specify a provider — pick one from EXTEND.md defaults or available env keys.
Automatically announces plans, issues, and summaries out loud using TTS. Use this skill PROACTIVELY after completing major tasks like finalizing a plan, resolving an issue, or generating a summary. Each project gets a unique voice so users can identify which project is speaking from another room. Providers fallback in order (google, openai, elevenlabs, say) on rate limits.
Audio generation skill — jingles, beds, voiceover, and sound effects. Routes music requests to Suno V5 / Udio / Lyria, speech to MiniMax TTS / FishAudio / ElevenLabs V3, and SFX to ElevenLabs SFX or AudioCraft. Output is one MP3/WAV file saved to the project folder.
AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI. Models: Inworld TTS-2 (100+ languages, emotion/non-verbal steering), Inworld TTS 1.5 (ultra-low latency), ElevenLabs (22+ premium voices, 32 languages), Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice for natural speech. Capabilities: multiple voices, emotions, accents, long-form narration, conversation, voice transformation, delivery mode control, character voices. Use for: voiceovers, audiobooks, podcasts, video narration, accessibility, gaming NPCs, avatar audio, UGC. Triggers: voice cloning, tts, text to speech, ai voice, voice generation, voice synthesis, voice over, narration, speech synthesis, ai narrator, elevenlabs, eleven labs, natural voice, realistic speech, voice ai, voice changer, inworld, inworld tts, character voice, npc voice
End-to-end guided workflow for creating a product demo or showcase video. Extracts brand identity from the repo, generates a polished soundtrack, and produces a feature-focused motion.dev composition rendered via Helios CLI. Use when making product demos, feature showcases, or UI walkthroughs.
End-to-end guided workflow for creating a product launch or release announcement video. Extracts brand identity from the repo, generates a cinematic soundtrack, and produces a reveal-driven motion.dev composition rendered via Helios CLI. Use when making launch announcements, release videos, or product reveals.
Create professional promotional videos using Remotion with AI voiceover and background music. Invoke with /promo-video.