Loading...
Loading...
Found 196 Skills
Use when the user asks to create a demo video, product walkthrough, feature showcase, animated presentation, marketing video, or GIF from screenshots or scene descriptions. Orchestrates playwright, ffmpeg, and edge-tts MCPs to produce polished video content.
Use when creating cloned voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from reference audio and then reusing the returned voice_id in later TTS calls.
Use when generating a non-interactive cutscene clip — opening scene, story beat, character intro, ending. Locks the look with a reference image, image-to-videos a 5-10s shot, optionally adds TTS dialogue, and wires it as a VideoStreamPlayer that fades in/out. Trigger on "cutscene", "intro cinematic", "opening scene", "ending cinematic", "story beat video", "character intro video", "in-engine cinematic", "non-playable scene".
Generate and transcribe speech using Google's Gemini-TTS and Chirp 3 models. Supports Text-to-Speech (Single/Multi-speaker), Instant Custom Voice, and Speech-to-Text (Transcription/Diarization).
Inworld TTS API. Covers voice cloning, audio markups, timestamps. Keywords: text-to-speech, visemes.
One-time bootstrap for Kokoro TTS engine, Telegram bot, and BotFather setup. TRIGGERS - setup tts, install kokoro, botfather, bootstrap tts-telegram-sync, configure telegram bot, full stack setup.
Generate high-quality voiceover audio with ElevenLabs. Includes word-level timestamps for video sync. Use when: creating demo narration, video voiceover, podcast intros, or any TTS need. Keywords: voiceover, TTS, text to speech, ElevenLabs, narration, audio, timestamps.
Configure TTS voices, speed, timeouts, queue depth, and bot settings. TRIGGERS - configure tts, change voice, tts speed, queue depth, tts timeout, bot config, tune settings, adjust parameters.
Tired of juggling multiple audio APIs? This skill gives you one-command access to TTS, music generation, sound effects, and voice cloning. Use when you want to generate any audio without managing multiple API keys.
Build voice apps with Sinch Voice REST API. Use for phone calls, text-to-speech (TTS), IVR menus, DTMF input, conference calling, call recording, call forwarding, answering machine detection (AMD), SIP routing, WebSocket audio streaming, and SVAML call control.
Help users integrate Runway audio APIs (TTS, sound effects, voice isolation, dubbing)
Use this skill when the user wants to convert a Wang Jianshuo-style WeChat article (article.md) into a narrated short MP4 video — featuring TTS voiceover via Volcano Engine Volcano TTS, scene-specific HyperFrames CSS/GSAP animations, subtle sound effects (SFX), abstract watercolor backgrounds, and end-to-end pipeline rendering to a 1080×1920 portrait MP4 (30-90 seconds). Triggers — "把这篇文章做成视频", "做一个解说视频", "讲解视频", "/wjs-converting-text-to-video".