Loading...
Loading...
Found 121 Skills
ElevenLabs integration. Manage data, records, and automate workflows. Use when the user wants to interact with ElevenLabs data.
[QwenCloud] Synthesize speech from text with Qwen TTS models. TRIGGER when: user wants to convert text to speech, create voiceovers, generate audio narration, read text aloud, build TTS applications, mentions speech synthesis/voice generation/audio output from text, or explicitly invokes this skill by name (e.g. use qwencloud-audio-tts). DO NOT TRIGGER when: user wants speech recognition/ASR, text generation without audio, non-Qwen audio tasks.
Convert text to speech using ElevenLabs API. Use when you need to generate voice audio for messages, narrations, or accessibility.
Generate and manage persona-aware voice assets for short-form video production, including voice design, script-specific audio takes, and future reusable voice identities. Use this when persona registries and scripts already exist and you need local audio assets, voice manifests, and reviewable voice iterations without losing continuity across many videos.
Generate spoken audio from text using OpenAI's API with built-in voices. Useful for narrated explainers, lecture audio, and quick voiceover tracks.
Find focused, runnable Deepgram recipes for a specific feature × language. Use whenever someone wants a minimal working code snippet for ONE feature (transcribe URL, diarize, smart-format, voice agent connect, etc.) rather than a full starter app. Recipes are under 50 lines, read DEEPGRAM_API_KEY from env, and ship with a runnable example_test. Covers Python, JavaScript, Go, .NET, Java, Rust, and the Deepgram CLI.
Use this skill when the user requests to generate, create, or produce podcasts from text content. Converts written content into a two-host conversational podcast audio format with natural dialogue.
Generate character voices using TTS, voice cloning, and lip-sync tools. Supports Chatterbox, F5-TTS, TTS Audio Suite, RVC, and ElevenLabs. Use when creating speech audio for characters or syncing audio to video.
Use when creating cloned voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from reference audio and then reusing the returned voice_id in later TTS calls.
Use when generating a non-interactive cutscene clip — opening scene, story beat, character intro, ending. Locks the look with a reference image, image-to-videos a 5-10s shot, optionally adds TTS dialogue, and wires it as a VideoStreamPlayer that fades in/out. Trigger on "cutscene", "intro cinematic", "opening scene", "ending cinematic", "story beat video", "character intro video", "in-engine cinematic", "non-playable scene".
Generate high-quality voiceover audio with ElevenLabs. Includes word-level timestamps for video sync. Use when: creating demo narration, video voiceover, podcast intros, or any TTS need. Keywords: voiceover, TTS, text to speech, ElevenLabs, narration, audio, timestamps.
Tired of juggling multiple audio APIs? This skill gives you one-command access to TTS, music generation, sound effects, and voice cloning. Use when you want to generate any audio without managing multiple API keys.