Loading...
Loading...
Found 121 Skills
Inworld TTS API. Covers voice cloning, audio markups, timestamps. Keywords: text-to-speech, visemes.
Generate speech, music, and sound effects using ModelsLab's v7 Voice API. Supports text-to-speech, speech-to-text, speech-to-speech, music generation, sound effects, dubbing, song extension, and song inpainting via ElevenLabs and Inworld models.
Play audio files, use text-to-speech, and record calls. Use when building IVR systems, playing announcements, or recording conversations. This skill provides Python SDK examples.
Play audio files, use text-to-speech, and record calls. Use when building IVR systems, playing announcements, or recording conversations. This skill provides Go SDK examples.
MiniMax API via curl. Use this skill for Chinese LLM chat, text-to-speech, and AI video generation.
Expert in voice synthesis, TTS, voice cloning, podcast production, speech processing, and voice UI design via ElevenLabs integration. Specializes in vocal clarity, loudness standards (LUFS), de-essing, dialogue mixing, and voice transformation. Activate on 'TTS', 'text-to-speech', 'voice clone', 'voice synthesis', 'ElevenLabs', 'podcast', 'voice recording', 'speech-to-speech', 'voice UI', 'audiobook', 'dialogue'. NOT for spatial audio (use sound-engineer), music production (use DAW tools), game audio middleware (use sound-engineer), sound effects generation (use sound-engineer with ElevenLabs SFX), or live concert audio.
Text-to-Speech using Doubao (Volcano Engine) API. Use when converting text to natural-sounding speech, generating audio files from text, listing available TTS voices, or synthesizing speech with customizable speed/volume parameters.
Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creatio...
Profile-aware speech workflow for narrated notes, spoken drafts, audio summaries, accessibility reads, and other text-to-speech tasks. Use when one front-door workflow should resolve voice profiles, enforce disclosure, and apply manifest tracking before delegating to built-in `$speech` or a deterministic local CLI path.
Use when the user wants to generate speech, voiceover, or text-to-audio. Converts text to AI voice via Giggle.pro TTS API. Triggers: generate speech, text-to-speech, TTS, voiceover, read this text aloud, synthesize speech.
Text-to-speech conversion using `uvx edge-tts` for generating audio from text. Use when (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Find the right Deepgram documentation for any task. Use whenever someone needs help locating docs, understanding which API to use, or wants to ask questions about Deepgram. Covers all product areas: speech-to-text, text-to-speech, voice agents, audio intelligence, and self-hosted deployments.