Loading...
Loading...
Found 23 Skills
ElevenLabs text-to-speech with mac-style say UX.
Deprecated legacy TalkToMePy TTS skill retained for backward compatibility. Prefer successor speech workflows in [gaelic-ghost/a11y-skills](https://github.com/gaelic-ghost/a11y-skills).
Text-to-speech synthesis via Google Cloud Text-to-Speech API — MP3 output, configurable language and voice, voice listing.
MiMo V2.5 TTS Text-to-Speech. Generate speech using Xiaomi MiMo V2.5 TTS series models. This skill is activated when text needs to be converted to speech, voice messages need to be sent, content needs to be read aloud, or when users request 'speak it out' or 'voice reply'. It supports three modes: preset voice, voice design, and voice cloning, as well as natural language control and director mode. It also supports style tag control for tone, emotion, and dialect, and preset voices support singing.
Convert text to speech (TTS) using the ListenHub API. Three modes are supported: Quick Synthesis (/v1/tts), Multi-role Script (/v1/speech), and Long Text Streaming Synthesis (/v1/flow-speech/episodes). If no voice is specified, automatically retrieve the voice list for user selection, with chat-girl-105-cn (Xiaoman) as the default. Use when user says: "tts", "text to speech", "语音合成", "文字转语音", "朗读", "生成语音", "生成音频", "转音频", "text to audio"
Amazon Polly integration. Manage data, records, and automate workflows. Use when the user wants to interact with Amazon Polly data.
Expert in voice synthesis, TTS, voice cloning, podcast production, speech processing, and voice UI design via ElevenLabs integration. Specializes in vocal clarity, loudness standards (LUFS), de-essing, dialogue mixing, and voice transformation. Activate on 'TTS', 'text-to-speech', 'voice clone', 'voice synthesis', 'ElevenLabs', 'podcast', 'voice recording', 'speech-to-speech', 'voice UI', 'audiobook', 'dialogue'. NOT for spatial audio (use sound-engineer), music production (use DAW tools), game audio middleware (use sound-engineer), sound effects generation (use sound-engineer with ElevenLabs SFX), or live concert audio.
Text-to-speech using edge-tts or macOS say. Use when user says "speak", "say", "read aloud", or wants text spoken.
Use when the user wants to generate speech, voiceover, or text-to-audio. Converts text to AI voice via Giggle.pro TTS API. Triggers: generate speech, text-to-speech, TTS, voiceover, read this text aloud, synthesize speech.
Find and use media assets — GIF search via Giphy, stock photos, movie data from OMDB/Letterboxd, voice synthesis, and media uploads.
AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI. Models: Inworld TTS-2 (100+ languages, emotion/non-verbal steering), Inworld TTS 1.5 (ultra-low latency), ElevenLabs (22+ premium voices, 32 languages), Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice for natural speech. Capabilities: multiple voices, emotions, accents, long-form narration, conversation, voice transformation, delivery mode control, character voices. Use for: voiceovers, audiobooks, podcasts, video narration, accessibility, gaming NPCs, avatar audio, UGC. Triggers: voice cloning, tts, text to speech, ai voice, voice generation, voice synthesis, voice over, narration, speech synthesis, ai narrator, elevenlabs, eleven labs, natural voice, realistic speech, voice ai, voice changer, inworld, inworld tts, character voice, npc voice