Loading...
Loading...
Found 156 Skills
Expert in voice synthesis, TTS, voice cloning, podcast production, speech processing, and voice UI design via ElevenLabs integration. Specializes in vocal clarity, loudness standards (LUFS), de-essing, dialogue mixing, and voice transformation. Activate on 'TTS', 'text-to-speech', 'voice clone', 'voice synthesis', 'ElevenLabs', 'podcast', 'voice recording', 'speech-to-speech', 'voice UI', 'audiobook', 'dialogue'. NOT for spatial audio (use sound-engineer), music production (use DAW tools), game audio middleware (use sound-engineer), sound effects generation (use sound-engineer with ElevenLabs SFX), or live concert audio.
Generate audio replies using TTS. Trigger with "read it to me [URL]" to fetch and read content aloud, or "talk to me [topic]" to generate a spoken response. Also responds to "speak", "say it", "voice reply".
Text-to-Speech using Doubao (Volcano Engine) API. Use when converting text to natural-sounding speech, generating audio files from text, listing available TTS voices, or synthesizing speech with customizable speed/volume parameters.
Minimal TTS smoke test for Model Studio Qwen TTS.
本地 TTS 语音生成(macOS say + afconvert),输出 m4a 文件。
Use Chanjing TTS API to convert text to speech
Text-to-speech conversion using `uvx edge-tts` for generating audio from text. Use when (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Expert skill for OmniVoice, a massively multilingual zero-shot TTS model supporting 600+ languages with voice cloning and voice design capabilities.
Generate multi-person talking head podcast videos from scratch using AI — character creation, TTS, avatar animation, and video stitching. Use when the user wants to create a podcast, talking head video, or multi-speaker conversation video.
Use when the user wants to teach / learn an English word as a video — turn a single English word into a self-contained HLS "supercut" lesson built from the mira video base. Stitches every season2 clip where the word is spoken (via the search-app API) into one .m3u8, prepended with a Claude-written bilingual word-intro card (word + IPA + 中文 gloss + usage, Volcano TTS) and appended with a 关注王建硕 CTA card. No MP4 burn. Triggers — "teach <word>", "讲讲 <word>", "学英语 <word>", "把 <word> 做成视频", "/wjs-teaching-english <word>".
Control audio generation requests before execution. Use this when the user asks for TTS, persona voice, voice change, translated dub, cloned voice take, podcast audio, or lip-sync audio handoff and the skill must classify the request before handing execution to voice-batch-runner or a video workflow.
Generate Chinese broadcast audio from text files via the MiniMax TTS API, which automatically handles common pronunciation errors such as polyphonic characters, English abbreviations, mixed model names, and number pronunciations. Triggered when the user says "Generate broadcast audio using MiniMax".