Search Results: text-to-speech

Found 71 Skills

hyperframes-media

Asset preprocessing for HyperFrames compositions — text-to-speech narration (Kokoro), audio/video transcription (Whisper), and background removal for transparent overlays (u2net). Use when generating voiceover from text, transcribing speech for captions, removing the background from a video or image to use as a transparent overlay, choosing a TTS voice or whisper model, or chaining these (TTS → transcribe → captions). Each command downloads its own model on first run.

🇺🇸|EnglishTranslated

921

AI & Machine Learningskill-zero/s

ai-voice-cloning

AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI. Models: Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice for natural speech. Capabilities: multiple voices, emotions, accents, long-form narration, conversation. Use for: voiceovers, audiobooks, podcasts, video narration, accessibility. Triggers: voice cloning, tts, text to speech, ai voice, voice generation, voice synthesis, voice over, narration, speech synthesis, ai narrator, elevenlabs alternative, natural voice, realistic speech, voice ai

🇺🇸|EnglishTranslated

819

AI & Machine Learningskill-zero/s

ai-podcast-creation

Create AI-powered podcasts with text-to-speech, music, and audio editing. Tools: Kokoro TTS, DIA TTS, Chatterbox, AI music generation, media merger. Capabilities: multi-voice conversations, background music, intro/outro, full episodes. Use for: podcast production, audiobooks, voice content, audio newsletters. Triggers: podcast, ai podcast, text to speech podcast, audio content, voice over, ai audiobook, multi voice, conversation ai, notebooklm alternative, audio generation, podcast automation, ai narrator, voice content, audio newsletter, podcast maker

🇺🇸|EnglishTranslated

677

AI & Machine Learningmicrosoft/agent-skills

podcast-generation

Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creation from content, or integrating with Azure OpenAI Realtime API for real audio output. Covers full-stack implementation from React frontend to Python FastAPI backend with WebSocket streaming.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningnotedit/happy-skills

tts-skill

MiniMax TTS API - Text-to-Speech, Voice Cloning, Voice Design

🇨🇳|ChineseTranslated

1 scripts/Checked

AI & Machine Learningqodex-ai/ai-agent-skills

voice-ai-integration

Build voice-enabled AI applications with speech recognition, text-to-speech, and voice-based interactions. Supports multiple voice providers and real-time processing. Use when creating voice assistants, voice-controlled applications, audio interfaces, or hands-free AI systems.

🇺🇸|EnglishTranslated

4 scripts/Checked

AI & Machine Learningakrindev/google-studio-sk...

gemini-tts

Generate speech from text using Google Gemini TTS models via scripts/. Use for text-to-speech, audio generation, voice synthesis, multi-speaker conversations, and creating audio content. Supports multiple voices and streaming. Triggers on "text to speech", "TTS", "generate audio", "voice synthesis", "speak this text".

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningglebis/claude-skills

elevenlabs-tts

This skill converts text to high-quality audio files using ElevenLabs API. Use this skill when users request text-to-speech generation, audio narration, or voice synthesis with customizable voice parameters (stability, similarity boost) and voice presets (rachel, adam, bella, elli, josh, arnold, ava).

🇺🇸|EnglishTranslated

1 scripts/Checked

Frontend Developmentdaffy0208/ai-dev-standard...

voice-interface-builder

Expert in building voice interfaces, speech recognition, and text-to-speech systems

🇺🇸|EnglishTranslated

AI & Machine Learningvm0-ai/vm0-skills

minimax

MiniMax API via curl. Use this skill for Chinese LLM chat, text-to-speech, and AI video generation.

🇺🇸|EnglishTranslated

AI & Machine Learningitechmeat/llm-code

inworld

Inworld TTS API. Covers voice cloning, audio markups, timestamps. Keywords: text-to-speech, visemes.

🇺🇸|EnglishTranslated

AI & Machine Learningboomsystel-code/openclaw-...

audio-cog

AI audio generation powered by CellCog. Text-to-speech, voice synthesis, voiceovers, podcast audio, narration, music generation, background music, sound design. Professional audio creation with AI.

🇺🇸|EnglishTranslated