Loading...
Loading...
Convert text to natural speech with DIA TTS, Kokoro, Chatterbox, and more via inference.sh CLI. Models: DIA TTS (conversational), Kokoro TTS, Chatterbox, Higgs Audio, VibeVoice (podcasts). Capabilities: text-to-speech, voice cloning, multi-speaker dialogue, podcast generation, expressive speech. Use for: voiceovers, audiobooks, podcasts, accessibility, video narration, IVR, voice assistants. Triggers: text to speech, tts, voice generation, ai voice, speech synthesis, voice over, generate speech, ai narrator, voice cloning, text to audio, elevenlabs alternative, voice ai, ai voiceover, speech generator, natural voice
npx skill4agent add inferen-sh/skills text-to-speech
Requires inference.sh CLI (). Get installation instructions:infshnpx skills add inference-sh/skills@agent-tools
infsh login
# Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Hello, welcome to our product demo."}'| Model | App ID | Best For |
|---|---|---|
| DIA TTS | | Conversational, expressive |
| Kokoro TTS | | Fast, natural |
| Chatterbox | | General purpose |
| Higgs Audio | | Emotional control |
| VibeVoice | | Podcasts, long-form |
infsh app list --category audioinfsh app run infsh/kokoro-tts --input '{"text": "Welcome to our tutorial."}'infsh app sample infsh/dia-tts --save input.json
# Edit input.json:
# {
# "text": "Hey! How are you doing today? I'm really excited to share this with you.",
# "voice": "conversational"
# }
infsh app run infsh/dia-tts --input input.jsoninfsh app sample infsh/vibevoice --save input.json
# Edit input.json with your podcast script
infsh app run infsh/vibevoice --input input.jsoninfsh app sample infsh/higgs-audio --save input.json
# {
# "text": "This is absolutely incredible!",
# "emotion": "excited"
# }
infsh app run infsh/higgs-audio --input input.json# 1. Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Your script here"}' > speech.json
# 2. Use the audio URL with OmniHuman for avatar video
infsh app run bytedance/omnihuman-1-5 --input '{
"image_url": "https://portrait.jpg",
"audio_url": "<audio-url-from-step-1>"
}'# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@agent-tools
# AI avatars (combine TTS with talking heads)
npx skills add inference-sh/skills@ai-avatar-video
# AI music generation
npx skills add inference-sh/skills@ai-music-generation
# Speech-to-text (transcription)
npx skills add inference-sh/skills@speech-to-text
# Video generation
npx skills add inference-sh/skills@ai-video-generationinfsh app list