elevenlabs-tts
Original:🇺🇸 English
Translated
ElevenLabs text-to-speech with 22+ premium voices, multilingual support, and voice tuning via inference.sh CLI. Models: eleven_multilingual_v2 (highest quality), eleven_turbo_v2_5 (low latency), eleven_flash_v2_5 (ultra-fast). Capabilities: text-to-speech, voice selection, stability/style control, 32 languages. Use for: voiceovers, audiobooks, video narration, podcasts, accessibility, IVR. Triggers: elevenlabs, eleven labs, elevenlabs tts, premium tts, professional voice, ai voice, high quality tts, multilingual tts, eleven labs voice, voice generation, natural speech, realistic voice, voice over, speech synthesis
52.0kinstalls
Sourceinferen-sh/skills
Added on
NPX Install
npx skill4agent add inferen-sh/skills elevenlabs-ttsTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →ElevenLabs Text-to-Speech
Premium text-to-speech with 22+ voices via inference.sh CLI.

Quick Start
Requires inference.sh CLI (). Install instructionsinfsh
bash
infsh login
# Generate speech with ElevenLabs
infsh app run elevenlabs/tts --input '{"text": "Hello, welcome to our product demo.", "voice": "aria"}'Available Models
| Model | ID | Best For | Latency |
|---|---|---|---|
| Multilingual v2 | | Highest quality, 32 languages | ~250ms |
| Turbo v2.5 | | Balance of speed & quality | ~150ms |
| Flash v2.5 | | Ultra-low latency | ~75ms |
Voice Library
Female Voices
| Voice | Style |
|---|---|
| American, conversational |
| British, confident |
| American, warm |
| American, expressive |
| American, professional |
| British, soft |
| American, friendly |
Male Voices
| Voice | Style |
|---|---|
| British, authoritative |
| American, deep |
| American, mature |
| American, conversational |
| Transatlantic, intense |
| Australian, natural |
| American, casual |
| British, commanding |
| American, friendly |
| American, young |
| American, articulate |
| American, warm |
| American, confident |
| American, authoritative |
| American, bright |
Examples
Basic Speech
bash
infsh app run elevenlabs/tts --input '{"text": "Welcome to our quarterly earnings presentation.", "voice": "george"}'Choose a Model
bash
# Highest quality
infsh app run elevenlabs/tts --input '{
"text": "This is our premium multilingual model with the best quality.",
"voice": "aria",
"model": "eleven_multilingual_v2"
}'
# Ultra-fast for real-time applications
infsh app run elevenlabs/tts --input '{
"text": "Flash model for low-latency applications.",
"voice": "brian",
"model": "eleven_flash_v2_5"
}'Voice Tuning
bash
infsh app run elevenlabs/tts --input '{
"text": "Fine-tune the voice characteristics for your use case.",
"voice": "bella",
"stability": 0.3,
"similarity_boost": 0.9,
"style": 0.4
}'| Parameter | Range | Effect |
|---|---|---|
| 0-1 | Higher = more consistent, lower = more expressive |
| 0-1 | Higher = closer to original voice character |
| 0-1 | Higher = more style exaggeration |
| true/false | Enhances speaker clarity |
Output Formats
bash
# High-quality MP3
infsh app run elevenlabs/tts --input '{
"text": "High quality audio output.",
"voice": "daniel",
"output_format": "mp3_44100_192"
}'| Format | Description |
|---|---|
| MP3 at 44.1kHz, 128kbps (default) |
| MP3 at 44.1kHz, 192kbps |
| Raw PCM at 16kHz |
| Raw PCM at 22.05kHz |
| Raw PCM at 24kHz |
| Raw PCM at 44.1kHz |
Multilingual
ElevenLabs supports 32 languages including English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Hindi, Russian, and more.
bash
# Spanish
infsh app run elevenlabs/tts --input '{
"text": "Hola, bienvenidos a nuestra presentación.",
"voice": "aria",
"model": "eleven_multilingual_v2"
}'
# French
infsh app run elevenlabs/tts --input '{
"text": "Bonjour, bienvenue à notre démonstration.",
"voice": "alice",
"model": "eleven_multilingual_v2"
}'Voice + Video Workflow
bash
# 1. Generate voiceover
infsh app run elevenlabs/tts --input '{
"text": "Introducing the future of AI-powered content creation.",
"voice": "george"
}' > voiceover.json
# 2. Create talking head video
infsh app run bytedance/omnihuman-1-5 --input '{
"image_url": "https://portrait.jpg",
"audio_url": "<audio-url-from-step-1>"
}'Use Cases
- Voiceovers: Product demos, explainer videos, commercials
- Audiobooks: Long-form narration with consistent voices
- Podcasts: AI hosts with natural delivery
- E-learning: Course narration in multiple languages
- Accessibility: High-quality screen reader content
- IVR: Professional phone system messages
- Video Narration: Documentary and social media content
Related Skills
bash
# ElevenLabs multi-speaker dialogue
npx skills add inference-sh/skills@elevenlabs-dialogue
# ElevenLabs voice changer
npx skills add inference-sh/skills@elevenlabs-voice-changer
# ElevenLabs sound effects
npx skills add inference-sh/skills@elevenlabs-sound-effects
# All TTS models (Kokoro, DIA, Chatterbox, and more)
npx skills add inference-sh/skills@text-to-speech
# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@infsh-cliBrowse all audio apps:
infsh app list --category audio