elevenlabs
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseElevenLabs AI Audio Platform
ElevenLabs AI 音频平台
Complete guide to ElevenLabs' audio AI capabilities: speech synthesis, transcription, voice cloning, sound effects, music generation, dubbing, and conversational voice agents.
ElevenLabs音频AI功能完整指南:语音合成、转录、语音克隆、音效生成、音乐生成、配音以及对话式语音Agent。
Quick Reference
快速参考
| Capability | API/Tool | Use Case |
|---|---|---|
| Text-to-Speech | | Generate lifelike speech from text |
| Speech-to-Text | | Transcribe audio with Scribe v2 |
| Voice Cloning | | Clone voices from audio samples |
| Voice Design | | Create voices from text descriptions |
| Sound Effects | | Generate SFX from prompts |
| Music | | Generate studio-grade music |
| Dubbing | Dubbing API | Translate video/audio (32 languages) |
| Voice Changer | | Transform voice while preserving emotion |
| Voice Isolator | | Remove background noise |
| Voice Agents | Agents CLI/API | Build conversational AI agents |
| 功能 | API/工具 | 适用场景 |
|---|---|---|
| 文本转语音 | | 从文本生成逼真语音 |
| 语音转文本 | | 使用Scribe v2转录音频 |
| 语音克隆 | | 从音频样本克隆语音 |
| 语音设计 | | 通过文本描述创建语音 |
| 音效生成 | | 根据提示生成SFX |
| 音乐生成 | | 生成专业级音乐 |
| 配音 | Dubbing API | 翻译视频/音频(支持32种语言) |
| 变声 | | 转换语音同时保留情感 |
| 人声分离 | | 去除背景噪音 |
| 语音Agent | Agents CLI/API | 构建对话式AI Agent |
Setup
设置
API Key
API密钥
bash
undefinedbash
undefinedEnvironment variable
环境变量
export ELEVENLABS_API_KEY="your-api-key"
export ELEVENLABS_API_KEY="your-api-key"
Or in .env file
或在.env文件中
ELEVENLABS_API_KEY=your-api-key
undefinedELEVENLABS_API_KEY=your-api-key
undefinedSDK Installation
SDK安装
bash
undefinedbash
undefinedPython
Python
pip install elevenlabs
pip install elevenlabs
TypeScript/Node
TypeScript/Node
npm install elevenlabs
undefinednpm install elevenlabs
undefinedMCP Server (for Claude Code, Cursor, etc.)
MCP服务器(适用于Claude Code、Cursor等)
json
{
"mcpServers": {
"ElevenLabs": {
"command": "uvx",
"args": ["elevenlabs-mcp"],
"env": {
"ELEVENLABS_API_KEY": "your-api-key"
}
}
}
}json
{
"mcpServers": {
"ElevenLabs": {
"command": "uvx",
"args": ["elevenlabs-mcp"],
"env": {
"ELEVENLABS_API_KEY": "your-api-key"
}
}
}
}Text-to-Speech (TTS)
文本转语音(TTS)
Convert text to lifelike speech. See references/tts-models.md for model details.
将文本转换为逼真语音。模型详情请查看references/tts-models.md。
Python SDK
Python SDK
python
from elevenlabs.client import ElevenLabs
from elevenlabs import play
client = ElevenLabs(api_key="your-api-key")
audio = client.text_to_speech.convert(
text="Hello world!",
voice_id="JBFqnCBsd6RMkjVDRZzb", # George
model_id="eleven_multilingual_v2",
output_format="mp3_44100_128"
)
play(audio)python
from elevenlabs.client import ElevenLabs
from elevenlabs import play
client = ElevenLabs(api_key="your-api-key")
audio = client.text_to_speech.convert(
text="Hello world!",
voice_id="JBFqnCBsd6RMkjVDRZzb", # George
model_id="eleven_multilingual_v2",
output_format="mp3_44100_128"
)
play(audio)MCP Tool
MCP工具
mcp__ElevenLabs__text_to_speech
- text: "Your text here"
- voice_name: "Rachel" (or voice_id)
- model_id: "eleven_multilingual_v2"
- stability: 0.5, similarity_boost: 0.75
- speed: 1.0 (range: 0.7-1.2)mcp__ElevenLabs__text_to_speech
- text: "Your text here"
- voice_name: "Rachel" (or voice_id)
- model_id: "eleven_multilingual_v2"
- stability: 0.5, similarity_boost: 0.75
- speed: 1.0 (range: 0.7-1.2)Model Selection
模型选择
| Model | Latency | Languages | Best For |
|---|---|---|---|
| ~500ms | 29 | High quality, long-form |
| ~75ms | 32 | Real-time, agents |
| ~250ms | 32 | Balanced quality/speed |
| Higher | 70+ | Emotional, dramatic |
| 模型 | 延迟 | 支持语言 | 最佳适用场景 |
|---|---|---|---|
| ~500ms | 29种 | 高质量、长文本内容 |
| ~75ms | 32种 | 实时交互、Agent |
| ~250ms | 32种 | 质量与速度平衡 |
| 较高 | 70+种 | 情感化、戏剧化内容 |
Speech-to-Text (Scribe)
语音转文本(Scribe)
Transcribe audio with 90+ language support. See references/stt-scribe.md for details.
支持90+种语言的音频转录。详情请查看references/stt-scribe.md。
Python SDK
Python SDK
python
result = client.speech_to_text.convert(
file=open("audio.mp3", "rb"),
model_id="scribe_v2",
diarize=True # Speaker detection
)
print(result.text)python
result = client.speech_to_text.convert(
file=open("audio.mp3", "rb"),
model_id="scribe_v2",
diarize=True # 说话人检测
)
print(result.text)MCP Tool
MCP工具
mcp__ElevenLabs__speech_to_text
- input_file_path: "/path/to/audio.mp3"
- diarize: true (speaker detection)
- language_code: "eng" (or auto-detect)mcp__ElevenLabs__speech_to_text
- input_file_path: "/path/to/audio.mp3"
- diarize: true (speaker detection)
- language_code: "eng" (or auto-detect)Features
功能特性
- 90+ languages with word-level timestamps
- Speaker diarization (up to 48 speakers)
- Keyterm prompting (bias toward specific words)
- Entity detection (names, numbers, dates)
- Realtime mode (~150ms latency)
- 90+种语言支持,带词级时间戳
- 说话人分离(最多支持48位说话人)
- 关键词提示(偏向特定词汇)
- 实体识别(姓名、数字、日期)
- 实时模式(延迟约150ms)
Voice Cloning
语音克隆
Instant Voice Clone (MCP)
即时语音克隆(MCP)
mcp__ElevenLabs__voice_clone
- name: "My Voice"
- files: ["/path/to/sample1.mp3", "/path/to/sample2.mp3"]
- description: "Professional male voice"mcp__ElevenLabs__voice_clone
- name: "My Voice"
- files: ["/path/to/sample1.mp3", "/path/to/sample2.mp3"]
- description: "Professional male voice"Requirements
要求
- Instant: 30+ seconds of clean audio
- Professional: 30+ minutes for hyper-realistic clones
- Creator plan or higher required
- 即时克隆:30秒以上清晰音频
- 专业克隆:30分钟以上音频以实现超逼真效果
- 需要Creator及以上套餐
Voice Design
语音设计
Create entirely new voices from text descriptions.
通过文本描述创建全新语音。
MCP Tool
MCP工具
mcp__ElevenLabs__text_to_voice
- voice_description: "A warm, friendly male voice with a slight British accent,
perfect for audiobook narration"Creates 3 voice previews to choose from. Use to save.
create_voice_from_previewmcp__ElevenLabs__text_to_voice
- voice_description: "A warm, friendly male voice with a slight British accent,
perfect for audiobook narration"将生成3种语音预览供选择,使用保存所选语音。
create_voice_from_previewSound Effects
音效生成
Generate cinematic sound effects from text. See references/sound-effects.md.
通过文本生成电影级音效。详情请查看references/sound-effects.md。
MCP Tool
MCP工具
mcp__ElevenLabs__text_to_sound_effects
- text: "Heavy wooden door creaking open slowly"
- duration_seconds: 3.0 (0.5-30 seconds)
- loop: falsemcp__ElevenLabs__text_to_sound_effects
- text: "Heavy wooden door creaking open slowly"
- duration_seconds: 3.0 (0.5-30 seconds)
- loop: falsePrompting Tips
提示技巧
- Simple: "Glass shattering on concrete"
- Sequences: "Footsteps on gravel, then a metallic door opens"
- Musical: "90s hip-hop drum loop, 90 BPM"
- 简洁型:"Glass shattering on concrete"
- 序列型:"Footsteps on gravel, then a metallic door opens"
- 音乐型:"90s hip-hop drum loop, 90 BPM"
Music Generation
音乐生成
Generate studio-grade music. See references/music-generation.md.
生成专业级音乐。详情请查看references/music-generation.md。
MCP Tool
MCP工具
mcp__ElevenLabs__compose_music
- prompt: "Upbeat electronic track with driving synths, 120 BPM"
- music_length_ms: 60000 (10s-5min)mcp__ElevenLabs__compose_music
- prompt: "Upbeat electronic track with driving synths, 120 BPM"
- music_length_ms: 60000 (10s-5min)Features
功能特性
- Complete control over genre, style, structure
- Vocals or instrumental
- Multilingual lyrics
- Edit sections individually
- 完全控制流派、风格、结构
- 支持带人声或纯器乐
- 多语种歌词
- 可单独编辑各段落
Dubbing
配音
Translate audio/video while preserving speaker identity. See references/dubbing.md.
- 32 languages supported
- Preserves emotion, timing, tone
- Speaker separation (up to 9 speakers)
- Files up to 1GB / 2.5 hours via API
在保留说话人特征的同时翻译音频/视频。详情请查看references/dubbing.md。
- 支持32种语言
- 保留情感、时长、语调
- 说话人分离(最多支持9位说话人)
- API支持最大1GB / 2.5小时的文件
Voice Changer (Speech-to-Speech)
变声(语音转语音)
Transform any voice while preserving performance nuances.
转换任意语音同时保留表现细节。
MCP Tool
MCP工具
mcp__ElevenLabs__speech_to_speech
- input_file_path: "/path/to/recording.mp3"
- voice_id: "target_voice_id"- Preserves whispers, laughs, emotional cues
- 29 languages supported
- Billed at 1000 chars/minute
mcp__ElevenLabs__speech_to_speech
- input_file_path: "/path/to/recording.mp3"
- voice_id: "target_voice_id"- 保留低语、笑声、情感线索
- 支持29种语言
- 计费标准:1000字符/分钟
Voice Isolator
人声分离
Remove background noise from recordings.
去除录音中的背景噪音。
MCP Tool
MCP工具
mcp__ElevenLabs__isolate_audio
- input_file_path: "/path/to/noisy_audio.mp3"- Supports audio and video files
- Files up to 500MB / 1 hour
mcp__ElevenLabs__isolate_audio
- input_file_path: "/path/to/noisy_audio.mp3"- 支持音频和视频文件
- 支持最大500MB / 1小时的文件
Conversational Voice Agents
对话式语音Agent
Build and deploy voice-enabled AI agents. See references/voice-agents.md for comprehensive guide.
构建并部署语音交互AI Agent。完整指南请查看references/voice-agents.md。
CLI Quick Start
CLI快速开始
bash
undefinedbash
undefinedInstall
安装
npm install -g @elevenlabs/cli
npm install -g @elevenlabs/cli
Initialize and authenticate
初始化并认证
elevenlabs agents init
elevenlabs auth login
elevenlabs agents init
elevenlabs auth login
Create agent
创建Agent
elevenlabs agents add "Support Bot" --template customer-service
elevenlabs agents add "Support Bot" --template customer-service
Deploy
部署
elevenlabs agents push
undefinedelevenlabs agents push
undefinedTemplates
模板
| Template | Use Case |
|---|---|
| Professional support, low temp |
| General purpose, balanced |
| Voice interactions only |
| Text conversations only |
| Quick prototyping |
| 模板 | 适用场景 |
|---|---|
| 专业客服、低温度 |
| 通用场景、平衡型 |
| 仅语音交互 |
| 仅文本对话 |
| 快速原型开发 |
Agent Tools
Agent工具
- Server Tools: Webhook API calls
- Client Tools: Frontend events
- MCP Tools: Model Context Protocol servers
- System Tools: transfer_to_number, agent_transfer, end_call
- 服务器工具:Webhook API调用
- 客户端工具:前端事件
- MCP工具:模型上下文协议服务器
- 系统工具:transfer_to_number、agent_transfer、end_call
Voice Library
语音库
Search Voices (MCP)
搜索语音(MCP)
mcp__ElevenLabs__search_voices
- search: "professional narrator"
- sort: "name" | "created_at_unix"mcp__ElevenLabs__search_voices
- search: "professional narrator"
- sort: "name" | "created_at_unix"Search Public Library
搜索公共语音库
mcp__ElevenLabs__search_voice_library
- search: "deep male"
- page_size: 10mcp__ElevenLabs__search_voice_library
- search: "deep male"
- page_size: 10Popular Voice IDs
热门语音ID
| Voice | ID | Style |
|---|---|---|
| Rachel | 21m00Tcm4TlvDq8ikWAM | Neutral, professional |
| Adam | pNInz6obpgDQGcFmaJgB | Deep, warm |
| Bella | EXAVITQu4vr4xnSDxMaL | Soft, gentle |
Browse: elevenlabs.io/voice-library
| 语音 | ID | 风格 |
|---|---|---|
| Rachel | 21m00Tcm4TlvDq8ikWAM | 中立、专业 |
| Adam | pNInz6obpgDQGcFmaJgB | 深沉、温暖 |
| Bella | EXAVITQu4vr4xnSDxMaL | 柔和、亲切 |
Account & Billing
账户与计费
Check Subscription
查看订阅信息
mcp__ElevenLabs__check_subscriptionmcp__ElevenLabs__check_subscriptionList Models
列出模型
mcp__ElevenLabs__list_modelsmcp__ElevenLabs__list_modelsReference Documentation
参考文档
| Topic | File |
|---|---|
| TTS Models & Parameters | references/tts-models.md |
| Speech-to-Text (Scribe) | references/stt-scribe.md |
| Sound Effects Prompting | references/sound-effects.md |
| Music Generation | references/music-generation.md |
| Voice Agents (CLI/API) | references/voice-agents.md |
| Agent Prompting Guide | references/agent-prompting.md |
| Dubbing Guide | references/dubbing.md |
| 主题 | 文件 |
|---|---|
| TTS模型与参数 | references/tts-models.md |
| 语音转文本(Scribe) | references/stt-scribe.md |
| 音效提示技巧 | references/sound-effects.md |
| 音乐生成 | references/music-generation.md |
| 语音Agent(CLI/API) | references/voice-agents.md |
| Agent提示指南 | references/agent-prompting.md |
| 配音指南 | references/dubbing.md |
Pricing & Limits
定价与限制
- TTS: Per character (Flash models 50% cheaper)
- STT: Per hour of audio
- Sound Effects: 40 credits/second when duration specified
- Music: Per generation
- See: elevenlabs.io/pricing
- TTS:按字符计费(Flash模型优惠50%)
- STT:按音频小时数计费
- 音效:指定时长时按40积分/秒计费
- 音乐:按生成次数计费
- 详情查看:elevenlabs.io/pricing
Concurrency Limits (by plan)
并发限制(按套餐)
| Plan | Multilingual v2 | Flash/Turbo | STT |
|---|---|---|---|
| Free | 2 | 4 | 8 |
| Starter | 3 | 6 | 12 |
| Creator | 5 | 10 | 20 |
| Pro | 10 | 20 | 40 |
| Scale | 15 | 30 | 60 |
| 套餐 | Multilingual v2 | Flash/Turbo | STT |
|---|---|---|---|
| 免费版 | 2 | 4 | 8 |
| 入门版 | 3 | 6 | 12 |
| Creator版 | 5 | 10 | 20 |
| 专业版 | 10 | 20 | 40 |
| 企业版 | 15 | 30 | 60 |