elevenlabs

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ElevenLabs AI Audio Platform

ElevenLabs AI 音频平台

Complete guide to ElevenLabs' audio AI capabilities: speech synthesis, transcription, voice cloning, sound effects, music generation, dubbing, and conversational voice agents.
ElevenLabs音频AI功能完整指南:语音合成、转录、语音克隆、音效生成、音乐生成、配音以及对话式语音Agent。

Quick Reference

快速参考

CapabilityAPI/ToolUse Case
Text-to-Speech
text_to_speech
Generate lifelike speech from text
Speech-to-Text
speech_to_text
Transcribe audio with Scribe v2
Voice Cloning
voice_clone
Clone voices from audio samples
Voice Design
text_to_voice
Create voices from text descriptions
Sound Effects
text_to_sound_effects
Generate SFX from prompts
Music
compose_music
Generate studio-grade music
DubbingDubbing APITranslate video/audio (32 languages)
Voice Changer
speech_to_speech
Transform voice while preserving emotion
Voice Isolator
isolate_audio
Remove background noise
Voice AgentsAgents CLI/APIBuild conversational AI agents
功能API/工具适用场景
文本转语音
text_to_speech
从文本生成逼真语音
语音转文本
speech_to_text
使用Scribe v2转录音频
语音克隆
voice_clone
从音频样本克隆语音
语音设计
text_to_voice
通过文本描述创建语音
音效生成
text_to_sound_effects
根据提示生成SFX
音乐生成
compose_music
生成专业级音乐
配音Dubbing API翻译视频/音频(支持32种语言)
变声
speech_to_speech
转换语音同时保留情感
人声分离
isolate_audio
去除背景噪音
语音AgentAgents CLI/API构建对话式AI Agent

Setup

设置

API Key

API密钥

bash
undefined
bash
undefined

Environment variable

环境变量

export ELEVENLABS_API_KEY="your-api-key"
export ELEVENLABS_API_KEY="your-api-key"

Or in .env file

或在.env文件中

ELEVENLABS_API_KEY=your-api-key
undefined
ELEVENLABS_API_KEY=your-api-key
undefined

SDK Installation

SDK安装

bash
undefined
bash
undefined

Python

Python

pip install elevenlabs
pip install elevenlabs

TypeScript/Node

TypeScript/Node

npm install elevenlabs
undefined
npm install elevenlabs
undefined

MCP Server (for Claude Code, Cursor, etc.)

MCP服务器(适用于Claude Code、Cursor等)

json
{
  "mcpServers": {
    "ElevenLabs": {
      "command": "uvx",
      "args": ["elevenlabs-mcp"],
      "env": {
        "ELEVENLABS_API_KEY": "your-api-key"
      }
    }
  }
}
json
{
  "mcpServers": {
    "ElevenLabs": {
      "command": "uvx",
      "args": ["elevenlabs-mcp"],
      "env": {
        "ELEVENLABS_API_KEY": "your-api-key"
      }
    }
  }
}

Text-to-Speech (TTS)

文本转语音(TTS)

Convert text to lifelike speech. See references/tts-models.md for model details.
将文本转换为逼真语音。模型详情请查看references/tts-models.md

Python SDK

Python SDK

python
from elevenlabs.client import ElevenLabs
from elevenlabs import play

client = ElevenLabs(api_key="your-api-key")

audio = client.text_to_speech.convert(
    text="Hello world!",
    voice_id="JBFqnCBsd6RMkjVDRZzb",  # George
    model_id="eleven_multilingual_v2",
    output_format="mp3_44100_128"
)
play(audio)
python
from elevenlabs.client import ElevenLabs
from elevenlabs import play

client = ElevenLabs(api_key="your-api-key")

audio = client.text_to_speech.convert(
    text="Hello world!",
    voice_id="JBFqnCBsd6RMkjVDRZzb",  # George
    model_id="eleven_multilingual_v2",
    output_format="mp3_44100_128"
)
play(audio)

MCP Tool

MCP工具

mcp__ElevenLabs__text_to_speech
- text: "Your text here"
- voice_name: "Rachel" (or voice_id)
- model_id: "eleven_multilingual_v2"
- stability: 0.5, similarity_boost: 0.75
- speed: 1.0 (range: 0.7-1.2)
mcp__ElevenLabs__text_to_speech
- text: "Your text here"
- voice_name: "Rachel" (or voice_id)
- model_id: "eleven_multilingual_v2"
- stability: 0.5, similarity_boost: 0.75
- speed: 1.0 (range: 0.7-1.2)

Model Selection

模型选择

ModelLatencyLanguagesBest For
eleven_multilingual_v2
~500ms29High quality, long-form
eleven_flash_v2_5
~75ms32Real-time, agents
eleven_turbo_v2_5
~250ms32Balanced quality/speed
eleven_v3
(alpha)
Higher70+Emotional, dramatic
模型延迟支持语言最佳适用场景
eleven_multilingual_v2
~500ms29种高质量、长文本内容
eleven_flash_v2_5
~75ms32种实时交互、Agent
eleven_turbo_v2_5
~250ms32种质量与速度平衡
eleven_v3
(测试版)
较高70+种情感化、戏剧化内容

Speech-to-Text (Scribe)

语音转文本(Scribe)

Transcribe audio with 90+ language support. See references/stt-scribe.md for details.
支持90+种语言的音频转录。详情请查看references/stt-scribe.md

Python SDK

Python SDK

python
result = client.speech_to_text.convert(
    file=open("audio.mp3", "rb"),
    model_id="scribe_v2",
    diarize=True  # Speaker detection
)
print(result.text)
python
result = client.speech_to_text.convert(
    file=open("audio.mp3", "rb"),
    model_id="scribe_v2",
    diarize=True  # 说话人检测
)
print(result.text)

MCP Tool

MCP工具

mcp__ElevenLabs__speech_to_text
- input_file_path: "/path/to/audio.mp3"
- diarize: true (speaker detection)
- language_code: "eng" (or auto-detect)
mcp__ElevenLabs__speech_to_text
- input_file_path: "/path/to/audio.mp3"
- diarize: true (speaker detection)
- language_code: "eng" (or auto-detect)

Features

功能特性

  • 90+ languages with word-level timestamps
  • Speaker diarization (up to 48 speakers)
  • Keyterm prompting (bias toward specific words)
  • Entity detection (names, numbers, dates)
  • Realtime mode (~150ms latency)
  • 90+种语言支持,带词级时间戳
  • 说话人分离(最多支持48位说话人)
  • 关键词提示(偏向特定词汇)
  • 实体识别(姓名、数字、日期)
  • 实时模式(延迟约150ms)

Voice Cloning

语音克隆

Instant Voice Clone (MCP)

即时语音克隆(MCP)

mcp__ElevenLabs__voice_clone
- name: "My Voice"
- files: ["/path/to/sample1.mp3", "/path/to/sample2.mp3"]
- description: "Professional male voice"
mcp__ElevenLabs__voice_clone
- name: "My Voice"
- files: ["/path/to/sample1.mp3", "/path/to/sample2.mp3"]
- description: "Professional male voice"

Requirements

要求

  • Instant: 30+ seconds of clean audio
  • Professional: 30+ minutes for hyper-realistic clones
  • Creator plan or higher required
  • 即时克隆:30秒以上清晰音频
  • 专业克隆:30分钟以上音频以实现超逼真效果
  • 需要Creator及以上套餐

Voice Design

语音设计

Create entirely new voices from text descriptions.
通过文本描述创建全新语音。

MCP Tool

MCP工具

mcp__ElevenLabs__text_to_voice
- voice_description: "A warm, friendly male voice with a slight British accent,
  perfect for audiobook narration"
Creates 3 voice previews to choose from. Use
create_voice_from_preview
to save.
mcp__ElevenLabs__text_to_voice
- voice_description: "A warm, friendly male voice with a slight British accent,
  perfect for audiobook narration"
将生成3种语音预览供选择,使用
create_voice_from_preview
保存所选语音。

Sound Effects

音效生成

Generate cinematic sound effects from text. See references/sound-effects.md.
通过文本生成电影级音效。详情请查看references/sound-effects.md

MCP Tool

MCP工具

mcp__ElevenLabs__text_to_sound_effects
- text: "Heavy wooden door creaking open slowly"
- duration_seconds: 3.0 (0.5-30 seconds)
- loop: false
mcp__ElevenLabs__text_to_sound_effects
- text: "Heavy wooden door creaking open slowly"
- duration_seconds: 3.0 (0.5-30 seconds)
- loop: false

Prompting Tips

提示技巧

  • Simple: "Glass shattering on concrete"
  • Sequences: "Footsteps on gravel, then a metallic door opens"
  • Musical: "90s hip-hop drum loop, 90 BPM"
  • 简洁型:"Glass shattering on concrete"
  • 序列型:"Footsteps on gravel, then a metallic door opens"
  • 音乐型:"90s hip-hop drum loop, 90 BPM"

Music Generation

音乐生成

Generate studio-grade music. See references/music-generation.md.
生成专业级音乐。详情请查看references/music-generation.md

MCP Tool

MCP工具

mcp__ElevenLabs__compose_music
- prompt: "Upbeat electronic track with driving synths, 120 BPM"
- music_length_ms: 60000 (10s-5min)
mcp__ElevenLabs__compose_music
- prompt: "Upbeat electronic track with driving synths, 120 BPM"
- music_length_ms: 60000 (10s-5min)

Features

功能特性

  • Complete control over genre, style, structure
  • Vocals or instrumental
  • Multilingual lyrics
  • Edit sections individually
  • 完全控制流派、风格、结构
  • 支持带人声或纯器乐
  • 多语种歌词
  • 可单独编辑各段落

Dubbing

配音

Translate audio/video while preserving speaker identity. See references/dubbing.md.
  • 32 languages supported
  • Preserves emotion, timing, tone
  • Speaker separation (up to 9 speakers)
  • Files up to 1GB / 2.5 hours via API
在保留说话人特征的同时翻译音频/视频。详情请查看references/dubbing.md
  • 支持32种语言
  • 保留情感、时长、语调
  • 说话人分离(最多支持9位说话人)
  • API支持最大1GB / 2.5小时的文件

Voice Changer (Speech-to-Speech)

变声(语音转语音)

Transform any voice while preserving performance nuances.
转换任意语音同时保留表现细节。

MCP Tool

MCP工具

mcp__ElevenLabs__speech_to_speech
- input_file_path: "/path/to/recording.mp3"
- voice_id: "target_voice_id"
  • Preserves whispers, laughs, emotional cues
  • 29 languages supported
  • Billed at 1000 chars/minute
mcp__ElevenLabs__speech_to_speech
- input_file_path: "/path/to/recording.mp3"
- voice_id: "target_voice_id"
  • 保留低语、笑声、情感线索
  • 支持29种语言
  • 计费标准:1000字符/分钟

Voice Isolator

人声分离

Remove background noise from recordings.
去除录音中的背景噪音。

MCP Tool

MCP工具

mcp__ElevenLabs__isolate_audio
- input_file_path: "/path/to/noisy_audio.mp3"
  • Supports audio and video files
  • Files up to 500MB / 1 hour
mcp__ElevenLabs__isolate_audio
- input_file_path: "/path/to/noisy_audio.mp3"
  • 支持音频和视频文件
  • 支持最大500MB / 1小时的文件

Conversational Voice Agents

对话式语音Agent

Build and deploy voice-enabled AI agents. See references/voice-agents.md for comprehensive guide.
构建并部署语音交互AI Agent。完整指南请查看references/voice-agents.md

CLI Quick Start

CLI快速开始

bash
undefined
bash
undefined

Install

安装

npm install -g @elevenlabs/cli
npm install -g @elevenlabs/cli

Initialize and authenticate

初始化并认证

elevenlabs agents init elevenlabs auth login
elevenlabs agents init elevenlabs auth login

Create agent

创建Agent

elevenlabs agents add "Support Bot" --template customer-service
elevenlabs agents add "Support Bot" --template customer-service

Deploy

部署

elevenlabs agents push
undefined
elevenlabs agents push
undefined

Templates

模板

TemplateUse Case
customer-service
Professional support, low temp
assistant
General purpose, balanced
voice-only
Voice interactions only
text-only
Text conversations only
minimal
Quick prototyping
模板适用场景
customer-service
专业客服、低温度
assistant
通用场景、平衡型
voice-only
仅语音交互
text-only
仅文本对话
minimal
快速原型开发

Agent Tools

Agent工具

  • Server Tools: Webhook API calls
  • Client Tools: Frontend events
  • MCP Tools: Model Context Protocol servers
  • System Tools: transfer_to_number, agent_transfer, end_call
  • 服务器工具:Webhook API调用
  • 客户端工具:前端事件
  • MCP工具:模型上下文协议服务器
  • 系统工具:transfer_to_number、agent_transfer、end_call

Voice Library

语音库

Search Voices (MCP)

搜索语音(MCP)

mcp__ElevenLabs__search_voices
- search: "professional narrator"
- sort: "name" | "created_at_unix"
mcp__ElevenLabs__search_voices
- search: "professional narrator"
- sort: "name" | "created_at_unix"

Search Public Library

搜索公共语音库

mcp__ElevenLabs__search_voice_library
- search: "deep male"
- page_size: 10
mcp__ElevenLabs__search_voice_library
- search: "deep male"
- page_size: 10

Popular Voice IDs

热门语音ID

VoiceIDStyle
Rachel21m00Tcm4TlvDq8ikWAMNeutral, professional
AdampNInz6obpgDQGcFmaJgBDeep, warm
BellaEXAVITQu4vr4xnSDxMaLSoft, gentle
语音ID风格
Rachel21m00Tcm4TlvDq8ikWAM中立、专业
AdampNInz6obpgDQGcFmaJgB深沉、温暖
BellaEXAVITQu4vr4xnSDxMaL柔和、亲切

Account & Billing

账户与计费

Check Subscription

查看订阅信息

mcp__ElevenLabs__check_subscription
mcp__ElevenLabs__check_subscription

List Models

列出模型

mcp__ElevenLabs__list_models
mcp__ElevenLabs__list_models

Reference Documentation

参考文档

TopicFile
TTS Models & Parametersreferences/tts-models.md
Speech-to-Text (Scribe)references/stt-scribe.md
Sound Effects Promptingreferences/sound-effects.md
Music Generationreferences/music-generation.md
Voice Agents (CLI/API)references/voice-agents.md
Agent Prompting Guidereferences/agent-prompting.md
Dubbing Guidereferences/dubbing.md
主题文件
TTS模型与参数references/tts-models.md
语音转文本(Scribe)references/stt-scribe.md
音效提示技巧references/sound-effects.md
音乐生成references/music-generation.md
语音Agent(CLI/API)references/voice-agents.md
Agent提示指南references/agent-prompting.md
配音指南references/dubbing.md

Pricing & Limits

定价与限制

  • TTS: Per character (Flash models 50% cheaper)
  • STT: Per hour of audio
  • Sound Effects: 40 credits/second when duration specified
  • Music: Per generation
  • See: elevenlabs.io/pricing
  • TTS:按字符计费(Flash模型优惠50%)
  • STT:按音频小时数计费
  • 音效:指定时长时按40积分/秒计费
  • 音乐:按生成次数计费
  • 详情查看:elevenlabs.io/pricing

Concurrency Limits (by plan)

并发限制(按套餐)

PlanMultilingual v2Flash/TurboSTT
Free248
Starter3612
Creator51020
Pro102040
Scale153060
套餐Multilingual v2Flash/TurboSTT
免费版248
入门版3612
Creator版51020
专业版102040
企业版153060