elevenlabs
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseElevenLabs - Text-to-Speech & Podcast Skill
ElevenLabs - 文本转语音与播客生成技能
Overview
概述
This skill converts text and documents into high-quality audio using ElevenLabs TTS API. It supports two modes: single-voice narration and two-host conversational podcast generation.
本技能利用ElevenLabs TTS API将文本和文档转换为高质量音频。它支持两种模式:单语音朗读和双主持人对话式播客生成。
When to Use This Skill
何时使用此技能
Activate when the user mentions:
- "create podcast", "generate podcast", "podcast from document"
- "narrate document", "narrate this file", "read aloud"
- "text to speech", "TTS", "convert to audio"
- "audio from document", "audio version of"
当用户提及以下内容时激活此技能:
- "创建播客"、"生成播客"、"从文档生成播客"
- "朗读文档"、"朗读此文件"、"文本朗读"
- "文本转语音"、"TTS"、"转换为音频"
- "从文档生成音频"、"音频版本"
Setup
配置
Config at :
skills/elevenlabs/config.jsonjson
{
"api_key": "your-elevenlabs-api-key",
"default_voice": "JBFqnCBsd6RMkjVDRZzb",
"default_model": "eleven_multilingual_v2",
"podcast_voice1": "JBFqnCBsd6RMkjVDRZzb",
"podcast_voice2": "EXAVITQu4vr4xnSDxMaL"
}Only is required. Or set env var.
api_keyELEVENLABS_API_KEYDependencies: (only needed for PDF/DOCX files).
pip install PyPDF2 python-docxRequires for multi-chunk narration and podcasts.
ffmpeg在中进行配置:
skills/elevenlabs/config.jsonjson
{
"api_key": "your-elevenlabs-api-key",
"default_voice": "JBFqnCBsd6RMkjVDRZzb",
"default_model": "eleven_multilingual_v2",
"podcast_voice1": "JBFqnCBsd6RMkjVDRZzb",
"podcast_voice2": "EXAVITQu4vr4xnSDxMaL"
}仅为必填项。也可以设置环境变量。
api_keyELEVENLABS_API_KEY依赖项:(仅在处理PDF/DOCX文件时需要)。
pip install PyPDF2 python-docx多片段朗读和播客生成需要依赖。
ffmpegCommands
命令
List Voices
列出可用语音
bash
python skills/elevenlabs/scripts/elevenlabs.py voices
python skills/elevenlabs/scripts/elevenlabs.py voices --jsonUse this to find voice IDs for the user.
bash
python skills/elevenlabs/scripts/elevenlabs.py voices
python skills/elevenlabs/scripts/elevenlabs.py voices --json使用此命令为用户查找语音ID。
Single-Voice TTS
单语音文本转语音
bash
undefinedbash
undefinedFrom text
从文本生成
python skills/elevenlabs/scripts/elevenlabs.py tts --text "Hello world" --output ~/Downloads/hello.mp3
python skills/elevenlabs/scripts/elevenlabs.py tts --text "Hello world" --output ~/Downloads/hello.mp3
From document
从文档生成
python skills/elevenlabs/scripts/elevenlabs.py tts --file /path/to/doc.pdf --output ~/Downloads/narration.mp3
python skills/elevenlabs/scripts/elevenlabs.py tts --file /path/to/doc.pdf --output ~/Downloads/narration.mp3
With specific voice
指定语音
python skills/elevenlabs/scripts/elevenlabs.py tts --file doc.md --voice VOICE_ID --output out.mp3
The script handles text extraction, chunking at sentence boundaries (~4000 chars), TTS per chunk with voice continuity, and ffmpeg concatenation automatically.python skills/elevenlabs/scripts/elevenlabs.py tts --file doc.md --voice VOICE_ID --output out.mp3
该脚本会自动处理文本提取、按句子边界拆分片段(约4000字符)、保持语音连贯性的逐片段TTS转换,以及通过ffmpeg合并音频。Podcast Generation
播客生成
Podcast mode requires a JSON script file with conversation segments:
json
[
{"speaker": "host1", "text": "Welcome to our podcast! Today we're diving into..."},
{"speaker": "host2", "text": "That's right! I found the section on..."},
{"speaker": "host1", "text": "Let's break that down..."}
]bash
python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/script.json --voice1 ID1 --voice2 ID2 --output ~/Downloads/podcast.mp3播客模式需要包含对话片段的JSON脚本文件:
json
[
{"speaker": "host1", "text": "Welcome to our podcast! Today we're diving into..."},
{"speaker": "host2", "text": "That's right! I found the section on..."},
{"speaker": "host1", "text": "Let's break that down..."}
]bash
python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/script.json --voice1 ID1 --voice2 ID2 --output ~/Downloads/podcast.mp3Podcast Workflow (for Claude)
播客生成流程(适用于Claude)
When the user asks to create a podcast from a document:
-
Extract the document text:bash
python skills/elevenlabs/scripts/extract.py /path/to/document.pdf -
Generate a two-host conversation script from the extracted text. Follow these guidelines:
- Write as a natural, engaging discussion between two hosts
- Host 1 typically leads/introduces topics, Host 2 adds analysis and reactions
- Start with a brief intro welcoming listeners and stating the topic
- End with a summary/outro
- Keep each turn under 3000 characters
- Vary turn lengths - mix short reactions with longer explanations
- Use conversational language: "That's a great point", "What I found interesting was..."
- Reference specific details from the source document
- Avoid reading the document verbatim - discuss and interpret it
-
Write the script as a JSON array to a temp file:python
# Write to /tmp/podcast_script.json [ {"speaker": "host1", "text": "Welcome to today's episode..."}, {"speaker": "host2", "text": "Thanks for having me..."}, ... ] -
Generate the podcast:bash
python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/podcast_script.json --output ~/Downloads/podcast.mp3 -
Clean up the temp script file.
当用户要求从文档创建播客时:
-
提取文档文本:bash
python skills/elevenlabs/scripts/extract.py /path/to/document.pdf -
从提取的文本生成双主持人对话脚本。请遵循以下准则:
- 以两位主持人之间自然、引人入胜的讨论形式撰写
- 主持人1通常主导/介绍话题,主持人2补充分析和观点
- 以简短的开场介绍听众并说明话题
- 以总结/结尾收尾
- 每个发言回合不超过3000字符
- 变换发言时长 - 混合简短反应和较长篇幅的解释
- 使用口语化表达:"这点说得好"、"我觉得有趣的是..."
- 引用源文档中的具体细节
- 避免逐字朗读文档 - 要进行讨论和解读
-
将脚本写入临时JSON文件:python
# 写入/tmp/podcast_script.json [ {"speaker": "host1", "text": "Welcome to today's episode..."}, {"speaker": "host2", "text": "Thanks for having me..."}, ... ] -
生成播客:bash
python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/podcast_script.json --output ~/Downloads/podcast.mp3 -
清理临时脚本文件。
Tips
小贴士
- Run first to let the user pick voices they like
voices - For podcasts, suggest voice pairs with contrasting qualities (e.g., one deep, one bright)
- Default output to unless the user specifies otherwise
~/Downloads/ - For large documents, warn the user about character usage on their ElevenLabs plan
- 先运行命令让用户选择喜欢的语音
voices - 对于播客,建议选择音质对比鲜明的语音组合(例如一个低沉、一个明亮)
- 默认输出到,除非用户指定其他路径
~/Downloads/ - 对于大文档,提醒用户注意其ElevenLabs套餐的字符使用限制