byted-text-to-speech
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseByted-Text-to-Speech Skill
Byted-Text-to-Speech Skill
基于火山引擎豆包语音合成(HTTP Chunked/SSE 单向流式-V3)将文本转为语音并保存为音频文件。
Convert text to speech and save as audio files based on VolcEngine Doubao Text-to-Speech (HTTP Chunked/SSE One-way Streaming-V3).
何时使用
When to Use
当用户有以下需求时,优先使用本 skill:
- 需要把一段文字转成语音、朗读音频
- 需要生成配音、旁白、播报、有声读物片段
- 需要将代码注释、文档、文章等内容转为音频便于收听
- 需要生成多语言语音(中文、英文等)
- 用户提到「文字转语音」「TTS」「语音合成」「朗读」「配音」「念出来」「读给我听」
- 用户没有明确说"语音合成",但任务本质上需要将文本内容转为可播放的音频时
Prioritize using this skill when users have the following needs:
- Need to convert a piece of text to speech or reading audio
- Need to generate dubbing, narration, broadcasts, or audiobook clips
- Need to convert code comments, documents, articles, etc. to audio for easy listening
- Need to generate multilingual speech (Chinese, English, etc.)
- Users mention terms like "text-to-speech", "TTS", "speech synthesis", "reading aloud", "dubbing", "read it out", or "read to me"
- Users don't explicitly mention "speech synthesis", but the task essentially requires converting text content to playable audio
使用前检查
Pre-Use Checks
优先检查是否已配置以下凭证:
MODEL_SPEECH_API_KEY
如果缺少凭证,打开 查看开通、申请和配置方式,并给予用户开通建议
references/setup-guide.mdFirst check if the following credentials have been configured:
MODEL_SPEECH_API_KEY
If credentials are missing, open to view the activation, application, and configuration methods, and provide users with activation suggestions
references/setup-guide.md脚本参数
Script Parameters
| 参数 | 简写 | 必填 | 说明 |
|---|---|---|---|
| | 是 | 要合成的文本内容 |
| | 否 | 输出音频文件路径(默认自动生成) |
| | 否 | 发音人,默认 |
| 否 | 音频格式: | |
| 否 | 采样率,如 16000、24000(默认 24000) | |
| 否 | 语速 [-50, 100],100 代表 2.0 倍速,-50 代表 0.5 倍速,默认 0 | |
| 否 | 音调 [-12, 12],默认 0 | |
| 否 | 音量 [-50, 100],100 代表 2.0 倍音量,-50 代表 0.5 倍音量,默认 0 | |
| 否 | 比特率,对 mp3 和 ogg_opus 格式生效(如 64000、128000),默认 64000 | |
| 否 | 过滤 markdown 语法(如 | |
| 否 | 启用 LaTeX 公式播报(使用 latex_parser v2,自动开启 markdown 过滤),默认关闭 |
| Parameter | Shortcut | Required | Description |
|---|---|---|---|
| | Yes | The text content to be synthesized |
| | No | Output audio file path (auto-generated by default) |
| | No | Speaker, default |
| No | Audio format: | |
| No | Sample rate, e.g., 16000, 24000 (default 24000) | |
| No | Speech rate [-50, 100], 100 represents 2.0x speed, -50 represents 0.5x speed, default 0 | |
| No | Pitch [-12, 12], default 0 | |
| No | Loudness [-50, 100], 100 represents 2.0x volume, -50 represents 0.5x volume, default 0 | |
| No | Bit rate, valid for mp3 and ogg_opus formats (e.g., 64000, 128000), default 64000 | |
| No | Filter Markdown syntax (e.g., | |
| No | Enable LaTeX formula broadcasting (uses latex_parser v2, automatically enables Markdown filtering), disabled by default |
返回值说明
Return Value Description
脚本输出 JSON,包含:
- :
status或"success""error" - : 本地音频文件路径
local_path - : 音频格式
format - : 失败时的错误信息
error
请将 或可访问的音频 URL 返回给用户,便于播放或下载。
local_pathThe script outputs JSON containing:
- :
statusor"success""error" - : Local audio file path
local_path - : Audio format
format - : Error message when failed
error
Please return the or accessible audio URL to the user for easy playback or download.
local_path错误处理
Error Handling
- 若报错 :提示用户在 API Key 管理 获取并配置
PermissionError: MODEL_SPEECH_API_KEY ... 需在环境变量中配置,写入 workspace 下的环境变量文件后重试。MODEL_SPEECH_API_KEY - 若返回 4xx/5xx 或业务错误码:根据错误信息提示用户检查文本内容、发音人 ID 及账号是否已开通豆包语音服务。
- If the error occurs: Prompt the user to obtain and configure
PermissionError: MODEL_SPEECH_API_KEY ... needs to be configured in environment variablesin API Key Management, write it to the environment variable file under the workspace, and try again.MODEL_SPEECH_API_KEY - If 4xx/5xx or business error codes are returned: Prompt the user to check the text content, speaker ID, and whether the account has activated the Doubao Speech Service based on the error message.
故障排查
Troubleshooting
- 缺少凭证:打开
references/setup-guide.md - 需要查 API 参数、字段、错误码:打开
references/docs-index.md - 如果脚本返回权限错误,优先检查服务是否已开通、凭证是否有效,给予用户明确的操作指引
- Missing credentials: Open
references/setup-guide.md - Need to check API parameters, fields, error codes: Open
references/docs-index.md - If the script returns a permission error, first check if the service is activated and the credentials are valid, and provide users with clear operation guidelines
参考资料
Reference Materials
按需打开以下文件,不必默认全部加载:
- :服务开通、凭证申请、环境变量配置
references/setup-guide.md - :API 文档索引、参数说明、音色列表、错误码速查
references/docs-index.md
Open the following files as needed; there's no need to load all by default:
- : Service activation, credential application, environment variable configuration
references/setup-guide.md - : API documentation index, parameter descriptions, voice timbre list, quick reference for error codes
references/docs-index.md
示例
Examples
bash
undefinedbash
undefined基本用法
Basic usage
python scripts/text_to_speech.py -t "欢迎使用火山引擎语音合成服务。"
python scripts/text_to_speech.py -t "Welcome to the VolcEngine Speech Synthesis Service."
指定发音人与输出格式
Specify speaker and output format
python scripts/text_to_speech.py -t "这是一段测试语音。" -s zh_female_vv_uranus_bigtts -o output.mp3 --format mp3
python scripts/text_to_speech.py -t "This is a test speech." -s zh_female_vv_uranus_bigtts -o output.mp3 --format mp3
指定语速与采样率
Specify speech rate and sample rate
python scripts/text_to_speech.py -t "语速和音调可调。" --speech-rate 10 --sample-rate 16000
undefinedpython scripts/text_to_speech.py -t "Speech rate and pitch are adjustable." --speech-rate 10 --sample-rate 16000
undefined