Loading...
Loading...
Convert text to speech (TTS). Powered by the VolcEngine Doubao Text-to-Speech API, it supports streaming synthesis, multiple voice timbres, adjustments to speech rate/pitch/loudness, Markdown syntax filtering, and LaTeX formula broadcasting. Use this skill when users need to convert text to speech, generate reading audio, dubbing, narration, broadcasts, or mention terms like 'text-to-speech', 'TTS', 'speech synthesis', 'reading aloud', or 'dubbing'.
npx skill4agent add bytedance/agentkit-samples byted-text-to-speechMODEL_SPEECH_API_KEYreferences/setup-guide.md| Parameter | Shortcut | Required | Description |
|---|---|---|---|
| | Yes | The text content to be synthesized |
| | No | Output audio file path (auto-generated by default) |
| | No | Speaker, default |
| No | Audio format: | |
| No | Sample rate, e.g., 16000, 24000 (default 24000) | |
| No | Speech rate [-50, 100], 100 represents 2.0x speed, -50 represents 0.5x speed, default 0 | |
| No | Pitch [-12, 12], default 0 | |
| No | Loudness [-50, 100], 100 represents 2.0x volume, -50 represents 0.5x volume, default 0 | |
| No | Bit rate, valid for mp3 and ogg_opus formats (e.g., 64000, 128000), default 64000 | |
| No | Filter Markdown syntax (e.g., | |
| No | Enable LaTeX formula broadcasting (uses latex_parser v2, automatically enables Markdown filtering), disabled by default |
status"success""error"local_pathformaterrorlocal_pathPermissionError: MODEL_SPEECH_API_KEY ... needs to be configured in environment variablesMODEL_SPEECH_API_KEYreferences/setup-guide.mdreferences/docs-index.mdreferences/setup-guide.mdreferences/docs-index.md# Basic usage
python scripts/text_to_speech.py -t "Welcome to the VolcEngine Speech Synthesis Service."
# Specify speaker and output format
python scripts/text_to_speech.py -t "This is a test speech." -s zh_female_vv_uranus_bigtts -o output.mp3 --format mp3
# Specify speech rate and sample rate
python scripts/text_to_speech.py -t "Speech rate and pitch are adjustable." --speech-rate 10 --sample-rate 16000