byted-text-to-speech

Original：🇨🇳 Chinese

Translated

2 scriptsChecked / no sensitive code detected

Convert text to speech (TTS). Powered by the VolcEngine Doubao Text-to-Speech API, it supports streaming synthesis, multiple voice timbres, adjustments to speech rate/pitch/loudness, Markdown syntax filtering, and LaTeX formula broadcasting. Use this skill when users need to convert text to speech, generate reading audio, dubbing, narration, broadcasts, or mention terms like 'text-to-speech', 'TTS', 'speech synthesis', 'reading aloud', or 'dubbing'.

9installs

Sourcebytedance/agentkit-samples

Added on2026-03-28

NPX Install

npx skill4agent add bytedance/agentkit-samples byted-text-to-speech

SKILL.md Content (Chinese)

View Translation Comparison →

Byted-Text-to-Speech Skill

Convert text to speech and save as audio files based on VolcEngine Doubao Text-to-Speech (HTTP Chunked/SSE One-way Streaming-V3).

When to Use

Prioritize using this skill when users have the following needs:

Need to convert a piece of text to speech or reading audio
Need to generate dubbing, narration, broadcasts, or audiobook clips
Need to convert code comments, documents, articles, etc. to audio for easy listening
Need to generate multilingual speech (Chinese, English, etc.)
Users mention terms like "text-to-speech", "TTS", "speech synthesis", "reading aloud", "dubbing", "read it out", or "read to me"
Users don't explicitly mention "speech synthesis", but the task essentially requires converting text content to playable audio

Pre-Use Checks

First check if the following credentials have been configured:

```
MODEL_SPEECH_API_KEY
```

If credentials are missing, open

references/setup-guide.md

to view the activation, application, and configuration methods, and provide users with activation suggestions

Script Parameters

Parameter	Shortcut	Required	Description
`--text`	`-t`	Yes	The text content to be synthesized
`--output`	`-o`	No	Output audio file path (auto-generated by default)
`--speaker`	`-s`	No	Speaker, default `zh_female_vv_uranus_bigtts` , Voice Timbre List
`--format`		No	Audio format: `mp3` (default), `pcm` , `ogg_opus`
`--sample-rate`		No	Sample rate, e.g., 16000, 24000 (default 24000)
`--speech-rate`		No	Speech rate [-50, 100], 100 represents 2.0x speed, -50 represents 0.5x speed, default 0
`--pitch-rate`		No	Pitch [-12, 12], default 0
`--loudness-rate`		No	Loudness [-50, 100], 100 represents 2.0x volume, -50 represents 0.5x volume, default 0
`--bit-rate`		No	Bit rate, valid for mp3 and ogg_opus formats (e.g., 64000, 128000), default 64000
`--filter-markdown`		No	Filter Markdown syntax (e.g., `Hello` is read as "Hello"), disabled by default
`--enable-latex`		No	Enable LaTeX formula broadcasting (uses latex_parser v2, automatically enables Markdown filtering), disabled by default

Return Value Description

The script outputs JSON containing:

```
status
```
:
```
"success"
```
or
```
"error"
```
```
local_path
```
: Local audio file path
```
format
```
: Audio format
```
error
```
: Error message when failed

Please return the

local_path

or accessible audio URL to the user for easy playback or download.

Error Handling

If the error
```
PermissionError: MODEL_SPEECH_API_KEY ... needs to be configured in environment variables
```
occurs: Prompt the user to obtain and configure
```
MODEL_SPEECH_API_KEY
```
in API Key Management, write it to the environment variable file under the workspace, and try again.
If 4xx/5xx or business error codes are returned: Prompt the user to check the text content, speaker ID, and whether the account has activated the Doubao Speech Service based on the error message.

Troubleshooting

Missing credentials: Open
```
references/setup-guide.md
```
Need to check API parameters, fields, error codes: Open
```
references/docs-index.md
```
If the script returns a permission error, first check if the service is activated and the credentials are valid, and provide users with clear operation guidelines

Reference Materials

Open the following files as needed; there's no need to load all by default:

```
references/setup-guide.md
```
: Service activation, credential application, environment variable configuration
```
references/docs-index.md
```
: API documentation index, parameter descriptions, voice timbre list, quick reference for error codes

Examples

bash

# Basic usage
python scripts/text_to_speech.py -t "Welcome to the VolcEngine Speech Synthesis Service."

# Specify speaker and output format
python scripts/text_to_speech.py -t "This is a test speech." -s zh_female_vv_uranus_bigtts -o output.mp3 --format mp3

# Specify speech rate and sample rate
python scripts/text_to_speech.py -t "Speech rate and pitch are adjustable." --speech-rate 10 --sample-rate 16000