alicloud-ai-audio-tts

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Category: provider

分类：服务商

Model Studio Qwen TTS

Critical model name

关键模型名称

Use the recommended model:

```
qwen3-tts-flash
```

使用推荐的模型：

```
qwen3-tts-flash
```

Prerequisites

前提条件

Install SDK (recommended in a venv to avoid PEP 668 limits):

bash

python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope

Set

DASHSCOPE_API_KEY

in your environment, or add

dashscope_api_key

~/.alibabacloud/credentials

(env takes precedence).

安装SDK（建议在虚拟环境venv中安装，以规避PEP 668限制）：

bash

python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope

在环境变量中设置
```
DASHSCOPE_API_KEY
```
，或在
```
~/.alibabacloud/credentials
```
中添加
```
dashscope_api_key
```
（环境变量优先级更高）。

Normalized interface (tts.generate)

标准化接口（tts.generate）

Request

请求参数

```
text
```
(string, required)
```
voice
```
(string, required)
```
language_type
```
(string, optional; default
```
Auto
```
)
```
stream
```
(bool, optional; default false)

```
text
```
（字符串，必填）
```
voice
```
（字符串，必填）
```
language_type
```
（字符串，可选；默认值
```
Auto
```
）
```
stream
```
（布尔值，可选；默认值false）

Response

响应参数

```
audio_url
```
(string, when stream=false)
```
audio_base64_pcm
```
(string, when stream=true)
```
sample_rate
```
(int, 24000)
```
format
```
(string, wav or pcm depending on mode)

```
audio_url
```
（字符串，当stream=false时返回）
```
audio_base64_pcm
```
（字符串，当stream=true时返回）
```
sample_rate
```
（整数，固定为24000）
```
format
```
（字符串，根据模式不同为wav或pcm）

Quick start (Python + DashScope SDK)

快速开始（Python + DashScope SDK）

python

import os
import dashscope

python

import os
import dashscope

Prefer env var for auth: export DASHSCOPE_API_KEY=...

优先使用环境变量进行身份验证：export DASHSCOPE_API_KEY=...

Or use ~/.alibabacloud/credentials with dashscope_api_key under [default].

或在~/.alibabacloud/credentials的[default]下添加dashscope_api_key。

Beijing region; for Singapore use: https://dashscope-intl.aliyuncs.com/api/v1

北京区域；若使用新加坡区域请改为：https://dashscope-intl.aliyuncs.com/api/v1

dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"

text = "Hello, this is a short voice line." response = dashscope.MultiModalConversation.call( model="qwen3-tts-flash", api_key=os.getenv("DASHSCOPE_API_KEY"), text=text, voice="Cherry", language_type="English", stream=False, )

audio_url = response.output.audio.url print(audio_url)

undefined

dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"

audio_url = response.output.audio.url print(audio_url)

undefined

Streaming notes

流式传输注意事项

```
stream=True
```
returns Base64-encoded PCM chunks at 24kHz.
Decode chunks and play or concatenate to a pcm buffer.
The response contains
```
finish_reason == "stop"
```
when the stream ends.

当
```
stream=True
```
时，会返回24kHz采样率的Base64编码PCM音频块。
解码这些音频块后可直接播放，或拼接成PCM缓冲区。
流式传输结束时，响应中会包含
```
finish_reason == "stop"
```
标识。

Operational guidance

操作指南

Keep requests concise; split long text into multiple calls if you hit size or timeout errors.
Use
```
language_type
```
consistent with the text to improve pronunciation.
Cache by
```
(text, voice, language_type)
```
to avoid repeat costs.

保持请求简洁；若遇到大小限制或超时错误，可将长文本拆分为多个请求。
设置与文本一致的
```
language_type
```
参数，以提升发音准确性。
可通过
```
(text, voice, language_type)
```
作为键进行缓存，避免重复调用产生额外费用。

Output location

输出位置

Default output:
```
output/ai-audio-tts/audio/
```
Override base dir with
```
OUTPUT_DIR
```
.

默认输出路径：
```
output/ai-audio-tts/audio/
```
可通过
```
OUTPUT_DIR
```
环境变量覆盖基础目录。

References

参考资料

```
references/api_reference.md
```
for parameter mapping and streaming example.
Source list:
```
references/sources.md
```

```
references/api_reference.md
```
：包含参数映射及流式传输示例。
来源列表：
```
references/sources.md
```