aliyun-cosyvoice-voice-design

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Category: provider

Model Studio CosyVoice Voice Design

Model Studio CosyVoice 音色设计

Use the CosyVoice voice enrollment API to create designed voices from a natural-language voice description.

使用CosyVoice音色录入API，通过自然语言音色描述创建自定义设计的音色。

Critical model names

关键模型名称

Use

model="voice-enrollment"

and one of these

target_model

values:

```
cosyvoice-v3.5-plus
```
```
cosyvoice-v3.5-flash
```
```
cosyvoice-v3-plus
```
```
cosyvoice-v3-flash
```

Recommended default in this repo:

```
target_model="cosyvoice-v3.5-plus"
```

使用

model="voice-enrollment"

以及以下任意一个

target_model

值：

```
cosyvoice-v3.5-plus
```
```
cosyvoice-v3.5-flash
```
```
cosyvoice-v3-plus
```
```
cosyvoice-v3-flash
```

本仓库推荐的默认值：

```
target_model="cosyvoice-v3.5-plus"
```

Region and compatibility

区域与兼容性

```
cosyvoice-v3.5-plus
```
and
```
cosyvoice-v3.5-flash
```
are available only in China mainland deployment mode (Beijing endpoint).
In international deployment mode (Singapore endpoint),
```
cosyvoice-v3-plus
```
and
```
cosyvoice-v3-flash
```
do not support voice clone/design.
The
```
target_model
```
must match the later speech synthesis model.

```
cosyvoice-v3.5-plus
```
和
```
cosyvoice-v3.5-flash
```
仅在中国大陆部署模式（北京节点）可用。
在国际部署模式（新加坡节点）下，
```
cosyvoice-v3-plus
```
和
```
cosyvoice-v3-flash
```
不支持音色克隆/设计功能。
```
target_model
```
必须与后续使用的语音合成模型相匹配。

Endpoint

接口地址

Domestic:

https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

International:

https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

国内：

https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

国际：

https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

Prerequisites

前置条件

Set

DASHSCOPE_API_KEY

in your environment, or add

dashscope_api_key

~/.alibabacloud/credentials

在你的环境变量中设置
```
DASHSCOPE_API_KEY
```
，或者将
```
dashscope_api_key
```
添加到
```
~/.alibabacloud/credentials
```
文件中。

Normalized interface (cosyvoice.voice_design)

标准化接口 (cosyvoice.voice_design)

Request

请求参数

```
model
```
(string, optional): fixed to
```
voice-enrollment
```
```
target_model
```
(string, optional): default
```
cosyvoice-v3.5-plus
```
```
prefix
```
(string, required): letters/digits only, max 10 chars
```
voice_prompt
```
(string, required): max 500 chars, Chinese or English only
```
preview_text
```
(string, required): max 200 chars, Chinese or English
```
language_hints
```
(array[string], optional):
```
zh
```
or
```
en
```
, and should match
```
preview_text
```
```
sample_rate
```
(int, optional): e.g.
```
24000
```
```
response_format
```
(string, optional): e.g.
```
wav
```

```
model
```
(字符串, 可选)：固定为
```
voice-enrollment
```
```
target_model
```
(字符串, 可选)：默认值为
```
cosyvoice-v3.5-plus
```
```
prefix
```
(字符串, 必填)：仅支持字母/数字，最长10个字符
```
voice_prompt
```
(字符串, 必填)：最长500个字符，仅支持中文或英文
```
preview_text
```
(字符串, 必填)：最长200个字符，支持中文或英文
```
language_hints
```
(字符串数组, 可选)：可选值为
```
zh
```
或
```
en
```
，需要与
```
preview_text
```
的语言匹配
```
sample_rate
```
(整数, 可选)：例如
```
24000
```
```
response_format
```
(字符串, 可选)：例如
```
wav
```

Response

返回参数

```
voice_id
```
(string)
```
request_id
```
(string)
```
status
```
(string, optional)

```
voice_id
```
(字符串)
```
request_id
```
(字符串)
```
status
```
(字符串, 可选)

Operational guidance

操作指引

Keep
```
voice_prompt
```
concrete: timbre, age range, pace, emotion, articulation, and scenario.
If
```
language_hints
```
is used, it should match the language of
```
preview_text
```
.
Designed voice names include a
```
-vd-
```
marker in the generated backend naming convention.

```
voice_prompt
```
的描述要具体：包括音色、年龄范围、语速、情绪、咬字清晰度和使用场景等信息。
如果使用了
```
language_hints
```
，其取值需要与
```
preview_text
```
的语言匹配。
设计生成的音色名称在后端命名规则中会包含
```
-vd-
```
标识。

Local helper script

本地辅助脚本

Prepare a normalized request JSON:

bash

python skills/ai/audio/aliyun-cosyvoice-voice-design/scripts/prepare_cosyvoice_design_request.py \
  --target-model cosyvoice-v3.5-plus \
  --prefix announcer \
  --voice-prompt "沉稳的中年男性播音员，低沉有磁性，语速平稳，吐字清晰。" \
  --preview-text "各位听众朋友，大家好，欢迎收听晚间新闻。" \
  --language-hint zh

准备标准化的请求JSON：

bash

python skills/ai/audio/aliyun-cosyvoice-voice-design/scripts/prepare_cosyvoice_design_request.py \
  --target-model cosyvoice-v3.5-plus \
  --prefix announcer \
  --voice-prompt "沉稳的中年男性播音员，低沉有磁性，语速平稳，吐字清晰。" \
  --preview-text "各位听众朋友，大家好，欢迎收听晚间新闻。" \
  --language-hint zh

Validation

校验

bash

mkdir -p output/aliyun-cosyvoice-voice-design
for f in skills/ai/audio/aliyun-cosyvoice-voice-design/scripts/*.py; do
  python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/aliyun-cosyvoice-voice-design/validate.txt

Pass criteria: command exits 0 and

output/aliyun-cosyvoice-voice-design/validate.txt

is generated.

bash

mkdir -p output/aliyun-cosyvoice-voice-design
for f in skills/ai/audio/aliyun-cosyvoice-voice-design/scripts/*.py; do
  python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/aliyun-cosyvoice-voice-design/validate.txt

通过标准：命令退出码为0，且生成了

output/aliyun-cosyvoice-voice-design/validate.txt

文件。

Output And Evidence

输出与凭证

Save artifacts, command outputs, and API response summaries under
```
output/aliyun-cosyvoice-voice-design/
```
.

Include

target_model

prefix

voice_prompt

, and

preview_text

in the evidence file.

将产物、命令输出和API响应摘要保存在
```
output/aliyun-cosyvoice-voice-design/
```
目录下。
凭证文件中需要包含
```
target_model
```
、
```
prefix
```
、
```
voice_prompt
```
和
```
preview_text
```
信息。

References

参考文档

```
references/api_reference.md
```
```
references/sources.md
```

```
references/api_reference.md
```
```
references/sources.md
```