alicloud-ai-audio-cosyvoice-voice-design

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Category: provider

类别：服务商

Model Studio CosyVoice Voice Design

Model Studio CosyVoice 语音设计

Use the CosyVoice voice enrollment API to create designed voices from a natural-language voice description.

使用CosyVoice语音注册API（voice-enrollment），通过自然语言语音描述创建定制语音。

Critical model names

关键模型名称

Use

model="voice-enrollment"

and one of these

target_model

values:

```
cosyvoice-v3.5-plus
```
```
cosyvoice-v3.5-flash
```
```
cosyvoice-v3-plus
```
```
cosyvoice-v3-flash
```

Recommended default in this repo:

```
target_model="cosyvoice-v3.5-plus"
```

使用

model="voice-enrollment"

，并选择以下

target_model

值之一：

```
cosyvoice-v3.5-plus
```
```
cosyvoice-v3.5-flash
```
```
cosyvoice-v3-plus
```
```
cosyvoice-v3-flash
```

本仓库推荐默认值：

```
target_model="cosyvoice-v3.5-plus"
```

Region and compatibility

地域与兼容性

```
cosyvoice-v3.5-plus
```
and
```
cosyvoice-v3.5-flash
```
are available only in China mainland deployment mode (Beijing endpoint).
In international deployment mode (Singapore endpoint),
```
cosyvoice-v3-plus
```
and
```
cosyvoice-v3-flash
```
do not support voice clone/design.
The
```
target_model
```
must match the later speech synthesis model.

```
cosyvoice-v3.5-plus
```
和
```
cosyvoice-v3.5-flash
```
仅在中国大陆部署模式（北京端点）可用。
在国际部署模式（新加坡端点）下，
```
cosyvoice-v3-plus
```
和
```
cosyvoice-v3-flash
```
不支持语音克隆/定制。
```
target_model
```
必须与后续使用的语音合成模型匹配。

Endpoint

端点地址

Domestic:

https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

International:

https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

国内：

https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization

国际：

https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

Prerequisites

前置条件

Set

DASHSCOPE_API_KEY

in your environment, or add

dashscope_api_key

~/.alibabacloud/credentials

在环境变量中设置

DASHSCOPE_API_KEY

，或在

~/.alibabacloud/credentials

中添加

dashscope_api_key

。

Normalized interface (cosyvoice.voice_design)

标准化接口（cosyvoice.voice_design）

Request

请求参数

```
model
```
(string, optional): fixed to
```
voice-enrollment
```
```
target_model
```
(string, optional): default
```
cosyvoice-v3.5-plus
```
```
prefix
```
(string, required): letters/digits only, max 10 chars
```
voice_prompt
```
(string, required): max 500 chars, Chinese or English only
```
preview_text
```
(string, required): max 200 chars, Chinese or English
```
language_hints
```
(array[string], optional):
```
zh
```
or
```
en
```
, and should match
```
preview_text
```
```
sample_rate
```
(int, optional): e.g.
```
24000
```
```
response_format
```
(string, optional): e.g.
```
wav
```

```
model
```
（字符串，可选）：固定为
```
voice-enrollment
```
```
target_model
```
（字符串，可选）：默认值为
```
cosyvoice-v3.5-plus
```
```
prefix
```
（字符串，必填）：仅允许字母/数字，最多10个字符
```
voice_prompt
```
（字符串，必填）：最多500个字符，仅支持中文或英文
```
preview_text
```
（字符串，必填）：最多200个字符，支持中文或英文
```
language_hints
```
（字符串数组，可选）：
```
zh
```
或
```
en
```
，需与
```
preview_text
```
的语言匹配
```
sample_rate
```
（整数，可选）：例如
```
24000
```
```
response_format
```
（字符串，可选）：例如
```
wav
```

Response

响应参数

```
voice_id
```
(string)
```
request_id
```
(string)
```
status
```
(string, optional)

```
voice_id
```
（字符串）
```
request_id
```
（字符串）
```
status
```
（字符串，可选）

Operational guidance

操作指南

Keep
```
voice_prompt
```
concrete: timbre, age range, pace, emotion, articulation, and scenario.
If
```
language_hints
```
is used, it should match the language of
```
preview_text
```
.
Designed voice names include a
```
-vd-
```
marker in the generated backend naming convention.

```
voice_prompt
```
需具体明确：包含音色、年龄范围、语速、情感、吐字清晰度及使用场景。
若使用
```
language_hints
```
，需与
```
preview_text
```
的语言保持一致。
定制生成的语音名称在后端命名规则中包含
```
-vd-
```
标记。

Local helper script

本地辅助脚本

Prepare a normalized request JSON:

bash

python skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/prepare_cosyvoice_design_request.py \
  --target-model cosyvoice-v3.5-plus \
  --prefix announcer \
  --voice-prompt "沉稳的中年男性播音员，低沉有磁性，语速平稳，吐字清晰。" \
  --preview-text "各位听众朋友，大家好，欢迎收听晚间新闻。" \
  --language-hint zh

准备标准化请求JSON：

bash

python skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/prepare_cosyvoice_design_request.py \
  --target-model cosyvoice-v3.5-plus \
  --prefix announcer \
  --voice-prompt "沉稳的中年男性播音员，低沉有磁性，语速平稳，吐字清晰。" \
  --preview-text "各位听众朋友，大家好，欢迎收听晚间新闻。" \
  --language-hint zh

Validation

验证步骤

bash

mkdir -p output/alicloud-ai-audio-cosyvoice-voice-design
for f in skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/*.py; do
  python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt

Pass criteria: command exits 0 and

output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt

is generated.

bash

mkdir -p output/alicloud-ai-audio-cosyvoice-voice-design
for f in skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/*.py; do
  python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt

验证通过标准：命令执行退出码为0，且生成

output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt

文件。

Output And Evidence

输出与证据

Save artifacts, command outputs, and API response summaries under
```
output/alicloud-ai-audio-cosyvoice-voice-design/
```
.

Include

target_model

prefix

voice_prompt

, and

preview_text

in the evidence file.

将生成的产物、命令输出及API响应摘要保存至
```
output/alicloud-ai-audio-cosyvoice-voice-design/
```
目录下。

证据文件中需包含

target_model

、

prefix

、

voice_prompt

和

preview_text

信息。

References

参考资料

```
references/api_reference.md
```
```
references/sources.md
```

```
references/api_reference.md
```
```
references/sources.md
```