alicloud-ai-audio-cosyvoice-voice-design

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Category: provider
类别:服务商

Model Studio CosyVoice Voice Design

Model Studio CosyVoice 语音设计

Use the CosyVoice voice enrollment API to create designed voices from a natural-language voice description.
使用CosyVoice语音注册API(voice-enrollment),通过自然语言语音描述创建定制语音。

Critical model names

关键模型名称

Use
model="voice-enrollment"
and one of these
target_model
values:
  • cosyvoice-v3.5-plus
  • cosyvoice-v3.5-flash
  • cosyvoice-v3-plus
  • cosyvoice-v3-flash
Recommended default in this repo:
  • target_model="cosyvoice-v3.5-plus"
使用
model="voice-enrollment"
,并选择以下
target_model
值之一:
  • cosyvoice-v3.5-plus
  • cosyvoice-v3.5-flash
  • cosyvoice-v3-plus
  • cosyvoice-v3-flash
本仓库推荐默认值:
  • target_model="cosyvoice-v3.5-plus"

Region and compatibility

地域与兼容性

  • cosyvoice-v3.5-plus
    and
    cosyvoice-v3.5-flash
    are available only in China mainland deployment mode (Beijing endpoint).
  • In international deployment mode (Singapore endpoint),
    cosyvoice-v3-plus
    and
    cosyvoice-v3-flash
    do not support voice clone/design.
  • The
    target_model
    must match the later speech synthesis model.
  • cosyvoice-v3.5-plus
    cosyvoice-v3.5-flash
    仅在中国大陆部署模式(北京端点)可用。
  • 在国际部署模式(新加坡端点)下,
    cosyvoice-v3-plus
    cosyvoice-v3-flash
    不支持语音克隆/定制。
  • target_model
    必须与后续使用的语音合成模型匹配。

Endpoint

端点地址

  • Domestic:
    https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
  • International:
    https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization
  • 国内:
    https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
  • 国际:
    https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

Prerequisites

前置条件

  • Set
    DASHSCOPE_API_KEY
    in your environment, or add
    dashscope_api_key
    to
    ~/.alibabacloud/credentials
    .
  • 在环境变量中设置
    DASHSCOPE_API_KEY
    ,或在
    ~/.alibabacloud/credentials
    中添加
    dashscope_api_key

Normalized interface (cosyvoice.voice_design)

标准化接口(cosyvoice.voice_design)

Request

请求参数

  • model
    (string, optional): fixed to
    voice-enrollment
  • target_model
    (string, optional): default
    cosyvoice-v3.5-plus
  • prefix
    (string, required): letters/digits only, max 10 chars
  • voice_prompt
    (string, required): max 500 chars, Chinese or English only
  • preview_text
    (string, required): max 200 chars, Chinese or English
  • language_hints
    (array[string], optional):
    zh
    or
    en
    , and should match
    preview_text
  • sample_rate
    (int, optional): e.g.
    24000
  • response_format
    (string, optional): e.g.
    wav
  • model
    (字符串,可选):固定为
    voice-enrollment
  • target_model
    (字符串,可选):默认值为
    cosyvoice-v3.5-plus
  • prefix
    (字符串,必填):仅允许字母/数字,最多10个字符
  • voice_prompt
    (字符串,必填):最多500个字符,仅支持中文或英文
  • preview_text
    (字符串,必填):最多200个字符,支持中文或英文
  • language_hints
    (字符串数组,可选):
    zh
    en
    ,需与
    preview_text
    的语言匹配
  • sample_rate
    (整数,可选):例如
    24000
  • response_format
    (字符串,可选):例如
    wav

Response

响应参数

  • voice_id
    (string)
  • request_id
    (string)
  • status
    (string, optional)
  • voice_id
    (字符串)
  • request_id
    (字符串)
  • status
    (字符串,可选)

Operational guidance

操作指南

  • Keep
    voice_prompt
    concrete: timbre, age range, pace, emotion, articulation, and scenario.
  • If
    language_hints
    is used, it should match the language of
    preview_text
    .
  • Designed voice names include a
    -vd-
    marker in the generated backend naming convention.
  • voice_prompt
    需具体明确:包含音色、年龄范围、语速、情感、吐字清晰度及使用场景。
  • 若使用
    language_hints
    ,需与
    preview_text
    的语言保持一致。
  • 定制生成的语音名称在后端命名规则中包含
    -vd-
    标记。

Local helper script

本地辅助脚本

Prepare a normalized request JSON:
bash
python skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/prepare_cosyvoice_design_request.py \
  --target-model cosyvoice-v3.5-plus \
  --prefix announcer \
  --voice-prompt "沉稳的中年男性播音员,低沉有磁性,语速平稳,吐字清晰。" \
  --preview-text "各位听众朋友,大家好,欢迎收听晚间新闻。" \
  --language-hint zh
准备标准化请求JSON:
bash
python skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/prepare_cosyvoice_design_request.py \
  --target-model cosyvoice-v3.5-plus \
  --prefix announcer \
  --voice-prompt "沉稳的中年男性播音员,低沉有磁性,语速平稳,吐字清晰。" \
  --preview-text "各位听众朋友,大家好,欢迎收听晚间新闻。" \
  --language-hint zh

Validation

验证步骤

bash
mkdir -p output/alicloud-ai-audio-cosyvoice-voice-design
for f in skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/*.py; do
  python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt
Pass criteria: command exits 0 and
output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt
is generated.
bash
mkdir -p output/alicloud-ai-audio-cosyvoice-voice-design
for f in skills/ai/audio/alicloud-ai-audio-cosyvoice-voice-design/scripts/*.py; do
  python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt
验证通过标准:命令执行退出码为0,且生成
output/alicloud-ai-audio-cosyvoice-voice-design/validate.txt
文件。

Output And Evidence

输出与证据

  • Save artifacts, command outputs, and API response summaries under
    output/alicloud-ai-audio-cosyvoice-voice-design/
    .
  • Include
    target_model
    ,
    prefix
    ,
    voice_prompt
    , and
    preview_text
    in the evidence file.
  • 将生成的产物、命令输出及API响应摘要保存至
    output/alicloud-ai-audio-cosyvoice-voice-design/
    目录下。
  • 证据文件中需包含
    target_model
    prefix
    voice_prompt
    preview_text
    信息。

References

参考资料

  • references/api_reference.md
  • references/sources.md
  • references/api_reference.md
  • references/sources.md