aliyun-qwen-tts-realtime

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Category: provider
分类: provider

Model Studio Qwen TTS Realtime

模型工作室Qwen TTS Realtime

Use realtime TTS models for low-latency streaming speech output.
使用实时TTS模型实现低延迟流式语音输出。

Critical model names

关键模型名称

Use one of these exact model strings:
  • qwen3-tts-flash-realtime
  • qwen3-tts-instruct-flash-realtime
  • qwen3-tts-instruct-flash-realtime-2026-01-22
  • qwen3-tts-vd-realtime-2026-01-15
  • qwen3-tts-vc-realtime-2026-01-15
请使用以下精确的模型字符串之一:
  • qwen3-tts-flash-realtime
  • qwen3-tts-instruct-flash-realtime
  • qwen3-tts-instruct-flash-realtime-2026-01-22
  • qwen3-tts-vd-realtime-2026-01-15
  • qwen3-tts-vc-realtime-2026-01-15

Prerequisites

前置要求

  • Install SDK in a virtual environment:
bash
python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope
  • Set
    DASHSCOPE_API_KEY
    in your environment, or add
    dashscope_api_key
    to
    ~/.alibabacloud/credentials
    .
  • 在虚拟环境中安装SDK:
bash
python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope
  • 在环境变量中设置
    DASHSCOPE_API_KEY
    ,或者将
    dashscope_api_key
    添加到
    ~/.alibabacloud/credentials
    文件中。

Normalized interface (tts.realtime)

标准化接口 (tts.realtime)

Request

请求参数

  • text
    (string, required)
  • voice
    (string, required)
  • instruction
    (string, optional)
  • sample_rate
    (int, optional)
  • text
    (字符串,必填)
  • voice
    (字符串,必填)
  • instruction
    (字符串,选填)
  • sample_rate
    (整数,选填)

Response

返回参数

  • audio_base64_pcm_chunks
    (array<string>)
  • sample_rate
    (int)
  • finish_reason
    (string)
  • audio_base64_pcm_chunks
    (字符串数组)
  • sample_rate
    (整数)
  • finish_reason
    (字符串)

Operational guidance

操作指引

  • Use websocket or streaming endpoint for realtime mode.
  • Keep each utterance short for lower latency.
  • For instruction models, keep instruction explicit and concise.
  • Some SDK/runtime combinations may reject realtime model calls over
    MultiModalConversation
    ; use the probe script below to verify compatibility.
  • 实时模式下请使用websocket或流式接口端点。
  • 保持每段发音简短以获得更低延迟。
  • 对于指令类模型,请保持指令明确简洁。
  • 部分SDK/运行时组合可能会拒绝通过
    MultiModalConversation
    发起的实时模型调用;请使用下方的探测脚本来验证兼容性。

Local demo script

本地演示脚本

Use the probe script to verify realtime compatibility in your current SDK/runtime, and optionally fallback to a non-realtime model for immediate output:
bash
.venv/bin/python skills/ai/audio/aliyun-qwen-tts-realtime/scripts/realtime_tts_demo.py \
  --text "This is a realtime speech demo." \
  --fallback \
  --output output/ai-audio-tts-realtime/audio/fallback-demo.wav
Strict mode (for CI / gating):
bash
.venv/bin/python skills/ai/audio/aliyun-qwen-tts-realtime/scripts/realtime_tts_demo.py \
  --text "realtime health check" \
  --strict
使用探测脚本来验证当前SDK/运行时的实时兼容性,必要时可回退到非实时模型以获得即时输出:
bash
.venv/bin/python skills/ai/audio/aliyun-qwen-tts-realtime/scripts/realtime_tts_demo.py \
  --text "This is a realtime speech demo." \
  --fallback \
  --output output/ai-audio-tts-realtime/audio/fallback-demo.wav
严格模式(用于CI/门禁校验):
bash
.venv/bin/python skills/ai/audio/aliyun-qwen-tts-realtime/scripts/realtime_tts_demo.py \
  --text "realtime health check" \
  --strict

Output location

输出位置

  • Default output:
    output/ai-audio-tts-realtime/audio/
  • Override base dir with
    OUTPUT_DIR
    .
  • 默认输出路径:
    output/ai-audio-tts-realtime/audio/
  • 可通过
    OUTPUT_DIR
    环境变量覆盖基础目录。

Validation

校验

bash
mkdir -p output/aliyun-qwen-tts-realtime
for f in skills/ai/audio/aliyun-qwen-tts-realtime/scripts/*.py; do
  python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/aliyun-qwen-tts-realtime/validate.txt
Pass criteria: command exits 0 and
output/aliyun-qwen-tts-realtime/validate.txt
is generated.
bash
mkdir -p output/aliyun-qwen-tts-realtime
for f in skills/ai/audio/aliyun-qwen-tts-realtime/scripts/*.py; do
  python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/aliyun-qwen-tts-realtime/validate.txt
通过标准:命令退出码为0,且成功生成
output/aliyun-qwen-tts-realtime/validate.txt
文件。

Output And Evidence

输出与凭证

  • Save artifacts, command outputs, and API response summaries under
    output/aliyun-qwen-tts-realtime/
    .
  • Include key parameters (region/resource id/time range) in evidence files for reproducibility.
  • 将产物、命令输出和API响应摘要保存在
    output/aliyun-qwen-tts-realtime/
    目录下。
  • 请在凭证文件中包含关键参数(区域/资源ID/时间范围)以保证可复现性。

Workflow

工作流

  1. Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
  2. Run one minimal read-only query first to verify connectivity and permissions.
  3. Execute the target operation with explicit parameters and bounded scope.
  4. Verify results and save output/evidence files.
  1. 确认用户意图、区域、标识符,以及操作是只读还是可修改类型。
  2. 先运行一个最小化的只读查询来验证连通性和权限。
  3. 使用明确的参数和有限的范围执行目标操作。
  4. 验证结果并保存输出/凭证文件。

References

参考文档

  • references/sources.md
  • references/sources.md