alicloud-ai-audio-tts-realtime

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Category: provider
分类:服务商

Model Studio Qwen TTS Realtime

Model Studio Qwen TTS Realtime

Use realtime TTS models for low-latency streaming speech output.
使用实时TTS模型实现低延迟流式语音输出。

Critical model names

关键模型名称

Use one of these exact model strings:
  • qwen3-tts-flash-realtime
  • qwen3-tts-instruct-flash-realtime
  • qwen3-tts-instruct-flash-realtime-2026-01-22
请使用以下精确的模型字符串之一:
  • qwen3-tts-flash-realtime
  • qwen3-tts-instruct-flash-realtime
  • qwen3-tts-instruct-flash-realtime-2026-01-22

Prerequisites

前提条件

  • Install SDK in a virtual environment:
bash
python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope
  • Set
    DASHSCOPE_API_KEY
    in your environment, or add
    dashscope_api_key
    to
    ~/.alibabacloud/credentials
    .
  • 在虚拟环境中安装SDK:
bash
python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope
  • 在环境变量中设置
    DASHSCOPE_API_KEY
    ,或在
    ~/.alibabacloud/credentials
    中添加
    dashscope_api_key

Normalized interface (tts.realtime)

标准化接口(tts.realtime)

Request

请求参数

  • text
    (string, required)
  • voice
    (string, required)
  • instruction
    (string, optional)
  • sample_rate
    (int, optional)
  • text
    (字符串,必填)
  • voice
    (字符串,必填)
  • instruction
    (字符串,可选)
  • sample_rate
    (整数,可选)

Response

响应结果

  • audio_base64_pcm_chunks
    (array<string>)
  • sample_rate
    (int)
  • finish_reason
    (string)
  • audio_base64_pcm_chunks
    (字符串数组)
  • sample_rate
    (整数)
  • finish_reason
    (字符串)

Operational guidance

操作指南

  • Use websocket or streaming endpoint for realtime mode.
  • Keep each utterance short for lower latency.
  • For instruction models, keep instruction explicit and concise.
  • Some SDK/runtime combinations may reject realtime model calls over
    MultiModalConversation
    ; use the probe script below to verify compatibility.
  • 使用websocket或流式端点进行实时模式调用。
  • 保持每段语音内容简短以降低延迟。
  • 对于指令型模型,确保指令明确简洁。
  • 部分SDK/运行时组合可能会拒绝通过
    MultiModalConversation
    调用实时模型;使用下方的探测脚本验证兼容性。

Local demo script

本地演示脚本

Use the probe script to verify realtime compatibility in your current SDK/runtime, and optionally fallback to a non-realtime model for immediate output:
bash
.venv/bin/python skills/ai/audio/alicloud-ai-audio-tts-realtime/scripts/realtime_tts_demo.py \
  --text "这是一个 realtime 语音演示。" \
  --fallback \
  --output output/ai-audio-tts-realtime/audio/fallback-demo.wav
Strict mode (for CI / gating):
bash
.venv/bin/python skills/ai/audio/alicloud-ai-audio-tts-realtime/scripts/realtime_tts_demo.py \
  --text "realtime health check" \
  --strict
使用探测脚本验证当前SDK/运行时的实时兼容性,也可选择回退到非实时模型以立即输出:
bash
.venv/bin/python skills/ai/audio/alicloud-ai-audio-tts-realtime/scripts/realtime_tts_demo.py \
  --text "这是一个 realtime 语音演示。" \
  --fallback \
  --output output/ai-audio-tts-realtime/audio/fallback-demo.wav
严格模式(适用于CI / 门禁检测):
bash
.venv/bin/python skills/ai/audio/alicloud-ai-audio-tts-realtime/scripts/realtime_tts_demo.py \
  --text "realtime health check" \
  --strict

Output location

输出位置

  • Default output:
    output/ai-audio-tts-realtime/audio/
  • Override base dir with
    OUTPUT_DIR
    .
  • 默认输出路径:
    output/ai-audio-tts-realtime/audio/
  • 可通过
    OUTPUT_DIR
    覆盖基础目录。

References

参考资料

  • references/sources.md
  • references/sources.md