alicloud-ai-audio-asr-realtime
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCategory: provider
分类:服务商
Model Studio Qwen ASR Realtime
Model Studio Qwen ASR Realtime
Validation
验证
bash
mkdir -p output/alicloud-ai-audio-asr-realtime
python -m py_compile skills/ai/audio/alicloud-ai-audio-asr-realtime/scripts/prepare_realtime_asr_request.py && echo "py_compile_ok" > output/alicloud-ai-audio-asr-realtime/validate.txtPass criteria: command exits 0 and is generated.
output/alicloud-ai-audio-asr-realtime/validate.txtbash
mkdir -p output/alicloud-ai-audio-asr-realtime
python -m py_compile skills/ai/audio/alicloud-ai-audio-asr-realtime/scripts/prepare_realtime_asr_request.py && echo "py_compile_ok" > output/alicloud-ai-audio-asr-realtime/validate.txt通过标准:命令执行返回0,且生成文件。
output/alicloud-ai-audio-asr-realtime/validate.txtOutput And Evidence
输出与验证依据
- Save session payloads and response samples under .
output/alicloud-ai-audio-asr-realtime/
- 将会话载荷和响应示例保存至目录下。
output/alicloud-ai-audio-asr-realtime/
Critical model names
关键模型名称
Use one of these exact model strings:
qwen3-asr-flash-realtimeqwen3-asr-flash-realtime-2026-02-10
请使用以下精确模型字符串之一:
qwen3-asr-flash-realtimeqwen3-asr-flash-realtime-2026-02-10
Use cases
适用场景
- Realtime subtitles and captions
- Voice-agent duplex input
- Streaming speech-to-text in browser or terminal clients
- 实时字幕
- 双工语音Agent输入
- 浏览器或终端客户端中的流式语音转文字
Prerequisites
前置条件
- Set in your environment, or add
DASHSCOPE_API_KEYtodashscope_api_key.~/.alibabacloud/credentials - Realtime sessions generally require WebSocket or streaming session handling in the client.
- 在环境变量中设置,或在
DASHSCOPE_API_KEY中添加~/.alibabacloud/credentials。dashscope_api_key - 实时会话通常需要客户端支持WebSocket或流式会话处理。
Normalized interface (asr.realtime)
标准化接口(asr.realtime)
Request
请求参数
- (string, optional): default
modelqwen3-asr-flash-realtime - (array<string>, optional)
language_hints - (string, optional): e.g.
format,pcmwav - (int, optional): e.g.
sample_rate16000 - (int, optional): frame size in milliseconds
chunk_ms
- (字符串,可选):默认值为
modelqwen3-asr-flash-realtime - (字符串数组,可选)
language_hints - (字符串,可选):例如
format、pcmwav - (整数,可选):例如
sample_rate16000 - (整数,可选):帧大小(毫秒)
chunk_ms
Response
响应参数
- (string): recognized transcript fragment
text - (bool): finalization marker
is_final - (object, optional)
usage
- (字符串):识别到的文本片段
text - (布尔值):最终结果标记
is_final - (对象,可选)
usage
Quick start
快速开始
Generate a request template:
bash
python skills/ai/audio/alicloud-ai-audio-asr-realtime/scripts/prepare_realtime_asr_request.py \
--output output/alicloud-ai-audio-asr-realtime/request.json生成请求模板:
bash
python skills/ai/audio/alicloud-ai-audio-asr-realtime/scripts/prepare_realtime_asr_request.py \
--output output/alicloud-ai-audio-asr-realtime/request.jsonOperational guidance
操作指南
- Prefer 16kHz mono PCM unless your client stack requires another format.
- Keep chunks small enough for responsive partial results.
- If you only have recorded files, use instead.
skills/ai/audio/alicloud-ai-audio-asr/
- 除非客户端栈要求其他格式,否则优先使用16kHz单声道PCM格式。
- 保持数据块足够小,以获得响应迅速的部分结果。
- 如果仅处理录制文件,请使用替代。
skills/ai/audio/alicloud-ai-audio-asr/
References
参考资料
references/sources.md
references/sources.md