alicloud-ai-audio-asr
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCategory: provider
Category: provider
Model Studio Qwen ASR (Non-Realtime)
Model Studio Qwen ASR(非实时)
Use Qwen ASR for recorded audio transcription (non-realtime), including short audio sync calls and long audio async jobs.
可使用Qwen ASR实现录制音频的非实时转写,包括短音频同步调用和长音频异步任务。
Critical model names
关键模型名称
Use one of these exact model strings:
qwen3-asr-flashqwen-audio-asrqwen3-asr-flash-filetrans
Selection guidance:
- Use or
qwen3-asr-flashfor short/normal recordings (sync).qwen-audio-asr - Use for long-file transcription (async task workflow).
qwen3-asr-flash-filetrans
请使用以下精确的模型字符串之一:
qwen3-asr-flashqwen-audio-asrqwen3-asr-flash-filetrans
选型指引:
- 短/常规录音(同步场景)请使用或
qwen3-asr-flash。qwen-audio-asr - 长文件转写(异步任务流程)请使用。
qwen3-asr-flash-filetrans
Prerequisites
前置条件
- Install SDK dependencies (script uses Python stdlib only):
bash
python3 -m venv .venv
. .venv/bin/activate- Set in environment, or add
DASHSCOPE_API_KEYtodashscope_api_key.~/.alibabacloud/credentials
- 安装SDK依赖(本脚本仅使用Python标准库):
bash
python3 -m venv .venv
. .venv/bin/activate- 在环境变量中设置,或者将
DASHSCOPE_API_KEY添加到dashscope_api_key文件中。~/.alibabacloud/credentials
Normalized interface (asr.transcribe)
标准化接口(asr.transcribe)
Request
请求参数
- (string, required): public URL or local file path.
audio - (string, optional): default
model.qwen3-asr-flash - (array<string>, optional): e.g.
language_hints,zh.en - (number, optional)
sample_rate - (string, optional)
vocabulary_id - (bool, optional)
disfluency_removal_enabled - (array<string>, optional): e.g.
timestamp_granularities.sentence - (bool, optional): default false for sync models, true for
async.qwen3-asr-flash-filetrans
- (字符串,必填):公网URL或本地文件路径。
audio - (字符串,可选):默认值为
model。qwen3-asr-flash - (字符串数组,可选):例如
language_hints、zh。en - (数字,可选)
sample_rate - (字符串,可选)
vocabulary_id - (布尔值,可选)
disfluency_removal_enabled - (字符串数组,可选):例如
timestamp_granularities。sentence - (布尔值,可选):同步模型默认值为false,
async默认值为true。qwen3-asr-flash-filetrans
Response
响应参数
- (string): normalized transcript text.
text - (string, optional): present for async submission.
task_id - (string):
statusor submission status.SUCCEEDED - (object): original API response.
raw
- (字符串):标准化后的转录文本。
text - (字符串,可选):异步提交时返回。
task_id - (字符串):
status或提交状态。SUCCEEDED - (对象):原始API响应。
raw
Quick start (official HTTP API)
快速上手(官方HTTP API)
Sync transcription (OpenAI-compatible protocol):
bash
curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash",
"messages": [
{
"role": "user",
"content": [
{
"type": "input_audio",
"input_audio": {
"data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}
]
}
],
"stream": false,
"asr_options": {
"enable_itn": false
}
}'Async long-file transcription (DashScope protocol):
bash
curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'X-DashScope-Async: enable' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash-filetrans",
"input": {
"file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}'Poll task result:
bash
curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/<task_id>" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"同步转写(兼容OpenAI协议):
bash
curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash",
"messages": [
{
"role": "user",
"content": [
{
"type": "input_audio",
"input_audio": {
"data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}
]
}
],
"stream": false,
"asr_options": {
"enable_itn": false
}
}'异步长文件转写(DashScope协议):
bash
curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'X-DashScope-Async: enable' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash-filetrans",
"input": {
"file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}'轮询任务结果:
bash
curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/<task_id>" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"Local helper script
本地辅助脚本
Use the bundled script for URL/local-file input and optional async polling:
bash
python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash \
--language-hints zh,en \
--print-responseLong-file mode:
bash
python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash-filetrans \
--async \
--wait可使用附带的脚本处理URL/本地文件输入,也支持可选的异步轮询功能:
bash
python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash \
--language-hints zh,en \
--print-response长文件模式:
bash
python skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash-filetrans \
--async \
--waitOperational guidance
操作指引
- For local files, use (data URI) when direct URL is unavailable.
input_audio.data - Keep minimal to reduce recognition ambiguity.
language_hints - For async tasks, use 5-20s polling interval with max retry guard.
- Save normalized outputs under .
output/ai-audio-asr/transcripts/
- 没有直接可用的URL时,本地文件可使用(数据URI)格式。
input_audio.data - 尽量精简配置,以降低识别歧义。
language_hints - 异步任务建议使用5-20秒的轮询间隔,同时设置最大重试保护。
- 标准化输出请保存到目录下。
output/ai-audio-asr/transcripts/
Output location
输出位置
- Default output:
output/ai-audio-asr/transcripts/ - Override base dir with .
OUTPUT_DIR
- 默认输出路径:
output/ai-audio-asr/transcripts/ - 可通过环境变量覆盖基础目录。
OUTPUT_DIR
References
参考资料
references/api_reference.mdreferences/sources.md- Realtime synthesis is provided by .
skills/ai/audio/alicloud-ai-audio-tts-realtime/
references/api_reference.mdreferences/sources.md- 实时语音合成能力由提供。
skills/ai/audio/alicloud-ai-audio-tts-realtime/