aliyun-qwen-asr

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Category: provider
分类:服务提供商

Model Studio Qwen ASR (Non-Realtime)

Model Studio Qwen ASR(非实时)

Validation

校验

bash
mkdir -p output/aliyun-qwen-asr
python -m py_compile skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py && echo "py_compile_ok" > output/aliyun-qwen-asr/validate.txt
Pass criteria: command exits 0 and
output/aliyun-qwen-asr/validate.txt
is generated.
bash
mkdir -p output/aliyun-qwen-asr
python -m py_compile skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py && echo "py_compile_ok" > output/aliyun-qwen-asr/validate.txt
通过标准:命令退出码为0,且成功生成
output/aliyun-qwen-asr/validate.txt
文件。

Output And Evidence

输出与凭证

  • Store transcripts and API responses under
    output/aliyun-qwen-asr/
    .
  • Keep one command log or sample response per run.
Use Qwen ASR for recorded audio transcription (non-realtime), including short audio sync calls and long audio async jobs.
  • 将转写文稿和API响应存储在
    output/aliyun-qwen-asr/
    目录下。
  • 每次运行保留一条命令日志或响应示例。
Qwen ASR适用于录制音频的非实时转写场景,包括短音频同步调用和长音频异步任务。

Critical model names

核心模型名称

Use one of these exact model strings:
  • qwen3-asr-flash
  • qwen3-asr-flash-2026-02-10
  • qwen-audio-asr
  • qwen3-asr-flash-filetrans
  • qwen3-asr-flash-filetrans-2025-11-17
Selection guidance:
  • Use
    qwen3-asr-flash
    ,
    qwen3-asr-flash-2026-02-10
    , or
    qwen-audio-asr
    for short/normal recordings (sync).
  • Use
    qwen3-asr-flash-filetrans
    or
    qwen3-asr-flash-filetrans-2025-11-17
    for long-file transcription (async task workflow).
需使用以下精确的模型字符串之一:
  • qwen3-asr-flash
  • qwen3-asr-flash-2026-02-10
  • qwen-audio-asr
  • qwen3-asr-flash-filetrans
  • qwen3-asr-flash-filetrans-2025-11-17
选型指南:
  • 短/常规录音(同步)请使用
    qwen3-asr-flash
    qwen3-asr-flash-2026-02-10
    qwen-audio-asr
  • 长文件转写(异步任务流程)请使用
    qwen3-asr-flash-filetrans
    qwen3-asr-flash-filetrans-2025-11-17

Prerequisites

前置要求

  • Install SDK dependencies (script uses Python stdlib only):
bash
python3 -m venv .venv
. .venv/bin/activate
  • Set
    DASHSCOPE_API_KEY
    in environment, or add
    dashscope_api_key
    to
    ~/.alibabacloud/credentials
    .
  • 安装SDK依赖(脚本仅使用Python标准库):
bash
python3 -m venv .venv
. .venv/bin/activate
  • 在环境变量中设置
    DASHSCOPE_API_KEY
    ,或在
    ~/.alibabacloud/credentials
    文件中添加
    dashscope_api_key
    配置。

Normalized interface (asr.transcribe)

标准化接口(asr.transcribe)

Request

请求参数

  • audio
    (string, required): public URL or local file path.
  • model
    (string, optional): default
    qwen3-asr-flash
    .
  • language_hints
    (array<string>, optional): e.g.
    zh
    ,
    en
    .
  • sample_rate
    (number, optional)
  • vocabulary_id
    (string, optional)
  • disfluency_removal_enabled
    (bool, optional)
  • timestamp_granularities
    (array<string>, optional): e.g.
    sentence
    .
  • async
    (bool, optional): default false for sync models, true for
    qwen3-asr-flash-filetrans
    .
  • audio
    (字符串,必填):公网URL或本地文件路径。
  • model
    (字符串,可选):默认值为
    qwen3-asr-flash
  • language_hints
    (字符串数组,可选):例如
    zh
    en
  • sample_rate
    (数字,可选)
  • vocabulary_id
    (字符串,可选)
  • disfluency_removal_enabled
    (布尔值,可选)
  • timestamp_granularities
    (字符串数组,可选):例如
    sentence
  • async
    (布尔值,可选):同步模型默认值为false,
    qwen3-asr-flash-filetrans
    默认值为true。

Response

响应参数

  • text
    (string): normalized transcript text.
  • task_id
    (string, optional): present for async submission.
  • status
    (string):
    SUCCEEDED
    or submission status.
  • raw
    (object): original API response.
  • text
    (字符串):标准化后的转写文本。
  • task_id
    (字符串,可选):异步提交时返回。
  • status
    (字符串):
    SUCCEEDED
    或提交状态。
  • raw
    (对象):原始API响应。

Quick start (official HTTP API)

快速开始(官方HTTP API)

Sync transcription (OpenAI-compatible protocol):
bash
curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "qwen3-asr-flash",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "input_audio",
            "input_audio": {
              "data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
            }
          }
        ]
      }
    ],
    "stream": false,
    "asr_options": {
      "enable_itn": false
    }
  }'
Async long-file transcription (DashScope protocol):
bash
curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header 'X-DashScope-Async: enable' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "qwen3-asr-flash-filetrans",
    "input": {
      "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
    }
  }'
Poll task result:
bash
curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/<task_id>" \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY"
同步转写(兼容OpenAI协议):
bash
curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "qwen3-asr-flash",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "input_audio",
            "input_audio": {
              "data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
            }
          }
        ]
      }
    ],
    "stream": false,
    "asr_options": {
      "enable_itn": false
    }
  }'
异步长文件转写(DashScope协议):
bash
curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header 'X-DashScope-Async: enable' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "qwen3-asr-flash-filetrans",
    "input": {
      "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
    }
  }'
轮询任务结果:
bash
curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/<task_id>" \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY"

Local helper script

本地辅助脚本

Use the bundled script for URL/local-file input and optional async polling:
bash
python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
  --audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
  --model qwen3-asr-flash \
  --language-hints zh,en \
  --print-response
Long-file mode:
bash
python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
  --audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
  --model qwen3-asr-flash-filetrans \
  --async \
  --wait
可使用配套脚本处理URL/本地文件输入,也可选择开启异步轮询:
bash
python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
  --audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
  --model qwen3-asr-flash \
  --language-hints zh,en \
  --print-response
长文件模式:
bash
python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
  --audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
  --model qwen3-asr-flash-filetrans \
  --async \
  --wait

Operational guidance

操作指引

  • For local files, use
    input_audio.data
    (data URI) when direct URL is unavailable.
  • Keep
    language_hints
    minimal to reduce recognition ambiguity.
  • For async tasks, use 5-20s polling interval with max retry guard.
  • Save normalized outputs under
    output/aliyun-qwen-asr/transcripts/
    .
  • 对于本地文件,若无直接可用的URL,可使用
    input_audio.data
    (数据URI)。
  • 尽量减少
    language_hints
    的取值,降低识别歧义。
  • 异步任务建议使用5-20秒的轮询间隔,同时设置最大重试次数限制。
  • 标准化输出结果请保存至
    output/aliyun-qwen-asr/transcripts/
    目录下。

Output location

输出路径

  • Default output:
    output/aliyun-qwen-asr/transcripts/
  • Override base dir with
    OUTPUT_DIR
    .
  • 默认输出路径:
    output/aliyun-qwen-asr/transcripts/
  • 可通过
    OUTPUT_DIR
    环境变量覆盖基础目录。

Workflow

工作流程

  1. Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
  2. Run one minimal read-only query first to verify connectivity and permissions.
  3. Execute the target operation with explicit parameters and bounded scope.
  4. Verify results and save output/evidence files.
  1. 确认用户意图、区域、标识,以及操作是只读还是可修改类型。
  2. 先运行一次最小化只读查询,验证连通性和权限。
  3. 使用明确参数和有限范围执行目标操作。
  4. 校验结果并保存输出/凭证文件。

References

参考文档

  • references/api_reference.md
  • references/sources.md
  • Realtime synthesis is provided by
    skills/ai/audio/aliyun-qwen-tts-realtime/
    .
  • references/api_reference.md
  • references/sources.md
  • 实时合成能力由
    skills/ai/audio/aliyun-qwen-tts-realtime/
    提供。