qwencloud-vision

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Agent setup: If your agent doesn't auto-load skills (e.g. Claude Code), see agent-compatibility.md once per session.
Agent 设置: 如果你的Agent不会自动加载技能(例如Claude Code), 每个会话请查看一次 agent-compatibility.md

Qwen Vision (Image & Video Understanding)

Qwen Vision(图像&视频理解)

Analyze images and videos using Qwen VL and QVQ models. This skill is part of qwencloud/qwencloud-ai.
使用Qwen VL和QVQ模型分析图像和视频。 该技能属于 qwencloud/qwencloud-ai 的一部分。

Skill directory

技能目录

Use this skill's internal files to execute and learn. Load reference files on demand when the default path fails or you need details.
LocationPurpose
scripts/analyze.py
Image/video understanding, multi-image, thinking mode
scripts/reason.py
Visual reasoning (QVQ, chain-of-thought, streaming)
scripts/ocr.py
OCR text extraction
scripts/vision_lib.py
Shared helpers (base64, upload, streaming)
references/execution-guide.md
Fallback: curl, code generation
references/curl-examples.md
Curl for base64, multi-image, video, OCR
references/visual-reasoning.md
QVQ and thinking mode details
references/prompt-guide.md
Query prompt templates by task, thinking mode decision
references/ocr.md
OCR parameters and examples
references/sources.md
Official documentation URLs
references/agent-compatibility.md
Agent self-check: register skills in project config for agents that don't auto-load
使用该技能的内部文件来执行功能和学习相关内容。当默认路径失效或你需要详情时,请按需加载参考文件。
位置用途
scripts/analyze.py
图像/视频理解、多图处理、思维模式
scripts/reason.py
视觉推理(QVQ、思维链、流式输出)
scripts/ocr.py
OCR文本提取
scripts/vision_lib.py
共享工具函数(base64、上传、流式输出)
references/execution-guide.md
备选方案:curl、代码生成
references/curl-examples.md
适用于base64、多图、视频、OCR场景的Curl示例
references/visual-reasoning.md
QVQ和思维模式详情
references/prompt-guide.md
按任务分类的查询prompt模板、思维模式决策规则
references/ocr.md
OCR参数和示例
references/sources.md
官方文档URL
references/agent-compatibility.md
Agent自检:针对不会自动加载技能的Agent,在项目配置中注册技能

Security

安全说明

NEVER output any API key or credential in plaintext. Always use variable references (
$DASHSCOPE_API_KEY
in shell,
os.environ["DASHSCOPE_API_KEY"]
in Python). Any check or detection of credentials must be non-plaintext: report only status (e.g. "set" / "not set", "valid" / "invalid"), never the value. Never display contents of
.env
or config files that may contain secrets.
When the API key is not configured, NEVER ask the user to provide it directly. Instead, help create a
.env
file with a placeholder (
DASHSCOPE_API_KEY=sk-your-key-here
) and instruct the user to replace it with their actual key from the QwenCloud Console. Only write the actual key value if the user explicitly requests it.
永远不要明文输出任何API密钥或凭证。 始终使用变量引用(shell中为
$DASHSCOPE_API_KEY
,Python中为
os.environ["DASHSCOPE_API_KEY"]
)。任何凭证的检查或检测必须非明文:仅报告状态(例如“已设置”/“未设置”、“有效”/“无效”),绝不输出实际值。永远不要展示可能包含机密信息的
.env
或配置文件内容。
当API密钥未配置时,永远不要直接要求用户提供密钥。 而是帮助创建带占位符的
.env
文件(
DASHSCOPE_API_KEY=sk-your-key-here
),并指导用户将占位符替换为从QwenCloud控制台获取的实际密钥。仅当用户显式要求时才写入实际密钥值。

Key Compatibility

密钥兼容性

Scripts require a standard QwenCloud API key (
sk-...
). Coding Plan keys (
sk-sp-...
) cannot be used for direct API calls and do not support dedicated vision models (qwen3-vl-plus, qvq-max, etc.). The scripts detect
sk-sp-
keys at startup and print a warning. If qwencloud-ops-auth is installed, see its
references/codingplan.md
for full details.
脚本需要标准QwenCloud API密钥
sk-...
格式)。编码计划密钥(
sk-sp-...
)无法用于直接API调用,也不支持专用视觉模型(qwen3-vl-plus、qvq-max等)。脚本启动时会检测
sk-sp-
格式的密钥并输出警告。如果安装了qwencloud-ops-auth,请查看其
references/codingplan.md
获取完整详情。

Model Selection

模型选择

ModelUse Case
qwen3.5-plusPreferred — unified multimodal (text+image+video). Thinking on by default.
qwen3.5-flashFast multimodal — cheaper, faster. Thinking on by default.
qwen3-vl-plusHigh-precision — object localization (2D/3D), document/webpage parsing.
qwen3-vl-flashFast vision — lower latency, 33 languages.
qvq-maxVisual reasoning — chain-of-thought for math, charts. Streaming only.
qwen-vl-ocrOCR — text extraction, table parsing, document scanning.
qwen-vl-maxQwen2.5-VL — best-performing in 2.5 series.
qwen-vl-plusQwen2.5-VL — faster, good balance of performance and cost, 11 languages.
  1. User specified a model → use directly.
  2. Consult the qwencloud-model-selector skill when model choice depends on requirement, scenario, or pricing.
  3. No signal, clear task
    qwen3.5-plus
    . Use
    qwen3-vl-plus
    for precise localization or 3D detection.
⚠️ Important: The model list above is a point-in-time snapshot and may be outdated. Model availability changes frequently. Always check the official model list for the authoritative, up-to-date catalog before making model decisions.
模型适用场景
qwen3.5-plus首选 — 统一多模态(文本+图像+视频),默认开启思维模式。
qwen3.5-flash快速多模态 — 成本更低、速度更快,默认开启思维模式。
qwen3-vl-plus高精度场景 — 目标定位(2D/3D)、文档/网页解析。
qwen3-vl-flash快速视觉处理 — 延迟更低,支持33种语言。
qvq-max视觉推理 — 数学、图表类场景的思维链推理,仅支持流式输出
qwen-vl-ocrOCR场景 — 文本提取、表格解析、文档扫描。
qwen-vl-maxQwen2.5-VL — 2.5系列中性能最优。
qwen-vl-plusQwen2.5-VL — 速度更快,性能和成本平衡良好,支持11种语言。
  1. 用户指定了模型 → 直接使用。
  2. 当模型选择取决于需求、场景或定价时,请咨询qwencloud-model-selector技能
  3. 无明确信号、任务清晰 → 使用
    qwen3.5-plus
    。精准定位或3D检测场景使用
    qwen3-vl-plus
⚠️ 重要提示:上述模型列表是当前时间点的快照,可能已过时。模型可用性会频繁更新。在确定模型选择前,请始终查看官方模型列表获取权威、最新的目录信息。

Execution

执行说明

Prerequisites

前置要求

  • API Key: Check that
    DASHSCOPE_API_KEY
    (or
    QWEN_API_KEY
    ) is set using a non-plaintext check only (e.g. in shell:
    [ -n "$DASHSCOPE_API_KEY" ]
    ; report only "set" or "not set", never the key value). If not set: run the * qwencloud-ops-auth* skill if available; otherwise guide the user to obtain a key from QwenCloud Console and set it via
    .env
    file (
    echo 'DASHSCOPE_API_KEY=sk-your-key-here' >> .env
    in project root or current directory) or environment variable. The script searches for
    .env
    in the current working directory and the project root. Skills may be installed independently — do not assume qwencloud-ops-auth is present.
  • Python 3.9+ (stdlib only, no pip install needed)
  • API密钥:仅通过非明文检查确认
    DASHSCOPE_API_KEY
    (或
    QWEN_API_KEY
    )已设置(例如shell中执行
    [ -n "$DASHSCOPE_API_KEY" ]
    ;仅报告“已设置”或“未设置”,绝不输出密钥值)。如果未设置:如果有qwencloud-ops-auth技能则运行该技能;否则指导用户从QwenCloud控制台获取密钥,并通过
    .env
    文件(在项目根目录或当前目录执行
    echo 'DASHSCOPE_API_KEY=sk-your-key-here' >> .env
    )或环境变量设置。脚本会在当前工作目录和项目根目录搜索
    .env
    文件。技能可能独立安装,不要假设qwencloud-ops-auth已存在。
  • Python 3.9+(仅需标准库,无需pip安装依赖

Environment Check

环境检查

Before first execution, verify Python is available:
bash
python3 --version  # must be 3.9+
If
python3
is not found, try
python --version
or
py -3 --version
. If Python is unavailable or below 3.9, skip to Path 2 (curl) in execution-guide.md.
首次执行前,验证Python可用:
bash
python3 --version  # 必须为3.9+
如果找不到
python3
,尝试
python --version
py -3 --version
。如果Python不可用或版本低于3.9,请跳转到execution-guide.md中的路径2(curl)

Default: Run Script

默认方式:运行脚本

Script path: Scripts are in the
scripts/
subdirectory of this skill's directory (the directory containing this SKILL.md). You MUST first locate this skill's installation directory, then ALWAYS use the full absolute path to execute scripts. Do NOT assume scripts are in the current working directory. Do NOT use
cd
to switch directories before execution. Shared infrastructure lives in
scripts/vision_lib.py
.
Execution note: Run all scripts in the foreground — wait for stdout; do not background.
Discovery: Run
python3 <this-skill-dir>/scripts/analyze.py --help
(or
reason.py
,
ocr.py
) first to see all available arguments.
ScriptPurposeDefault Model
scripts/analyze.py
Image understanding, multi-image, video, thinking mode, high-res
qwen3.5-plus
scripts/reason.py
Visual reasoning with chain-of-thought, video reasoning (always streaming)
qvq-max
scripts/ocr.py
OCR text extraction from documents, receipts, tables
qwen-vl-ocr
Input type fields (use exactly one in
--request
JSON):
FieldUse forExample
"image"
Single image (URL or local path)
"image": "photo.jpg"
"images"
Multi-image comparison (array)
"images": ["a.jpg", "b.jpg"]
"video"
Video file (URL or local path)
"video": "clip.mp4"
"video_frames"
Video as frame array
"video_frames": ["f1.jpg", "f2.jpg"]
⚠️ Common mistake: Do NOT use
"image"
for video files — use
"video"
instead.
bash
undefined
脚本路径:脚本位于该技能目录(包含此SKILL.md的目录)的
scripts/
子目录下。你必须先定位该技能的安装目录,然后始终使用完整绝对路径执行脚本。 不要假设脚本位于当前工作目录。执行前不要使用
cd
切换目录。共享基础代码位于
scripts/vision_lib.py
执行注意事项:前台运行所有脚本 — 等待标准输出;不要后台运行。
功能查询: 先运行
python3 <this-skill-dir>/scripts/analyze.py --help
(或
reason.py
ocr.py
)查看所有可用参数。
脚本用途默认模型
scripts/analyze.py
图像理解、多图处理、视频处理、思维模式、高分辨率处理
qwen3.5-plus
scripts/reason.py
带思维链的视觉推理、视频推理(始终流式输出)
qvq-max
scripts/ocr.py
从文档、收据、表格中提取OCR文本
qwen-vl-ocr
输入类型字段(在
--request
JSON中仅使用一个):
字段适用场景示例
"image"
单张图像(URL或本地路径)
"image": "photo.jpg"
"images"
多图对比(数组)
"images": ["a.jpg", "b.jpg"]
"video"
视频文件(URL或本地路径)
"video": "clip.mp4"
"video_frames"
视频帧数组
"video_frames": ["f1.jpg", "f2.jpg"]
⚠️ 常见错误:不要对视频文件使用
"image"
字段 — 请改用
"video"
bash
undefined

Image analysis

图像分析

python3 <this-skill-dir>/scripts/analyze.py
--request '{"prompt":"What is in this image?","image":"https://example.com/photo.jpg"}'
--output output/qwencloud-vision/result.json --print-response
python3 <this-skill-dir>/scripts/analyze.py
--request '{"prompt":"What is in this image?","image":"https://example.com/photo.jpg"}'
--output output/qwencloud-vision/result.json --print-response

Video analysis (local file — add --upload-files for files >= 7 MB)

视频分析(本地文件 — 文件 >= 7 MB时添加--upload-files)

python3 <this-skill-dir>/scripts/analyze.py
--request '{"prompt":"Describe what happens in this video","video":"clip.mp4"}'
--upload-files --print-response
python3 <this-skill-dir>/scripts/reason.py
--request '{"prompt":"Solve this math problem step by step","image":"problem.png"}'
--print-response
python3 <this-skill-dir>/scripts/ocr.py
--request '{"image":"invoice.jpg"}'
--print-response

| Argument | Description |
|----------|-------------|
| `--request '{...}'` | JSON request body |
| `--file path.json` | Load request from file |
| `--output path` | Save response JSON to path |
| `--print-response` | Print response to stdout |
| `--stream` | Enable streaming (auto for thinking/QVQ) |
| `--upload-files` | Upload local files to temp storage (for files > 7 MB) |
| `--schema path.json` | JSON Schema for structured extraction |
python3 <this-skill-dir>/scripts/analyze.py
--request '{"prompt":"Describe what happens in this video","video":"clip.mp4"}'
--upload-files --print-response
python3 <this-skill-dir>/scripts/reason.py
--request '{"prompt":"Solve this math problem step by step","image":"problem.png"}'
--print-response
python3 <this-skill-dir>/scripts/ocr.py
--request '{"image":"invoice.jpg"}'
--print-response

| 参数 | 说明 |
|----------|-------------|
| `--request '{...}'` | JSON请求体 |
| `--file path.json` | 从文件加载请求 |
| `--output path` | 将响应JSON保存到指定路径 |
| `--print-response` | 将响应打印到标准输出 |
| `--stream` | 启用流式输出(思维模式/QVQ场景自动开启) |
| `--upload-files` | 将本地文件上传到临时存储(适用于文件 > 7 MB的场景) |
| `--schema path.json` | 用于结构化提取的JSON Schema |

Verify Result

结果验证

  • Exit code
    0
    + output contains valid JSON with
    choices
    field → success
  • Non-zero exit, HTTP error, empty response, or JSON with
    "code"
    /
    "message"
    error → fail
  • Post-execution check: When
    --output
    is used, verify the response JSON file exists and contains expected content
  • MANDATORY — stderr signal check: After confirming the result, scan the command's stderr output for
    [ACTION_REQUIRED]
    or
    [UPDATE_AVAILABLE]
    . If either signal is present, you MUST follow the instructions in Update Check below before responding to the user.
  • 退出码
    0
    + 输出包含带
    choices
    字段的有效JSON → 执行成功
  • 非零退出码、HTTP错误、空响应,或JSON包含
    "code"
    /
    "message"
    错误 → 执行失败
  • 执行后检查:如果使用了
    --output
    参数,请验证响应JSON文件存在且包含预期内容
  • 强制要求 — 标准错误信号检查:确认结果后,扫描命令的标准错误输出是否存在
    [ACTION_REQUIRED]
    [UPDATE_AVAILABLE]
    。如果存在任一信号,你必须在响应用户前遵循下方更新检查(执行后强制要求)中的说明操作。

On Failure

执行失败处理

If scripts fail, match the error output against the diagnostic table below to determine the resolution. If no match, read execution-guide.md for alternative paths: curl commands (Path 2), code generation (Path 3), and autonomous resolution (Path 5).
If Python is not available at all → skip directly to Path 2 (curl) in execution-guide.md.
Error PatternDiagnosisResolution
command not found: python3
Python not on PATHTry
python
or
py -3
; install Python 3.9+ if missing
Python 3.9+ required
Script version check failedUpgrade Python to 3.9+
SyntaxError
near type hints
Python < 3.9Upgrade Python to 3.9+
QWEN_API_KEY/DASHSCOPE_API_KEY not found
Missing API keyObtain key from QwenCloud Console; add to
.env
:
echo 'DASHSCOPE_API_KEY=sk-...' >> .env
; or run qwencloud-ops-auth if available
HTTP 401
Invalid or mismatched keyRun qwencloud-ops-auth (non-plaintext check only); verify key is valid
SSL: CERTIFICATE_VERIFY_FAILED
SSL cert issue (proxy/corporate)macOS: run
Install Certificates.command
; else set
SSL_CERT_FILE
env var
URLError
/
ConnectionError
Network unreachableCheck internet; set
HTTPS_PROXY
if behind proxy
HTTP 429
Rate limitedWait and retry with backoff
HTTP 5xx
Server errorRetry with backoff
PermissionError
Can't write outputUse
--output
to specify writable directory
如果脚本执行失败,请将错误输出与下方诊断表匹配来确定解决方案。如果没有匹配项,请查看execution-guide.md获取备选路径:curl命令(路径2)、代码生成(路径3)、自主解决(路径5)。
如果完全没有可用的Python → 直接跳转到execution-guide.md中的路径2(curl)。
错误模式诊断解决方案
command not found: python3
Python不在PATH中尝试
python
py -3
;如果缺失则安装Python 3.9+
Python 3.9+ required
脚本版本检查失败将Python升级到3.9+
类型提示附近出现
SyntaxError
Python版本 < 3.9将Python升级到3.9+
QWEN_API_KEY/DASHSCOPE_API_KEY not found
缺失API密钥QwenCloud控制台获取密钥;添加到
.env
echo 'DASHSCOPE_API_KEY=sk-...' >> .env
;如果有qwencloud-ops-auth技能则运行该技能
HTTP 401
密钥无效或不匹配运行qwencloud-ops-auth(仅执行非明文检查);验证密钥有效
SSL: CERTIFICATE_VERIFY_FAILED
SSL证书问题(代理/企业网络)macOS:运行
Install Certificates.command
;其他系统设置
SSL_CERT_FILE
环境变量
URLError
/
ConnectionError
网络不可达检查网络;如果使用代理则设置
HTTPS_PROXY
HTTP 429
触发速率限制等待后采用退避策略重试
HTTP 5xx
服务器错误等待后采用退避策略重试
PermissionError
无法写入输出使用
--output
指定可写目录

File Input

文件输入说明

The API accepts: HTTP/HTTPS URL, Base64 data URI, and
oss://
URL
. Local file paths are NOT directly supported — scripts handle conversion automatically. Pass local paths directly; no manual upload step needed.
Large file rule: If the local file is >= 7 MB, always add
--upload-files
.
Base64 encoding inflates size by ~33% and will exceed the 10 MB API limit. Small files (including short video clips < 7 MB) can use the default base64 path.
MethodWhen to useHow
Online URLFile already hostedPass URL directly — preferred for large files
Base64 (default)Local files < 7 MB (images or short video clips)Script auto-converts to
data:
URI
Temp uploadLocal files >= 7 MBAdd
--upload-files
flag → uploads to DashScope temp storage (
oss://
URL, 48h TTL)
Production: Default temp storage has 48h TTL and 100 QPS upload limit — not suitable for production, high-concurrency, or load-testing. To use your own OSS bucket, set
QWEN_TMP_OSS_BUCKET
and
QWEN_TMP_OSS_REGION
in
.env
, install
pip install alibabacloud-oss-v2
, and provide credentials via
QWEN_TMP_OSS_AK_ID
/
QWEN_TMP_OSS_AK_SECRET
or the standard
OSS_ACCESS_KEY_ID
/
OSS_ACCESS_KEY_SECRET
. Use a RAM user with least-privilege (
oss:PutObject
+
oss:GetObject
on target bucket only). The
--upload-files
flag is still required for vision scripts to trigger upload. If qwencloud-ops-auth is installed, see its
references/custom-oss.md
for the full setup guide.
API支持:HTTP/HTTPS URLBase64 data URI
oss://
URL
。不直接支持本地文件路径 — 脚本会自动处理转换。直接传入本地路径即可,无需手动上传。
大文件规则:如果本地文件 >= 7 MB,必须添加
--upload-files
参数。
Base64编码会使文件大小增加约33%,会超过10 MB的API限制。小文件(包括时长较短的<7 MB视频片段)可使用默认的base64路径。
方式适用场景操作
在线URL文件已托管直接传入URL — 大文件首选方式
Base64(默认)<7 MB的本地文件(图像或短视频片段)脚本自动转换为
data:
URI
临时上传>=7 MB的本地文件添加
--upload-files
参数 → 上传到DashScope临时存储(
oss://
URL,有效期48小时)
生产环境说明:默认临时存储有48小时有效期100 QPS上传限制 — 不适合生产、高并发或压测场景。要使用你自己的OSS bucket,请在
.env
中设置
QWEN_TMP_OSS_BUCKET
QWEN_TMP_OSS_REGION
,安装
pip install alibabacloud-oss-v2
,并通过
QWEN_TMP_OSS_AK_ID
/
QWEN_TMP_OSS_AK_SECRET
或标准
OSS_ACCESS_KEY_ID
/
OSS_ACCESS_KEY_SECRET
提供凭证。使用权限最小的RAM用户(仅授予目标bucket的
oss:PutObject
+
oss:GetObject
权限)。视觉脚本仍需要
--upload-files
参数来触发上传。如果安装了qwencloud-ops-auth,请查看其
references/custom-oss.md
获取完整设置指南。

Input from Other Skills

来自其他技能的输入

When the input file comes from another skill's output (e.g., image-gen, video-gen):
  • Pass the URL directly (e.g.,
    "image": "<image_url from image-gen>"
    ) — do NOT download the URL first
  • Downloading and re-passing as a local path wastes bandwidth and triggers unnecessary base64 encoding or OSS upload
  • All URL types are supported:
    https://
    ,
    oss://
    ,
    data:
当输入文件来自其他技能的输出(例如图像生成、视频生成技能):
  • 直接传入URL(例如
    "image": "<image_url from image-gen>"
    ) — 不要先下载URL
  • 下载后再作为本地路径传入会浪费带宽,触发不必要的base64编码或OSS上传
  • 支持所有URL类型:
    https://
    oss://
    data:

Thinking Mode

思维模式

ModelThinking DefaultNotes
qwen3.5-plus
/
qwen3.5-flash
OnDisable with
enable_thinking: false
for simple tasks.
qwen3-vl-plus
/
qwen3-vl-flash
OffEnable with
enable_thinking: true
.
qvq-max
Always onStreaming output required.
See visual-reasoning.md for details.
模型默认思维模式状态说明
qwen3.5-plus
/
qwen3.5-flash
开启简单任务可通过
enable_thinking: false
关闭。
qwen3-vl-plus
/
qwen3-vl-flash
关闭可通过
enable_thinking: true
开启。
qvq-max
始终开启必须使用流式输出。
查看visual-reasoning.md获取详情。

OCR (qwen-vl-ocr)

OCR(qwen-vl-ocr)

Optimized for text extraction. Supports multi-language, skewed images, tables, formulas. See ocr.md for parameters and examples.
针对文本提取优化,支持多语言、倾斜图像、表格、公式。查看ocr.md获取参数和示例。

Input Limits

输入限制

Images: BMP/JPEG/PNG/TIFF/WEBP/HEIC. Min 10px sides, aspect ratio <= 200:1. Max 20 MB (URL, Qwen3.5) / 10 MB (others).
Videos: MP4/AVI/MKV/MOV/FLV/WMV. Duration 2s–2h (Qwen3.5) / 2s–10min (others). Max 2 GB (URL) / 10 MB (base64). fps range [0.1, 10], default 2.0.
图像:支持BMP/JPEG/PNG/TIFF/WEBP/HEIC格式。最小边长10px,宽高比 <= 200:1。最大20 MB(URL、Qwen3.5)/ 10 MB(其他模型)。
视频:支持MP4/AVI/MKV/MOV/FLV/WMV格式。时长2秒–2小时(Qwen3.5)/ 2秒–10分钟(其他模型)。最大2 GB(URL)/ 10 MB(base64)。帧率范围[0.1, 10],默认2.0。

Error Handling

错误处理

HTTPMeaningAction
401Invalid or missing API keyRun qwencloud-ops-auth if available; else prompt user to set key (non-plaintext check only)
400Bad request (invalid format)Verify messages format and image URL/format
429Rate limitedRetry with exponential backoff
5xxServer errorRetry with exponential backoff
HTTP状态码含义处理方式
401API密钥无效或缺失如果有qwencloud-ops-auth则运行该技能;否则提示用户设置密钥(仅执行非明文检查)
400请求错误(格式无效)验证消息格式和图像URL/格式
429触发速率限制采用指数退避策略重试
5xx服务器错误采用指数退避策略重试

Output Location

输出位置

Prefer the current working directory. Default subdirectory:
./output/qwencloud-vision/
.
Write prohibition: NEVER write output files into this skill's installation directory or any
skills/
hierarchy. All generated content must go to
output/
under the current working directory or a user-specified path.
优先选择当前工作目录。默认子目录:
./output/qwencloud-vision/
写入禁止规则:永远不要将输出文件写入该技能的安装目录或任何
skills/
层级下。所有生成内容必须保存到当前工作目录下的
output/
目录或用户指定的路径。

Update Check (MANDATORY Post-Execution)

更新检查(执行后强制要求)

After script execution, check stderr for action signals. The script automatically detects update status and emits structured signals. You MUST act on them. This section also applies when the user explicitly asks to check for updates.
脚本执行后,检查标准错误输出中的操作信号。脚本会自动检测更新状态并输出结构化信号,你必须按信号操作。当用户显式要求检查更新时,本节内容也适用。

Signal:
[ACTION_REQUIRED] qwencloud-update-check skill is not installed

信号:
[ACTION_REQUIRED] qwencloud-update-check skill is not installed

When stderr contains this signal, you MUST present the following options to the user and wait for their response before proceeding. Do NOT silently skip this.
The update-check skill is not installed. It keeps you informed about new qwencloud/qwencloud-ai features and bug fixes.
  1. Install now — I'll run the install command for you
  2. Skip for now — do nothing this time
  3. Never remind me — permanently suppress this prompt
After the user responds:
  • User chooses 1 → run:
    npx skills add QwenCloud/qwencloud-ai --skill qwencloud-update-check -y
    , then run:
    python3 <this-skill-scripts-dir>/gossamer.py --dismiss
  • User chooses 2 → run:
    python3 <this-skill-scripts-dir>/gossamer.py --dismiss
    , then continue normally
  • User chooses 3 → run:
    python3 <this-skill-scripts-dir>/gossamer.py --never-install
    , then continue normally
当标准错误输出包含该信号时,你必须向用户提供以下选项,并在继续操作前等待用户响应,不要静默跳过。
更新检查技能未安装,它会向你通知qwencloud/qwencloud-ai的新功能和bug修复。
  1. 立即安装 — 我会为你运行安装命令
  2. 暂时跳过 — 本次不做任何操作
  3. 永不提醒 — 永久关闭该提示
用户响应后:
  • 用户选择1 → 运行:
    npx skills add QwenCloud/qwencloud-ai --skill qwencloud-update-check -y
    ,然后运行:
    python3 <this-skill-scripts-dir>/gossamer.py --dismiss
  • 用户选择2 → 运行:
    python3 <this-skill-scripts-dir>/gossamer.py --dismiss
    ,然后正常继续
  • 用户选择3 → 运行:
    python3 <this-skill-scripts-dir>/gossamer.py --never-install
    ,然后正常继续

Signal:
[UPDATE_AVAILABLE]

信号:
[UPDATE_AVAILABLE]

When stderr contains this signal, you MUST append a brief update notice to your response, including the version info and the update command shown in the stderr output.
当标准错误输出包含该信号时,你必须在响应末尾附加简短的更新通知,包含标准错误输出中显示的版本信息和更新命令。

No signal in stderr

标准错误输出无信号

If stderr contains neither
[ACTION_REQUIRED]
nor
[UPDATE_AVAILABLE]
, no action is needed — the skill is installed and up to date (or cached within 24h).
如果标准错误输出既没有
[ACTION_REQUIRED]
也没有
[UPDATE_AVAILABLE]
,则无需任何操作 — 技能已安装且为最新版本(或缓存时间在24小时内)。

Explicit user request

用户显式请求

When the user explicitly asks to check for updates (e.g. "check for updates", "check version"):
  1. Look for
    qwencloud-update-check/SKILL.md
    in sibling skill directories.
  2. If found — run:
    python3 <qwencloud-update-check-dir>/scripts/check_update.py --print-response
    and report the result.
  3. If not found — present the install options above.
当用户显式要求检查更新时(例如“检查更新”、“查看版本”):
  1. 在同级技能目录中查找
    qwencloud-update-check/SKILL.md
  2. 如果存在 — 运行:
    python3 <qwencloud-update-check-dir>/scripts/check_update.py --print-response
    并报告结果。
  3. 如果不存在 — 提供上述安装选项。

References

参考文档

  • execution-guide.md — Fallback paths (curl, code generation, autonomous)
  • curl-examples.md — Curl templates (base64, multi-image, video, OCR)
  • api-guide.md — API supplementary guide
  • visual-reasoning.md — QVQ visual reasoning guide
  • ocr.md — Qwen-VL-OCR text extraction guide
  • sources.md — Official documentation URLs
  • execution-guide.md — 备选执行路径(curl、代码生成、自主解决)
  • curl-examples.md — Curl模板(base64、多图、视频、OCR)
  • api-guide.md — API补充指南
  • visual-reasoning.md — QVQ视觉推理指南
  • ocr.md — Qwen-VL-OCR文本提取指南
  • sources.md — 官方文档URL