qwencloud-model-selector

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Agent setup: If your agent doesn't auto-load skills (e.g. Claude Code), see agent-compatibility.md once per session.
Agent设置: 如果您的Agent不会自动加载技能(例如Claude Code), 请每个会话查看一次agent-compatibility.md

Qwen Model Selector (Advisor)

Qwen模型选择器(顾问)

This skill operates in two modes:
  1. Interactive advisory — asks diagnostic questions to recommend the right model (see Diagnostic Flow).
  2. Cross-skill resolution — provides a fast-path model lookup for execution skills that need a model decision without user interaction (see Cross-Skill Model Resolution).
Do not fabricate model names — only recommend models listed in this skill. This skill is part of qwencloud/qwencloud-ai.
本技能有两种运行模式:
  1. 交互式咨询 — 提出诊断性问题以推荐合适的模型(见诊断流程)。
  2. 跨技能调度 — 为无需用户交互即可做出模型决策的执行技能提供快速模型查询路径(见跨技能模型决策)。
请勿编造模型名称 — 仅推荐本技能中列出的模型。 本技能属于 qwencloud/qwencloud-ai 的一部分。

Skill directory

技能目录

Use this skill's reference files for data and learning. Load on demand — do not fetch external URLs unless the user explicitly asks for latest data.
LocationPurpose
references/pricing.md
Pricing overview — model categories, billing units, and link to official pricing page
references/model-list.md
Model catalog (point-in-time snapshot)
references/sources.md
Official documentation URLs (manual lookup only)
references/agent-compatibility.md
Agent self-check: register skills in project config for agents that don't auto-load
使用本技能的参考文件获取数据和学习内容。按需加载 — 除非用户明确要求获取最新数据,否则不要抓取外部URL。
位置用途
references/pricing.md
定价概览 — 模型分类、计费单位,以及官方定价页面的链接
references/model-list.md
模型目录(时间点快照)
references/sources.md
官方文档URL(仅手动查询使用)
references/agent-compatibility.md
Agent自检:为不会自动加载的Agent在项目配置中注册技能

Security

安全

NEVER output any API key or credential in plaintext. Always use variable references (
$DASHSCOPE_API_KEY
in shell,
os.environ["QWEN_API_KEY"]
in Python). Any check or detection of credentials must be non-plaintext: report only status (e.g. "set" / "not set", "valid" / "invalid"), never the value. Never display contents of
.env
or config files that may contain secrets.
绝对不要明文输出任何API key或凭证。 始终使用变量引用(Shell中为
$DASHSCOPE_API_KEY
,Python中为
os.environ["QWEN_API_KEY"]
)。任何凭证的检查或检测都必须非明文:仅报告状态(例如“已设置”/“未设置”、“有效”/“无效”),绝不输出值。永远不要显示可能包含密钥的
.env
或配置文件内容。

Coding Plan Models

编码计划模型

Users with a Coding Plan subscription have access to a limited set of models through their coding tools only:
ModelContextThinking
qwen3.5-plus1MYes (budget: 81,920)
kimi-k2.5256KYes (budget: 81,920)
glm-5198KYes (budget: 32,768)
MiniMax-M2.5192KYes (budget: 32,768)
qwen3-max-2026-01-23256KYes (budget: 81,920)
qwen3-coder-next256KNo
qwen3-coder-plus1MNo
glm-4.7198KYes (budget: 32,768)
Coding Plan does not include image, video, TTS, or specialized vision models. When recommending models, note if the user's chosen model falls outside this list and they are using a Coding Plan key (
sk-sp-...
). If qwencloud-ops-auth is installed, see its
references/codingplan.md
for the full model list and error codes.
订阅了编码计划的用户仅可通过其编码工具访问有限的模型集合:
ModelContextThinking
qwen3.5-plus1MYes (budget: 81,920)
kimi-k2.5256KYes (budget: 81,920)
glm-5198KYes (budget: 32,768)
MiniMax-M2.5192KYes (budget: 32,768)
qwen3-max-2026-01-23256KYes (budget: 81,920)
qwen3-coder-next256KNo
qwen3-coder-plus1MNo
glm-4.7198KYes (budget: 32,768)
编码计划不包含图像、视频、TTS或专用视觉模型。推荐模型时,如果用户选择的模型不在此列表中,且用户使用的是编码计划密钥(
sk-sp-...
),请予以说明。如果安装了qwencloud-ops-auth,请查看其
references/codingplan.md
获取完整模型列表和错误码。

Diagnostic Flow

诊断流程

Ask the user (in order):
  1. Content type? — text / image / video / audio / vision
  2. Primary task? — generation / understanding / coding / reasoning / translation
  3. Priority? — quality vs speed vs cost
  4. Input size? — short / medium / long context
  5. Structured output? — JSON / function calling needed?
按顺序询问用户:
  1. 内容类型? — 文本 / 图像 / 视频 / 音频 / 视觉
  2. 主要任务? — 生成 / 理解 / 编码 / 推理 / 翻译
  3. 优先级? — 质量 vs 速度 vs 成本
  4. 输入大小? — 短 / 中 / 长上下文
  5. 结构化输出? — 是否需要JSON / 函数调用?

Cross-Skill Model Resolution

跨技能模型决策

When an execution skill needs to choose a model, evaluate across three dimensions: Requirement → Scenario → Pricing. If the user explicitly specified a model, use it as given — but still verify availability; if restricted, warn the user and suggest an alternative.
当执行技能需要选择模型时,从三个维度评估:需求 → 场景 → 定价。如果用户明确指定了模型,直接使用指定模型,但仍需验证可用性;如果受限,需警告用户并建议替代方案。

Dimension 1 · Requirement (select)

维度1 · 需求

Match task capability to the right model. Use when the user's need points to a specialized model, or when the task is ambiguous and you need to compare capabilities.
SignalKeywordsModel
Reasoning"think step by step", "reason", "analyze"qwq-plus (text) · qvq-max (vision)
Coding"write code", "implement", "debug"qwen3-coder-plus
OCR / document"extract text", "OCR", "scan"qwen-vl-ocr
Long context"long document", "large file"qwen3.5-plus (1M context)
Multimodal (text+image+video)"analyze image", "understand video" + textqwen3.5-plus (unified multimodal)
Voice interaction / omni"voice chat", "speak", "listen"qwen3-omni-flash
Built-in tools"search the web", "run code", "use tools"qwen3-max (web search, code interpreter)
Image editing / style transfer"edit image", "style transfer", "reference image"wan2.6-image (preferred) · wan2.5-i2i-preview
Image-to-image fusion"place object", "combine images", "fuse images"wan2.6-image · wan2.5-i2i-preview
Style TTS"emotion", "tone", "pace"qwen3-tts-instruct-flash
Ambiguoustask doesn't clearly map to one modelcompare Recommendation Matrix; ask user to clarify if needed
将任务能力与合适的模型匹配。当用户需求指向专用模型,或任务不明确需要对比能力时使用。
信号关键词模型
推理"think step by step", "reason", "analyze"qwq-plus (文本) · qvq-max (视觉)
编码"write code", "implement", "debug"qwen3-coder-plus
OCR / 文档"extract text", "OCR", "scan"qwen-vl-ocr
长上下文"long document", "large file"qwen3.5-plus (1M上下文)
多模态(文本+图像+视频)"analyze image", "understand video" + 文本qwen3.5-plus (统一多模态)
语音交互 / 全模态"voice chat", "speak", "listen"qwen3-omni-flash
内置工具"search the web", "run code", "use tools"qwen3-max (网页搜索、代码解释器)
图像编辑 / 风格迁移"edit image", "style transfer", "reference image"wan2.6-image (优先) · wan2.5-i2i-preview
图生图融合"place object", "combine images", "fuse images"wan2.6-image · wan2.5-i2i-preview
风格TTS"emotion", "tone", "pace"qwen3-tts-instruct-flash
不明确任务没有明确对应到某一个模型对比推荐矩阵;必要时请用户澄清

Dimension 2 · Scenario (tune)

维度2 · 场景(调优)

Adjust model tier based on how the model will be used.
PatternSignalsGuidance
Interactive / real-time"chat", "real-time", "interactive"Prefer flash/turbo variants; enable streaming
Batch / offline"batch", "offline", "background"Quality model + Batch API (50% off)
One-off trial"try", "test", "experiment"Quality model; check if free quota is still available in user's console
High-volume production"production", "at scale", "high volume"Cost-optimize: flash/turbo + context cache
Repeated context"template", "same prompt", "repeated"Enable context caching for input token discount
根据模型的使用场景调整模型层级。
模式信号指引
交互式 / 实时"chat", "real-time", "interactive"优先选择flash/turbo变体;开启流式输出
批量 / 离线"batch", "offline", "background"高质量模型 + 批量API(优惠50%)
一次性试用"try", "test", "experiment"高质量模型;检查用户控制台是否仍有免费额度可用
高流量生产"production", "at scale", "high volume"成本优化:flash/turbo + 上下文缓存
重复上下文"template", "same prompt", "repeated"开启上下文缓存以获得输入Token优惠

Dimension 3 · Pricing (optimize)

维度3 · 定价优化

Given the candidates from dimensions 1–2, compare costs and apply modifiers.
  • Pricing reference: pricing.md. For the latest rates, check the official pricing page.
  • Free quota: Some models offer a limited free quota after activation. However, quotas may have been consumed, expired, or changed. Never assume remaining free quota — always present the paid unit price.
  • Batch API: 50% off both input and output tokens for non-realtime workloads.
  • Context cache: Input token discount for repeated/templated contexts.
  • Tiered pricing: Some models charge more per token as input length increases — check pricing tables for breakpoints.
  • When cost is the user's primary concern, explicitly recommend the cheapest viable model and cite the price.
根据维度1-2得出的候选模型,对比成本并应用调整项。
  • 定价参考:pricing.md。如需最新费率,请查看官方定价页面
  • 免费额度:部分模型激活后提供有限的免费额度。但额度可能已被消耗、过期或变更。永远不要假设剩余免费额度 — 始终展示付费单价。
  • 批量API:非实时工作负载的输入和输出Token均优惠50%。
  • 上下文缓存:重复/模板化上下文的输入Token可享优惠。
  • 阶梯定价:部分模型的Token单价会随着输入长度增加而上涨 — 请查看定价表了解断点。
  • 当成本是用户的首要考量时,明确推荐最便宜的可行模型并标注价格。

Default

默认配置

No signals detected, clear task → use the Canonical Default for the domain.
DomainDefaultQualitySpeedCost
text.chatqwen3.5-plusqwen3-maxqwen3.5-flashqwen-turbo
vision.analyzeqwen3-vl-plusqwen3-vl-plusqwen3-vl-flashqwen3-vl-flash
omni (voice+vision)qwen3-omni-flashqwen3-omni-flashqwen3-omni-flash
image.generatewan2.6-t2iwan2.6-t2iwan2.2-t2i-flashwan2.2-t2i-flash
image.editwan2.6-imagewan2.6-imagewan2.5-i2i-previewwan2.5-i2i-preview
video.t2vwan2.6-t2vwan2.6-t2v
video.i2vwan2.6-i2v-flashwan2.6-i2vwan2.6-i2v-flash
audio.ttsqwen3-tts-flashqwen3-tts-flash
Degradation: If this skill is not loaded or not available, each execution skill falls back to its own built-in default. This protocol is purely additive — it enhances model selection but never blocks execution.
未检测到信号,任务明确 → 使用对应领域的标准默认配置。
领域默认高质量高速度低成本
text.chatqwen3.5-plusqwen3-maxqwen3.5-flashqwen-turbo
vision.analyzeqwen3-vl-plusqwen3-vl-plusqwen3-vl-flashqwen3-vl-flash
omni (voice+vision)qwen3-omni-flashqwen3-omni-flashqwen3-omni-flash
image.generatewan2.6-t2iwan2.6-t2iwan2.2-t2i-flashwan2.2-t2i-flash
image.editwan2.6-imagewan2.6-imagewan2.5-i2i-previewwan2.5-i2i-preview
video.t2vwan2.6-t2vwan2.6-t2v
video.i2vwan2.6-i2v-flashwan2.6-i2vwan2.6-i2v-flash
audio.ttsqwen3-tts-flashqwen3-tts-flash
降级策略:如果本技能未加载或不可用,每个执行技能将回退到自身内置的默认配置。本协议仅做补充 — 它会优化模型选择,但绝不会阻塞执行。

Model Recommendation Matrix

模型推荐矩阵

Text Models

文本模型

Use CaseRecommendedWhy
General chat/assistantqwen3.5-plusBest balance of quality, speed, cost. Also accepts image/video input (multimodal). Thinking enabled by default.
Fast responses, low costqwen3.5-flash3x faster, 70% cheaper than Plus. Thinking enabled by default.
Highest qualityqwen3-maxStrongest reasoning. Built-in tools (web search, code interpreter). Supports thinking mode.
Code generationqwen3-coder-nextBest balance of code quality, speed, cost. Agentic coding.
qwen3-coder-plus
for highest quality.
Complex reasoningqwq-plusChain-of-thought reasoning specialist
Long documentsqwen3.5-plusUp to 1M context. For >1M needs, see model-list.md.
Budget/high volumeqwen-turboCheapest per-token cost
使用场景推荐模型说明
通用聊天/助手qwen3.5-plus质量、速度、成本的最佳平衡。同时支持图像/视频输入(多模态)。默认开启思考模式。
快速响应、低成本qwen3.5-flash比Plus快3倍,便宜70%。默认开启思考模式。
最高质量qwen3-max最强推理能力。内置工具(网页搜索、代码解释器)。支持思考模式。
代码生成qwen3-coder-next代码质量、速度、成本的最佳平衡。支持智能体编码。如需最高质量请选择
qwen3-coder-plus
复杂推理qwq-plus思维链推理专用模型
长文档qwen3.5-plus最高支持1M上下文。如需超过1M的支持,请查看model-list.md
预算/高流量qwen-turbo单位Token成本最低

Image Models

图像模型

Use CaseRecommendedWhy
Best quality text-to-imagewan2.6-t2iLatest model, sync support
Image editing / style transfer (1–4 refs)wan2.6-imageMulti-image composition, subject consistency, 2K output, interleaved text-image
Image editing / multi-image fusion (1–3 refs)wan2.5-i2i-previewSimpler prompt-based editing, subject consistency, multi-image fusion
Interleaved text-image output (tutorials)wan2.6-imageMixed text+image generation
Fast iterationwan2.2-t2i-flash50% faster generation
Flexible resolutionwan2.5-t2i-previewCustom aspect ratios
使用场景推荐模型说明
最高质量文生图wan2.6-t2i最新模型,支持同步生成
图像编辑 / 风格迁移(1-4个参考图)wan2.6-image多图合成、主体一致性、2K输出、图文混排生成
图像编辑 / 多图融合(1-3个参考图)wan2.5-i2i-preview更简单的基于提示词的编辑、主体一致性、多图融合
图文混排输出(教程)wan2.6-image混合文本+图像生成
快速迭代wan2.2-t2i-flash生成速度快50%
灵活分辨率wan2.5-t2i-preview支持自定义宽高比

Video Models

视频模型

Use CaseRecommendedWhy
Quick video creationwan2.6-i2v-flashFast, multi-shot narrative
High qualitywan2.6-i2vBest visual quality
With audiowan2.5-i2v-previewAuto-dubbing support
使用场景推荐模型说明
快速视频创作wan2.6-i2v-flash速度快、支持多镜头叙事
高质量wan2.6-i2v最佳视觉质量
带音频wan2.5-i2v-preview支持自动配音

Audio Models

音频模型

Use CaseRecommendedWhy
Highest quality
cosyvoice-v3-plus
Best naturalness, emotional expression, professional scenarios
High quality + speed
cosyvoice-v3-flash
Good balance of quality and performance
Standard TTS
qwen3-tts-flash
Fast, reliable, multi-language, cost-effective
Controlled style
qwen3-tts-instruct-flash
Instruction-guided voice style (tone/emotion)
使用场景推荐模型说明
最高质量
cosyvoice-v3-plus
最佳自然度、情感表达、适合专业场景
高质量+速度
cosyvoice-v3-flash
质量和性能的良好平衡
标准TTS
qwen3-tts-flash
快速、可靠、多语言、高性价比
可控风格
qwen3-tts-instruct-flash
指令引导的语音风格(语气/情感)

Vision Models

视觉模型

Use CaseRecommendedWhy
Best accuracyqwen3-vl-plusHighest vision understanding. Thinking mode supported. 256K context.
Fast analysisqwen3-vl-flashQuick image understanding. Thinking mode supported.
Unified text+visionqwen3.5-plusMultimodal (text + image + video). Surpasses qwen3-vl series on many benchmarks. Use when both text quality and vision matter.
使用场景推荐模型说明
最高精度qwen3-vl-plus最高视觉理解能力。支持思考模式。256K上下文。
快速分析qwen3-vl-flash快速图像理解。支持思考模式。
统一图文qwen3.5-plus多模态(文本 + 图像 + 视频)。在多项基准测试中超过qwen3-vl系列。同时需要文本质量和视觉能力时使用。

Omni Models

全模态模型

Use CaseRecommendedWhy
Voice + vision chatqwen3-omni-flashText/image/audio/video → text or speech. 49 voices, 10 languages. Thinking supported.
Real-time voiceqwen3-omni-flash-realtimeStreaming audio input + built-in VAD. 49 voices.
使用场景推荐模型说明
语音+视觉聊天qwen3-omni-flash文本/图像/音频/视频 → 文本或语音。49种音色、10种语言。支持思考模式。
实时语音qwen3-omni-flash-realtime流式音频输入 + 内置VAD。49种音色。

Pricing Guidance

定价指引

  • Default pricing: pricing.md — International, USD. For the latest rates, check the official pricing page.
  • Latest prices: When the user explicitly asks for exact/latest pricing, see sources.md for official URLs.
  • Cost formula:
    Cost = Tokens ÷ 1,000,000 × Unit price
    . 1K Chinese chars ≈ 1,200-1,500 tokens.
  • Free quota: Some models offer a limited free quota after activation — but quotas may have been consumed, expired, or changed without notice. Always present the paid unit price first. Mention free quota only as something the user should verify in their QwenCloud console.
  • Cost tips:
    • Use Batch calling for 50% off in non-realtime scenarios
    • Enable context cache for repeated contexts
    • Use flash/turbo series for non-critical tasks
  • 默认定价pricing.md — 国际版,美元计价。如需最新费率,请查看官方定价页面
  • 最新价格:当用户明确要求准确/最新定价时,请查看sources.md获取官方URL。
  • 成本公式
    成本 = Token数 ÷ 1,000,000 × 单价
    。1千汉字 ≈ 1,200-1,500个Token。
  • 免费额度:部分模型激活后提供有限的免费额度 — 但额度可能已被消耗、过期或未经通知变更。始终优先展示付费单价。 仅将免费额度作为用户可在QwenCloud控制台核实的可能性提及。
  • 成本提示
    • 非实时场景下使用批量调用可享50%优惠
    • 重复上下文开启上下文缓存
    • 非核心任务使用flash/turbo系列模型

Cost Estimation Disclaimer (MANDATORY)

成本估算免责声明(强制要求)

🚨 CRITICAL — NO EXCEPTIONS: NEVER fabricate, invent, or guess any price figure. If you do not have a confirmed price from
references/pricing.md
or the official pricing page, you MUST NOT output any number. Instead, direct the user to the official pricing page. Outputting a made-up price is a critical failure — worse than saying "I don't know."
When responding to any cost-related query — including but not limited to price evaluation, usage estimation, budget forecasting, or cost comparison — you MUST append a professional disclaimer. This applies regardless of language or response format.
Required disclaimer (Chinese response):
⚠️ 费用说明:以上费用为基于官方公示单价的预估价格,仅供参考。实际费用受 Token 消耗量、上下文长度阶梯定价、Batch/缓存折扣及计费策略调整等因素影响,请以QwenCloud控制台的实际账单为准。部分模型可能提供限时免费额度,但免费额度的可用性、额度量及有效期随时可能调整,请在控制台确认您的账户是否仍有剩余额度,切勿假设本次调用免费。最新定价详见 模型定价页
Required disclaimer (English response):
⚠️ Pricing Notice: The cost figures above are estimates calculated from officially published unit prices and are provided for reference only. Actual charges depend on token consumption, tiered context-length pricing, Batch/cache discounts, and billing policy updates. Some models may offer a time-limited free quota, but quota availability, amounts, and validity periods are subject to change — do not assume this call is free. Please verify your remaining quota in the QwenCloud console and refer to the actual bill for definitive costs. See Model Pricing for the latest rates.
Rules:
  • The disclaimer must appear at the end of every cost-related response, clearly separated from the main content.
  • When the estimate involves assumptions (e.g., average tokens per character, assumed context length tier), explicitly state each assumption used in the calculation.
  • Never present estimated costs as exact or guaranteed amounts. Use hedging language such as "approximately", "estimated at", "roughly" (or Chinese equivalents "约", "预估", "约合") throughout the cost breakdown.
  • Never tell the user a call will be free or cost $0/¥0. Even if a free quota exists, the user may have already consumed it. Always present the paid price and note that a free quota may apply — subject to the user verifying in their console.
  • If pricing data is unavailable or uncertain, say so explicitly and link to the official pricing page. Never fill the gap with a guess.
🚨 重要 — 无例外永远不要编造、虚构或猜测任何价格数据。 如果您没有从
references/pricing.md
或官方定价页面获得确认的价格,绝对不要输出任何数字。而是引导用户访问官方定价页面。输出虚构价格是严重失误 — 比说“我不知道”更糟糕。
回复任何与成本相关的查询时 — 包括但不限于价格评估、用量估算、预算预测或成本对比 — 必须附加专业免责声明。无论语言或回复格式如何,此要求均适用。
中文回复所需免责声明:
⚠️ 费用说明:以上费用为基于官方公示单价的预估价格,仅供参考。实际费用受 Token 消耗量、上下文长度阶梯定价、Batch/缓存折扣及计费策略调整等因素影响,请以QwenCloud控制台的实际账单为准。部分模型可能提供限时免费额度,但免费额度的可用性、额度量及有效期随时可能调整,请在控制台确认您的账户是否仍有剩余额度,切勿假设本次调用免费。最新定价详见 模型定价页
英文回复所需免责声明:
⚠️ Pricing Notice: The cost figures above are estimates calculated from officially published unit prices and are provided for reference only. Actual charges depend on token consumption, tiered context-length pricing, Batch/cache discounts, and billing policy updates. Some models may offer a time-limited free quota, but quota availability, amounts, and validity periods are subject to change — do not assume this call is free. Please verify your remaining quota in the QwenCloud console and refer to the actual bill for definitive costs. See Model Pricing for the latest rates.
规则:
  • 免责声明必须出现在每篇与成本相关回复的末尾,与主要内容明确分隔。
  • 如果估算涉及假设(例如每字符平均Token数、假设的上下文长度阶梯),明确说明计算中使用的每个假设
  • 永远不要将估算成本表述为准确或保证的金额。在成本明细中全程使用模糊表述,例如“approximately”、“estimated at”、“roughly”(或对应中文“约”、“预估”、“约合”)。
  • 永远不要告诉用户调用将免费或花费$0/¥0。 即使存在免费额度,用户也可能已经用完。始终展示付费价格,并说明免费额度可能适用 — 需用户在控制台核实。
  • 如果定价数据不可用或不确定,请明确说明并链接到官方定价页面。永远不要猜测填补空白。

Available Models

可用模型

All standard text, vision, image, video, audio, and coding models are available. Some models offer free quota (verify in console).
  • Text: qwen3-max, qwen3.5-plus, qwen3.5-flash, qwen-turbo, qwq-plus, qwen3-coder-next/plus/flash, qwen-plus-character, qwen-plus-character-ja, qwen-flash-character
  • Vision: qwen3-vl-plus, qwen3-vl-flash, qvq-max, qwen-vl-ocr, qwen-vl-max, qwen-vl-plus
  • Omni: qwen3-omni-flash (+ realtime), qwen-omni-turbo (+ realtime)
  • Image generation (text-to-image): wan2.6-t2i, wan2.5-t2i-preview, wan2.2-t2i-flash, z-image-turbo
  • Image editing (requires reference images): wan2.6-image, wan2.5-i2i-preview
  • Video generation: wan2.6 series (t2v, i2v, i2v-flash, r2v, r2v-flash), wan2.5/2.2 series, vace
  • TTS: qwen3-tts-flash, qwen3-tts-instruct-flash, cosyvoice-v3 series
  • ASR: qwen3-asr-flash, fun-asr
  • Embedding/Rerank: text-embedding-v4, qwen3-rerank
  • Translation: qwen-mt-plus/flash/lite/turbo
⚠️ Important: The model list above is a point-in-time snapshot and may be outdated. Model availability changes frequently. Always check the official model list for the authoritative, up-to-date catalog before making model decisions. See model-list.md for a more detailed local reference.
所有标准文本、视觉、图像、视频、音频和编码模型均可用。部分模型提供免费额度(需在控制台核实)。
  • 文本: qwen3-max, qwen3.5-plus, qwen3.5-flash, qwen-turbo, qwq-plus, qwen3-coder-next/plus/flash, qwen-plus-character, qwen-plus-character-ja, qwen-flash-character
  • 视觉: qwen3-vl-plus, qwen3-vl-flash, qvq-max, qwen-vl-ocr, qwen-vl-max, qwen-vl-plus
  • 全模态: qwen3-omni-flash (+ 实时版), qwen-omni-turbo (+ 实时版)
  • 图像生成(文生图): wan2.6-t2i, wan2.5-t2i-preview, wan2.2-t2i-flash, z-image-turbo
  • 图像编辑(需要参考图): wan2.6-image, wan2.5-i2i-preview
  • 视频生成: wan2.6 系列 (t2v, i2v, i2v-flash, r2v, r2v-flash), wan2.5/2.2 系列, vace
  • TTS: qwen3-tts-flash, qwen3-tts-instruct-flash, cosyvoice-v3 系列
  • ASR: qwen3-asr-flash, fun-asr
  • Embedding/重排序: text-embedding-v4, qwen3-rerank
  • 翻译: qwen-mt-plus/flash/lite/turbo
⚠️ 重要提示:上述模型列表是时间点快照,可能已过时。模型可用性会频繁变更。在做出模型决策前,请始终查看官方模型列表获取权威的最新目录。 查看model-list.md获取更详细的本地参考。

Thinking Mode

思考模式

Several models support hybrid thinking/non-thinking modes:
ModelThinking DefaultNotes
qwen3.5-plusOnThinking enabled by default. Use
enable_thinking: false
to disable.
qwen3.5-flashOnThinking enabled by default.
qwen3-maxOffUse
enable_thinking: true
for complex reasoning. Built-in tools available in thinking mode.
qwen-plus / qwen-flash / qwen-turboOffHybrid; enable for deeper reasoning at higher output cost.
qwen3-vl-plus / qwen3-vl-flashOffVision + thinking for complex visual analysis.
qwen3-omni-flashOffThinking supported; audio output not available in thinking mode.
qwq-plus / qvq-maxAlways onPure reasoning models; CoT always active.
Guidance: Do not enable thinking by default for simple or conversational tasks — it increases latency and output token cost. Enable only when the user explicitly asks for deep reasoning or the task requires multi-step analysis.
部分模型支持混合思考/非思考模式:
模型默认开启思考说明
qwen3.5-plus默认开启思考模式。使用
enable_thinking: false
可关闭。
qwen3.5-flash默认开启思考模式。
qwen3-max复杂推理场景使用
enable_thinking: true
开启。思考模式下可用内置工具。
qwen-plus / qwen-flash / qwen-turbo混合模式;开启可获得更深的推理能力,但输出Token成本更高。
qwen3-vl-plus / qwen3-vl-flash视觉+思考模式可用于复杂视觉分析。
qwen3-omni-flash支持思考模式;思考模式下无法输出音频。
qwq-plus / qvq-max始终开启纯推理模型;思维链始终激活。
指引:简单或对话类任务不要默认开启思考模式 — 这会增加延迟和输出Token成本。仅当用户明确要求深度推理或任务需要多步分析时才开启。

Anti-Patterns

反模式

  • Only recommend models listed in this skill — never fabricate model names.
  • When unsure, use
    qwen3.5-plus
    as a safe default for text tasks.
  • 🚨 NEVER invent or guess any price figure — only use pricing from
    references/pricing.md
    or the official pricing page. If the data is not available, say so and link to the official page. Fabricating a price is a critical failure.
  • Always cite data source when providing pricing info.
  • Default currency is USD. Check the official pricing page for the latest rates. Writing in Chinese does NOT imply CNY.
  • All cost estimates must include the mandatory disclaimer — use hedging language ("approximately", "estimated"). Omitting the disclaimer is a critical violation.
  • Never assume free quota is available — free quotas may have been consumed, expired, or removed. Never tell the user a call will cost $0/¥0. Always present the paid unit price first; mention free quota only as a possibility the user should verify in their console.
  • Do NOT proactively fetch URLs or trigger web searches — only access online sources when the user explicitly asks for "latest" data AND reference files cannot answer.
  • Never output API keys in plaintext — see Security section above.
  • 仅推荐本技能中列出的模型 — 永远不要编造模型名称。
  • 不确定时,文本任务使用
    qwen3.5-plus
    作为安全默认。
  • 🚨 永远不要编造或猜测任何价格数据 — 仅使用
    references/pricing.md
    官方定价页面中的定价。如果数据不可用,说明情况并链接到官方页面。编造价格是严重失误。
  • 提供定价信息时始终注明数据源
  • 默认货币为美元。 查看官方定价页面获取最新费率。使用中文回复不代表默认使用人民币。
  • 所有成本估算必须包含强制免责声明 — 使用模糊表述(“约”、“预估”)。省略免责声明属于严重违规。
  • 永远不要假设免费额度可用 — 免费额度可能已被消耗、过期或移除。永远不要告诉用户调用将花费$0/¥0。始终优先展示付费单价;仅将免费额度作为用户可在控制台核实的可能性提及。
  • 不要主动抓取URL或触发网页搜索 — 仅当用户明确要求“最新”数据且参考文件无法解答时,才访问在线资源。
  • 永远不要明文输出API密钥 — 见上文安全部分。

Online Information Sources

在线信息源

Reference files in this skill are point-in-time snapshots. For official documentation URLs, see sources.md.
Do NOT proactively fetch or search online. Only access online sources when ALL of these conditions are met:
  1. The user explicitly asks for "latest", "real-time", or "current" data.
  2. The reference files in this skill cannot answer the question.
  3. The user confirms they want an online lookup.
本技能中的参考文件是时间点快照。如需官方文档URL,请查看sources.md不要主动抓取或搜索在线内容。 仅当同时满足以下所有条件时,才访问在线资源:
  1. 用户明确要求“最新”、“实时”或“当前”数据。
  2. 本技能中的参考文件无法解答问题。
  3. 用户确认需要进行在线查询。

Update Check (MANDATORY Post-Execution)

更新检查(执行后强制要求)

When the user asks to check for updates (e.g. "check for updates", "check version", "is there a new version", "update skills"):
  1. Find qwencloud-update-check: Look for
    qwencloud-update-check/SKILL.md
    in sibling skill directories.
  2. If found — run:
    python3 <qwencloud-update-check-dir>/scripts/check_update.py --print-response
    and report the result. Use
    --force
    if the user asks to force-check.
  3. If not found — you MUST present the following options and wait for the user's response:
The update-check skill is not installed. It keeps you informed about new qwencloud/qwencloud-ai features and bug fixes.
  1. Install now — I'll run the install command for you
  2. Skip for now — do nothing this time
  3. Never remind me — permanently suppress this prompt
After the user responds:
  • User chooses 1 → run:
    npx skills add QwenCloud/qwencloud-ai --skill qwencloud-update-check -y
    , then run:
    python3 <this-skill-scripts-dir>/gossamer.py --dismiss
  • User chooses 2 → run:
    python3 <this-skill-scripts-dir>/gossamer.py --dismiss
    , then continue normally
  • User chooses 3 → run:
    python3 <this-skill-scripts-dir>/gossamer.py --never-install
    , then continue normally
当用户要求检查更新时(例如“check for updates”、“check version”、“is there a new version”、“update skills”):
  1. 查找qwencloud-update-check:在同级技能目录中查找
    qwencloud-update-check/SKILL.md
  2. 如果找到 — 运行:
    python3 <qwencloud-update-check-dir>/scripts/check_update.py --print-response
    并报告结果。如果用户要求强制检查,添加
    --force
    参数。
  3. 如果未找到 — 必须展示以下选项并等待用户回复
更新检查技能未安装。它会让您及时了解qwencloud/qwencloud-ai的新功能和bug修复。
  1. 立即安装 — 我将为您运行安装命令
  2. 暂时跳过 — 本次不做操作
  3. 永不提醒 — 永久屏蔽此提示
用户回复后:
  • 用户选择1 → 运行:
    npx skills add QwenCloud/qwencloud-ai --skill qwencloud-update-check -y
    ,然后运行:
    python3 <this-skill-scripts-dir>/gossamer.py --dismiss
  • 用户选择2 → 运行:
    python3 <this-skill-scripts-dir>/gossamer.py --dismiss
    ,然后正常继续
  • 用户选择3 → 运行:
    python3 <this-skill-scripts-dir>/gossamer.py --never-install
    ,然后正常继续

References

参考

  • pricing.md — Pricing overview: model categories, billing units, and link to official pricing page
  • model-list.md — Model catalog (2026-03 snapshot; check official model list for latest)
  • sources.md — Official documentation URLs (for manual lookup only)
  • pricing.md — 定价概览:模型分类、计费单位,以及官方定价页面的链接
  • model-list.md — 模型目录(2026-03快照;查看官方模型列表获取最新内容)
  • sources.md — 官方文档URL(仅手动查询使用)