qwencloud-model-selector
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgent setup: If your agent doesn't auto-load skills (e.g. Claude Code), see agent-compatibility.md once per session.
Agent设置: 如果您的Agent不会自动加载技能(例如Claude Code), 请每个会话查看一次agent-compatibility.md。
Qwen Model Selector (Advisor)
Qwen模型选择器(顾问)
This skill operates in two modes:
- Interactive advisory — asks diagnostic questions to recommend the right model (see Diagnostic Flow).
- Cross-skill resolution — provides a fast-path model lookup for execution skills that need a model decision without user interaction (see Cross-Skill Model Resolution).
Do not fabricate model names — only recommend models listed in this skill.
This skill is part of qwencloud/qwencloud-ai.
本技能有两种运行模式:
- 交互式咨询 — 提出诊断性问题以推荐合适的模型(见诊断流程)。
- 跨技能调度 — 为无需用户交互即可做出模型决策的执行技能提供快速模型查询路径(见跨技能模型决策)。
请勿编造模型名称 — 仅推荐本技能中列出的模型。
本技能属于 qwencloud/qwencloud-ai 的一部分。
Skill directory
技能目录
Use this skill's reference files for data and learning. Load on demand — do not fetch external URLs unless the user
explicitly asks for latest data.
| Location | Purpose |
|---|---|
| Pricing overview — model categories, billing units, and link to official pricing page |
| Model catalog (point-in-time snapshot) |
| Official documentation URLs (manual lookup only) |
| Agent self-check: register skills in project config for agents that don't auto-load |
使用本技能的参考文件获取数据和学习内容。按需加载 — 除非用户明确要求获取最新数据,否则不要抓取外部URL。
| 位置 | 用途 |
|---|---|
| 定价概览 — 模型分类、计费单位,以及官方定价页面的链接 |
| 模型目录(时间点快照) |
| 官方文档URL(仅手动查询使用) |
| Agent自检:为不会自动加载的Agent在项目配置中注册技能 |
Security
安全
NEVER output any API key or credential in plaintext. Always use variable references ( in shell,
in Python). Any check or detection of credentials must be non-plaintext: report
only status (e.g. "set" / "not set", "valid" / "invalid"), never the value. Never display contents of or config
files that may contain secrets.
$DASHSCOPE_API_KEYos.environ["QWEN_API_KEY"].env绝对不要明文输出任何API key或凭证。 始终使用变量引用(Shell中为,Python中为)。任何凭证的检查或检测都必须非明文:仅报告状态(例如“已设置”/“未设置”、“有效”/“无效”),绝不输出值。永远不要显示可能包含密钥的或配置文件内容。
$DASHSCOPE_API_KEYos.environ["QWEN_API_KEY"].envCoding Plan Models
编码计划模型
Users with a Coding Plan subscription have access to a
limited set of models through their coding tools only:
| Model | Context | Thinking |
|---|---|---|
| qwen3.5-plus | 1M | Yes (budget: 81,920) |
| kimi-k2.5 | 256K | Yes (budget: 81,920) |
| glm-5 | 198K | Yes (budget: 32,768) |
| MiniMax-M2.5 | 192K | Yes (budget: 32,768) |
| qwen3-max-2026-01-23 | 256K | Yes (budget: 81,920) |
| qwen3-coder-next | 256K | No |
| qwen3-coder-plus | 1M | No |
| glm-4.7 | 198K | Yes (budget: 32,768) |
Coding Plan does not include image, video, TTS, or specialized vision models. When recommending models, note if the
user's chosen model falls outside this list and they are using a Coding Plan key (). If qwencloud-ops-auth is
installed, see its for the full model list and error codes.
sk-sp-...references/codingplan.md订阅了编码计划的用户仅可通过其编码工具访问有限的模型集合:
| Model | Context | Thinking |
|---|---|---|
| qwen3.5-plus | 1M | Yes (budget: 81,920) |
| kimi-k2.5 | 256K | Yes (budget: 81,920) |
| glm-5 | 198K | Yes (budget: 32,768) |
| MiniMax-M2.5 | 192K | Yes (budget: 32,768) |
| qwen3-max-2026-01-23 | 256K | Yes (budget: 81,920) |
| qwen3-coder-next | 256K | No |
| qwen3-coder-plus | 1M | No |
| glm-4.7 | 198K | Yes (budget: 32,768) |
编码计划不包含图像、视频、TTS或专用视觉模型。推荐模型时,如果用户选择的模型不在此列表中,且用户使用的是编码计划密钥(),请予以说明。如果安装了qwencloud-ops-auth,请查看其获取完整模型列表和错误码。
sk-sp-...references/codingplan.mdDiagnostic Flow
诊断流程
Ask the user (in order):
- Content type? — text / image / video / audio / vision
- Primary task? — generation / understanding / coding / reasoning / translation
- Priority? — quality vs speed vs cost
- Input size? — short / medium / long context
- Structured output? — JSON / function calling needed?
按顺序询问用户:
- 内容类型? — 文本 / 图像 / 视频 / 音频 / 视觉
- 主要任务? — 生成 / 理解 / 编码 / 推理 / 翻译
- 优先级? — 质量 vs 速度 vs 成本
- 输入大小? — 短 / 中 / 长上下文
- 结构化输出? — 是否需要JSON / 函数调用?
Cross-Skill Model Resolution
跨技能模型决策
When an execution skill needs to choose a model, evaluate across three dimensions: Requirement → Scenario →
Pricing. If the user explicitly specified a model, use it as given — but still verify availability; if
restricted, warn the user and suggest an alternative.
当执行技能需要选择模型时,从三个维度评估:需求 → 场景 → 定价。如果用户明确指定了模型,直接使用指定模型,但仍需验证可用性;如果受限,需警告用户并建议替代方案。
Dimension 1 · Requirement (select)
维度1 · 需求
Match task capability to the right model. Use when the user's need points to a specialized model, or when the task is
ambiguous and you need to compare capabilities.
| Signal | Keywords | Model |
|---|---|---|
| Reasoning | "think step by step", "reason", "analyze" | qwq-plus (text) · qvq-max (vision) |
| Coding | "write code", "implement", "debug" | qwen3-coder-plus |
| OCR / document | "extract text", "OCR", "scan" | qwen-vl-ocr |
| Long context | "long document", "large file" | qwen3.5-plus (1M context) |
| Multimodal (text+image+video) | "analyze image", "understand video" + text | qwen3.5-plus (unified multimodal) |
| Voice interaction / omni | "voice chat", "speak", "listen" | qwen3-omni-flash |
| Built-in tools | "search the web", "run code", "use tools" | qwen3-max (web search, code interpreter) |
| Image editing / style transfer | "edit image", "style transfer", "reference image" | wan2.6-image (preferred) · wan2.5-i2i-preview |
| Image-to-image fusion | "place object", "combine images", "fuse images" | wan2.6-image · wan2.5-i2i-preview |
| Style TTS | "emotion", "tone", "pace" | qwen3-tts-instruct-flash |
| Ambiguous | task doesn't clearly map to one model | compare Recommendation Matrix; ask user to clarify if needed |
将任务能力与合适的模型匹配。当用户需求指向专用模型,或任务不明确需要对比能力时使用。
| 信号 | 关键词 | 模型 |
|---|---|---|
| 推理 | "think step by step", "reason", "analyze" | qwq-plus (文本) · qvq-max (视觉) |
| 编码 | "write code", "implement", "debug" | qwen3-coder-plus |
| OCR / 文档 | "extract text", "OCR", "scan" | qwen-vl-ocr |
| 长上下文 | "long document", "large file" | qwen3.5-plus (1M上下文) |
| 多模态(文本+图像+视频) | "analyze image", "understand video" + 文本 | qwen3.5-plus (统一多模态) |
| 语音交互 / 全模态 | "voice chat", "speak", "listen" | qwen3-omni-flash |
| 内置工具 | "search the web", "run code", "use tools" | qwen3-max (网页搜索、代码解释器) |
| 图像编辑 / 风格迁移 | "edit image", "style transfer", "reference image" | wan2.6-image (优先) · wan2.5-i2i-preview |
| 图生图融合 | "place object", "combine images", "fuse images" | wan2.6-image · wan2.5-i2i-preview |
| 风格TTS | "emotion", "tone", "pace" | qwen3-tts-instruct-flash |
| 不明确 | 任务没有明确对应到某一个模型 | 对比推荐矩阵;必要时请用户澄清 |
Dimension 2 · Scenario (tune)
维度2 · 场景(调优)
Adjust model tier based on how the model will be used.
| Pattern | Signals | Guidance |
|---|---|---|
| Interactive / real-time | "chat", "real-time", "interactive" | Prefer flash/turbo variants; enable streaming |
| Batch / offline | "batch", "offline", "background" | Quality model + Batch API (50% off) |
| One-off trial | "try", "test", "experiment" | Quality model; check if free quota is still available in user's console |
| High-volume production | "production", "at scale", "high volume" | Cost-optimize: flash/turbo + context cache |
| Repeated context | "template", "same prompt", "repeated" | Enable context caching for input token discount |
根据模型的使用场景调整模型层级。
| 模式 | 信号 | 指引 |
|---|---|---|
| 交互式 / 实时 | "chat", "real-time", "interactive" | 优先选择flash/turbo变体;开启流式输出 |
| 批量 / 离线 | "batch", "offline", "background" | 高质量模型 + 批量API(优惠50%) |
| 一次性试用 | "try", "test", "experiment" | 高质量模型;检查用户控制台是否仍有免费额度可用 |
| 高流量生产 | "production", "at scale", "high volume" | 成本优化:flash/turbo + 上下文缓存 |
| 重复上下文 | "template", "same prompt", "repeated" | 开启上下文缓存以获得输入Token优惠 |
Dimension 3 · Pricing (optimize)
维度3 · 定价优化
Given the candidates from dimensions 1–2, compare costs and apply modifiers.
- Pricing reference: pricing.md. For the latest rates, check the official pricing page.
- Free quota: Some models offer a limited free quota after activation. However, quotas may have been consumed, expired, or changed. Never assume remaining free quota — always present the paid unit price.
- Batch API: 50% off both input and output tokens for non-realtime workloads.
- Context cache: Input token discount for repeated/templated contexts.
- Tiered pricing: Some models charge more per token as input length increases — check pricing tables for breakpoints.
- When cost is the user's primary concern, explicitly recommend the cheapest viable model and cite the price.
根据维度1-2得出的候选模型,对比成本并应用调整项。
- 定价参考:pricing.md。如需最新费率,请查看官方定价页面。
- 免费额度:部分模型激活后提供有限的免费额度。但额度可能已被消耗、过期或变更。永远不要假设剩余免费额度 — 始终展示付费单价。
- 批量API:非实时工作负载的输入和输出Token均优惠50%。
- 上下文缓存:重复/模板化上下文的输入Token可享优惠。
- 阶梯定价:部分模型的Token单价会随着输入长度增加而上涨 — 请查看定价表了解断点。
- 当成本是用户的首要考量时,明确推荐最便宜的可行模型并标注价格。
Default
默认配置
No signals detected, clear task → use the Canonical Default for the domain.
| Domain | Default | Quality | Speed | Cost |
|---|---|---|---|---|
| text.chat | qwen3.5-plus | qwen3-max | qwen3.5-flash | qwen-turbo |
| vision.analyze | qwen3-vl-plus | qwen3-vl-plus | qwen3-vl-flash | qwen3-vl-flash |
| omni (voice+vision) | qwen3-omni-flash | qwen3-omni-flash | qwen3-omni-flash | — |
| image.generate | wan2.6-t2i | wan2.6-t2i | wan2.2-t2i-flash | wan2.2-t2i-flash |
| image.edit | wan2.6-image | wan2.6-image | wan2.5-i2i-preview | wan2.5-i2i-preview |
| video.t2v | wan2.6-t2v | wan2.6-t2v | — | — |
| video.i2v | wan2.6-i2v-flash | wan2.6-i2v | wan2.6-i2v-flash | — |
| audio.tts | qwen3-tts-flash | — | qwen3-tts-flash | — |
Degradation: If this skill is not loaded or not available, each execution skill falls back to its own built-in default. This protocol is purely additive — it enhances model selection but never blocks execution.
未检测到信号,任务明确 → 使用对应领域的标准默认配置。
| 领域 | 默认 | 高质量 | 高速度 | 低成本 |
|---|---|---|---|---|
| text.chat | qwen3.5-plus | qwen3-max | qwen3.5-flash | qwen-turbo |
| vision.analyze | qwen3-vl-plus | qwen3-vl-plus | qwen3-vl-flash | qwen3-vl-flash |
| omni (voice+vision) | qwen3-omni-flash | qwen3-omni-flash | qwen3-omni-flash | — |
| image.generate | wan2.6-t2i | wan2.6-t2i | wan2.2-t2i-flash | wan2.2-t2i-flash |
| image.edit | wan2.6-image | wan2.6-image | wan2.5-i2i-preview | wan2.5-i2i-preview |
| video.t2v | wan2.6-t2v | wan2.6-t2v | — | — |
| video.i2v | wan2.6-i2v-flash | wan2.6-i2v | wan2.6-i2v-flash | — |
| audio.tts | qwen3-tts-flash | — | qwen3-tts-flash | — |
降级策略:如果本技能未加载或不可用,每个执行技能将回退到自身内置的默认配置。本协议仅做补充 — 它会优化模型选择,但绝不会阻塞执行。
Model Recommendation Matrix
模型推荐矩阵
Text Models
文本模型
| Use Case | Recommended | Why |
|---|---|---|
| General chat/assistant | qwen3.5-plus | Best balance of quality, speed, cost. Also accepts image/video input (multimodal). Thinking enabled by default. |
| Fast responses, low cost | qwen3.5-flash | 3x faster, 70% cheaper than Plus. Thinking enabled by default. |
| Highest quality | qwen3-max | Strongest reasoning. Built-in tools (web search, code interpreter). Supports thinking mode. |
| Code generation | qwen3-coder-next | Best balance of code quality, speed, cost. Agentic coding. |
| Complex reasoning | qwq-plus | Chain-of-thought reasoning specialist |
| Long documents | qwen3.5-plus | Up to 1M context. For >1M needs, see model-list.md. |
| Budget/high volume | qwen-turbo | Cheapest per-token cost |
| 使用场景 | 推荐模型 | 说明 |
|---|---|---|
| 通用聊天/助手 | qwen3.5-plus | 质量、速度、成本的最佳平衡。同时支持图像/视频输入(多模态)。默认开启思考模式。 |
| 快速响应、低成本 | qwen3.5-flash | 比Plus快3倍,便宜70%。默认开启思考模式。 |
| 最高质量 | qwen3-max | 最强推理能力。内置工具(网页搜索、代码解释器)。支持思考模式。 |
| 代码生成 | qwen3-coder-next | 代码质量、速度、成本的最佳平衡。支持智能体编码。如需最高质量请选择 |
| 复杂推理 | qwq-plus | 思维链推理专用模型 |
| 长文档 | qwen3.5-plus | 最高支持1M上下文。如需超过1M的支持,请查看model-list.md。 |
| 预算/高流量 | qwen-turbo | 单位Token成本最低 |
Image Models
图像模型
| Use Case | Recommended | Why |
|---|---|---|
| Best quality text-to-image | wan2.6-t2i | Latest model, sync support |
| Image editing / style transfer (1–4 refs) | wan2.6-image | Multi-image composition, subject consistency, 2K output, interleaved text-image |
| Image editing / multi-image fusion (1–3 refs) | wan2.5-i2i-preview | Simpler prompt-based editing, subject consistency, multi-image fusion |
| Interleaved text-image output (tutorials) | wan2.6-image | Mixed text+image generation |
| Fast iteration | wan2.2-t2i-flash | 50% faster generation |
| Flexible resolution | wan2.5-t2i-preview | Custom aspect ratios |
| 使用场景 | 推荐模型 | 说明 |
|---|---|---|
| 最高质量文生图 | wan2.6-t2i | 最新模型,支持同步生成 |
| 图像编辑 / 风格迁移(1-4个参考图) | wan2.6-image | 多图合成、主体一致性、2K输出、图文混排生成 |
| 图像编辑 / 多图融合(1-3个参考图) | wan2.5-i2i-preview | 更简单的基于提示词的编辑、主体一致性、多图融合 |
| 图文混排输出(教程) | wan2.6-image | 混合文本+图像生成 |
| 快速迭代 | wan2.2-t2i-flash | 生成速度快50% |
| 灵活分辨率 | wan2.5-t2i-preview | 支持自定义宽高比 |
Video Models
视频模型
| Use Case | Recommended | Why |
|---|---|---|
| Quick video creation | wan2.6-i2v-flash | Fast, multi-shot narrative |
| High quality | wan2.6-i2v | Best visual quality |
| With audio | wan2.5-i2v-preview | Auto-dubbing support |
| 使用场景 | 推荐模型 | 说明 |
|---|---|---|
| 快速视频创作 | wan2.6-i2v-flash | 速度快、支持多镜头叙事 |
| 高质量 | wan2.6-i2v | 最佳视觉质量 |
| 带音频 | wan2.5-i2v-preview | 支持自动配音 |
Audio Models
音频模型
| Use Case | Recommended | Why |
|---|---|---|
| Highest quality | | Best naturalness, emotional expression, professional scenarios |
| High quality + speed | | Good balance of quality and performance |
| Standard TTS | | Fast, reliable, multi-language, cost-effective |
| Controlled style | | Instruction-guided voice style (tone/emotion) |
| 使用场景 | 推荐模型 | 说明 |
|---|---|---|
| 最高质量 | | 最佳自然度、情感表达、适合专业场景 |
| 高质量+速度 | | 质量和性能的良好平衡 |
| 标准TTS | | 快速、可靠、多语言、高性价比 |
| 可控风格 | | 指令引导的语音风格(语气/情感) |
Vision Models
视觉模型
| Use Case | Recommended | Why |
|---|---|---|
| Best accuracy | qwen3-vl-plus | Highest vision understanding. Thinking mode supported. 256K context. |
| Fast analysis | qwen3-vl-flash | Quick image understanding. Thinking mode supported. |
| Unified text+vision | qwen3.5-plus | Multimodal (text + image + video). Surpasses qwen3-vl series on many benchmarks. Use when both text quality and vision matter. |
| 使用场景 | 推荐模型 | 说明 |
|---|---|---|
| 最高精度 | qwen3-vl-plus | 最高视觉理解能力。支持思考模式。256K上下文。 |
| 快速分析 | qwen3-vl-flash | 快速图像理解。支持思考模式。 |
| 统一图文 | qwen3.5-plus | 多模态(文本 + 图像 + 视频)。在多项基准测试中超过qwen3-vl系列。同时需要文本质量和视觉能力时使用。 |
Omni Models
全模态模型
| Use Case | Recommended | Why |
|---|---|---|
| Voice + vision chat | qwen3-omni-flash | Text/image/audio/video → text or speech. 49 voices, 10 languages. Thinking supported. |
| Real-time voice | qwen3-omni-flash-realtime | Streaming audio input + built-in VAD. 49 voices. |
| 使用场景 | 推荐模型 | 说明 |
|---|---|---|
| 语音+视觉聊天 | qwen3-omni-flash | 文本/图像/音频/视频 → 文本或语音。49种音色、10种语言。支持思考模式。 |
| 实时语音 | qwen3-omni-flash-realtime | 流式音频输入 + 内置VAD。49种音色。 |
Pricing Guidance
定价指引
- Default pricing: pricing.md — International, USD. For the latest rates, check the official pricing page.
- Latest prices: When the user explicitly asks for exact/latest pricing, see sources.md for official URLs.
- Cost formula: . 1K Chinese chars ≈ 1,200-1,500 tokens.
Cost = Tokens ÷ 1,000,000 × Unit price - Free quota: Some models offer a limited free quota after activation — but quotas may have been consumed, expired, or changed without notice. Always present the paid unit price first. Mention free quota only as something the user should verify in their QwenCloud console.
- Cost tips:
- Use Batch calling for 50% off in non-realtime scenarios
- Enable context cache for repeated contexts
- Use flash/turbo series for non-critical tasks
- 默认定价:pricing.md — 国际版,美元计价。如需最新费率,请查看官方定价页面。
- 最新价格:当用户明确要求准确/最新定价时,请查看sources.md获取官方URL。
- 成本公式:。1千汉字 ≈ 1,200-1,500个Token。
成本 = Token数 ÷ 1,000,000 × 单价 - 免费额度:部分模型激活后提供有限的免费额度 — 但额度可能已被消耗、过期或未经通知变更。始终优先展示付费单价。 仅将免费额度作为用户可在QwenCloud控制台核实的可能性提及。
- 成本提示:
- 非实时场景下使用批量调用可享50%优惠
- 重复上下文开启上下文缓存
- 非核心任务使用flash/turbo系列模型
Cost Estimation Disclaimer (MANDATORY)
成本估算免责声明(强制要求)
🚨 CRITICAL — NO EXCEPTIONS: NEVER fabricate, invent, or guess any price figure. If you do not have a confirmed price fromor the official pricing page, you MUST NOT output any number. Instead, direct the user to the official pricing page. Outputting a made-up price is a critical failure — worse than saying "I don't know."references/pricing.md
When responding to any cost-related query — including but not limited to price evaluation, usage estimation, budget
forecasting, or cost comparison — you MUST append a professional disclaimer. This applies regardless of language or
response format.
Required disclaimer (Chinese response):
⚠️ 费用说明:以上费用为基于官方公示单价的预估价格,仅供参考。实际费用受 Token 消耗量、上下文长度阶梯定价、Batch/缓存折扣及计费策略调整等因素影响,请以QwenCloud控制台的实际账单为准。部分模型可能提供限时免费额度,但免费额度的可用性、额度量及有效期随时可能调整,请在控制台确认您的账户是否仍有剩余额度,切勿假设本次调用免费。最新定价详见 模型定价页。
Required disclaimer (English response):
⚠️ Pricing Notice: The cost figures above are estimates calculated from officially published unit prices and are provided for reference only. Actual charges depend on token consumption, tiered context-length pricing, Batch/cache discounts, and billing policy updates. Some models may offer a time-limited free quota, but quota availability, amounts, and validity periods are subject to change — do not assume this call is free. Please verify your remaining quota in the QwenCloud console and refer to the actual bill for definitive costs. See Model Pricing for the latest rates.
Rules:
- The disclaimer must appear at the end of every cost-related response, clearly separated from the main content.
- When the estimate involves assumptions (e.g., average tokens per character, assumed context length tier), explicitly state each assumption used in the calculation.
- Never present estimated costs as exact or guaranteed amounts. Use hedging language such as "approximately", "estimated at", "roughly" (or Chinese equivalents "约", "预估", "约合") throughout the cost breakdown.
- Never tell the user a call will be free or cost $0/¥0. Even if a free quota exists, the user may have already consumed it. Always present the paid price and note that a free quota may apply — subject to the user verifying in their console.
- If pricing data is unavailable or uncertain, say so explicitly and link to the official pricing page. Never fill the gap with a guess.
🚨 重要 — 无例外:永远不要编造、虚构或猜测任何价格数据。 如果您没有从或官方定价页面获得确认的价格,绝对不要输出任何数字。而是引导用户访问官方定价页面。输出虚构价格是严重失误 — 比说“我不知道”更糟糕。references/pricing.md
回复任何与成本相关的查询时 — 包括但不限于价格评估、用量估算、预算预测或成本对比 — 必须附加专业免责声明。无论语言或回复格式如何,此要求均适用。
中文回复所需免责声明:
⚠️ 费用说明:以上费用为基于官方公示单价的预估价格,仅供参考。实际费用受 Token 消耗量、上下文长度阶梯定价、Batch/缓存折扣及计费策略调整等因素影响,请以QwenCloud控制台的实际账单为准。部分模型可能提供限时免费额度,但免费额度的可用性、额度量及有效期随时可能调整,请在控制台确认您的账户是否仍有剩余额度,切勿假设本次调用免费。最新定价详见 模型定价页。
英文回复所需免责声明:
⚠️ Pricing Notice: The cost figures above are estimates calculated from officially published unit prices and are provided for reference only. Actual charges depend on token consumption, tiered context-length pricing, Batch/cache discounts, and billing policy updates. Some models may offer a time-limited free quota, but quota availability, amounts, and validity periods are subject to change — do not assume this call is free. Please verify your remaining quota in the QwenCloud console and refer to the actual bill for definitive costs. See Model Pricing for the latest rates.
规则:
- 免责声明必须出现在每篇与成本相关回复的末尾,与主要内容明确分隔。
- 如果估算涉及假设(例如每字符平均Token数、假设的上下文长度阶梯),明确说明计算中使用的每个假设。
- 永远不要将估算成本表述为准确或保证的金额。在成本明细中全程使用模糊表述,例如“approximately”、“estimated at”、“roughly”(或对应中文“约”、“预估”、“约合”)。
- 永远不要告诉用户调用将免费或花费$0/¥0。 即使存在免费额度,用户也可能已经用完。始终展示付费价格,并说明免费额度可能适用 — 需用户在控制台核实。
- 如果定价数据不可用或不确定,请明确说明并链接到官方定价页面。永远不要猜测填补空白。
Available Models
可用模型
All standard text, vision, image, video, audio, and coding models are available. Some models offer free
quota (verify in console).
- Text: qwen3-max, qwen3.5-plus, qwen3.5-flash, qwen-turbo, qwq-plus, qwen3-coder-next/plus/flash, qwen-plus-character, qwen-plus-character-ja, qwen-flash-character
- Vision: qwen3-vl-plus, qwen3-vl-flash, qvq-max, qwen-vl-ocr, qwen-vl-max, qwen-vl-plus
- Omni: qwen3-omni-flash (+ realtime), qwen-omni-turbo (+ realtime)
- Image generation (text-to-image): wan2.6-t2i, wan2.5-t2i-preview, wan2.2-t2i-flash, z-image-turbo
- Image editing (requires reference images): wan2.6-image, wan2.5-i2i-preview
- Video generation: wan2.6 series (t2v, i2v, i2v-flash, r2v, r2v-flash), wan2.5/2.2 series, vace
- TTS: qwen3-tts-flash, qwen3-tts-instruct-flash, cosyvoice-v3 series
- ASR: qwen3-asr-flash, fun-asr
- Embedding/Rerank: text-embedding-v4, qwen3-rerank
- Translation: qwen-mt-plus/flash/lite/turbo
⚠️ Important: The model list above is a point-in-time snapshot and may be outdated. Model availability changes frequently. Always check the official model list for the authoritative, up-to-date catalog before making model decisions. See model-list.md for a more detailed local reference.
所有标准文本、视觉、图像、视频、音频和编码模型均可用。部分模型提供免费额度(需在控制台核实)。
- 文本: qwen3-max, qwen3.5-plus, qwen3.5-flash, qwen-turbo, qwq-plus, qwen3-coder-next/plus/flash, qwen-plus-character, qwen-plus-character-ja, qwen-flash-character
- 视觉: qwen3-vl-plus, qwen3-vl-flash, qvq-max, qwen-vl-ocr, qwen-vl-max, qwen-vl-plus
- 全模态: qwen3-omni-flash (+ 实时版), qwen-omni-turbo (+ 实时版)
- 图像生成(文生图): wan2.6-t2i, wan2.5-t2i-preview, wan2.2-t2i-flash, z-image-turbo
- 图像编辑(需要参考图): wan2.6-image, wan2.5-i2i-preview
- 视频生成: wan2.6 系列 (t2v, i2v, i2v-flash, r2v, r2v-flash), wan2.5/2.2 系列, vace
- TTS: qwen3-tts-flash, qwen3-tts-instruct-flash, cosyvoice-v3 系列
- ASR: qwen3-asr-flash, fun-asr
- Embedding/重排序: text-embedding-v4, qwen3-rerank
- 翻译: qwen-mt-plus/flash/lite/turbo
⚠️ 重要提示:上述模型列表是时间点快照,可能已过时。模型可用性会频繁变更。在做出模型决策前,请始终查看官方模型列表获取权威的最新目录。 查看model-list.md获取更详细的本地参考。
Thinking Mode
思考模式
Several models support hybrid thinking/non-thinking modes:
| Model | Thinking Default | Notes |
|---|---|---|
| qwen3.5-plus | On | Thinking enabled by default. Use |
| qwen3.5-flash | On | Thinking enabled by default. |
| qwen3-max | Off | Use |
| qwen-plus / qwen-flash / qwen-turbo | Off | Hybrid; enable for deeper reasoning at higher output cost. |
| qwen3-vl-plus / qwen3-vl-flash | Off | Vision + thinking for complex visual analysis. |
| qwen3-omni-flash | Off | Thinking supported; audio output not available in thinking mode. |
| qwq-plus / qvq-max | Always on | Pure reasoning models; CoT always active. |
Guidance: Do not enable thinking by default for simple or conversational tasks — it increases latency and output
token cost. Enable only when the user explicitly asks for deep reasoning or the task requires multi-step analysis.
部分模型支持混合思考/非思考模式:
| 模型 | 默认开启思考 | 说明 |
|---|---|---|
| qwen3.5-plus | 是 | 默认开启思考模式。使用 |
| qwen3.5-flash | 是 | 默认开启思考模式。 |
| qwen3-max | 否 | 复杂推理场景使用 |
| qwen-plus / qwen-flash / qwen-turbo | 否 | 混合模式;开启可获得更深的推理能力,但输出Token成本更高。 |
| qwen3-vl-plus / qwen3-vl-flash | 否 | 视觉+思考模式可用于复杂视觉分析。 |
| qwen3-omni-flash | 否 | 支持思考模式;思考模式下无法输出音频。 |
| qwq-plus / qvq-max | 始终开启 | 纯推理模型;思维链始终激活。 |
指引:简单或对话类任务不要默认开启思考模式 — 这会增加延迟和输出Token成本。仅当用户明确要求深度推理或任务需要多步分析时才开启。
Anti-Patterns
反模式
- Only recommend models listed in this skill — never fabricate model names.
- When unsure, use as a safe default for text tasks.
qwen3.5-plus - 🚨 NEVER invent or guess any price figure — only use pricing from or the official pricing page. If the data is not available, say so and link to the official page. Fabricating a price is a critical failure.
references/pricing.md - Always cite data source when providing pricing info.
- Default currency is USD. Check the official pricing page for the latest rates. Writing in Chinese does NOT imply CNY.
- All cost estimates must include the mandatory disclaimer — use hedging language ("approximately", "estimated"). Omitting the disclaimer is a critical violation.
- Never assume free quota is available — free quotas may have been consumed, expired, or removed. Never tell the user a call will cost $0/¥0. Always present the paid unit price first; mention free quota only as a possibility the user should verify in their console.
- Do NOT proactively fetch URLs or trigger web searches — only access online sources when the user explicitly asks for "latest" data AND reference files cannot answer.
- Never output API keys in plaintext — see Security section above.
- 仅推荐本技能中列出的模型 — 永远不要编造模型名称。
- 不确定时,文本任务使用作为安全默认。
qwen3.5-plus - 🚨 永远不要编造或猜测任何价格数据 — 仅使用或官方定价页面中的定价。如果数据不可用,说明情况并链接到官方页面。编造价格是严重失误。
references/pricing.md - 提供定价信息时始终注明数据源。
- 默认货币为美元。 查看官方定价页面获取最新费率。使用中文回复不代表默认使用人民币。
- 所有成本估算必须包含强制免责声明 — 使用模糊表述(“约”、“预估”)。省略免责声明属于严重违规。
- 永远不要假设免费额度可用 — 免费额度可能已被消耗、过期或移除。永远不要告诉用户调用将花费$0/¥0。始终优先展示付费单价;仅将免费额度作为用户可在控制台核实的可能性提及。
- 不要主动抓取URL或触发网页搜索 — 仅当用户明确要求“最新”数据且参考文件无法解答时,才访问在线资源。
- 永远不要明文输出API密钥 — 见上文安全部分。
Online Information Sources
在线信息源
Reference files in this skill are point-in-time snapshots. For official documentation URLs,
see sources.md.
Do NOT proactively fetch or search online. Only access online sources when ALL of these conditions are met:
- The user explicitly asks for "latest", "real-time", or "current" data.
- The reference files in this skill cannot answer the question.
- The user confirms they want an online lookup.
本技能中的参考文件是时间点快照。如需官方文档URL,请查看sources.md。
不要主动抓取或搜索在线内容。 仅当同时满足以下所有条件时,才访问在线资源:
- 用户明确要求“最新”、“实时”或“当前”数据。
- 本技能中的参考文件无法解答问题。
- 用户确认需要进行在线查询。
Update Check (MANDATORY Post-Execution)
更新检查(执行后强制要求)
When the user asks to check for updates (e.g. "check for updates", "check version", "is there a new version", "update
skills"):
- Find qwencloud-update-check: Look for in sibling skill directories.
qwencloud-update-check/SKILL.md - If found — run: and report the result. Use
python3 <qwencloud-update-check-dir>/scripts/check_update.py --print-responseif the user asks to force-check.--force - If not found — you MUST present the following options and wait for the user's response:
The update-check skill is not installed. It keeps you informed about new qwencloud/qwencloud-ai features and bug fixes.
- Install now — I'll run the install command for you
- Skip for now — do nothing this time
- Never remind me — permanently suppress this prompt
After the user responds:
- User chooses 1 → run:
, then run:
npx skills add QwenCloud/qwencloud-ai --skill qwencloud-update-check -ypython3 <this-skill-scripts-dir>/gossamer.py --dismiss - User chooses 2 → run: , then continue normally
python3 <this-skill-scripts-dir>/gossamer.py --dismiss - User chooses 3 → run: , then continue normally
python3 <this-skill-scripts-dir>/gossamer.py --never-install
当用户要求检查更新时(例如“check for updates”、“check version”、“is there a new version”、“update skills”):
- 查找qwencloud-update-check:在同级技能目录中查找。
qwencloud-update-check/SKILL.md - 如果找到 — 运行:并报告结果。如果用户要求强制检查,添加
python3 <qwencloud-update-check-dir>/scripts/check_update.py --print-response参数。--force - 如果未找到 — 必须展示以下选项并等待用户回复:
更新检查技能未安装。它会让您及时了解qwencloud/qwencloud-ai的新功能和bug修复。
- 立即安装 — 我将为您运行安装命令
- 暂时跳过 — 本次不做操作
- 永不提醒 — 永久屏蔽此提示
用户回复后:
- 用户选择1 → 运行:,然后运行:
npx skills add QwenCloud/qwencloud-ai --skill qwencloud-update-check -ypython3 <this-skill-scripts-dir>/gossamer.py --dismiss - 用户选择2 → 运行:,然后正常继续
python3 <this-skill-scripts-dir>/gossamer.py --dismiss - 用户选择3 → 运行:,然后正常继续
python3 <this-skill-scripts-dir>/gossamer.py --never-install
References
参考
- pricing.md — Pricing overview: model categories, billing units, and link to official pricing page
- model-list.md — Model catalog (2026-03 snapshot; check official model list for latest)
- sources.md — Official documentation URLs (for manual lookup only)
- pricing.md — 定价概览:模型分类、计费单位,以及官方定价页面的链接
- model-list.md — 模型目录(2026-03快照;查看官方模型列表获取最新内容)
- sources.md — 官方文档URL(仅手动查询使用)