down-skilling

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Down-Skilling: Opus → Haiku Distillation

技能降级：Opus → Haiku 推理蒸馏

Translate your reasoning capabilities into explicit, structured instructions that Haiku 4.5 can execute reliably. You are a compiler: your input is context, intent, and domain knowledge; your output is a Haiku-ready prompt with decision procedures and diverse examples.

将你的推理能力转化为Haiku 4.5可可靠执行的明确、结构化指令。你就像一个编译器：输入是上下文、意图和领域知识；输出是包含决策流程和多样化示例的Haiku就绪提示词。

Core Principle

核心原则

Opus infers from WHY. Haiku executes from WHAT and HOW.

Your job: convert implicit reasoning, contextual judgment, and domain expertise into explicit procedures, concrete decision trees, and demonstrative examples. Every inference you would make silently, Haiku needs stated explicitly.

Opus从“为什么”（WHY）进行推理。Haiku从“做什么”（WHAT）和“怎么做”（HOW）执行任务。

你的工作：将隐含推理、上下文判断和领域专业知识转化为明确的流程、具体的决策树和演示性示例。你会默默做出的每一个推理，都需要明确告知Haiku。

Economics: Why Examples Are Free

经济性：为何示例几乎无成本

Opus input costs ~6× Haiku input. A task that costs $1.00 on Opus costs ~$0.17 on Haiku — but only if Haiku gets it right on the first try. One retry wipes the savings; two retries makes Haiku more expensive.

The math that matters:

Input tokens are cheap (Haiku: $0.80/MTok input vs $4.00/MTok output)
Adding 2,000 tokens of examples costs ~$0.0016 per call
A single failed-then-retried call costs ~$0.008+ in wasted output
Examples pay for themselves if they prevent even 1-in-5 retries

What this means for prompt design:

If you're sending an 8K token document, you can afford 3-4K tokens of examples — the examples cost less than the document itself
Lengthy input prompts don't inflate output costs — output pricing is independent of input length
The constraint is not token cost but diminishing returns: after 5-7 examples, additional examples rarely improve performance

Bottom line: Every example that prevents a Haiku misfire saves 5-25× its input cost in wasted output tokens. Under-investing in examples is the most expensive mistake in down-skilling.

Opus的输入成本约为Haiku的6倍。在Opus上花费1美元的任务，在Haiku上仅需约0.17美元——但前提是Haiku第一次就能做对。一次重试会抵消成本优势；两次重试会让Haiku的成本超过Opus。

关键数据：

输入令牌成本极低（Haiku：每百万输入令牌0.8美元，输出令牌4.0美元）
添加2000个令牌的示例，每次调用仅需约0.0016美元
一次失败后重试的调用，浪费的输出令牌成本约为0.008美元以上
只要示例能避免1/5的重试，就能回本

这对提示词设计的意义：

如果你要处理8K令牌的文档，完全可以负担3-4K令牌的示例——示例的成本甚至低于文档本身
冗长的输入提示不会推高输出成本——输出定价与输入长度无关
限制因素不是令牌成本，而是边际效益递减：5-7个示例后，额外示例对性能的提升微乎其微

结论： 每个能避免Haiku出错的示例，能节省其输入成本5-25倍的输出令牌浪费。在示例上投入不足是技能降级中代价最高的错误。

Activation

触发流程

When triggered, perform these steps:

Extract task context from the conversation: what is the user trying to accomplish? What domain knowledge applies? What quality criteria matter?
Identify the reasoning gaps — what would Opus infer automatically that Haiku needs spelled out? Common gaps:
- Ambiguity resolution (Opus picks the sensible interpretation; Haiku needs a decision rule)
- Quality judgment (Opus knows "good enough"; Haiku needs explicit criteria)
- Edge case handling (Opus reasons through novel situations; Haiku needs enumerated cases)
- Output calibration (Opus matches tone/length intuitively; Haiku needs explicit constraints)
Generate the distilled prompt following the structure in Prompt Architecture
Generate 4-7 diverse examples following the principles in Example Design — this is the highest-leverage step
Deliver the complete Haiku-ready prompt as a copyable artifact or file, including system prompt and user prompt components as appropriate

当触发该功能时，执行以下步骤：

提取任务上下文：从对话中明确用户的目标是什么？适用的领域知识有哪些？质量标准是什么？
识别推理差距——Opus会自动推断，但Haiku需要明确说明的内容。常见差距包括：
- 歧义消除（Opus会选择合理的解释；Haiku需要明确的决策规则）
- 质量判断（Opus知道“足够好”的标准；Haiku需要明确的判定条件）
- 边缘情况处理（Opus能推理新场景；Haiku需要枚举具体情况）
- 输出校准（Opus能直观匹配语气和长度；Haiku需要明确的约束）
生成蒸馏后的提示词，遵循提示词架构中的结构
生成4-7个多样化示例，遵循示例设计中的原则——这是投资回报率最高的步骤
交付完整的Haiku就绪提示词，作为可复制的内容或文件，包括系统提示词和用户提示词组件（如适用）

Prompt Architecture

提示词架构

Structure every distilled prompt with these components in this order. Haiku responds best to this specific sequencing:

<role>
[Single sentence: who Haiku is and what it does]
</role>

<task>
[2-3 sentences: the specific task, its purpose, and the deliverable]
</task>

<rules>
[Numbered list of explicit constraints. Be precise about:]
- Output format (JSON schema, markdown structure, etc.)
- Length bounds (word/token counts, not vague "brief"/"detailed")
- Required elements (must-include fields or sections)
- Prohibited behaviors (specific failure modes to avoid)
- Decision rules for ambiguous cases
</rules>

<process>
[Numbered steps. Maximum 7 steps. Each step is one action.]
[Include validation checkpoints: "Before proceeding, verify X"]
[Include decision points: "If X, do Y. If Z, do W."]
</process>

<examples>
[4-7 diverse examples showing input → output pairs]
[This section should be the LARGEST part of the prompt]
[See Example Design section for distribution requirements]
</examples>

<context>
[Task-specific data, reference material, or domain knowledge]
[Use labels: [Context], [Policy], [Reference]]
</context>

所有蒸馏后的提示词都需按以下顺序包含这些组件。Haiku对这种特定排序的响应最佳：

<role>
[单句：Haiku的身份和任务]
</role>

<task>
[2-3句：具体任务、目的和交付成果]
</task>

<rules>
[编号列表形式的明确约束。需精确说明：]
- 输出格式（JSON schema、markdown结构等）
- 长度限制（单词/令牌数量，而非模糊的“简短”/“详细”）
- 必需元素（必须包含的字段或章节）
- 禁止行为（需避免的具体失败模式）
- 歧义场景的决策规则
</rules>

<process>
[编号步骤。最多7步，每步对应一个动作。]
[包含验证检查点：“继续前，验证X”]
[包含决策点：“如果X，则执行Y；如果Z，则执行W”]
</process>

<examples>
[4-7个多样化示例，展示输入→输出对]
[此部分应是提示词中占比最大的部分]
[示例分布要求见示例设计章节]
</examples>

<context>
[任务特定数据、参考资料或领域知识]
[使用标签：[Context]、[Policy]、[Reference]]
</context>

Haiku Optimization Rules

Haiku优化规则

Apply these when generating any Haiku-targeted prompt:

生成面向Haiku的提示词时，需遵循以下规则：

Structure & Syntax

结构与语法

Use XML tags to delimit every section — Haiku respects labeled boundaries
Keep sentences under 25 words where possible
One instruction per sentence; split compound instructions
Use numbered steps, not prose paragraphs, for procedures
Specify token/word budgets explicitly: "respond in 80-120 words"

使用XML标签分隔每个章节——Haiku能更好地识别带标签的边界
尽可能将句子控制在25词以内
每个句子对应一条指令；拆分复合指令
流程使用编号步骤，而非散文段落
明确指定令牌/单词数量限制：“用80-120词回复”

Reasoning Support

推理支持

Replace open-ended judgment with decision rubrics: BAD: "Assess whether the code is production-ready" GOOD: "Check: (a) no TODO comments, (b) all functions have error handling, (c) no hardcoded secrets. Score pass/fail per item."
Bound reasoning depth: "Think in 3-5 steps, then give your answer"
Provide a fallback for uncertainty: "If you cannot determine X, respond with: 'UNCERTAIN: [brief reason]'"

用决策准则替代开放式判断：错误示例：“评估代码是否可用于生产环境” 正确示例：“检查：(a) 无TODO注释，(b) 所有函数都有错误处理，(c) 无硬编码密钥。逐项标记通过/失败。”
限制推理深度：“思考3-5步后给出答案”
为不确定性提供 fallback：“如果无法确定X，回复：‘UNCERTAIN: [简要原因]’”

Context Management

上下文管理

Front-load critical instructions (Haiku attends strongly to position)
Budget rule of thumb: instructions + rules ≤ 800 tokens, examples get the rest. For a task processing an 8K document, 3-4K tokens of examples is well within budget and pays for itself in reliability
Pass only the 1-3 most relevant context snippets, not full documents
Use explicit delimiters between context and instructions

关键指令前置（Haiku对内容位置的关注度很高）
预算经验法则：指令+规则 ≤ 800令牌，剩余令牌留给示例。处理8K文档的任务中，3-4K令牌的示例完全在预算内，且能提升可靠性
仅传递1-3个最相关的上下文片段，而非完整文档
在上下文和指令间使用明确的分隔符

Output Control

输出控制

Require structured output (JSON, labeled sections) for extractable results
Provide an output template Haiku can fill in
Specify what comes first in the response: "Begin your response with..."
For classification tasks, enumerate all valid categories

要求结构化输出（JSON、带标签的章节）以便于提取结果
提供Haiku可填充的输出模板
指定响应的开头内容：“你的回复以...开头”
分类任务需枚举所有有效类别

Failure Prevention

故障预防

Anticipate Haiku's common failure modes and add guardrails:
- Hallucination: "Use ONLY information from the provided context. If the answer is not in the context, say 'Not found in sources.'"
- Verbosity: "Maximum 150 words. Do not add preamble or caveats."
- Format drift: Include the output schema in both rules and examples
- Instruction skipping: Number all constraints; reference them in the process steps: "Apply rules 2-4 from <rules>"

预判Haiku的常见失败模式并添加防护措施：
- 幻觉：“仅使用提供的上下文信息。如果答案不在上下文中，回复‘未在来源中找到。’”
- 冗长：“最多150词。不要添加开场白或免责声明。”
- 格式偏差：在规则和示例中都包含输出schema
- 跳过指令：为所有约束编号；在流程步骤中引用：“应用<rules>中的规则2-4”

Example Design

示例设计

Examples are the single highest-leverage investment in a Haiku prompt. Rules tell Haiku what to do; examples show it what "done right" looks like. When rules and examples conflict, Haiku follows the examples. When rules are ambiguous, Haiku extrapolates from examples. This makes examples the primary steering mechanism — not a supplement to rules, but the dominant signal.

Given the economics (see Economics), you should invest heavily here. A prompt with 800 tokens of rules and 3,000 tokens of examples will outperform one with 2,000 tokens of rules and 500 tokens of examples almost every time.

**示例是Haiku提示词中投资回报率最高的部分。**规则告诉Haiku要做什么；示例展示“正确完成”的样子。当规则与示例冲突时，Haiku会遵循示例。当规则模糊时，Haiku会从示例中推断。这使得示例成为主要的引导机制——不是规则的补充，而是主导信号。

结合经济性（见经济性章节），你应该在此投入大量资源。一个包含800令牌规则和3000令牌示例的提示词，几乎总能胜过包含2000令牌规则和500令牌示例的提示词。

Minimum Example Count: 4

最少示例数量：4个

Generate 4-7 diverse examples per distilled prompt. Fewer than 4 is under-investing. The marginal cost of each example is negligible compared to the reliability improvement. Use this distribution:

#	Role	Purpose
1	Typical case	The most common, straightforward input. Establishes the baseline pattern.
2	Second typical variant	A different but common input — varies length, domain, or structure from #1. Prevents Haiku from over-fitting to a single pattern.
3	Edge case	Unusual but valid input: empty fields, very long text, special characters, boundary conditions, ambiguous phrasing.
4	Negative/rejection case	Input that should be rejected, handled differently, or produce an empty/default output. Shows Haiku what NOT to do.
5+	Tricky/boundary cases	Inputs near decision boundaries where Haiku is most likely to fail. The cases you'd use for a test suite.

Why the second typical case matters: With only one typical example, Haiku may latch onto incidental features of that example (its length, word choice, domain). A second typical case from a different angle shows Haiku which features are task-relevant and which are coincidental.

每个蒸馏后的提示词需生成4-7个多样化示例。少于4个属于投入不足。每个示例的边际成本可忽略不计，但其对可靠性的提升显著。示例分布如下：

序号	角色	目的
1	典型场景	最常见、直接的输入。建立基准模式。
2	第二种典型变体	不同但常见的输入——与第1个示例在长度、领域或结构上有所不同。避免Haiku过度拟合单一模式。
3	边缘场景	不常见但有效的输入：空字段、超长文本、特殊字符、边界条件、模糊表述。
4	负面/拒绝场景	应被拒绝、特殊处理或生成空/默认输出的输入。展示Haiku什么不能做。
5+	复杂/边界场景	接近决策边界的输入，Haiku最容易出错。这些场景可作为测试用例。

**为什么第二种典型场景很重要：**如果只有一个典型示例，Haiku可能会抓住该示例的非关键特征（如长度、措辞、领域）。第二个不同角度的典型示例能让Haiku区分哪些特征与任务相关，哪些是巧合。

Example Format

示例格式

xml

<example>
<input>
[Realistic input data — use real-world length and complexity]
</input>
<output>
[Exact format Haiku should produce — not a description, the actual output]
</output>
<reasoning>
[1-2 sentences: WHY this output is correct. Which rules applied.
 What Haiku might have gotten wrong without this example.]
</reasoning>
</example>

The
<reasoning>
tag is not optional for complex tasks. It acts as a chain-of-thought anchor, showing Haiku the reasoning pattern to follow. For classification and extraction tasks, reasoning should reference the specific rule numbers that drive the decision.

xml

<example>
<input>
[真实的输入数据——符合现实世界的长度和复杂度]
</input>
<output>
[Haiku应生成的精确格式——不是描述，而是实际输出内容]
</output>
<reasoning>
[1-2句：为什么此输出正确。应用了哪些规则。如果没有此示例，Haiku可能会犯什么错误。]
</reasoning>
</example>

**对于复杂任务，

<reasoning>

标签是必需的。**它作为思维链锚点，向Haiku展示应遵循的推理模式。对于分类和提取任务，推理部分应引用驱动决策的具体规则编号。

Example Sizing Guidance

示例规模指导

Task processing...	Recommended example budget
Short inputs (<500 tokens)	1,500-2,500 tokens of examples (4-5 examples)
Medium inputs (500-4K tokens)	2,500-4,000 tokens of examples (4-6 examples)
Long inputs (4K-8K tokens)	3,000-5,000 tokens of examples (5-7 examples)

These budgets assume Haiku's 200K context window. The constraint is diminishing returns, not cost — after 7 examples the marginal benefit drops sharply unless the task has a very large classification space.

任务处理对象...	推荐示例预算
短输入（<500令牌）	1500-2500令牌的示例（4-5个）
中等输入（500-4K令牌）	2500-4000令牌的示例（4-6个）
长输入（4K-8K令牌）	3000-5000令牌的示例（5-7个）

这些预算基于Haiku的200K上下文窗口。限制因素是边际效益递减，而非成本——7个示例后，除非任务的分类空间非常大，否则额外示例的边际收益会急剧下降。

Example Quality Criteria

示例质量标准

Examples must be realistic, not toy data — match the complexity and messiness of real inputs
Output format must be identical across all examples — Haiku treats format inconsistency as a signal that format doesn't matter
Include the hardest case you expect Haiku to handle
Vary input characteristics: length, complexity, domain, tone
Never include an example that contradicts your rules
Order examples from simplest to most complex — this progressive difficulty helps Haiku build up its understanding

示例必须真实，而非玩具数据——匹配真实输入的复杂度和混乱程度
所有示例的输出格式必须完全一致——Haiku会将格式不一致视为“格式不重要”的信号
包含你预期Haiku能处理的最难场景
多样化输入特征：长度、复杂度、领域、语气
绝不要包含与规则矛盾的示例
示例按从简单到复杂排序——这种渐进式难度有助于Haiku逐步建立理解

Delivery Format

交付格式

Present the distilled prompt in a code block or artifact with clear section markers. Include:

System prompt (if applicable): role + persistent rules
User prompt template: with {{placeholders}} for variable content
Examples: embedded in the prompt or as a separate few-shot section
Usage notes: any caveats about when this prompt may fail and what to watch for

When generating for API use, include the model parameter and recommended settings:

model: claude-haiku-4-5-20251001
max_tokens: [appropriate for task]
temperature: 0 (for deterministic tasks) or 0.3 (for creative tasks)

将蒸馏后的提示词放在代码块或带明确章节标记的内容中呈现。包含：

系统提示词（如适用）：角色+持久规则
用户提示词模板：用{{占位符}}表示可变内容
示例：嵌入提示词中或作为单独的小样本章节
使用说明：提示词可能失效的场景及注意事项

当为API使用生成提示词时，需包含模型参数和推荐设置：

model: claude-haiku-4-5-20251001
max_tokens: [任务适用的数值]
temperature: 0（确定性任务）或0.3（创造性任务）

Agentic Resource Selection

智能资源选择

The skill includes two directories of granular reference files. Do NOT read them all. Scan the index below, then read only the files relevant to the current task.

该技能包含两个细化参考文件目录。无需全部阅读。先扫描下方索引，然后仅阅读与当前任务相关的文件。

gaps/

Each file documents one reasoning pattern where Haiku diverges from Opus, with a tested mitigation strategy. Read 2-4 per task.

File	Use when the task involves...
`ambiguity-resolution.md`	Input that has multiple valid interpretations; vague user requests
`code-generation.md`	Generating code, scripts, or queries; style matching to existing code
`comparative-analysis.md`	Comparing options, pros/cons, tradeoff analysis
`conditional-logic.md`	Decision trees, branching rules, nested if/then logic
`context-utilization.md`	Long context windows, documents >2K tokens, position-sensitive info
`counting-enumeration.md`	"Generate exactly N items", counting occurrences, list lengths
`creative-generation.md`	Writing, tone adaptation, persona consistency, style matching
`implicit-constraints.md`	Tasks where tone, audience, or format norms are assumed not stated
`instruction-density.md`	Tasks requiring 8+ simultaneous constraints; complex rule sets
`multi-hop-reasoning.md`	3+ step inference chains; cause-effect-consequence analysis
`multi-turn-consistency.md`	Chatbot behavior, stateful conversations, persona maintenance
`negation-handling.md`	Constraints phrased as "don't", "never", "avoid"; prohibitions
`nuanced-classification.md`	Borderline cases, multi-label classification, overlapping categories
`output-calibration.md`	Length control, format precision, verbosity management (ALWAYS read)
`parallel-consistency.md`	Generating multiple similar items; lists where format must be uniform
`partial-information.md`	Missing fields, incomplete input, optional data, error states
`schema-adherence.md`	Structured output (JSON, tables) that must survive edge-case inputs
`self-correction.md`	Tasks needing verification; quality checks before output
`summarization-fidelity.md`	Summarizing documents without distortion, position bias, or fabrication
`tool-use-planning.md`	Multi-tool workflows, API orchestration, dependency ordering

每个文件记录了Haiku与Opus在某一推理模式上的差异，以及经过测试的缓解策略。每个任务阅读2-4个相关文件。

文件	适用于以下任务...
`ambiguity-resolution.md`	存在多种合理解释的输入；模糊的用户请求
`code-generation.md`	生成代码、脚本或查询；匹配现有代码风格
`comparative-analysis.md`	选项比较、优缺点分析、权衡分析
`conditional-logic.md`	决策树、分支规则、嵌套if/then逻辑
`context-utilization.md`	长上下文窗口、>2K令牌的文档、位置敏感信息
`counting-enumeration.md`	“生成恰好N个条目”、计数出现次数、列表长度控制
`creative-generation.md`	写作、语气适配、角色一致性、风格匹配
`implicit-constraints.md`	语气、受众或格式规范为隐含假设而非明确说明的任务
`instruction-density.md`	需要8个以上同时约束的任务；复杂规则集
`multi-hop-reasoning.md`	3步以上的推理链；因果-结果分析
`multi-turn-consistency.md`	聊天机器人行为、有状态对话、角色维护
`negation-handling.md`	以“不要”、“绝不”、“避免”表述的约束；禁止行为
`nuanced-classification.md`	边界场景、多标签分类、重叠类别
`output-calibration.md`	长度控制、格式精度、冗长管理（必须阅读）
`parallel-consistency.md`	生成多个相似条目；格式必须统一的列表
`partial-information.md`	缺失字段、不完整输入、可选数据、错误状态
`schema-adherence.md`	结构化输出（JSON、表格），需能处理边缘场景输入
`self-correction.md`	需要验证的任务；输出前的质量检查
`summarization-fidelity.md`	文档摘要，无扭曲、位置偏差或编造内容
`tool-use-planning.md`	多工具工作流、API编排、依赖排序

examples/

Complete before/after distillations. Each shows an Opus-level task → Haiku-optimized prompt with annotated examples. Read 1-2 closest to the current task domain.

File	Use when distilling...
`api-orchestration.md`	Multi-step tool/API workflows with dependencies and branching
`code-review-triage.md`	Analysis tasks with severity classification and structured JSON output
`content-moderation.md`	Safety-critical classification with "when uncertain" defaults
`creative-rewriting.md`	Tone adaptation, audience-aware rewriting, style transfer
`data-extraction.md`	Schema-bound extraction from unstructured text to JSON
`document-qa.md`	RAG / retrieval-grounded QA with citation and "not found" handling
`email-summarization.md`	Information extraction from conversations/threads into sections
`meeting-notes.md`	Transcript processing into decisions, actions, and next steps
`resume-screening.md`	Multi-criteria evaluation with parallel scoring structure
`sql-generation.md`	Natural language to code with schema constraints and error handling
`step-by-step-analysis.md`	Multi-step analytical reasoning with explicit decision rubrics
`text-classification.md`	Multi-label classification with confidence and ambiguity handling

完整的蒸馏前后示例。每个示例展示了Opus级任务→Haiku优化后的提示词，带注释示例。阅读1-2个与当前任务领域最接近的文件。

文件	适用于以下蒸馏任务...
`api-orchestration.md`	带依赖和分支的多步骤工具/API工作流
`code-review-triage.md`	带严重性分类和结构化JSON输出的分析任务
`content-moderation.md`	安全关键型分类，含“不确定时”默认规则
`creative-rewriting.md`	语气适配、受众导向重写、风格迁移
`data-extraction.md`	从非结构化文本到JSON的schema绑定提取
`document-qa.md`	RAG/检索增强型QA，含引用和“未找到”处理
`email-summarization.md`	从对话/线程中提取信息并分章节整理
`meeting-notes.md`	转录处理为决策、行动项和后续步骤
`resume-screening.md`	多标准评估，含并行评分结构
`sql-generation.md`	自然语言转代码，含schema约束和错误处理
`step-by-step-analysis.md`	多步骤分析推理，带明确决策准则
`text-classification.md`	多标签分类，含置信度和歧义处理

Self-Check

—

Before delivering, verify the distilled prompt against these criteria:

Every Opus inference is made explicit
All constraints are numbered and cross-referenced
4+ diverse examples with consistent output format
Examples include: 2 typical, 1+ edge case, 1+ negative/rejection case
Example tokens ≥ 2× rule tokens (examples should be the bulk of the prompt)
No instruction assumes Haiku will "figure it out"
Decision points have explicit branches, not open-ended judgment
Output format is demonstrated, not just described
```
<reasoning>
```
tags explain WHY each example output is correct

—