improve

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Improve

迭代优化

Iteratively improve any output until all criteria are met — with scored evaluation at every step so "better" is never subjective.

可迭代优化任意输出，直至满足所有标准——每一步都会进行打分评估，因此“更好”永远不会是主观判断。

Mode Selection

模式选择

Choose the mode based on what the user is asking for. This matters because forcing a full loop on a simple request wastes time, while skipping structure on a complex request leads to aimless changes.

根据用户的需求选择对应模式。这点非常重要：如果对简单需求强行执行完整循环会浪费时间，而对复杂需求省略结构化流程则会导致改动漫无目的。

Quick Mode

快速模式

Use when the user has a single, clear goal — e.g., "make this function faster", "shorten this paragraph", "fix the formatting". No need to establish multiple criteria or run a full scoring loop.

Acknowledge the goal
Make the improvement
Show before/after comparison
Ask: "Does this hit the mark, or should I keep going?"

If the user wants more iterations, escalate to Full Mode with proper criteria.

适用于用户有单一明确目标的场景——例如“让这个函数运行更快”、“缩短这段段落”、“修复格式问题”。无需设置多个评估标准，也不需要运行完整的打分循环。

确认优化目标
执行优化改动
展示改动前后的对比
询问用户：“是否符合预期，还是需要我继续优化？”

如果用户需要更多次迭代，升级到带明确标准的完整模式。

Full Mode

完整模式

Use when the task involves multiple dimensions of quality, competing tradeoffs, or the user explicitly asks for iterative refinement. This is the default for complex or ambiguous requests.

Receive input
Establish 1-3 ranked, measurable criteria
Score baseline (Iteration 0), then iteratively improve — one focused goal per iteration
Each iteration: improve → score → user checkpoint → decide (continue/stop)
Deliver final output with full iteration history

适用于任务涉及多维度质量评估、需要权衡取舍，或者用户明确要求迭代优化的场景。这是复杂或模糊需求的默认模式。

接收输入
确定1-3个有优先级、可衡量的评估标准
给基线版本（第0次迭代）打分，之后进行迭代优化——每次迭代只聚焦一个目标
每次迭代流程：优化 → 打分 → 用户确认 → 决策（继续/停止）
交付最终输出，附带完整迭代历史

Domain Hints

领域提示

Different output types have different improvement patterns. Use these as starting points — not rigid rules — when thinking about what "better" means in context.

Code — readability, performance, correctness, test coverage, idiomatic style
Prose — clarity, conciseness, tone, structure, audience fit
Data/Config — correctness, completeness, consistency, schema compliance
Design/Visual — alignment, hierarchy, accessibility, responsiveness

When establishing criteria, suggest domain-relevant dimensions the user might not have thought of.

不同类型的输出有不同的优化逻辑。思考场景下“更好”的定义时，可以参考这些提示，无需当作刚性规则：

代码 —— 可读性、性能、正确性、测试覆盖率、符合语言惯用风格
文字内容 —— 清晰度、简洁度、语气、结构、适配目标受众
数据/Config —— 正确性、完整性、一致性、符合Schema规范
设计/视觉 —— 对齐度、层级清晰性、可访问性、响应式适配

确定评估标准时，可以建议用户可能没有考虑到的领域相关维度，但最终由用户决定。

Asking Questions

提问方式

Use

ask_user

(with

choices

when the answer space is predictable) if available. If

ask_user

is not available, ask questions in plain text — the important thing is to ask, not how you ask.

使用

ask_user

（当答案范围可预测时搭配

choices

参数）如果该功能可用。如果不支持

ask_user

，就用纯文本提问——核心是要提问，用什么方式不重要。

Core Rules

核心规则

Never improve without at least one criterion. Even in Quick Mode, the user's request implies a criterion — name it explicitly.
Always score before AND after (Full Mode). Every iteration has a scorecard.
One focused goal per iteration. Closely related changes may go together, but they must serve a single improvement goal. Multiple unrelated changes make it impossible to know what helped.
Target the largest gap first. Never polish a met criterion while others are unmet.
Detect regressions. If a previously-met criterion drops, revert or fix before continuing.
Stop when appropriate. All criteria met, 3 stalled iterations, 10 total iterations (soft max), or user says stop.
User checkpoints matter. After scoring each iteration, briefly share the scorecard and ask if the user agrees with the assessment. Self-scoring has blind spots — the user's perspective is the ground truth.

没有至少一条评估标准时永远不要开始优化。 哪怕是快速模式，用户的需求也隐含了评估标准——要把它明确说出来。
完整模式下优化前后必须都打分。每次迭代都要有打分卡。
每次迭代只聚焦一个目标。 紧密相关的改动可以放在一起，但必须服务于同一个优化目标。多个无关改动会导致无法判断是什么起了作用。
优先解决最大的差距。 其他标准还没满足时，不要去润色已经达标的标准。
检测退化。 如果之前已经达标的标准分数下降，先回滚或者修复问题再继续。
在合适的时机停止。 所有标准都已满足、连续3次迭代没有进展、累计迭代10次（软上限）或者用户要求停止时终止。
用户确认很重要。 每次迭代打分后，简要分享打分卡，询问用户是否同意评估结果。自打分有盲区——用户的视角才是事实标准。

Workflow (Full Mode)

工作流（完整模式）

For detailed scoring rubric, scorecard formats, priority rules, and completion conditions: see references/EVALUATION-GUIDE.md

Receive — Accept the input (existing work or raw request)
Establish Criteria — Extract or ask for measurable evaluation criteria (1-3, ranked)
Capture Baseline — Produce first output, score it (Iteration 0)
Improve — Make one focused change targeting the largest gap
Evaluate & Checkpoint — Score the new output, share scorecard with user, confirm alignment
Decide — Continue, stop, or adjust criteria
Deliver — Present final output with full iteration history

详细的打分规则、打分卡格式、优先级规则和完成条件请参考references/EVALUATION-GUIDE.md

接收 —— 接受输入（现有成果或者原始需求）
确定标准 —— 提取或者向用户询问可衡量的评估标准（1-3条，按优先级排序）
记录基线 —— 生成第一次输出，给它打分（第0次迭代）
优化 —— 针对最大的差距做一个聚焦的改动
评估与用户确认 —— 给新输出打分，和上一次迭代对比，检查是否有退化。然后把打分卡分享给用户——他们可能会发现你遗漏的问题，或者不认同你的打分。继续前先按需调整。
决策 —— 继续优化、停止，或者调整评估标准
交付 —— 呈现最终输出，附带完整迭代历史

Step 1 — Receive

步骤1 —— 接收

Accept the input. It can be:

Existing work — User provides something they want improved. Ask for criteria (Step 2).
Raw request — User describes what they want. Produce the first version, then ask for criteria.

Acknowledge what was received. Do NOT start improving yet.

接受输入，输入可以是：

现有成果 —— 用户提供了他们想要优化的内容。跳转到步骤2询问评估标准。
原始需求 —— 用户描述了他们想要的内容。先生成第一个版本，再询问评估标准。

确认收到输入，暂时不要开始优化。

Step 2 — Establish Criteria

步骤2 —— 确定标准

Ask for or confirm ranked, specific, observable criteria. A single criterion is fine for focused tasks — don't force the user to invent extra criteria when one is enough. If more than 3, force-rank and pick the top 3.

Suggest domain-relevant criteria the user may not have mentioned (see Domain Hints above), but let them decide.

询问或者确认有优先级、具体、可观测的评估标准。聚焦的任务只有一条标准也没问题——不要在一条标准就足够的情况下强迫用户额外创造标准。如果标准超过3条，强制排序，选择优先级最高的3条。

建议用户可能没有提到的领域相关标准（参考上面的领域提示），但最终由用户决定。

Step 3 — Capture Baseline

步骤3 —— 记录基线

Produce the first output (or use existing work as-is), score it against all criteria, and present to the user with key gaps identified.

生成第一次输出（或者直接使用现有成果），对照所有标准打分，呈现给用户并标注出核心差距。

Step 4 — Improve

步骤4 —— 优化

Make one focused change targeting the largest gap. Explain what you're changing and why before making the change.

针对最大的差距做一个聚焦的改动。改动前先说明你要改什么以及为什么改。

Step 5 — Evaluate & Checkpoint

步骤5 —— 评估与用户确认

Score the new output against ALL criteria, compare to previous iteration, and check for regressions. Then share the scorecard with the user — they may see things you missed, or disagree with your scoring. Adjust if needed before continuing.

对照所有标准给新输出打分，和上一次迭代对比，检查是否有退化。然后把打分卡分享给用户——他们可能会发现你遗漏的点，或者不认同你的打分。继续前按需调整。

Step 6 — Decide

步骤6 —— 决策

Continue if gaps remain, stop if all criteria met, 3 stalled iterations, 10 total iterations, or user says stop.

如果还有差距就继续优化，如果所有标准都已满足、连续3次迭代无进展、累计迭代10次或者用户要求停止就终止。

Step 7 — Deliver

步骤7 —— 交付

Present the final output with a summary:

undefined

呈现最终输出和总结：

undefined

Final Output

[The improved result]

Iteration History

Iteration	Change	Total Score	Delta
0 (Baseline)	—	X / Y	—
1	[Change description]	X / Y	+W
2	[Change description]	X / Y	+W

Final Score: X / Y Criteria met: [List] Criteria not met (if any): [List with reason]

undefined

Iteration	Change	Total Score	Delta
0 (Baseline)	—	X / Y	—
1	[Change description]	X / Y	+W
2	[Change description]	X / Y	+W

Final Score: X / Y Criteria met: [List] Criteria not met (if any): [List with reason]

undefined

Examples

示例

Quick Mode Example

快速模式示例

User: "Make this function faster"

1. Goal: improve execution speed
2. Read the function, identify bottleneck
3. Apply optimization
4. Show before/after with explanation
5. "Does this hit the mark, or should I keep going?"

用户： "Make this function faster"

1. Goal: improve execution speed
2. Read the function, identify bottleneck
3. Apply optimization
4. Show before/after with explanation
5. "Does this hit the mark, or should I keep going?"

Full Mode Example

完整模式示例

User: "Improve this README — it's confusing and too long"

1. Receive: README.md contents
2. Criteria (confirmed with user):
   - #1 Clarity: a new team member understands the project in under 2 minutes
   - #2 Conciseness: under 200 lines, no redundant sections
3. Baseline: score both criteria, identify key gaps
4. Iteration 1: restructure for clarity (target biggest gap)
5. Checkpoint: share scorecard, ask user if they agree
6. Iteration 2: trim redundant content (next gap)
7. Deliver: final README + iteration history

用户： "Improve this README — it's confusing and too long"

1. Receive: README.md contents
2. Criteria (confirmed with user):
   - #1 Clarity: a new team member understands the project in under 2 minutes
   - #2 Conciseness: under 200 lines, no redundant sections
3. Baseline: score both criteria, identify key gaps
4. Iteration 1: restructure for clarity (target biggest gap)
5. Checkpoint: share scorecard, ask user if they agree
6. Iteration 2: trim redundant content (next gap)
7. Deliver: final README + iteration history

Handling Edge Cases

边缘场景处理

User changes criteria mid-loop: Accept the change. Re-score the current output against the new criteria and continue from Step 4.

User provides feedback instead of letting the loop run: Treat feedback as a new criterion or a signal to re-prioritize. Adjust and continue.

Output is already perfect (all criteria met at baseline): Declare success immediately. Suggest the user raise the bar if they want further improvement.

Criteria conflict with each other: Flag the conflict: "Criterion A and B seem to pull in opposite directions. Which should win when they conflict?"

User just wants one quick fix: Use Quick Mode — don't force the full loop.

用户在循环中途修改标准： 接受修改，按照新标准重新给当前输出打分，从步骤4继续。

用户直接给出反馈而不按循环流程走： 把反馈当作新的标准或者重新排序优先级的信号，调整后继续。

输出已经完美（基线版本就满足所有标准）： 直接告知用户已达标，如果用户想要进一步优化可以建议他们提高标准。

标准之间存在冲突： 标记冲突："Criterion A and B seem to pull in opposite directions. Which should win when they conflict?"

用户只想要一次快速修复： 使用快速模式——不要强行执行完整循环。

What This Skill Does NOT Do

本技能不适用场景

Does not discover goals or explore intent — it requires clear input and criteria (use
```
/brainstorm
```
for discovery)
Does not guess what "better" means — requires explicit criteria
Does not make unlimited iterations — has clear stopping conditions
Does not trade met criteria for unmet ones without user approval

不负责发现目标或者探索需求——它需要明确的输入和标准（探索需求请使用
```
/brainstorm
```
）
不猜测“更好”的定义——需要明确的评估标准
不进行无限制迭代——有明确的停止条件
没有用户批准的情况下，不会牺牲已达标的标准去满足未达标的标准