evaluator-optimizer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Evaluator-Optimizer

Evaluator-Optimizer

Iterative refinement workflow that takes existing code, documentation, or designs and polishes them through rigorous cycles of evaluation and improvement until they meet production-grade quality standards.
通过严格的评估与改进循环,对现有代码、文档或设计进行打磨,直至达到生产级质量标准的迭代优化工作流。

When to Use This Skill

何时使用该技能

  • Refining a rough draft of code into production quality
  • Polishing documentation for clarity, completeness, and accuracy
  • Iteratively improving a design or architecture proposal
  • Systematic quality improvement where "good enough" is not sufficient
  • When you need to converge on high quality through structured iteration
  • 将代码草稿完善至生产质量
  • 打磨文档以提升清晰度、完整性与准确性
  • 迭代改进设计或架构提案
  • 对质量要求较高,“足够好”无法满足需求的系统性质量提升场景
  • 需要通过结构化迭代逐步实现高质量成果时

Quick Reference

快速参考

TaskLoad reference
Evaluation criteria and quality rubrics
skills/evaluator-optimizer/references/evaluation-criteria.md
任务参考资料路径
评估标准与质量准则
skills/evaluator-optimizer/references/evaluation-criteria.md

Workflow: The Loop

工作流:循环机制

For any given artifact (code, text, design):
  1. Accept: Take the current version of the artifact.
  2. Evaluate: Act as a harsh critic. Rate the artifact on correctness, clarity, efficiency, style, and safety. Assign a score out of 100.
  3. Decide:
    • Score >= 90: Stop and present the result.
    • Score < 90: Refine.
  4. Refine: Rewrite the artifact, specifically addressing the critique from step 2. List what changed and why.
  5. Repeat: Return to step 2 with the new version.
针对任意工件(代码、文本、设计):
  1. 接收:获取工件的当前版本。
  2. 评估:以严苛的视角进行评判。从正确性、清晰度、效率、风格与安全性五个维度为工件评分,满分100分。
  3. 决策
    • 评分≥90:停止并呈现结果。
    • 评分<90:优化
  4. 优化:重写工件,针对性解决步骤2中提出的问题。列出具体修改内容及原因。
  5. 重复:使用新版本回到步骤2。

Behavioral Rules

行为规则

  • Do not settle: "Good enough" is not good enough. You are here to polish.
  • Be explicit: When evaluating, list specific flaws. "The function
    process_data
    is O(n^2) but could be O(n)."
  • Show your work: Summarize changes in each iteration.
  • Self-correct: If a refinement breaks something, revert and try a different approach.
  • Converge: Each iteration must improve the score. If two consecutive iterations do not improve the score, stop and present the best version.
  • 绝不妥协:“足够好”是远远不够的,你的目标是精益求精。
  • 表述明确:评估时需列出具体问题,例如“函数
    process_data
    的时间复杂度为O(n²),可优化至O(n)。”
  • 展示过程:总结每次迭代中的修改内容。
  • 自我修正:若某次优化引入问题,需回退并尝试其他方案。
  • 逐步收敛:每次迭代必须提升评分。若连续两次迭代评分未提升,停止并呈现最优版本。

Iteration Output Template

迭代输出模板

markdown
undefined
markdown
undefined

Iteration [N] Evaluation

第[N]次迭代评估

CriterionScore (1-10)Notes
Correctness
Clarity
Efficiency
Style
Safety
Total/50[x100/50]
评估维度得分(1-10)备注
正确性
清晰度
效率
风格
安全性
总分/50[x100/50]

Issues Found

发现的问题

  1. [Specific issue with location]
  2. [Specific issue with location]
  1. [具体问题及位置]
  2. [具体问题及位置]

Refinements Applied

已应用的优化

  • [Change 1 and rationale]
  • [Change 2 and rationale]
undefined
  • [修改内容1及理由]
  • [修改内容2及理由]
undefined

Example Interaction

示例交互

Input: "Refine this Python script."
Iteration 1 Evaluation:
  • Functionality: Good
  • Efficiency: Poor - uses nested loops for matching
  • Style: Variable names
    a
    and
    b
    are unclear
  • Score: 60/100
Refinements applied:
  • Flattened loops using a set lookup (O(n))
  • Renamed
    a
    to
    users
    ,
    b
    to
    active_ids
  • Added type hints
Iteration 2 Evaluation:
  • Functionality: Good
  • Efficiency: Excellent
  • Style: Good
  • Score: 95/100
Result: Present the refined script.
输入:“优化这段Python脚本。”
第1次迭代评估
  • 功能:良好
  • 效率:较差 - 使用嵌套循环进行匹配
  • 风格:变量名
    a
    b
    含义模糊
  • 评分:60/100
已应用的优化
  • 使用集合查找(O(n))替代嵌套循环
  • a
    重命名为
    users
    b
    重命名为
    active_ids
  • 添加类型提示
第2次迭代评估
  • 功能:良好
  • 效率:优秀
  • 风格:良好
  • 评分:95/100
结果:呈现优化后的脚本。