output-critic

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Output Critic Protocol

输出评审协议

Evaluate the output by type, score each criterion, make an accept/reject decision, and suggest concrete improvements. Goal: prevent weak output from reaching the next step.

根据输出类型进行评估,为每项标准打分,做出接受/拒绝决策,并给出具体改进建议。目标:避免低质量输出进入下一环节。

Workflow

工作流

1. Detect output type
2. Apply type-specific criteria
3. Score each criterion
4. Calculate overall score
5. Make accept / conditional / reject decision
6. Suggest improvements

1. 识别输出类型
2. 应用对应类型的评估标准
3. 为每项标准打分
4. 计算总分
5. 做出接受 / 有条件通过 / 拒绝决策
6. 给出改进建议

Acceptance Threshold

接受阈值

Overall ScoreDecisionAction
8-10ACCEPTProceed
6-7CONDITIONALApply minor fixes, then proceed
0-5REJECTApply improvements, re-evaluate

总分决策操作
8-10接受继续推进
6-7有条件通过进行小幅修正后推进
0-5拒绝进行改进后重新评估

Type-Specific Criteria

各类型评估标准

Code

代码

CriterionWeightQuestion
Correctness30%Produces expected output? Handles edge cases?
Readability20%Meaningful names? Clean indentation?
Security20%SQL injection? Hardcoded secrets? Unsafe input?
Performance15%Unnecessary loops? N+1 queries? Memory leaks?
Testability15%Functions independently testable?
评估标准权重问题
正确性30%是否生成预期输出?是否处理边界情况?
可读性20%是否使用有意义的命名?缩进是否清晰?
安全性20%是否存在SQL注入风险?是否有硬编码密钥?是否处理不安全输入?
性能15%是否存在不必要的循环?是否有N+1查询问题?是否存在内存泄漏?
可测试性15%函数是否可独立测试?

Report / Written Content

报告/书面内容

CriterionWeightQuestion
Accuracy30%Claims supported? Misleading statements?
Coverage25%All requested topics addressed? Missing sections?
Clarity20%Target audience can understand? Jargon explained?
Structure15%Logical flow? Consistent headings?
Actionability10%Reader knows what to do next?
评估标准权重问题
准确性30%主张是否有依据?是否存在误导性表述?
覆盖度25%是否覆盖所有要求的主题?是否有缺失的章节?
清晰度20%目标受众是否能理解?术语是否有解释?
结构15%逻辑是否通顺?标题是否统一?
可执行性10%读者是否清楚下一步操作?

Plan / Task List

计划/任务清单

CriterionWeightQuestion
Completeness30%All necessary steps present? Critical step missing?
Atomicity25%Each step does one thing? Overly broad steps?
Dependency accuracy20%Order makes sense? Circular dependencies?
Verifiability15%Each step has clear "done" criteria?
Realism10%Steps are achievable? Overly optimistic estimates?
评估标准权重问题
完整性30%是否包含所有必要步骤?是否缺失关键步骤?
原子性25%每个步骤是否仅完成一件事?是否存在过于宽泛的步骤?
依赖准确性20%步骤顺序是否合理?是否存在循环依赖?
可验证性15%每个步骤是否有明确的“完成”判定标准?
可行性10%步骤是否可实现?预估是否过于乐观?

Data / Table

数据/表格

CriterionWeightQuestion
Accuracy35%Numbers consistent? Calculations correct?
Completeness25%Missing rows/columns? Nulls explained?
Format consistency20%Units, date formats, currency consistent?
Readability20%Meaningful headers? Proper sorting?

评估标准权重问题
准确性35%数据是否一致?计算是否正确?
完整性25%是否存在缺失的行/列?空值是否有说明?
格式一致性20%单位、日期格式、货币是否统一?
可读性20%表头是否有意义?排序是否合理?

Output Format

输出格式

OUTPUT CRITIC
Type     : [output type]
Decision : ACCEPT / CONDITIONAL / REJECT
Score    : [X/10]
输出评审
类型     : [输出类型]
决策 : 接受 / 有条件通过 / 拒绝
评分    : [X/10]

Criterion Scores

各标准评分

CriterionScoreNote
[Criterion 1]X/10[short note]
[Criterion 2]X/10[short note]
OverallX/10
评估标准评分备注
[评估标准1]X/10[简短备注]
[评估标准2]X/10[简短备注]
总分X/10

Strengths

优势

  • [What was done well — specific]
  • [做得好的地方 — 具体描述]

Weaknesses

不足

  • [What is missing / wrong — specific]
  • [缺失/错误的地方 — 具体描述]

Improvement Suggestions

改进建议

  1. [Concrete action — what to do, where]
  2. [Concrete action]
  1. [具体操作 — 做什么,在哪里修改]
  2. [具体操作]

Next Step

下一步

[Accept -> proceed | Conditional -> fix X | Reject -> apply suggestions, resubmit]

---
[接受 -> 继续推进 | 有条件通过 -> 修正X问题 | 拒绝 -> 落实改进建议后重新提交]

---

Re-Evaluation

重新评估

When improved output is resubmitted:
RE-EVALUATION
Previous score: X/10
New score     : Y/10
Change        : +N points
Improved      : [which criteria]
Still open    : [remaining issues if any]

当提交改进后的输出时:
重新评估
之前评分: X/10
新评分     : Y/10
变化        : +N分
已改进      : [哪些标准得到提升]
仍存在问题    : [若有剩余问题]

When to Skip

跳过场景

  • User said "quick and dirty, doesn't need to be perfect"
  • Prototype / draft stage (user explicitly stated)
  • Single-line simple output

  • 用户明确要求“快速完成,无需完美”
  • 原型/草稿阶段(用户明确说明)
  • 单行简单输出

Guardrails

规则约束

  • Never accept security issues — hardcoded secrets = automatic REJECT regardless of other scores.
  • Be specific in suggestions — "improve code" is useless; "move API key to env var at line 12" is actionable.
  • Cross-skill: works with
    task-decomposer
    (validates plan quality),
    output-critic
    is the quality gate before task completion.
  • 绝不接受安全问题 — 硬编码密钥直接判定为拒绝,无论其他评分如何。
  • 改进建议需具体 — “优化代码”毫无意义;“将第12行的API密钥移至环境变量”才是可执行的建议。
  • 跨技能协作:可与
    task-decomposer
    配合使用(验证计划质量),
    output-critic
    是任务完成前的质量关卡。