code-review
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCode Review
代码评审
Value: Feedback and communication -- structured review catches defects that
the author cannot see, and separating review into stages prevents thoroughness
in one area from crowding out another.
价值: 反馈与沟通——结构化评审能发现代码作者自身难以察觉的缺陷,分阶段评审可避免因在某一领域过度投入而忽略其他方面的检查。
Purpose
目的
Teaches a systematic three-stage code review that evaluates spec compliance,
code quality, and domain integrity as separate passes. Prevents combined
reviews from letting issues slip through by ensuring each dimension gets
focused attention.
传授一套系统化的三阶代码评审方法,将规范合规性、代码质量和领域完整性作为独立环节分别评估。通过确保每个维度都得到专注检查,避免合并评审导致问题被遗漏。
Practices
实践方法
Three Stages, In Order
按顺序执行的三个阶段
Review code in three sequential stages. Do not combine them. Each stage has a
single focus. A failure in an earlier stage blocks later stages -- there is no
point reviewing code quality on code that does not meet the spec.
Stage 1: Spec Compliance. Does the code do what was asked? Not more, not
less.
For each acceptance criterion or requirement:
- Find the code that implements it
- Find the test that verifies it
- Confirm the implementation matches the spec exactly
Mark each criterion: PASS, FAIL (missing/incomplete/divergent), or CONCERN
(implemented but potentially incorrect). Flag anything built beyond
requirements as OVER-BUILT.
If any criterion is FAIL, stop. Return to implementation before continuing.
分三个连续阶段评审代码,不可合并。每个阶段仅聚焦一个核心点。若前一阶段未通过,则需终止后续阶段——对不符合规范的代码进行代码质量评审毫无意义。
阶段1:规范合规性。代码是否完全符合需求?不多做,不少做。
针对每条验收标准或需求:
- 找到实现该需求的代码
- 找到验证该需求的测试用例
- 确认实现与规范完全匹配
为每条标准标记:通过(PASS)、失败(FAIL,缺失/不完整/偏离)或存疑(CONCERN,已实现但可能存在错误)。将超出需求范围的实现标记为“过度实现(OVER-BUILT)”。
若有任何标准标记为失败,立即终止评审,返回修改实现。
Vertical Slice Layer Coverage
垂直切片层覆盖检查
For tasks that implement a vertical slice (adding user-observable behavior), perform the following checks in order:
-
Entry-point wiring check (diff-based): Examine whether the changeset includes modifications to the application's entry point or its wiring/routing layer. If the slice claims to add new user-observable behavior but the diff does not touch any wiring or entry-point code, the review fails unless the author explicitly documents why existing wiring already routes to the new behavior.
-
End-to-end traceability: Verify that a path can be traced from the application's external entry point, through any infrastructure or integration layer, to the new domain logic, and back to observable output. If any segment of this path is missing from the changeset and not already present in the codebase, flag the gap.
-
Boundary-level test coverage: Confirm that at least one test exercises the new behavior through the application's external boundary (e.g., an HTTP request, a CLI invocation, a message on a queue) rather than calling internal functions directly. Where the application architecture makes automated boundary tests feasible, their absence is a review concern.
-
Test-level smell check: If every test in the changeset is a unit test of isolated internal functions with no integration or acceptance-level test, flag this as a concern. The slice may be implementing domain logic without proving it is reachable through the running application.
Stage 2: Code Quality. Is the code clear, maintainable, and well-tested?
Review each changed file for:
- Clarity: Can you understand what the code does without extra context? Are names descriptive? Is the structure obvious?
- Domain types: Are semantic types used where primitives appear? You MUST follow
the skill for primitive obsession detection.
domain-modeling - Error handling: Are errors handled with typed errors? Are all paths covered?
- Test quality: Do tests verify behavior, not implementation? Is coverage adequate for the changed code?
- YAGNI: Is there unused code, speculative features, or premature abstraction?
Categorize findings by severity:
- CRITICAL: Bug risk, likely to cause defects
- IMPORTANT: Maintainability concern, should fix before merge
- SUGGESTION: Style or minor improvement, optional
If any CRITICAL issue exists, stop. Return to implementation.
Stage 3: Domain Integrity. Final gate -- does the code respect domain
boundaries?
Check for:
- Compile-time enforcement opportunities: Are tests checking things the type system could enforce instead?
- Domain type consistency: Are semantic types used at all boundaries, or do primitives leak through?
- Validation placement: Is validation at construction (parse-don't-validate), not scattered through business logic?
- State representation: Can the types represent invalid states?
Flag issues but do not block on suggestions. Domain integrity flags are
strongly recommended but not required for merge.
对于实现垂直切片(添加用户可感知功能)的任务,按以下顺序执行检查:
-
入口点连接检查(基于差异):查看变更集是否包含对应用程序入口点或其连接/路由层的修改。若切片声称添加了新的用户可感知功能,但差异中未涉及任何连接或入口点代码,除非作者明确说明现有连接已能路由到新功能,否则评审不通过。
-
端到端可追溯性:验证是否能从应用程序的外部入口点,经过基础设施或集成层,追踪到新的领域逻辑,再回到可感知的输出。若变更集中缺少该路径的任何环节且代码库中原本也不存在,需标记该缺口。
-
边界级测试覆盖:确认至少有一个测试用例通过应用程序的外部边界(如HTTP请求、CLI调用、队列消息)来测试新功能,而非直接调用内部函数。若应用程序架构支持自动化边界测试但未实现,需标记为评审存疑点。
-
测试气味检查:若变更集中的所有测试都是针对孤立内部函数的单元测试,没有集成或验收级测试,需标记为存疑点。该切片可能仅实现了领域逻辑,但未证明其能在运行中的应用程序中被访问。
阶段2:代码质量。代码是否清晰、可维护且测试充分?
针对每个变更文件检查:
- 清晰度:无需额外上下文是否能理解代码功能?命名是否具有描述性?结构是否清晰?
- 领域类型:是否在原语类型的使用场景中采用了语义化类型?必须遵循技能中的“原语痴迷”检测原则。
domain-modeling - 错误处理:是否使用类型化错误处理?是否覆盖了所有路径?
- 测试质量:测试是否验证行为而非实现?变更代码的测试覆盖是否充分?
- YAGNI原则:是否存在未使用的代码、推测性功能或过早抽象?
按严重程度分类发现的问题:
- 严重(CRITICAL):存在bug风险,可能导致缺陷
- 重要(IMPORTANT):可维护性问题,合并前需修复
- 建议(SUGGESTION):风格或微小改进,可选修复
若存在任何严重问题,立即终止评审,返回修改实现。
阶段3:领域完整性。最终关卡——代码是否尊重领域边界?
检查内容:
- 编译时强制执行机会:是否有本可通过类型系统强制执行,却用测试来检查的内容?
- 领域类型一致性:是否在所有边界都使用了语义化类型,还是原语类型泄露到了边界之外?
- 验证位置:验证是否在构造阶段完成(遵循parse-don't-validate原则),而非分散在业务逻辑中?
- 状态表示:类型是否能表示无效状态?
标记问题但不强制要求修改。领域完整性标记为强烈建议项,而非合并的必要条件。
Review Output
评审输出
Produce a structured summary after all three stages:
REVIEW SUMMARY
Stage 1 (Spec Compliance): PASS/FAIL
Stage 2 (Code Quality): PASS/FAIL/PASS with suggestions
Stage 3 (Domain Integrity): PASS/FAIL/PASS with flags
Overall: APPROVED / CHANGES REQUIRED
If CHANGES REQUIRED:
1. [specific required change]
2. [specific required change]完成所有三个阶段后,生成结构化总结:
REVIEW SUMMARY
Stage 1 (Spec Compliance): PASS/FAIL
Stage 2 (Code Quality): PASS/FAIL/PASS with suggestions
Stage 3 (Domain Integrity): PASS/FAIL/PASS with flags
Overall: APPROVED / CHANGES REQUIRED
If CHANGES REQUIRED:
1. [specific required change]
2. [specific required change]Structured Review Evidence
结构化评审证据
After completing all three stages, produce a REVIEW_RESULT evidence packet
containing: per-stage verdicts {stage, verdict (PASS/FAIL), findings
[{severity, description, file, line?, required_change?}]}, overall_verdict,
required_changes_count, blocking_findings_count.
When is provided in context metadata, the code-review skill
operates in pipeline mode and stores the evidence to
. When running
standalone, the evidence is informational only (not stored).
pipeline-state.factory/audit-trail/slices/<slice-id>/review.jsonIn factory mode, the full team reviews before the pipeline pushes code --
this is the quality checkpoint that replaces consensus-during-build. All
blocking review feedback must be addressed before push. See
for the factory mode review subsection.
references/mob-review.md完成所有三个阶段后,生成包含以下内容的REVIEW_RESULT证据包:各阶段结论{stage, verdict (PASS/FAIL), findings[{severity, description, file, line?, required_change?}]}、整体结论、需修改项数量、阻塞性问题数量。
若上下文元数据中提供了,code-review技能将以流水线模式运行,并将证据存储至。独立运行时,证据仅作参考(不存储)。
pipeline-state.factory/audit-trail/slices/<slice-id>/review.json在工厂模式下,团队全员需在流水线推送代码前完成评审——这是替代“构建期间达成共识”的质量检查点。所有阻塞性评审反馈必须在推送前解决。有关工厂模式评审的详细内容,请参阅。
references/mob-review.mdHandling Disagreements
处理分歧
When your review finding conflicts with the implementation approach:
- State the concern with specific code references
- Explain the risk -- what could go wrong
- Propose an alternative
- If no agreement after one round, escalate to the user
You exist to catch what the author missed, not to block progress.
当你的评审结论与实现方案存在冲突时:
- 结合具体代码引用说明存疑点
- 解释风险——可能会出现什么问题
- 提出替代方案
- 若一轮沟通后仍未达成共识,升级反馈给用户
你的职责是发现作者遗漏的问题,而非阻碍进度。
Business Value and UX Awareness
业务价值与UX意识
During Stage 1, also consider:
- Does this slice deliver visible user value?
- Are acceptance criteria specific and testable (not vague)?
- Does the user journey remain coherent after this change?
- Are edge cases and error states handled from the user's perspective?
These are not blocking concerns but should be noted when relevant.
在阶段1中,还需考虑:
- 该切片是否能为用户带来可见价值?
- 验收标准是否具体且可测试(而非模糊表述)?
- 变更后用户旅程是否仍连贯?
- 是否从用户视角处理了边缘情况和错误状态?
这些内容不属于阻塞性问题,但相关时需记录。
Enforcement Note
实施说明
This skill provides advisory guidance. It instructs the agent on correct
review procedure but cannot mechanically prevent skipping stages or merging
without review. When used with the skill in automated mode, the
orchestrator can gate PR creation on review completion. In guided mode or
standalone, the agent follows these practices by convention. If you observe
stages being skipped, point it out.
tdd本技能提供指导性建议。它指导Agent遵循正确的评审流程,但无法机械性地阻止跳过阶段或未评审就合并代码。与技能配合使用自动化模式时,编排器可将PR创建的权限管控在评审完成后。在引导模式或独立运行时,Agent将按惯例遵循这些实践。若发现跳过阶段的情况,请指出问题。
tddVerification
验证
After completing a review guided by this skill, verify:
- All three stages were performed separately, in order
- Every acceptance criterion was mapped to code and tests in Stage 1
- Each changed file was assessed for clarity and domain type usage in Stage 2
- Domain integrity was checked for compile-time enforcement opportunities in Stage 3
- A structured summary was produced with clear PASS/FAIL per stage
- Any CHANGES REQUIRED items list specific, actionable fixes
If any criterion is not met, revisit the relevant stage before finalizing.
完成本技能指导的评审后,需验证以下内容:
- 三个阶段已独立按顺序执行
- 阶段1中每条验收标准都已映射到代码和测试用例
- 阶段2中每个变更文件都已评估清晰度和领域类型使用情况
- 阶段3中已检查领域完整性的编译时强制执行机会
- 已生成包含各阶段明确PASS/FAIL结论的结构化总结
- 所有“需修改(CHANGES REQUIRED)”项都列出了具体、可执行的修复方案
若有任何标准未满足,在最终确定前重新执行相关阶段。
Dependencies
依赖
This skill works standalone. For enhanced workflows, it integrates with:
- domain-modeling: Provides the primitive obsession and parse-don't-validate principles referenced in Stage 2 and Stage 3
- tdd: Reviews often follow a TDD cycle; this skill validates the output of that cycle
- mutation-testing: Can follow code review as an additional quality gate
Missing a dependency? Install with:
npx skills add jwilger/agent-skills --skill domain-modeling本技能可独立运行。如需增强工作流,可与以下技能集成:
- domain-modeling:提供阶段2和阶段3中引用的“原语痴迷”和“parse-don't-validate”原则
- tdd:评审通常在TDD周期之后进行;本技能可验证该周期的输出
- mutation-testing:可作为代码评审之后的额外质量关卡
缺少依赖?通过以下命令安装:
npx skills add jwilger/agent-skills --skill domain-modeling