honest-review

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Honest Review

诚信评审

Research-driven code review. Every finding validated with evidence.

基于研究的代码评审。所有评审结论均有证据支撑。

Dispatch

调度方式

$ARGUMENTS	Mode
Empty + changes in session (git diff)	Session review of changed files
Empty + no changes (first message)	Full codebase audit
File or directory path	Scoped review of that path
"audit"	Force full codebase audit
PR number/URL	Review PR changes (gh pr diff)
Git range (HEAD~3..HEAD)	Review changes in that range

参数	模式
无参数 + 会话中存在代码变更（git diff）	变更文件的会话评审
无参数 + 无代码变更（首次消息）	全代码库审计
文件或目录路径	指定路径的范围评审
"audit"	强制执行全代码库审计
PR编号/URL	评审PR变更内容（gh pr diff）
Git范围（HEAD~3..HEAD）	评审指定Git范围内的变更

Review Levels (Both Modes)

评审层级（两种模式通用）

Every review covers three abstraction levels, each examining both defects and unnecessary complexity:

Surface (lines, expressions, functions): Correctness, error handling, security, readability. Simplify: dead code, complex conditionals to early returns, hand-rolled to stdlib.

Structural (modules, classes, boundaries): Test coverage, coupling, interface contracts, cognitive complexity. Simplify: 1:1 wrappers, single-use abstractions, pass-through plumbing.

Algorithmic (algorithms, data structures, system design): Complexity class, N+1, resource leaks, concurrency. Simplify: O(n^2) to O(n), wrong data structure, unnecessary serialization.

Context-dependent: add security checks for auth/payment/user-data code. Add observability checks for services/APIs. Full checklists: read references/checklists.md

每一次评审都会覆盖三个抽象层级，每个层级都会检查缺陷与不必要的复杂度：

表层（代码行、表达式、函数）：正确性、错误处理、安全性、可读性。简化优化：死代码清理、复杂条件语句改为提前返回、自定义实现替换为标准库调用。

结构层（模块、类、边界）：测试覆盖率、耦合度、接口契约、认知复杂度。简化优化：一对一包装器移除、单次使用抽象删除、透传式管道简化。

算法层（算法、数据结构、系统设计）：复杂度等级、N+1查询、资源泄漏、并发问题。简化优化：O(n²)复杂度降为O(n)、替换不合适的数据结构、移除不必要的序列化操作。

上下文适配：针对认证/支付/用户数据相关代码增加安全性检查。针对服务/API增加可观测性检查。完整检查清单：查看references/checklists.md

Research Validation

研究验证

THIS IS THE CORE DIFFERENTIATOR. Do not report findings based solely on LLM knowledge. For every non-trivial finding, validate with research:

Two-phase review per scope:

Flag phase: Analyze code, generate hypotheses ("this API may be deprecated", "this SQL pattern may be injectable", "this dependency has a known CVE")
Validate phase: For each flag, spawn research subagent(s) to confirm:
- Context7: look up current library docs for API correctness
- WebSearch: check current best practices, security advisories
- WebFetch: query package registries (npm, PyPI, crates.io)
- gh: check open issues, security advisories for dependencies
Only report findings with evidence. Cite sources.

Research playbook: read references/research-playbook.md

这是核心差异化特性。不得仅基于LLM知识报告评审结论。对于所有非 trivial 的评审发现，必须通过研究进行验证：

每个范围的两阶段评审流程：

标记阶段：分析代码，生成假设（如“该API可能已废弃”、“该SQL模式可能存在注入风险”、“该依赖存在已知CVE漏洞”）
验证阶段：针对每个标记项，生成研究子代理进行确认：
- Context7：查询当前库文档以验证API正确性
- WebSearch：查阅当前最佳实践、安全公告
- WebFetch：查询包管理仓库（npm、PyPI、crates.io）
- gh：检查依赖的公开问题、安全公告
仅报告带有证据的发现，并注明来源。

研究操作手册：查看references/research-playbook.md

Mode 1: Session Review

模式1：会话评审

Step 1: Identify Changes

步骤1：识别变更

Run

git diff --name-only HEAD

to capture both staged and unstaged changes. Collect

git diff HEAD

for full context. Identify original task intent from session history.

运行

git diff --name-only HEAD

以捕获暂存和未暂存的所有变更。收集

git diff HEAD

以获取完整上下文。从会话历史中识别原始任务意图。

Step 2: Scale and Launch

步骤2：扩展并启动评审

Scope	Strategy
1-2 files	Inline review at all 3 levels. Spawn research subagents for flagged findings.
3-5 files	Spawn 3 parallel reviewer subagents (Surface/Structural/Algorithmic). Each flags then researches within their level.
6+ files or 3+ modules	Spawn a team. See below.

Team structure for large session reviews (6+ files):

[Lead: reconcile findings, produce final report]
  |-- Surface Reviewer
  |     Wave 1: subagents analyzing files (1 per file)
  |     Wave 2: subagents researching flagged findings
  |-- Structural Reviewer
  |     Wave 1: subagents analyzing module boundaries
  |     Wave 2: subagents researching flagged findings
  |-- Algorithmic Reviewer
  |     Wave 1: subagents analyzing performance/complexity
  |     Wave 2: subagents researching flagged findings
  |-- Verification Runner
        Wave 1: subagents running build, lint, tests
        Wave 2: subagents spot-checking behavior

Each teammate operates independently. Each runs internal waves of massively parallelized subagents. No overlapping file ownership.

范围	策略
1-2个文件	对所有3个层级进行内联评审。针对标记的发现生成研究子代理。
3-5个文件	生成3个并行评审子代理（表层/结构层/算法层）。每个代理在各自层级内先标记再进行研究验证。
6个及以上文件或3个及以上模块	生成评审团队。详情见下文。

大型会话评审的团队结构（6个及以上文件）：

[负责人：整合评审结果，生成最终报告]
  |-- 表层评审器
  |     第一阶段：子代理分析所有文件（每个文件对应一个子代理）
  |     第二阶段：子代理对标记的发现进行研究验证
  |-- 结构层评审器
  |     第一阶段：子代理分析模块边界
  |     第二阶段：子代理对标记的发现进行研究验证
  |-- 算法层评审器
  |     第一阶段：子代理分析性能/复杂度
  |     第二阶段：子代理对标记的发现进行研究验证
  |-- 验证执行器
        第一阶段：子代理运行构建、代码检查、测试
        第二阶段：子代理进行行为抽样检查

每个团队成员独立工作，各自运行大规模并行子代理的多阶段流程。不得出现文件所有权重叠的情况。

Step 3: Reconcile (5 Steps)

步骤3：整合结果（5步流程）

Question: For each finding, ask: (a) Is this actually broken or just unfamiliar? (b) Is there research evidence? (c) Would fixing this genuinely improve the code? Discard unvalidated findings.
Deduplicate: Same issue at different levels — keep deepest root cause
Resolve conflicts: When levels disagree, choose most net simplification
Elevate: Surface patterns across files to structural/algorithmic root
Prioritize: P0/S0 (must fix), P1/S1 (should fix), P2/S2 (report but do not implement)

Severity calibration: P0 = will cause production incident. Not "ugly code."

质疑：针对每个发现，询问：(a) 这真的是问题还是只是不熟悉的写法？(b) 是否有研究证据支撑？(c) 修复这个问题是否真的能提升代码质量？丢弃未经验证的发现。
去重：同一问题在不同层级被发现时，保留最根本的原因
解决冲突：当不同层级的结论不一致时，选择能带来最大简化效果的方案
提炼：将跨文件的模式问题升级为结构层/算法层的根本问题
优先级排序：分为P0/S0（必须修复）、P1/S1（应该修复）、P2/S2（仅报告无需修复）

严重程度校准：P0级问题指会导致生产事故的问题，而非“代码不美观”。

Step 4: Present and Execute

步骤4：呈现结果与执行修复

Present all P0/P1/S0/S1 findings with evidence and citations. Ask: "Implement fixes? [all / select / skip]" If approved: parallel subagents by file (no overlapping ownership). Then verify: build/lint, tests, behavior spot-check. Output format: read references/output-formats.md

呈现所有P0/P1/S0/S1级别的发现，并附上证据与引用来源。询问：“是否执行修复？[全部/选择部分/跳过]” 若获得批准：按文件分配并行子代理（无文件所有权重叠）。完成修复后进行验证：构建、代码检查、测试、行为抽样检查。输出格式：查看references/output-formats.md

Mode 2: Full Codebase Audit

模式2：全代码库审计

Step 1: Discover

步骤1：探索分析

Explore: language(s), framework(s), build system, directory structure, entry points, dependency manifest, approximate size. For 500+ files: prioritize recently modified, entry points, public API, high-complexity areas. State scope in report.

探索内容：使用的语言、框架、构建系统、目录结构、入口点、依赖清单、代码库大致规模。对于500个及以上文件的代码库：优先评审最近修改的文件、入口点、公开API、高复杂度区域。在报告中说明评审范围。

Step 2: Design and Launch Team

步骤2：设计并启动评审团队

Spawn a team with domain-based ownership. Each teammate runs all 3 review levels + research validation on their owned files.

[Lead: cross-domain analysis, reconciliation, final report]
  |-- Domain A Reviewer — e.g., Backend
  |     Wave 1: parallel subagents scanning all owned files
  |     Wave 2: parallel subagents deep-diving flagged files
  |     Wave 3: parallel subagents researching flagged assumptions
  |-- Domain B Reviewer — e.g., Frontend
  |     [same wave pattern]
  |-- Domain C Reviewer — e.g., Tests/Infra
  |     [same wave pattern]
  |-- Dependency and Security Researcher
        Wave 1: subagents auditing each dependency (version, CVEs, license)
        Wave 2: subagents checking security patterns against current docs
        Wave 3: subagents verifying API usage against library docs (Context7)

Adapt team composition to project type. Team archetypes + scaling: read references/team-templates.md

生成按领域划分所有权的评审团队。每个团队成员需对其负责的文件完成所有3个层级的评审 + 研究验证。

[负责人：跨领域分析、结果整合、生成最终报告]
  |-- 领域A评审器 —— 例如：后端
  |     第一阶段：并行子代理扫描所有负责的文件
  |     第二阶段：并行子代理深入分析标记的文件
  |     第三阶段：并行子代理对标记的假设进行研究验证
  |-- 领域B评审器 —— 例如：前端
  |     [相同阶段流程]
  |-- 领域C评审器 —— 例如：测试/基础设施
  |     [相同阶段流程]
  |-- 依赖与安全研究员
        第一阶段：子代理审计每个依赖（版本、CVE漏洞、许可证）
        第二阶段：子代理对照当前文档检查安全模式
        第三阶段：子代理对照库文档（Context7）验证API使用正确性

根据项目类型调整团队组成。团队原型与扩展方式：查看references/team-templates.md

Step 3: Teammate Instructions

步骤3：团队成员指令

Each teammate receives: role, owned files, project context, all 3 review levels, instruction to run two-phase (flag then research-validate), and findings format. Full template: references/team-templates.md

每个团队成员会收到：角色、负责的文件、项目上下文、3个评审层级的要求、两阶段评审（标记→研究验证）的指令、发现结果的格式要求。完整模板：查看references/team-templates.md

Step 4: Cross-Domain Analysis (Lead)

步骤4：跨领域分析（负责人）

While teammates review, lead spawns parallel subagents for:

Architecture: module boundaries, dependency graph
Data flow: trace key paths end-to-end
Error propagation: consistency across system
Shared patterns: duplication vs. necessary abstraction

在团队成员评审的同时，负责人生成并行子代理完成以下工作：

架构：模块边界、依赖关系图
数据流：端到端追踪关键路径
错误传播：系统内的一致性
共享模式：重复代码与必要抽象的区分

Step 5: Reconcile Across Domains

步骤5：跨领域结果整合

Same 5-step reconciliation. Cross-domain deduplication and elevation.

使用相同的5步整合流程，进行跨领域的去重与提炼。

Step 6: Report

步骤6：生成报告

Output format: references/output-formats.md Required sections: Critical, Significant, Cross-Domain, Health Summary, Top 3 Recommendations. All findings include evidence + citations.

输出格式：查看references/output-formats.md 必填章节：严重问题、重要问题、跨领域问题、健康状况总结、Top3建议。所有发现均需包含证据与引用来源。

Step 7: Execute (If Approved)

步骤7：执行修复（若获得批准）

Ask: "Implement fixes? [all / select / skip]" If approved: parallel subagents by file (no overlapping ownership). Then verify: build/lint, tests, behavior spot-check.

询问：“是否执行修复？[全部/选择部分/跳过]” 若获得批准：按文件分配并行子代理（无文件所有权重叠）。完成修复后进行验证：构建、代码检查、测试、行为抽样检查。

Healthy Codebase

健康代码库

If no P0/P1 or S0 findings: state this explicitly. Acknowledge health. Do not inflate minor issues. A short report is a good report.

若未发现P0/P1或S0级问题：需明确说明这一点，认可代码库的健康状态。不得夸大次要问题。简短的报告就是好报告。

Reference Files

参考文件

File	When to Read
references/checklists.md	During analysis or building teammate prompts
references/research-playbook.md	When setting up research validation subagents
references/output-formats.md	When producing final output
references/team-templates.md	When designing teams (Mode 2 or large Mode 1)

文件	阅读时机
references/checklists.md	分析过程中或编写团队成员提示语时
references/research-playbook.md	设置研究验证子代理时
references/output-formats.md	生成最终输出时
references/team-templates.md	设计评审团队时（模式2或大型模式1场景）

Critical Rules

核心规则

Every non-trivial finding must have research evidence or be discarded
Do not police style — follow the codebase's conventions
Do not report phantom bugs requiring impossible conditions
More than 15 findings means re-prioritize — 5 validated findings beat 50 speculative
Never skip reconciliation
Always present before implementing (approval gate)
Always verify after implementing (build, tests, behavior)
Never assign overlapping file ownership

所有非 trivial 的发现必须有研究证据支撑，否则需丢弃
不得强制统一代码风格 —— 遵循代码库已有的约定
不得报告需要极端条件才会触发的虚假bug
发现数量超过15个时需重新排序优先级 —— 5个经过验证的发现优于50个推测性发现
绝不跳过结果整合步骤
执行修复前必须先呈现结果（需经过批准）
执行修复后必须进行验证（构建、测试、行为检查）
绝不分配重叠的文件所有权