ce-code-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Code Review

代码审查

Reviews code changes using dynamically selected reviewer personas. Spawns parallel sub-agents that return structured JSON, then merges and deduplicates findings into a single report.
使用动态选择的审查者角色对代码变更进行审查。生成并行子代理,返回结构化JSON,然后将审查结果合并、去重,生成单一报告。

When to Use

适用场景

  • Before creating a PR
  • After completing a task during iterative implementation
  • When feedback is needed on any code changes
  • Can be invoked standalone
  • Can run as a read-only or autofix review step inside larger workflows
  • 创建PR之前
  • 迭代开发完成某个任务后
  • 需要对任意代码变更获取反馈时
  • 可独立调用
  • 可作为只读或自动修复审查步骤,在更大的工作流中运行

Argument Parsing

参数解析

Parse
$ARGUMENTS
for the following optional tokens. Strip each recognized token before interpreting the remainder as the PR number, GitHub URL, or branch name.
TokenExampleEffect
mode:autofix
mode:autofix
Select autofix mode (see Mode Detection below)
mode:report-only
mode:report-only
Select report-only mode
mode:headless
mode:headless
Select headless mode for programmatic callers (see Mode Detection below)
base:<sha-or-ref>
base:abc1234
or
base:origin/main
Skip scope detection — use this as the diff base directly
plan:<path>
plan:docs/plans/2026-03-25-001-feat-foo-plan.md
Load this plan for requirements verification
All tokens are optional. Each one present means one less thing to infer. When absent, fall back to existing behavior for that stage.
Conflicting mode flags: If multiple mode tokens appear in arguments, stop and do not dispatch agents. If
mode:headless
is one of the conflicting tokens, emit the headless error envelope:
Review failed (headless mode). Reason: conflicting mode flags — <mode_a> and <mode_b> cannot be combined.
Otherwise emit the generic form:
Review failed. Reason: conflicting mode flags — <mode_a> and <mode_b> cannot be combined.
解析
$ARGUMENTS
中的以下可选标记。识别每个标记后,将剩余部分解释为PR编号、GitHub URL或分支名称。
标记示例作用
mode:autofix
mode:autofix
选择自动修复模式(详见下方的模式检测)
mode:report-only
mode:report-only
选择仅报告模式
mode:headless
mode:headless
为程序化调用者选择无头模式(详见下方的模式检测)
base:<sha-or-ref>
base:abc1234
base:origin/main
跳过范围检测——直接使用该值作为对比基准
plan:<path>
plan:docs/plans/2026-03-25-001-feat-foo-plan.md
加载该计划文档以验证需求
所有标记均为可选。每个标记的存在意味着减少一项推断工作。如果标记缺失,则回退到对应阶段的现有行为。
冲突模式标记: 如果参数中出现多个模式标记,停止执行且不调度代理。如果
mode:headless
是冲突标记之一,输出无头模式错误信息:
Review failed (headless mode). Reason: conflicting mode flags — <mode_a> and <mode_b> cannot be combined.
否则输出通用错误形式:
Review failed. Reason: conflicting mode flags — <mode_a> and <mode_b> cannot be combined.

Mode Detection

模式检测

ModeWhenBehavior
Interactive (default)No mode token presentReview, apply safe_auto fixes automatically, present findings, ask for policy decisions on gated/manual findings, and optionally continue into fix/push/PR next steps
Autofix
mode:autofix
in arguments
No user interaction. Review, apply only policy-allowed
safe_auto
fixes, re-review in bounded rounds, write a run artifact, and emit residual downstream work when needed
Report-only
mode:report-only
in arguments
Strictly read-only. Review and report only, then stop with no edits, artifacts, todos, commits, pushes, or PR actions
Headless
mode:headless
in arguments
Programmatic mode for skill-to-skill invocation. Apply
safe_auto
fixes silently (single pass), return all other findings as structured text output, write run artifacts, skip todos, and return "Review complete" signal. No interactive prompts.
模式触发条件行为
交互式(默认)未指定模式标记执行审查,自动应用安全的自动修复,展示审查结果,针对门控/手动结果询问策略决策,可选择进入修复/推送/PR等后续步骤
自动修复参数中包含
mode:autofix
无用户交互。执行审查,仅应用策略允许的
safe_auto
修复,在有限轮次内重新审查,写入运行工件,必要时输出剩余下游工作
仅报告参数中包含
mode:report-only
严格只读。仅执行审查并生成报告,不进行任何编辑、工件创建、待办项、提交、推送或PR操作
无头参数中包含
mode:headless
适用于技能间调用的程序化模式。静默应用
safe_auto
修复(单次执行),以结构化文本输出所有其他结果,写入运行工件,跳过待办项,返回"Review complete"信号。无交互式提示。

Autofix mode rules

自动修复模式规则

  • Skip all user questions. Never pause for approval or clarification once scope has been established.
  • Apply only
    safe_auto -> review-fixer
    findings.
    Leave
    gated_auto
    ,
    manual
    ,
    human
    , and
    release
    work unresolved.
  • Write a run artifact under
    .context/compound-engineering/ce-code-review/<run-id>/
    summarizing findings, applied fixes, residual actionable work, and advisory outputs.
  • Create durable todo files only for unresolved actionable findings whose final owner is
    downstream-resolver
    . Load the
    ce-todo-create
    skill for the canonical directory path and naming convention.
  • Never commit, push, or create a PR from autofix mode. Parent workflows own those decisions.
  • 跳过所有用户提问:一旦确定审查范围,绝不暂停等待批准或澄清。
  • 仅应用
    safe_auto -> review-fixer
    结果
    :保留
    gated_auto
    manual
    human
    release
    类工作不处理。
  • 写入运行工件:在
    .context/compound-engineering/ce-code-review/<run-id>/
    路径下生成工件,总结审查结果、已应用修复、剩余可执行工作和建议输出。
  • 仅为未解决的可执行结果创建持久待办文件:仅针对最终所有者为
    downstream-resolver
    的结果,使用
    ce-todo-create
    技能遵循标准目录路径和命名规范。
  • 绝不从自动修复模式提交、推送或创建PR:这些决策由父工作流负责。

Report-only mode rules

仅报告模式规则

  • Skip all user questions. Infer intent conservatively if the diff metadata is thin.
  • Never edit files or externalize work. Do not write
    .context/compound-engineering/ce-code-review/<run-id>/
    , do not create todo files, and do not commit, push, or create a PR.
  • Safe for parallel read-only verification.
    mode:report-only
    is the only mode that is safe to run concurrently with browser testing on the same checkout.
  • Do not switch the shared checkout. If the caller passes an explicit PR or branch target,
    mode:report-only
    must run in an isolated checkout/worktree or stop instead of running
    gh pr checkout
    /
    git checkout
    .
  • Do not overlap mutating review with browser testing on the same checkout. If a future orchestrator wants fixes, run the mutating review phase after browser testing or in an isolated checkout/worktree.
  • 跳过所有用户提问:如果差异元数据不足,保守推断意图。
  • 绝不编辑文件或生成外部工作项:不写入
    .context/compound-engineering/ce-code-review/<run-id>/
    ,不创建待办文件,不执行提交、推送或PR操作。
  • 适用于并行只读验证
    mode:report-only
    是唯一可在同一检出目录下与浏览器测试并发运行的模式。
  • 不切换共享检出目录:如果调用者传入明确的PR或分支目标,
    mode:report-only
    必须在独立检出目录/工作树中运行,否则停止,不执行
    gh pr checkout
    /
    git checkout
  • 避免在同一检出目录下同时进行可变审查与浏览器测试:如果后续编排器需要修复,应在浏览器测试后执行可变审查,或在独立检出目录/工作树中运行。

Headless mode rules

无头模式规则

  • Skip all user questions. Never use the platform question tool (
    AskUserQuestion
    in Claude Code,
    request_user_input
    in Codex,
    ask_user
    in Gemini) or other interactive prompts. Infer intent conservatively if the diff metadata is thin.
  • Require a determinable diff scope. If headless mode cannot determine a diff scope (no branch, PR, or
    base:
    ref determinable without user interaction), emit
    Review failed (headless mode). Reason: no diff scope detected. Re-invoke with a branch name, PR number, or base:<ref>.
    and stop without dispatching agents.
  • Apply only
    safe_auto -> review-fixer
    findings in a single pass.
    No bounded re-review rounds. Leave
    gated_auto
    ,
    manual
    ,
    human
    , and
    release
    work unresolved and return them in the structured output.
  • Return all non-auto findings as structured text output. Use the headless output envelope format (see Stage 6 below) preserving severity, autofix_class, owner, requires_verification, confidence, pre_existing, and suggested_fix per finding. Enrich with detail-tier fields (why_it_matters, evidence[]) from the per-agent artifact files on disk (see Detail enrichment in Stage 6).
  • Write a run artifact under
    .context/compound-engineering/ce-code-review/<run-id>/
    summarizing findings, applied fixes, and advisory outputs. Include the artifact path in the structured output.
  • Do not create todo files. The caller receives structured findings and routes downstream work itself.
  • Do not switch the shared checkout. If the caller passes an explicit PR or branch target,
    mode:headless
    must run in an isolated checkout/worktree or stop instead of running
    gh pr checkout
    /
    git checkout
    . When stopping, emit
    Review failed (headless mode). Reason: cannot switch shared checkout. Re-invoke with base:<ref> to review the current checkout, or run from an isolated worktree.
  • Not safe for concurrent use on a shared checkout. Unlike
    mode:report-only
    , headless mutates files (applies
    safe_auto
    fixes). Callers must not run headless concurrently with other mutating operations on the same checkout.
  • Never commit, push, or create a PR from headless mode. The caller owns those decisions.
  • End with "Review complete" as the terminal signal so callers can detect completion. If all reviewers fail or time out, emit
    Code review degraded (headless mode). Reason: 0 of N reviewers returned results.
    followed by "Review complete".
  • 跳过所有用户提问:绝不使用平台提问工具(Claude Code中的
    AskUserQuestion
    、Codex中的
    request_user_input
    、Gemini中的
    ask_user
    )或其他交互式提示。如果差异元数据不足,保守推断意图。
  • 需要可确定的差异范围:如果无头模式无法确定差异范围(无分支、PR或
    base:
    引用,且无需用户交互即可确定),输出
    Review failed (headless mode). Reason: no diff scope detected. Re-invoke with a branch name, PR number, or base:<ref>.
    并停止,不调度代理。
  • 单次执行仅应用
    safe_auto -> review-fixer
    结果
    :不进行有限轮次的重新审查。保留
    gated_auto
    manual
    human
    release
    类工作不处理,并在结构化输出中返回。
  • 以结构化文本输出所有非自动修复结果:使用无头输出信封格式(详见阶段6),保留每个结果的严重性、autofix_class、所有者、是否需要验证、置信度、是否预先存在以及建议修复等字段。从磁盘上的代理工件文件中补充细节层级字段(why_it_matters、evidence[])(详见阶段6的细节补充)。
  • 写入运行工件:在
    .context/compound-engineering/ce-code-review/<run-id>/
    路径下生成工件,总结审查结果、已应用修复和建议输出。在结构化输出中包含工件路径。
  • 不创建待办文件:调用者接收结构化结果并自行处理下游工作。
  • 不切换共享检出目录:如果调用者传入明确的PR或分支目标,
    mode:headless
    必须在独立检出目录/工作树中运行,否则停止,不执行
    gh pr checkout
    /
    git checkout
    。停止时输出
    Review failed (headless mode). Reason: cannot switch shared checkout. Re-invoke with base:<ref> to review the current checkout, or run from an isolated worktree.
  • 不适用于共享检出目录的并发使用:与
    mode:report-only
    不同,无头模式会修改文件(应用
    safe_auto
    修复)。调用者不得在同一检出目录下同时运行无头模式与其他可变操作。
  • 绝不从无头模式提交、推送或创建PR:这些决策由调用者负责。
  • 以"Review complete"作为终端信号结束:便于调用者检测完成状态。如果所有审查者失败或超时,输出
    Code review degraded (headless mode). Reason: 0 of N reviewers returned results.
    然后输出"Review complete"。

Interactive mode rules

交互式模式规则

  • Pre-load the platform question tool before any question fires. In Claude Code,
    AskUserQuestion
    is a deferred tool — its schema is not available at session start. At the start of Interactive-mode work (before Stage 2 intent-ambiguity questions, the After-Review routing question, walk-through per-finding questions, bulk-preview Proceed/Cancel, and tracker-defer failure sub-questions), call
    ToolSearch
    with query
    select:AskUserQuestion
    to load the schema. Load it once, eagerly, at the top of the Interactive flow — do not wait for the first question site and do not decide it on a per-site basis. On Codex (
    request_user_input
    ) and Gemini (
    ask_user
    ) this step is not required; the tools are loaded by default.
  • The numbered-list fallback only applies on confirmed load failure. The skill's fallback pattern — "present the options as a numbered list and wait for the user's reply" — is valid only when
    ToolSearch
    returns no match or the tool call explicitly fails. Rendering a question as narrative text because the tool feels inconvenient, because the model is in report-formatting mode, or because the instruction was buried in a long skill is a bug. A question that calls for a user decision must either fire the tool or fail loudly.
  • 在任何提问前预加载平台提问工具:在Claude Code中,
    AskUserQuestion
    是延迟加载工具——会话启动时不可用。在交互式模式工作开始时(阶段2意图模糊提问、审查后路由提问、逐个结果审查提问、批量预览Proceed/Cancel、跟踪器延迟失败子提问之前),调用
    ToolSearch
    并传入查询
    select:AskUserQuestion
    以加载其schema。提前一次性加载——不要等待第一个提问场景,也不要逐个场景决定是否加载。在Codex(
    request_user_input
    )和Gemini(
    ask_user
    )中无需此步骤,工具默认已加载。
  • 仅在确认加载失败时使用编号列表回退:技能的回退模式——"将选项以编号列表展示并等待用户回复"——仅在
    ToolSearch
    返回无匹配或工具调用明确失败时有效。因工具使用不便、模型处于报告格式化模式或指令被淹没在长技能中等原因,将提问渲染为叙述性文本属于错误。需要用户决策的提问必须要么调用工具,要么明确失败。

Severity Scale

严重程度等级

All reviewers use P0-P3:
LevelMeaningAction
P0Critical breakage, exploitable vulnerability, data loss/corruptionMust fix before merge
P1High-impact defect likely hit in normal usage, breaking contractShould fix
P2Moderate issue with meaningful downside (edge case, perf regression, maintainability trap)Fix if straightforward
P3Low-impact, narrow scope, minor improvementUser's discretion
所有审查者使用P0-P3等级:
等级含义操作
P0严重故障、可利用漏洞、数据丢失/损坏合并前必须修复
P1高影响缺陷,正常使用中可能触发,违反契约建议修复
P2中等问题,存在明显负面影响(边缘情况、性能退化、可维护性陷阱)若修复简单则处理
P3低影响、范围狭窄、微小改进由用户自行决定

Action Routing

操作路由

Severity answers urgency. Routing answers who acts next and whether this skill may mutate the checkout.
autofix_class
Default ownerMeaning
safe_auto
review-fixer
Local, deterministic fix suitable for the in-skill fixer when the current mode allows mutation
gated_auto
downstream-resolver
or
human
Concrete fix exists, but it changes behavior, contracts, permissions, or another sensitive boundary that should not be auto-applied by default
manual
downstream-resolver
or
human
Actionable work that should be handed off rather than fixed in-skill
advisory
human
or
release
Report-only output such as learnings, rollout notes, or residual risk
Routing rules:
  • Synthesis owns the final route. Persona-provided routing metadata is input, not the last word.
  • Choose the more conservative route on disagreement. A merged finding may move from
    safe_auto
    to
    gated_auto
    or
    manual
    , but never the other way without stronger evidence.
  • Only
    safe_auto -> review-fixer
    enters the in-skill fixer queue automatically.
  • requires_verification: true
    means a fix is not complete without targeted tests, a focused re-review, or operational validation.
严重程度回答紧急性。路由回答下一步执行者以及该技能是否可修改检出目录
autofix_class
默认所有者含义
safe_auto
review-fixer
本地、确定性修复,适用于当前模式允许修改时的技能内修复器
gated_auto
downstream-resolver
human
存在具体修复方案,但会修改行为、契约、权限或其他敏感边界,默认不自动应用
manual
downstream-resolver
human
可执行工作,应移交而非在技能内修复
advisory
human
release
仅报告输出,如经验总结、发布说明或剩余风险
路由规则:
  • 综合处理拥有最终路由权:角色提供的路由元数据仅作为输入,而非最终结论。
  • 意见不一致时选择更保守的路由:合并后的结果可能从
    safe_auto
    转为
    gated_auto
    manual
    ,但无更强证据时不得反向转换。
  • safe_auto -> review-fixer
    自动进入技能内修复器队列
  • requires_verification: true
    意味着必须通过针对性测试、重点重新审查或操作验证才能完成修复

Reviewers

审查者

17 reviewer personas in layered conditionals, plus CE-specific agents. See the persona catalog included below for the full catalog.
Always-on (every review):
AgentFocus
review:ce-correctness-reviewer
Logic errors, edge cases, state bugs, error propagation
review:ce-testing-reviewer
Coverage gaps, weak assertions, brittle tests
review:ce-maintainability-reviewer
Coupling, complexity, naming, dead code, abstraction debt
review:ce-project-standards-reviewer
CLAUDE.md and AGENTS.md compliance -- frontmatter, references, naming, portability
review:ce-agent-native-reviewer
Verify new features are agent-accessible
research:ce-learnings-researcher
Search docs/solutions/ for past issues related to this PR
Cross-cutting conditional (selected per diff):
AgentSelect when diff touches...
review:ce-security-reviewer
Auth, public endpoints, user input, permissions
review:ce-performance-reviewer
DB queries, data transforms, caching, async
review:ce-api-contract-reviewer
Routes, serializers, type signatures, versioning
review:ce-data-migrations-reviewer
Migrations, schema changes, backfills
review:ce-reliability-reviewer
Error handling, retries, timeouts, background jobs
review:ce-adversarial-reviewer
Diff >=50 changed non-test/non-generated/non-lockfile lines, or auth, payments, data mutations, external APIs
review:ce-cli-readiness-reviewer
CLI command definitions, argument parsing, CLI framework usage, command handler implementations
review:ce-previous-comments-reviewer
Reviewing a PR that has existing review comments or threads
Stack-specific conditional (selected per diff):
AgentSelect when diff touches...
review:ce-dhh-rails-reviewer
Rails architecture, service objects, session/auth choices, or Hotwire-vs-SPA boundaries
review:ce-kieran-rails-reviewer
Rails application code where conventions, naming, and maintainability are in play
review:ce-kieran-python-reviewer
Python modules, endpoints, scripts, or services
review:ce-kieran-typescript-reviewer
TypeScript components, services, hooks, utilities, or shared types
review:ce-julik-frontend-races-reviewer
Stimulus/Turbo controllers, DOM events, timers, animations, or async UI flows
CE conditional (migration-specific):
AgentSelect when diff includes migration files
review:ce-schema-drift-detector
Cross-references schema.rb against included migrations
review:ce-deployment-verification-agent
Produces deployment checklist with SQL verification queries
17个分层条件的审查者角色,加上CE专属代理。详见下方包含的角色目录。
始终启用(每次审查必选):
代理关注重点
review:ce-correctness-reviewer
逻辑错误、边缘情况、状态bug、错误传播
review:ce-testing-reviewer
覆盖缺口、弱断言、脆弱测试
review:ce-maintainability-reviewer
耦合性、复杂度、命名、死代码、抽象债务
review:ce-project-standards-reviewer
符合CLAUDE.md和AGENTS.md规范——前置元数据、引用、命名、可移植性
review:ce-agent-native-reviewer
验证新功能是否可被Agent访问
research:ce-learnings-researcher
搜索docs/solutions/中与当前PR相关的历史问题
跨领域条件触发(根据差异选择):
代理差异涉及以下内容时选择
review:ce-security-reviewer
认证、公共端点、用户输入、权限
review:ce-performance-reviewer
DB查询、数据转换、缓存、异步操作
review:ce-api-contract-reviewer
路由、序列化器、类型签名、版本控制
review:ce-data-migrations-reviewer
迁移、 schema变更、数据回填
review:ce-reliability-reviewer
错误处理、重试、超时、后台任务
review:ce-adversarial-reviewer
差异修改≥50行非测试/非生成/非锁文件代码,或涉及认证、支付、数据变更、外部API
review:ce-cli-readiness-reviewer
CLI命令定义、参数解析、CLI框架使用、命令处理实现
review:ce-previous-comments-reviewer
审查已有审查评论或线程的PR
栈专属条件触发(根据差异选择):
代理差异涉及以下内容时选择
review:ce-dhh-rails-reviewer
Rails架构、服务对象、会话/认证选择、Hotwire与SPA边界
review:ce-kieran-rails-reviewer
Rails应用代码,涉及规范、命名和可维护性
review:ce-kieran-python-reviewer
Python模块、端点、脚本或服务
review:ce-kieran-typescript-reviewer
TypeScript组件、服务、钩子、工具或共享类型
review:ce-julik-frontend-races-reviewer
Stimulus/Turbo控制器、DOM事件、定时器、动画或异步UI流程
CE条件触发(迁移专属):
代理差异包含迁移文件时选择
review:ce-schema-drift-detector
交叉验证schema.rb与包含的迁移文件
review:ce-deployment-verification-agent
生成包含SQL验证查询的部署检查清单

Review Scope

审查范围

Every review spawns all 4 always-on personas plus the 2 CE always-on agents, then adds whichever cross-cutting and stack-specific conditionals fit the diff. The model naturally right-sizes: a small config change triggers 0 conditionals = 6 reviewers. A Rails auth feature might trigger security + reliability + kieran-rails + dhh-rails = 10 reviewers.
每次审查都会启动4个始终启用的角色和2个CE始终启用的代理,然后添加符合差异情况的跨领域和栈专属条件角色。模型会自然调整规模:小型配置变更触发0个条件角色,共6个审查者;Rails认证功能可能触发安全+可靠性+kieran-rails+dhh-rails,共10个审查者。

Protected Artifacts

受保护工件

The following paths are compound-engineering pipeline artifacts and must never be flagged for deletion, removal, or gitignore by any reviewer:
  • docs/brainstorms/*
    -- requirements documents created by ce-brainstorm
  • docs/plans/*.md
    -- plan files created by ce-plan (living documents with progress checkboxes)
  • docs/solutions/*.md
    -- solution documents created during the pipeline
If a reviewer flags any file in these directories for cleanup or removal, discard that finding during synthesis.
以下路径是compound-engineering流水线工件,任何审查者不得标记为删除、移除或加入gitignore:
  • docs/brainstorms/*
    -- ce-brainstorm创建的需求文档
  • docs/plans/*.md
    -- ce-plan创建的计划文件(带进度复选框的动态文档)
  • docs/solutions/*.md
    -- 流水线中创建的解决方案文档
如果审查者标记这些目录下的任何文件需要清理或移除,在综合处理时丢弃该结果。

How to Run

运行方式

Stage 1: Determine scope

阶段1:确定范围

Compute the diff range, file list, and diff. Minimize permission prompts by combining into as few commands as possible.
If
base:
argument is provided (fast path):
The caller already knows the diff base. Skip all base-branch detection, remote resolution, and merge-base computation. Use the provided value directly:
BASE_ARG="{base_arg}"
BASE=$(git merge-base HEAD "$BASE_ARG" 2>/dev/null) || BASE="$BASE_ARG"
Then produce the same output as the other paths:
echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard
This path works with any ref — a SHA,
origin/main
, a branch name. Automated callers (ce-work, lfg, slfg) should prefer this to avoid the detection overhead. Do not combine
base:
with a PR number or branch target.
If both are present, stop with an error: "Cannot use
base:
with a PR number or branch target —
base:
implies the current checkout is already the correct branch. Pass
base:
alone, or pass the target alone and let scope detection resolve the base." This avoids scope/intent mismatches where the diff base comes from one source but the code and metadata come from another.
If a PR number or GitHub URL is provided as an argument:
If
mode:report-only
or
mode:headless
is active, do not run
gh pr checkout <number-or-url>
on the shared checkout. For
mode:report-only
, tell the caller: "mode:report-only cannot switch the shared checkout to review a PR target. Run it from an isolated worktree/checkout for that PR, or run report-only with no target argument on the already checked out branch." For
mode:headless
, emit
Review failed (headless mode). Reason: cannot switch shared checkout. Re-invoke with base:<ref> to review the current checkout, or run from an isolated worktree.
Stop here unless the review is already running in an isolated checkout.
First, verify the worktree is clean before switching branches:
git status --porcelain
If the output is non-empty, inform the user: "You have uncommitted changes on the current branch. Stash or commit them before reviewing a PR, or use standalone mode (no argument) to review the current branch as-is." Do not proceed with checkout until the worktree is clean.
Then check out the PR branch so persona agents can read the actual code (not the current checkout):
gh pr checkout <number-or-url>
Then fetch PR metadata. Capture the base branch name and the PR base repository identity, not just the branch name:
gh pr view <number-or-url> --json title,body,baseRefName,headRefName,url
Use the repository portion of the returned PR URL as
<base-repo>
(for example,
EveryInc/compound-engineering-plugin
from
https://github.com/EveryInc/compound-engineering-plugin/pull/348
).
Then compute a local diff against the PR's base branch so re-reviews also include local fix commits and uncommitted edits. Substitute the PR base branch from metadata (shown here as
<base>
) and the PR base repository identity derived from the PR URL (shown here as
<base-repo>
). Resolve the base ref from the PR's actual base repository, not by assuming
origin
points at that repo:
PR_BASE_REMOTE=$(git remote -v | awk 'index($2, "github.com:<base-repo>") || index($2, "github.com/<base-repo>") {print $1; exit}')
if [ -n "$PR_BASE_REMOTE" ]; then PR_BASE_REMOTE_REF="$PR_BASE_REMOTE/<base>"; else PR_BASE_REMOTE_REF=""; fi
PR_BASE_REF=$(git rev-parse --verify "$PR_BASE_REMOTE_REF" 2>/dev/null || git rev-parse --verify <base> 2>/dev/null || true)
if [ -z "$PR_BASE_REF" ]; then
  if [ -n "$PR_BASE_REMOTE_REF" ]; then
    git fetch --no-tags "$PR_BASE_REMOTE" <base>:refs/remotes/"$PR_BASE_REMOTE"/<base> 2>/dev/null || git fetch --no-tags "$PR_BASE_REMOTE" <base> 2>/dev/null || true
    PR_BASE_REF=$(git rev-parse --verify "$PR_BASE_REMOTE_REF" 2>/dev/null || git rev-parse --verify <base> 2>/dev/null || true)
  else
    if git fetch --no-tags https://github.com/<base-repo>.git <base> 2>/dev/null; then
      PR_BASE_REF=$(git rev-parse --verify FETCH_HEAD 2>/dev/null || true)
    fi
    if [ -z "$PR_BASE_REF" ]; then PR_BASE_REF=$(git rev-parse --verify <base> 2>/dev/null || true); fi
  fi
fi
if [ -n "$PR_BASE_REF" ]; then BASE=$(git merge-base HEAD "$PR_BASE_REF" 2>/dev/null) || BASE=""; else BASE=""; fi
if [ -n "$BASE" ]; then echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard; else echo "ERROR: Unable to resolve PR base branch <base> locally. Fetch the base branch and rerun so the review scope stays aligned with the PR."; fi
Extract PR title/body, base branch, and PR URL from
gh pr view
, then extract the base marker, file list, diff content, and
UNTRACKED:
list from the local command. Do not use
gh pr diff
as the review scope after checkout -- it only reflects the remote PR state and will miss local fix commits until they are pushed. If the base ref still cannot be resolved from the PR's actual base repository after the fetch attempt, stop instead of falling back to
git diff HEAD
; a PR review without the PR base branch is incomplete.
If a branch name is provided as an argument:
Check out the named branch, then diff it against the base branch. Substitute the provided branch name (shown here as
<branch>
).
If
mode:report-only
or
mode:headless
is active, do not run
git checkout <branch>
on the shared checkout. For
mode:report-only
, tell the caller: "mode:report-only cannot switch the shared checkout to review another branch. Run it from an isolated worktree/checkout for
<branch>
, or run report-only on the current checkout with no target argument." For
mode:headless
, emit
Review failed (headless mode). Reason: cannot switch shared checkout. Re-invoke with base:<ref> to review the current checkout, or run from an isolated worktree.
Stop here unless the review is already running in an isolated checkout.
First, verify the worktree is clean before switching branches:
git status --porcelain
If the output is non-empty, inform the user: "You have uncommitted changes on the current branch. Stash or commit them before reviewing another branch, or provide a PR number instead." Do not proceed with checkout until the worktree is clean.
git checkout <branch>
Then detect the review base branch and compute the merge-base. Run the
references/resolve-base.sh
script, which handles fork-safe remote resolution with multi-fallback detection (PR metadata ->
origin/HEAD
->
gh repo view
-> common branch names):
RESOLVE_OUT=$(bash references/resolve-base.sh) || { echo "ERROR: resolve-base.sh failed"; exit 1; }
if [ -z "$RESOLVE_OUT" ] || echo "$RESOLVE_OUT" | grep -q '^ERROR:'; then echo "${RESOLVE_OUT:-ERROR: resolve-base.sh produced no output}"; exit 1; fi
BASE=$(echo "$RESOLVE_OUT" | sed 's/^BASE://')
If the script outputs an error, stop instead of falling back to
git diff HEAD
; a branch review without the base branch would only show uncommitted changes and silently miss all committed work.
On success, produce the diff:
echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard
You may still fetch additional PR metadata with
gh pr view
for title, body, and linked issues, but do not fail if no PR exists.
If no argument (standalone on current branch):
Detect the review base branch and compute the merge-base using the same
references/resolve-base.sh
script as branch mode:
RESOLVE_OUT=$(bash references/resolve-base.sh) || { echo "ERROR: resolve-base.sh failed"; exit 1; }
if [ -z "$RESOLVE_OUT" ] || echo "$RESOLVE_OUT" | grep -q '^ERROR:'; then echo "${RESOLVE_OUT:-ERROR: resolve-base.sh produced no output}"; exit 1; fi
BASE=$(echo "$RESOLVE_OUT" | sed 's/^BASE://')
If the script outputs an error, stop instead of falling back to
git diff HEAD
; a standalone review without the base branch would only show uncommitted changes and silently miss all committed work on the branch.
On success, produce the diff:
echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard
Using
git diff $BASE
(without
..HEAD
) diffs the merge-base against the working tree, which includes committed, staged, and unstaged changes together.
Untracked file handling: Always inspect the
UNTRACKED:
list, even when
FILES:
/
DIFF:
are non-empty. Untracked files are outside review scope until staged. If the list is non-empty, tell the user which files are excluded. If any of them should be reviewed, stop and tell the user to
git add
them first and rerun. Only continue when the user is intentionally reviewing tracked changes only. In
mode:headless
or
mode:autofix
, do not stop to ask — proceed with tracked changes only and note the excluded untracked files in the Coverage section of the output.
计算差异范围、文件列表和差异内容。通过合并为尽可能少的命令来最小化权限提示。
如果提供了
base:
参数(快速路径):
调用者已明确差异基准。跳过所有基准分支检测、远程解析和合并基准计算。直接使用提供的值:
BASE_ARG="{base_arg}"
BASE=$(git merge-base HEAD "$BASE_ARG" 2>/dev/null) || BASE="$BASE_ARG"
然后生成与其他路径相同的输出:
echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard
该路径支持任何引用——SHA、
origin/main
、分支名称。自动化调用者(ce-work、lfg、slfg)应优先选择此路径以避免检测开销。请勿同时使用
base:
与PR编号或分支目标
。如果两者都存在,停止并输出错误:"Cannot use
base:
with a PR number or branch target —
base:
implies the current checkout is already the correct branch. Pass
base:
alone, or pass the target alone and let scope detection resolve the base." 避免出现差异基准来自一个源,而代码和元数据来自另一个源的范围/意图不匹配。
如果参数提供了PR编号或GitHub URL:
如果
mode:report-only
mode:headless
启用,不要在共享检出目录上运行
gh pr checkout <number-or-url>
。对于
mode:report-only
,告知调用者:"mode:report-only cannot switch the shared checkout to review a PR target. Run it from an isolated worktree/checkout for that PR, or run report-only with no target argument on the already checked out branch." 对于
mode:headless
,输出
Review failed (headless mode). Reason: cannot switch shared checkout. Re-invoke with base:<ref> to review the current checkout, or run from an isolated worktree.
除非审查已在独立检出目录中运行,否则停止。
首先,在切换分支前验证工作树是否干净:
git status --porcelain
如果输出非空,告知用户:"You have uncommitted changes on the current branch. Stash or commit them before reviewing a PR, or use standalone mode (no argument) to review the current branch as-is." 在工作树干净前不要继续检出。
然后检出PR分支,以便角色代理可以读取实际代码(而非当前检出的代码):
gh pr checkout <number-or-url>
然后获取PR元数据。捕获基准分支名称和PR基准仓库标识,而非仅分支名称:
gh pr view <number-or-url> --json title,body,baseRefName,headRefName,url
从返回的PR URL中提取仓库部分作为
<base-repo>
(例如,从
https://github.com/EveryInc/compound-engineering-plugin/pull/348
中提取
EveryInc/compound-engineering-plugin
)。
然后针对PR的基准分支计算本地差异,以便重新审查时也包含本地修复提交和未提交修改。替换元数据中的PR基准分支(此处为
<base>
)和从PR URL派生的PR基准仓库标识(此处为
<base-repo>
)。从PR的实际基准仓库解析基准引用,而非假设
origin
指向该仓库:
PR_BASE_REMOTE=$(git remote -v | awk 'index($2, "github.com:<base-repo>") || index($2, "github.com/<base-repo>") {print $1; exit}')
if [ -n "$PR_BASE_REMOTE" ]; then PR_BASE_REMOTE_REF="$PR_BASE_REMOTE/<base>"; else PR_BASE_REMOTE_REF=""; fi
PR_BASE_REF=$(git rev-parse --verify "$PR_BASE_REMOTE_REF" 2>/dev/null || git rev-parse --verify <base> 2>/dev/null || true)
if [ -z "$PR_BASE_REF" ]; then
  if [ -n "$PR_BASE_REMOTE_REF" ]; then
    git fetch --no-tags "$PR_BASE_REMOTE" <base>:refs/remotes/"$PR_BASE_REMOTE"/<base> 2>/dev/null || git fetch --no-tags "$PR_BASE_REMOTE" <base> 2>/dev/null || true
    PR_BASE_REF=$(git rev-parse --verify "$PR_BASE_REMOTE_REF" 2>/dev/null || git rev-parse --verify <base> 2>/dev/null || true)
  else
    if git fetch --no-tags https://github.com/<base-repo>.git <base> 2>/dev/null; then
      PR_BASE_REF=$(git rev-parse --verify FETCH_HEAD 2>/dev/null || true)
    fi
    if [ -z "$PR_BASE_REF" ]; then PR_BASE_REF=$(git rev-parse --verify <base> 2>/dev/null || true); fi
  fi
fi
if [ -n "$PR_BASE_REF" ]; then BASE=$(git merge-base HEAD "$PR_BASE_REF" 2>/dev/null) || BASE=""; else BASE=""; fi
if [ -n "$BASE" ]; then echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard; else echo "ERROR: Unable to resolve PR base branch <base> locally. Fetch the base branch and rerun so the review scope stays aligned with the PR."; fi
gh pr view
中提取PR标题/正文、基准分支和PR URL,然后从本地命令中提取基准标记、文件列表、差异内容和
UNTRACKED:
列表。检出后不要使用
gh pr diff
作为审查范围——它仅反映远程PR状态,会遗漏本地修复提交,直到推送。如果在尝试获取后仍无法从PR的实际基准仓库解析基准引用,停止而非回退到
git diff HEAD
;没有PR基准分支的PR审查是不完整的。
如果参数提供了分支名称:
检出指定分支,然后与基准分支比较差异。替换提供的分支名称(此处为
<branch>
)。
如果
mode:report-only
mode:headless
启用,不要在共享检出目录上运行
git checkout <branch>
。对于
mode:report-only
,告知调用者:"mode:report-only cannot switch the shared checkout to review another branch. Run it from an isolated worktree/checkout for
<branch>
, or run report-only on the current checkout with no target argument." 对于
mode:headless
,输出
Review failed (headless mode). Reason: cannot switch shared checkout. Re-invoke with base:<ref> to review the current checkout, or run from an isolated worktree.
除非审查已在独立检出目录中运行,否则停止。
首先,在切换分支前验证工作树是否干净:
git status --porcelain
如果输出非空,告知用户:"You have uncommitted changes on the current branch. Stash or commit them before reviewing another branch, or provide a PR number instead." 在工作树干净前不要继续检出。
git checkout <branch>
然后检测审查基准分支并计算合并基准。运行
references/resolve-base.sh
脚本,该脚本处理支持fork的远程解析,具备多重回退检测(PR元数据 ->
origin/HEAD
->
gh repo view
-> 常见分支名称):
RESOLVE_OUT=$(bash references/resolve-base.sh) || { echo "ERROR: resolve-base.sh failed"; exit 1; }
if [ -z "$RESOLVE_OUT" ] || echo "$RESOLVE_OUT" | grep -q '^ERROR:'; then echo "${RESOLVE_OUT:-ERROR: resolve-base.sh produced no output}"; exit 1; fi
BASE=$(echo "$RESOLVE_OUT" | sed 's/^BASE://')
如果脚本输出错误,停止而非回退到
git diff HEAD
;没有基准分支的分支审查仅会显示未提交修改,会静默遗漏分支上的所有已提交工作。
成功后,生成差异:
echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard
你仍可使用
gh pr view
获取额外PR元数据(标题、正文、关联问题),但如果不存在PR也无需失败。
如果无参数(在当前分支独立运行):
使用与分支模式相同的
references/resolve-base.sh
脚本检测审查基准分支并计算合并基准:
RESOLVE_OUT=$(bash references/resolve-base.sh) || { echo "ERROR: resolve-base.sh failed"; exit 1; }
if [ -z "$RESOLVE_OUT" ] || echo "$RESOLVE_OUT" | grep -q '^ERROR:'; then echo "${RESOLVE_OUT:-ERROR: resolve-base.sh produced no output}"; exit 1; fi
BASE=$(echo "$RESOLVE_OUT" | sed 's/^BASE://')
如果脚本输出错误,停止而非回退到
git diff HEAD
;没有基准分支的独立审查仅会显示未提交修改,会静默遗漏分支上的所有已提交工作。
成功后,生成差异:
echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard
使用
git diff $BASE
(不带
..HEAD
)比较合并基准与工作树的差异,包含已提交、暂存和未暂存的所有修改。
未跟踪文件处理: 始终检查
UNTRACKED:
列表,即使
FILES:
/
DIFF:
非空。未跟踪文件在暂存前不属于审查范围。如果列表非空,告知用户哪些文件被排除。如果其中任何文件需要审查,停止并告知用户先
git add
然后重新运行。仅当用户有意仅审查已跟踪修改时才继续。在
mode:headless
mode:autofix
模式下,不要停止询问——仅处理已跟踪修改,并在输出的Coverage部分注明排除的未跟踪文件。

Stage 2: Intent discovery

阶段2:意图发现

Understand what the change is trying to accomplish. The source of intent depends on which Stage 1 path was taken:
PR/URL mode: Use the PR title, body, and linked issues from
gh pr view
metadata. Supplement with commit messages from the PR if the body is sparse.
Branch mode: Run
git log --oneline ${BASE}..<branch>
using the resolved merge-base from Stage 1.
Standalone (current branch): Run:
echo "BRANCH:" && git rev-parse --abbrev-ref HEAD && echo "COMMITS:" && git log --oneline ${BASE}..HEAD
Combined with conversation context (plan section summary, PR description), write a 2-3 line intent summary:
Intent: Simplify tax calculation by replacing the multi-tier rate lookup
with a flat-rate computation. Must not regress edge cases in tax-exempt handling.
Pass this to every reviewer in their spawn prompt. Intent shapes how hard each reviewer looks, not which reviewers are selected.
When intent is ambiguous:
  • Interactive mode: Ask one question using the platform's interactive question tool (
    AskUserQuestion
    in Claude Code,
    request_user_input
    in Codex,
    ask_user
    in Gemini): "What is the primary goal of these changes?" Do not spawn reviewers until intent is established. Claude Code only: if
    AskUserQuestion
    has not yet been loaded this session (per the Interactive mode rules pre-load), call
    ToolSearch
    with query
    select:AskUserQuestion
    first before asking. On Codex (
    request_user_input
    ) and Gemini (
    ask_user
    ) this preload step does not apply — the platform-native question tool is loaded by default.
  • Autofix/report-only/headless modes: Infer intent conservatively from the branch name, diff, PR metadata, and caller context. Note the uncertainty in Coverage or Verdict reasoning instead of blocking.
理解变更的目标。意图来源取决于阶段1选择的路径:
PR/URL模式: 使用
gh pr view
元数据中的PR标题、正文和关联问题。如果正文内容稀疏,补充PR中的提交消息。
分支模式: 使用阶段1解析的合并基准运行
git log --oneline ${BASE}..<branch>
独立模式(当前分支): 运行:
echo "BRANCH:" && git rev-parse --abbrev-ref HEAD && echo "COMMITS:" && git log --oneline ${BASE}..HEAD
结合对话上下文(计划章节摘要、PR描述),撰写2-3行的意图摘要:
Intent: Simplify tax calculation by replacing the multi-tier rate lookup
with a flat-rate computation. Must not regress edge cases in tax-exempt handling.
将此传递给每个审查者的启动提示。意图决定每个审查者的审查深度,而非选择哪些审查者。
当意图模糊时:
  • 交互式模式: 使用平台的交互式提问工具(Claude Code中的
    AskUserQuestion
    、Codex中的
    request_user_input
    、Gemini中的
    ask_user
    )提出一个问题:"这些变更的主要目标是什么?" 在确定意图前不要启动审查者。仅Claude Code: 如果本次会话尚未加载
    AskUserQuestion
    (根据交互式模式规则的预加载要求),先调用
    ToolSearch
    并传入查询
    select:AskUserQuestion
    。在Codex(
    request_user_input
    )和Gemini(
    ask_user
    )中无需此预加载步骤——平台原生提问工具默认已加载。
  • 自动修复/仅报告/无头模式: 从分支名称、差异、PR元数据和调用者上下文保守推断意图。在Coverage或Verdict推理中注明不确定性,而非阻塞。

Stage 2b: Plan discovery (requirements verification)

阶段2b:计划发现(需求验证)

Locate the plan document so Stage 6 can verify requirements completeness. Check these sources in priority order — stop at the first hit:
  1. plan:
    argument.
    If the caller passed a plan path, use it directly. Read the file to confirm it exists.
  2. PR body. If PR metadata was fetched in Stage 1, scan the body for paths matching
    docs/plans/*.md
    . If exactly one match is found and the file exists, use it as
    plan_source: explicit
    . If multiple plan paths appear, treat as ambiguous — demote to
    plan_source: inferred
    for the most recent match that exists on disk, or skip if none exist or none clearly relate to the PR title/intent. Always verify the selected file exists before using it — stale or copied plan links in PR descriptions are common.
  3. Auto-discover. Extract 2-3 keywords from the branch name (e.g.,
    feat/onboarding-skill
    ->
    onboarding
    ,
    skill
    ). Glob
    docs/plans/*
    and filter filenames containing those keywords. If exactly one match, use it. If multiple matches or the match looks ambiguous (e.g., generic keywords like
    review
    ,
    fix
    ,
    update
    that could hit many plans), skip auto-discovery — a wrong plan is worse than no plan. If zero matches, skip.
Confidence tagging: Record how the plan was found:
  • plan:
    argument ->
    plan_source: explicit
    (high confidence)
  • Single unambiguous PR body match ->
    plan_source: explicit
    (high confidence)
  • Multiple/ambiguous PR body matches ->
    plan_source: inferred
    (lower confidence)
  • Auto-discover with single unambiguous match ->
    plan_source: inferred
    (lower confidence)
If a plan is found, read its Requirements Trace (R1, R2, etc.) and Implementation Units (checkbox items). Store the extracted requirements list and
plan_source
for Stage 6. Do not block the review if no plan is found — requirements verification is additive, not required.
定位计划文档,以便阶段6验证需求完整性。按优先级顺序检查以下来源——找到第一个匹配项后停止:
  1. plan:
    参数
    :如果调用者传入计划路径,直接使用。读取文件确认存在。
  2. PR正文:如果阶段1获取了PR元数据,扫描正文中匹配
    docs/plans/*.md
    的路径。如果找到唯一匹配且文件存在,标记为
    plan_source: explicit
    。如果出现多个计划路径,视为模糊——降级为
    plan_source: inferred
    ,选择磁盘上存在的最新匹配项,或如果不存在或与PR标题/意图无关则跳过。使用前始终验证所选文件是否存在——PR描述中的陈旧或复制的计划链接很常见。
  3. 自动发现:从分支名称中提取2-3个关键词(例如,
    feat/onboarding-skill
    ->
    onboarding
    skill
    )。通配
    docs/plans/*
    并过滤包含这些关键词的文件名。如果找到唯一匹配,使用它。如果有多个匹配或匹配项看起来模糊(例如,通用关键词如
    review
    fix
    update
    可能匹配多个计划),跳过自动发现——错误的计划比没有计划更糟。如果没有匹配项,跳过。
置信度标记: 记录计划的发现方式:
  • plan:
    参数 ->
    plan_source: explicit
    (高置信度)
  • PR正文中的唯一明确匹配 ->
    plan_source: explicit
    (高置信度)
  • PR正文中的多个/模糊匹配 ->
    plan_source: inferred
    (较低置信度)
  • 自动发现的唯一明确匹配 ->
    plan_source: inferred
    (较低置信度)
如果找到计划,读取其需求跟踪(R1、R2等)和实现单元(复选框项)。提取需求列表和
plan_source
用于阶段6。如果未找到计划,不要阻塞审查——需求验证是附加项,非必需。

Stage 3: Select reviewers

阶段3:选择审查者

Read the diff and file list from Stage 1. The 4 always-on personas and 2 CE always-on agents are automatic. For each cross-cutting and stack-specific conditional persona in the persona catalog included below, decide whether the diff warrants it. This is agent judgment, not keyword matching.
File-type awareness for conditional selection: Instruction-prose files (Markdown skill definitions, JSON schemas, config files) are product code but do not benefit from runtime-focused reviewers. The adversarial reviewer's techniques (race conditions, cascade failures, abuse cases) target executable code behavior. For diffs that only change instruction-prose files, skip adversarial unless the prose describes auth, payment, or data-mutation behavior. Count only executable code lines toward line-count thresholds.
previous-comments
is PR-only.
Only select this persona when Stage 1 gathered PR metadata (PR number or URL was provided as an argument, or
gh pr view
returned metadata for the current branch). Skip it entirely for standalone branch reviews with no associated PR -- there are no prior comments to check.
Stack-specific personas are additive. A Rails UI change may warrant
kieran-rails
plus
julik-frontend-races
; a TypeScript API diff may warrant
kieran-typescript
plus
api-contract
and
reliability
.
For CE conditional agents, check if the diff includes files matching
db/migrate/*.rb
,
db/schema.rb
, or data backfill scripts.
Announce the team before spawning:
Review team:
- correctness (always)
- testing (always)
- maintainability (always)
- project-standards (always)
- ce-agent-native-reviewer (always)
- ce-learnings-researcher (always)
- security -- new endpoint in routes.rb accepts user-provided redirect URL
- kieran-rails -- controller and Turbo flow changed in app/controllers and app/views
- dhh-rails -- diff adds service objects around ordinary Rails CRUD
- data-migrations -- adds migration 20260303_add_index_to_orders
- ce-schema-drift-detector -- migration files present
This is progress reporting, not a blocking confirmation.
读取阶段1的差异和文件列表。4个始终启用的角色和2个CE始终启用的代理是默认选择。对于角色目录中的每个跨领域和栈专属条件角色,判断差异是否需要该角色。这是代理判断,而非关键词匹配。
条件选择的文件类型感知: 指令性文本文件(Markdown技能定义、JSON schema、配置文件)属于产品代码,但不会从运行时聚焦的审查者处受益。对抗性审查者的技术(竞争条件、级联故障、滥用案例)针对可执行代码行为。对于仅修改指令性文本文件的差异,跳过对抗性审查者,除非文本描述了认证、支付或数据变更行为。仅将可执行代码行计入行数阈值。
previous-comments
仅适用于PR:
仅当阶段1收集了PR元数据(参数提供了PR编号或URL,或
gh pr view
为当前分支返回了元数据)时选择该角色。对于无关联PR的独立分支审查,完全跳过——没有先前评论可检查。
栈专属角色是附加的。Rails UI变更可能需要
kieran-rails
julik-frontend-races
;TypeScript API差异可能需要
kieran-typescript
api-contract
reliability
对于CE条件代理,检查差异是否包含匹配
db/migrate/*.rb
db/schema.rb
或数据回填脚本的文件。
在启动前宣布审查团队:
Review team:
- correctness (always)
- testing (always)
- maintainability (always)
- project-standards (always)
- ce-agent-native-reviewer (always)
- ce-learnings-researcher (always)
- security -- new endpoint in routes.rb accepts user-provided redirect URL
- kieran-rails -- controller and Turbo flow changed in app/controllers and app/views
- dhh-rails -- diff adds service objects around ordinary Rails CRUD
- data-migrations -- adds migration 20260303_add_index_to_orders
- ce-schema-drift-detector -- migration files present
这是进度报告,而非阻塞确认。

Stage 3b: Discover project standards paths

阶段3b:发现项目规范路径

Before spawning sub-agents, find the file paths (not contents) of all relevant standards files for the
project-standards
persona. Use the native file-search/glob tool to locate:
  1. Use the native file-search tool (e.g., Glob in Claude Code) to find all
    **/CLAUDE.md
    and
    **/AGENTS.md
    in the repo.
  2. Filter to those whose directory is an ancestor of at least one changed file. A standards file governs all files below it (e.g.,
    plugins/compound-engineering/AGENTS.md
    applies to everything under
    plugins/compound-engineering/
    ).
Pass the resulting path list to the
project-standards
persona inside a
<standards-paths>
block in its review context (see Stage 4). The persona reads the files itself, targeting only the sections relevant to the changed file types. This keeps the orchestrator's work cheap (path discovery only) and avoids bloating the subagent prompt with content the reviewer may not fully need.
在启动子代理前,为
project-standards
角色找到所有相关规范文件的路径(而非内容)。使用原生文件搜索/通配工具定位:
  1. 使用原生文件搜索工具(例如Claude Code中的Glob)在仓库中查找所有
    **/CLAUDE.md
    **/AGENTS.md
  2. 过滤出目录是至少一个变更文件祖先的文件。规范文件管辖其下的所有文件(例如,
    plugins/compound-engineering/AGENTS.md
    适用于
    plugins/compound-engineering/
    下的所有内容)。
将结果路径列表传递给
project-standards
角色,放在其审查上下文的
<standards-paths>
块中(详见阶段4)。角色会自行读取文件,仅针对变更文件类型相关的章节。这使编排器的工作保持轻量化(仅路径发现),避免用审查者可能不需要的内容膨胀子代理提示。

Stage 4: Spawn sub-agents

阶段4:启动子代理

Model tiering

模型分层

Persona sub-agents do focused, scoped work and should use a fast mid-tier model to reduce cost and latency without sacrificing review quality. The orchestrator itself stays on the default (most capable) model.
Use the platform's mid-tier model for all persona and CE sub-agents. In Claude Code, pass
model: "sonnet"
in the Agent tool call. On other platforms, use the equivalent mid-tier (e.g.,
gpt-4o
in Codex). If the platform has no model override mechanism or the available model names are unknown, omit the model parameter and let agents inherit the default -- a working review on the parent model is better than a broken dispatch from an unrecognized model name.
CE always-on agents (ce-agent-native-reviewer, ce-learnings-researcher) and CE conditional agents (ce-schema-drift-detector, ce-deployment-verification-agent) also use the mid-tier model since they perform scoped, focused work.
The orchestrator (this skill) stays on the default model because it handles intent discovery, reviewer selection, finding merge/dedup, and synthesis -- tasks that benefit from stronger reasoning.
角色子代理执行聚焦、范围明确的工作,应使用快速的中端模型以降低成本和延迟,同时不牺牲审查质量。编排器自身使用默认(最强大)的模型。
为所有角色和CE子代理使用平台的中端模型。在Claude Code中,在Agent工具调用中传递
model: "sonnet"
。在其他平台上,使用等效的中端模型(例如Codex中的
gpt-4o
)。如果平台不支持模型覆盖机制或可用模型名称未知,省略model参数,让代理继承默认设置——在父模型上运行的有效审查比因未识别模型名称导致的调度失败更好。
CE始终启用的代理(ce-agent-native-reviewer、ce-learnings-researcher)和CE条件代理(ce-schema-drift-detector、ce-deployment-verification-agent)也使用中端模型,因为它们执行聚焦、范围明确的工作。
编排器(本技能)保留在默认模型上,因为它处理意图发现、审查者选择、结果合并/去重和综合处理——这些任务受益于更强的推理能力。

Run ID

运行ID

Generate a unique run identifier before dispatching any agents. This ID scopes all agent artifact files and the post-review run artifact to the same directory.
bash
RUN_ID=$(date +%Y%m%d-%H%M%S)-$(head -c4 /dev/urandom | od -An -tx1 | tr -d ' ')
mkdir -p ".context/compound-engineering/ce-code-review/$RUN_ID"
Pass
{run_id}
to every persona sub-agent so they can write their full analysis to
.context/compound-engineering/ce-code-review/{run_id}/{reviewer_name}.json
.
Report-only mode: Skip run-id generation and directory creation. Do not pass
{run_id}
to agents. Agents return compact JSON only with no file write, consistent with report-only's no-write contract.
在调度任何代理前生成唯一的运行标识符。此ID将所有代理工件文件和审查后运行工件限定在同一目录中。
bash
RUN_ID=$(date +%Y%m%d-%H%M%S)-$(head -c4 /dev/urandom | od -An -tx1 | tr -d ' ')
mkdir -p ".context/compound-engineering/ce-code-review/$RUN_ID"
{run_id}
传递给每个角色子代理,以便它们可以将完整分析写入
.context/compound-engineering/ce-code-review/{run_id}/{reviewer_name}.json
仅报告模式: 跳过运行ID生成和目录创建。不要将
{run_id}
传递给代理。代理仅返回紧凑JSON,不写入文件,符合仅报告模式的无写入契约。

Spawning

启动

Omit the
mode
parameter when dispatching sub-agents so the user's configured permission settings apply. Do not pass
mode: "auto"
.
Spawn each selected persona reviewer as a parallel sub-agent using the subagent template included below. Each persona sub-agent receives:
  1. Their persona file content (identity, failure modes, calibration, suppress conditions)
  2. Shared diff-scope rules from the diff-scope reference included below
  3. The JSON output contract from the findings schema included below
  4. PR metadata: title, body, and URL when reviewing a PR (empty string otherwise). Passed in a
    <pr-context>
    block so reviewers can verify code against stated intent
  5. Review context: intent summary, file list, diff
  6. Run ID and reviewer name for the artifact file path
  7. For
    project-standards
    only:
    the standards file path list from Stage 3b, wrapped in a
    <standards-paths>
    block appended to the review context
Persona sub-agents are read-only with respect to the project: they review and return structured JSON. They do not edit project files or propose refactors. The one permitted write is saving their full analysis to the
.context/
artifact path specified in the output contract.
Read-only here means non-mutating, not "no shell access." Reviewer sub-agents may use non-mutating inspection commands when needed to gather evidence or verify scope, including read-oriented
git
/
gh
usage such as
git diff
,
git show
,
git blame
,
git log
, and
gh pr view
. They must not edit project files, change branches, commit, push, create PRs, or otherwise mutate the checkout or repository state.
Each persona sub-agent writes full JSON (all schema fields) to
.context/compound-engineering/ce-code-review/{run_id}/{reviewer_name}.json
and returns compact JSON with merge-tier fields only:
json
{
  "reviewer": "security",
  "findings": [
    {
      "title": "User-supplied ID in account lookup without ownership check",
      "severity": "P0",
      "file": "orders_controller.rb",
      "line": 42,
      "confidence": 0.92,
      "autofix_class": "gated_auto",
      "owner": "downstream-resolver",
      "requires_verification": true,
      "pre_existing": false,
      "suggested_fix": "Add current_user.owns?(account) guard before lookup"
    }
  ],
  "residual_risks": [...],
  "testing_gaps": [...]
}
Detail-tier fields (
why_it_matters
,
evidence
) are in the artifact file only.
suggested_fix
is optional in both tiers -- included in compact returns when present so the orchestrator has fix context for auto-apply decisions. If the file write fails, the compact return still provides everything the merge needs.
CE always-on agents (ce-agent-native-reviewer, ce-learnings-researcher) are dispatched as standard Agent calls in parallel with the persona agents. Give them the same review context bundle the personas receive: entry mode, any PR metadata gathered in Stage 1, intent summary, review base branch name when known,
BASE:
marker, file list, diff, and
UNTRACKED:
scope notes. Do not invoke them with a generic "review this" prompt. Their output is unstructured and synthesized separately in Stage 6.
CE conditional agents (ce-schema-drift-detector, ce-deployment-verification-agent) are also dispatched as standard Agent calls when applicable. Pass the same review context bundle plus the applicability reason (for example, which migration files triggered the agent). For ce-schema-drift-detector specifically, pass the resolved review base branch explicitly so it never assumes
main
. Their output is unstructured and must be preserved for Stage 6 synthesis just like the CE always-on agents.
调度子代理时省略
mode
参数,以便应用用户配置的权限设置。不要传递
mode: "auto"
使用下方包含的子代理模板,将每个选定的角色审查者作为并行子代理启动。每个角色子代理接收:
  1. 其角色文件内容(身份、故障模式、校准、抑制条件)
  2. 下方差异范围参考中的共享差异范围规则
  3. 下方结果schema中的JSON输出契约
  4. PR元数据:审查PR时的标题、正文和URL(否则为空字符串)。放在
    <pr-context>
    块中,以便审查者对照声明的意图验证代码
  5. 审查上下文:意图摘要、文件列表、差异
  6. 运行ID和审查者名称,用于工件文件路径
  7. project-standards
    角色:
    阶段3b中的规范文件路径列表,附加到审查上下文的
    <standards-paths>
    块中
角色子代理相对于项目是只读的:它们审查并返回结构化JSON。它们不编辑项目文件或提出重构。唯一允许的写入是将完整分析保存到输出契约中指定的
.context/
工件路径。
此处的只读指非修改性,而非"无shell访问"。审查者子代理在需要收集证据或验证范围时,可使用非修改性的检查命令,包括面向读取的
git
/
gh
操作,如
git diff
git show
git blame
git log
gh pr view
。它们不得编辑项目文件、切换分支、提交、推送、创建PR或修改检出目录或仓库状态。
每个角色子代理将完整JSON(所有schema字段)写入
.context/compound-engineering/ce-code-review/{run_id}/{reviewer_name}.json
,并仅返回包含合并层级字段的紧凑JSON:
json
{
  "reviewer": "security",
  "findings": [
    {
      "title": "User-supplied ID in account lookup without ownership check",
      "severity": "P0",
      "file": "orders_controller.rb",
      "line": 42,
      "confidence": 0.92,
      "autofix_class": "gated_auto",
      "owner": "downstream-resolver",
      "requires_verification": true,
      "pre_existing": false,
      "suggested_fix": "Add current_user.owns?(account) guard before lookup"
    }
  ],
  "residual_risks": [...],
  "testing_gaps": [...]
}
细节层级字段(
why_it_matters
evidence
)仅存在于工件文件中。
suggested_fix
在两个层级中都是可选的——如果存在,包含在紧凑返回中,以便编排器获取自动应用决策的修复上下文。如果文件写入失败,紧凑返回仍提供合并所需的所有内容。
CE始终启用的代理(ce-agent-native-reviewer、ce-learnings-researcher)与角色代理并行,作为标准Agent调用调度。为它们提供与角色相同的审查上下文包:入口模式、阶段1收集的任何PR元数据、意图摘要、已知的审查基准分支名称、
BASE:
标记、文件列表、差异和
UNTRACKED:
范围说明。不要用通用的"审查这个"提示调用它们。它们的输出是非结构化的,在阶段6单独综合处理。
CE条件代理(ce-schema-drift-detector、ce-deployment-verification-agent)在适用时也作为标准Agent调用调度。传递相同的审查上下文包,加上适用原因(例如,哪些迁移文件触发了代理)。对于ce-schema-drift-detector,明确传递解析后的审查基准分支,使其绝不假设
main
。它们的输出是非结构化的,必须像CE始终启用的代理一样保留,用于阶段6的综合处理。

Stage 5: Merge findings

阶段5:合并结果

Convert multiple reviewer compact JSON returns into one deduplicated, confidence-gated finding set. The compact returns contain merge-tier fields (title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing) plus the optional suggested_fix. Detail-tier fields (why_it_matters, evidence) are on disk in the per-agent artifact files and are not loaded at this stage.
  1. Validate. Check each compact return for required top-level and per-finding fields, plus value constraints. Drop malformed returns or findings. Record the drop count.
    • Top-level required: reviewer (string), findings (array), residual_risks (array), testing_gaps (array). Drop the entire return if any are missing or wrong type.
    • Per-finding required: title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing
    • Value constraints:
      • severity: P0 | P1 | P2 | P3
      • autofix_class: safe_auto | gated_auto | manual | advisory
      • owner: review-fixer | downstream-resolver | human | release
      • confidence: numeric, 0.0-1.0
      • line: positive integer
      • pre_existing, requires_verification: boolean
    • Do not validate against the full schema here -- the full schema (including why_it_matters and evidence) applies to the artifact files on disk, not the compact returns.
  2. Confidence gate. Suppress findings below 0.60 confidence. Exception: P0 findings at 0.50+ confidence survive the gate -- critical-but-uncertain issues must not be silently dropped. Record the suppressed count. This matches the persona instructions and the schema's confidence thresholds.
  3. Deduplicate. Compute fingerprint:
    normalize(file) + line_bucket(line, +/-3) + normalize(title)
    . When fingerprints match, merge: keep highest severity, keep highest confidence, note which reviewers flagged it.
  4. Cross-reviewer agreement. When 2+ independent reviewers flag the same issue (same fingerprint), boost the merged confidence by 0.10 (capped at 1.0). Cross-reviewer agreement is strong signal -- independent reviewers converging on the same issue is more reliable than any single reviewer's confidence. Note the agreement in the Reviewer column of the output (e.g., "security, correctness").
  5. Separate pre-existing. Pull out findings with
    pre_existing: true
    into a separate list.
  6. Resolve disagreements. When reviewers flag the same code region but disagree on severity, autofix_class, or owner, annotate the Reviewer column with the disagreement (e.g., "security (P0), correctness (P1) -- kept P0"). This transparency helps the user understand why a finding was routed the way it was.
  7. Normalize routing. For each merged finding, set the final
    autofix_class
    ,
    owner
    , and
    requires_verification
    . If reviewers disagree, keep the most conservative route. Synthesis may narrow a finding from
    safe_auto
    to
    gated_auto
    or
    manual
    , but must not widen it without new evidence. 7b. Tie-break the recommended action. Interactive mode's walk-through and LFG paths present a per-finding recommended action (Apply / Defer / Skip / Acknowledge) derived from the normalized
    autofix_class
    and
    suggested_fix
    . When contributing reviewers implied different actions for the same merged finding, synthesis picks the most conservative using the order
    Skip > Defer > Apply > Acknowledge
    . This guarantees that identical review artifacts produce the same recommendation deterministically, so LFG results are auditable after the fact and the walk-through's recommendation is stable across re-runs. The user may still override per finding via the walk-through's options; this rule only determines what gets labeled "recommended."
  8. Partition the work. Build three sets:
    • in-skill fixer queue: only
      safe_auto -> review-fixer
    • residual actionable queue: unresolved
      gated_auto
      or
      manual
      findings whose owner is
      downstream-resolver
    • report-only queue:
      advisory
      findings plus anything owned by
      human
      or
      release
  9. Sort. Order by severity (P0 first) -> confidence (descending) -> file path -> line number.
  10. Collect coverage data. Union residual_risks and testing_gaps across reviewers.
  11. Preserve CE agent artifacts. Keep the learnings, agent-native, schema-drift, and deployment-verification outputs alongside the merged finding set. Do not drop unstructured agent output just because it does not match the persona JSON schema.
将多个审查者的紧凑JSON返回转换为一个去重、置信度门控的结果集。紧凑返回包含合并层级字段(title、severity、file、line、confidence、autofix_class、owner、requires_verification、pre_existing)以及可选的suggested_fix。细节层级字段(why_it_matters、evidence)在磁盘上的代理工件文件中,此阶段不加载。
  1. 验证:检查每个紧凑返回的必填顶层和每个结果字段,以及值约束。丢弃格式错误的返回或结果。记录丢弃数量。
    • 顶层必填: reviewer(字符串)、findings(数组)、residual_risks(数组)、testing_gaps(数组)。如果任何字段缺失或类型错误,丢弃整个返回。
    • 每个结果必填: title、severity、file、line、confidence、autofix_class、owner、requires_verification、pre_existing
    • 值约束:
      • severity: P0 | P1 | P2 | P3
      • autofix_class: safe_auto | gated_auto | manual | advisory
      • owner: review-fixer | downstream-resolver | human | release
      • confidence: 数值,0.0-1.0
      • line: 正整数
      • pre_existing、requires_verification: 布尔值
    • 此处不针对完整schema验证——完整schema(包括why_it_matters和evidence)适用于磁盘上的工件文件,而非紧凑返回。
  2. 置信度门控:抑制置信度低于0.60的结果。例外:置信度≥0.50的P0结果保留——关键但不确定的问题不得被静默丢弃。记录抑制数量。这与角色指令和schema的置信度阈值一致。
  3. 去重:计算指纹:
    normalize(file) + line_bucket(line, +/-3) + normalize(title)
    。当指纹匹配时,合并:保留最高严重性、最高置信度,注明哪些审查者标记了该问题。
  4. 跨审查者共识:当2个或更多独立审查者标记同一问题(相同指纹)时,将合并后的置信度提高0.10(上限1.0)。跨审查者共识是强信号——独立审查者聚焦同一问题比任何单个审查者的置信度更可靠。在输出的Reviewer列中注明共识(例如,"security, correctness")。
  5. 分离预先存在的问题:将
    pre_existing: true
    的结果提取到单独列表中。
  6. 解决分歧:当审查者标记同一代码区域,但在严重性、autofix_class或owner上存在分歧时,在Reviewer列中注明分歧(例如,"security (P0), correctness (P1) -- kept P0")。这种透明度帮助用户理解结果的路由原因。
  7. 规范化路由:为每个合并后的结果设置最终的
    autofix_class
    owner
    requires_verification
    。如果审查者存在分歧,保留最保守的路由。综合处理可将结果从
    safe_auto
    缩小到
    gated_auto
    manual
    ,但无新证据时不得扩大范围。 7b. 打破推荐操作的平局:交互式模式的逐个审查和LFG路径根据规范化的
    autofix_class
    suggested_fix
    ,为每个结果提供推荐操作(Apply / Defer / Skip / Acknowledge)。当贡献审查者对同一合并结果暗示不同操作时,综合处理选择最保守的操作,顺序为
    Skip > Defer > Apply > Acknowledge
    。这保证相同的审查工件产生相同的推荐,使LFG结果事后可审计,且逐个审查的推荐在重新运行时保持稳定。用户仍可通过逐个审查的选项覆盖每个结果;此规则仅决定标记为"推荐"的操作。
  8. 划分工作:构建三个集合:
    • 技能内修复器队列:仅
      safe_auto -> review-fixer
    • 剩余可执行队列:未解决的
      gated_auto
      manual
      结果,所有者为
      downstream-resolver
    • 仅报告队列:
      advisory
      结果,以及任何所有者为
      human
      release
      的结果
  9. 排序:按严重性(P0优先)→ 置信度(降序)→ 文件路径 → 行号排序。
  10. 收集覆盖数据:合并所有审查者的residual_risks和testing_gaps。
  11. 保留CE代理工件:将learnings、agent-native、schema-drift和deployment-verification输出与合并后的结果集一起保留。不要因非结构化代理输出不符合角色JSON schema而丢弃。

Stage 6: Synthesize and present

阶段6:综合处理与展示

Assemble the final report using pipe-delimited markdown tables for findings from the review output template included below. The table format is mandatory for finding rows in interactive mode — do not render findings as freeform text blocks or horizontal-rule-separated prose. Other report sections (Applied Fixes, Learnings, Coverage, etc.) use bullet lists and the
---
separator before the verdict, as shown in the template.
  1. Header. Scope, intent, mode, reviewer team with per-conditional justifications.
  2. Findings. Rendered as pipe-delimited tables grouped by severity (
    ### P0 -- Critical
    ,
    ### P1 -- High
    ,
    ### P2 -- Moderate
    ,
    ### P3 -- Low
    ). Each finding row shows
    #
    , file, issue, reviewer(s), confidence, and synthesized route. Omit empty severity levels. Never render findings as freeform text blocks or numbered lists.
  3. Requirements Completeness. Include only when a plan was found in Stage 2b. For each requirement (R1, R2, etc.) and implementation unit in the plan, report whether corresponding work appears in the diff. Use a simple checklist: met / not addressed / partially addressed. Routing depends on
    plan_source
    :
    • explicit
      (caller-provided or PR body): Flag unaddressed requirements as P1 findings with
      autofix_class: manual
      ,
      owner: downstream-resolver
      . These enter the residual actionable queue and can become todos.
    • inferred
      (auto-discovered): Flag unaddressed requirements as P3 findings with
      autofix_class: advisory
      ,
      owner: human
      . These stay in the report only — no todos, no autonomous follow-up. An inferred plan match is a hint, not a contract. Omit this section entirely when no plan was found — do not mention the absence of a plan.
  4. Applied Fixes. Include only if a fix phase ran in this invocation.
  5. Residual Actionable Work. Include when unresolved actionable findings were handed off or should be handed off.
  6. Pre-existing. Separate section, does not count toward verdict.
  7. Learnings & Past Solutions. Surface ce-learnings-researcher results: if past solutions are relevant, flag them as "Known Pattern" with links to docs/solutions/ files.
  8. Agent-Native Gaps. Surface ce-agent-native-reviewer results. Omit section if no gaps found.
  9. Schema Drift Check. If ce-schema-drift-detector ran, summarize whether drift was found. If drift exists, list the unrelated schema objects and the required cleanup command. If clean, say so briefly.
  10. Deployment Notes. If ce-deployment-verification-agent ran, surface the key Go/No-Go items: blocking pre-deploy checks, the most important verification queries, rollback caveats, and monitoring focus areas. Keep the checklist actionable rather than dropping it into Coverage.
  11. Coverage. Suppressed count, residual risks, testing gaps, failed/timed-out reviewers, and any intent uncertainty carried by non-interactive modes.
  12. Verdict. Ready to merge / Ready with fixes / Not ready. Fix order if applicable. When an
    explicit
    plan has unaddressed requirements, the verdict must reflect it — a PR that's code-clean but missing planned requirements is "Not ready" unless the omission is intentional. When an
    inferred
    plan has unaddressed requirements, note it in the verdict reasoning but do not block on it alone.
Do not include time estimates.
Format verification: Before delivering the report, verify the findings sections use pipe-delimited table rows (
| # | File | Issue | ... |
) not freeform text. If you catch yourself rendering findings as prose blocks separated by horizontal rules or bullet points, stop and reformat into tables.
使用管道分隔的Markdown表格呈现结果(来自下方包含的审查输出模板)。在交互式模式下,结果行必须使用表格格式——不要将结果渲染为自由文本块或水平规则分隔的散文。其他报告章节(Applied Fixes、Learnings、Coverage等)使用项目符号列表,在verdict前使用
---
分隔,如模板所示。
  1. 页眉:范围、意图、模式、带每个条件理由的审查者团队。
  2. 结果:按严重性分组(
    ### P0 -- 严重
    ### P1 -- 高
    ### P2 -- 中等
    ### P3 -- 低
    ),以管道分隔的表格呈现。每个结果行显示
    #
    、文件、问题、审查者、置信度和综合路由。省略空的严重性级别。绝不将结果渲染为自由文本块或编号列表。
  3. 需求完整性:仅在阶段2b找到计划时包含。对于计划中的每个需求(R1、R2等)和实现单元,报告差异中是否存在对应工作。使用简单的检查清单:已满足 / 未涉及 / 部分涉及。路由取决于
    plan_source
    • explicit
      (调用者提供或PR正文):将未涉及的需求标记为P1结果,
      autofix_class: manual
      owner: downstream-resolver
      。这些进入剩余可执行队列,可成为待办项。
    • inferred
      (自动发现):将未涉及的需求标记为P3结果,
      autofix_class: advisory
      owner: human
      。这些仅保留在报告中——不生成待办项,不进行自主后续处理。推断的计划匹配是提示,而非契约。 未找到计划时完全省略此章节——不要提及计划缺失。
  4. 已应用修复:仅当本次调用运行了修复阶段时包含。
  5. 剩余可执行工作:当未解决的可执行结果被移交或应移交时包含。
  6. 预先存在的问题:单独章节,不计入verdict。
  7. 经验总结与过往解决方案:展示ce-learnings-researcher的结果:如果过往解决方案相关,标记为"Known Pattern",并链接到docs/solutions/文件。
  8. Agent原生缺口:展示ce-agent-native-reviewer的结果。如果未找到缺口,省略此章节。
  9. Schema漂移检查:如果ce-schema-drift-detector运行,总结是否发现漂移。如果存在漂移,列出无关的schema对象和所需的清理命令。如果干净,简要说明。
  10. 部署说明:如果ce-deployment-verification-agent运行,展示关键的Go/No-Go项:阻塞性预部署检查、最重要的验证查询、回退注意事项和监控重点。保持检查清单可执行,而非将其放入Coverage。
  11. 覆盖范围:抑制数量、剩余风险、测试缺口、失败/超时的审查者,以及非交互式模式携带的任何意图不确定性。
  12. Verdict:Ready to merge / Ready with fixes / Not ready。如果适用,注明修复顺序。当
    explicit
    计划存在未涉及的需求时,verdict必须反映——代码干净但缺少计划需求的PR是"Not ready",除非省略是有意的。当
    inferred
    计划存在未涉及的需求时,在verdict推理中注明,但不单独阻塞。
不要包含时间估计。
格式验证: 在交付报告前,验证结果章节使用管道分隔的表格行(
| # | File | Issue | ... |
),而非自由文本。如果发现自己将结果渲染为散文块或水平规则分隔的文本,停止并重新格式化为表格。

Headless output format

无头输出格式

In
mode:headless
, replace the interactive pipe-delimited table report with a structured text envelope. The envelope follows the same structural pattern as document-review's headless output (completion header, metadata block, findings grouped by autofix_class, trailing sections) while using ce-code-review's own section headings and per-finding fields.
Code review complete (headless mode).

Scope: <scope-line>
Intent: <intent-summary>
Reviewers: <reviewer-list with conditional justifications>
Verdict: <Ready to merge | Ready with fixes | Not ready>
Artifact: .context/compound-engineering/ce-code-review/<run-id>/

Applied N safe_auto fixes.

Gated-auto findings (concrete fix, changes behavior/contracts):

[P1][gated_auto -> downstream-resolver][needs-verification] File: <file:line> -- <title> (<reviewer>, confidence <N>)
  Why: <why_it_matters>
  Suggested fix: <suggested_fix or "none">
  Evidence: <evidence[0]>
  Evidence: <evidence[1]>

Manual findings (actionable, needs handoff):

[P1][manual -> downstream-resolver] File: <file:line> -- <title> (<reviewer>, confidence <N>)
  Why: <why_it_matters>
  Evidence: <evidence[0]>

Advisory findings (report-only):

[P2][advisory -> human] File: <file:line> -- <title> (<reviewer>, confidence <N>)
  Why: <why_it_matters>

Pre-existing issues:
[P2][gated_auto -> downstream-resolver] File: <file:line> -- <title> (<reviewer>, confidence <N>)
  Why: <why_it_matters>

Residual risks:
- <risk>

Learnings & Past Solutions:
- <learning>

Agent-Native Gaps:
- <gap description>

Schema Drift Check:
- <drift status>

Deployment Notes:
- <deployment note>

Testing gaps:
- <gap>

Coverage:
- Suppressed: <N> findings below 0.60 confidence (P0 at 0.50+ retained)
- Untracked files excluded: <file1>, <file2>
- Failed reviewers: <reviewer>

Review complete
Detail enrichment (headless only): The headless envelope includes
Why:
,
Evidence:
, and
Suggested fix:
lines. After merge (Stage 5), read the per-agent artifact files from
.context/compound-engineering/ce-code-review/{run_id}/
for only the findings that survived dedup and confidence gating.
  • Field tiers:
    Why:
    and
    Evidence:
    are detail-tier -- load from per-agent artifact files.
    Suggested fix:
    is merge-tier -- use it directly from the compact return without artifact lookup.
  • Artifact matching: For each surviving finding, look up its detail-tier fields in the artifact files of the contributing reviewers. Match on
    file + line_bucket(line, +/-3)
    (the same tolerance used in Stage 5 dedup) within each contributing reviewer's artifact. When multiple artifact entries fall within the line bucket, apply
    normalize(title)
    to both the merged finding's title and each candidate entry's title as a tie-breaker.
  • Reviewer order: Try contributing reviewers in the order they appear in the merged finding's reviewer list; use the first match.
  • No-match fallback: If no artifact file contains a match (all writes failed, or the finding was synthesized during merge), omit the
    Why:
    and
    Evidence:
    lines for that finding and note the gap in Coverage. The
    Suggested fix:
    line can still be populated from the compact return since it is merge-tier.
Formatting rules:
  • The
    [needs-verification]
    marker appears only on findings where
    requires_verification: true
    .
  • The
    Artifact:
    line gives callers the path to the full run artifact for machine-readable access to the complete findings schema. The text envelope is the primary handoff; the artifact is for debugging and full-fidelity access.
  • Findings with
    owner: release
    appear in the Advisory section (they are operational/rollout items, not code fixes).
  • Findings with
    pre_existing: true
    appear in the Pre-existing section regardless of autofix_class.
  • The Verdict appears in the metadata header (deliberately reordered from the interactive format where it appears at the bottom) so programmatic callers get the verdict first.
  • Omit any section with zero items.
  • If all reviewers fail or time out, emit
    Code review degraded (headless mode). Reason: 0 of N reviewers returned results.
    followed by "Review complete".
  • End with "Review complete" as the terminal signal so callers can detect completion.
mode:headless
模式下,用结构化文本信封替换交互式管道分隔的表格报告。信封遵循文档审查无头输出的相同结构模式(完成页眉、元数据块、按autofix_class分组的结果、尾部章节),同时使用ce-code-review自己的章节标题和每个结果的字段。
Code review complete (headless mode).

Scope: <scope-line>
Intent: <intent-summary>
Reviewers: <reviewer-list with conditional justifications>
Verdict: <Ready to merge | Ready with fixes | Not ready>
Artifact: .context/compound-engineering/ce-code-review/<run-id>/

Applied N safe_auto fixes.

Gated-auto findings (concrete fix, changes behavior/contracts):

[P1][gated_auto -> downstream-resolver][needs-verification] File: <file:line> -- <title> (<reviewer>, confidence <N>)
  Why: <why_it_matters>
  Suggested fix: <suggested_fix or "none">
  Evidence: <evidence[0]>
  Evidence: <evidence[1]>

Manual findings (actionable, needs handoff):

[P1][manual -> downstream-resolver] File: <file:line> -- <title> (<reviewer>, confidence <N>)
  Why: <why_it_matters>
  Evidence: <evidence[0]>

Advisory findings (report-only):

[P2][advisory -> human] File: <file:line> -- <title> (<reviewer>, confidence <N>)
  Why: <why_it_matters>

Pre-existing issues:
[P2][gated_auto -> downstream-resolver] File: <file:line> -- <title> (<reviewer>, confidence <N>)
  Why: <why_it_matters>

Residual risks:
- <risk>

Learnings & Past Solutions:
- <learning>

Agent-Native Gaps:
- <gap description>

Schema Drift Check:
- <drift status>

Deployment Notes:
- <deployment note>

Testing gaps:
- <gap>

Coverage:
- Suppressed: <N> findings below 0.60 confidence (P0 at 0.50+ retained)
- Untracked files excluded: <file1>, <file2>
- Failed reviewers: <reviewer>

Review complete
细节补充(仅无头模式): 无头输出信封包含
Why:
Evidence:
Suggested fix:
行。合并后(阶段5),从
.context/compound-engineering/ce-code-review/{run_id}/
读取仅保留的结果的代理工件文件。
  • 字段层级:
    Why:
    Evidence:
    是细节层级——从代理工件文件加载。
    Suggested fix:
    是合并层级——直接使用紧凑返回中的内容,无需查找工件。
  • 工件匹配: 对于每个保留的结果,在贡献审查者的工件文件中查找其细节层级字段。在每个贡献审查者的工件中,按
    file + line_bucket(line, +/-3)
    (与阶段5去重使用的相同容差)匹配。如果工件中有多个条目落在行桶内,将合并结果的title与每个候选条目的title进行
    normalize
    作为平局决胜。
  • 审查者顺序: 按合并结果的审查者列表顺序尝试贡献审查者;使用第一个匹配项。
  • 无匹配回退: 如果没有工件文件包含匹配项(所有写入失败,或结果是在合并期间综合生成的),省略该结果的
    Why:
    Evidence:
    行,并在Coverage中注明缺口。
    Suggested fix:
    行仍可从紧凑返回中填充,因为它是合并层级。
格式化规则:
  • [needs-verification]
    标记仅出现在
    requires_verification: true
    的结果上。
  • Artifact:
    行向调用者提供完整运行工件的路径,以便以机器可读方式访问完整的结果schema。文本信封是主要的移交方式;工件用于调试和获取完整保真度的访问权限。
  • owner: release
    的结果出现在Advisory章节(它们是操作/发布项,而非代码修复)。
  • pre_existing: true
    的结果无论autofix_class如何,都出现在Pre-existing章节。
  • Verdict出现在元数据页眉中(与交互式格式中的底部位置故意重新排序),以便程序化调用者首先获取verdict。
  • 省略任何包含零项的章节。
  • 如果所有审查者失败或超时,输出
    Code review degraded (headless mode). Reason: 0 of N reviewers returned results.
    然后输出"Review complete"。
  • 以"Review complete"作为终端信号结束,以便调用者检测完成状态。

Quality Gates

质量门控

Before delivering the review, verify:
  1. Every finding is actionable. Re-read each finding. If it says "consider", "might want to", or "could be improved" without a concrete fix, rewrite it with a specific action. Vague findings waste engineering time.
  2. No false positives from skimming. For each finding, verify the surrounding code was actually read. Check that the "bug" isn't handled elsewhere in the same function, that the "unused import" isn't used in a type annotation, that the "missing null check" isn't guarded by the caller.
  3. Severity is calibrated. A style nit is never P0. A SQL injection is never P3. Re-check every severity assignment.
  4. Line numbers are accurate. Verify each cited line number against the file content. A finding pointing to the wrong line is worse than no finding.
  5. Protected artifacts are respected. Discard any findings that recommend deleting or gitignoring files in
    docs/brainstorms/
    ,
    docs/plans/
    , or
    docs/solutions/
    .
  6. Findings don't duplicate linter output. Don't flag things the project's linter/formatter would catch (missing semicolons, wrong indentation). Focus on semantic issues.
在交付审查前,验证:
  1. 每个结果都是可执行的:重新阅读每个结果。如果它说"consider"、"might want to"或"could be improved"但没有具体修复方案,重写为具体操作。模糊的结果浪费工程时间。
  2. 无 skim 导致的误报:对于每个结果,验证是否实际阅读了周围的代码。检查"bug"是否在同一函数的其他地方处理,"未使用的导入"是否在类型注解中使用,"缺失的空检查"是否被调用者保护。
  3. 严重程度已校准:风格问题绝不是P0。SQL注入绝不是P3。重新检查每个严重程度分配。
  4. 行号准确:验证每个引用的行号与文件内容一致。指向错误行的结果比没有结果更糟。
  5. 受保护工件得到尊重:丢弃任何建议删除或gitignore
    docs/brainstorms/
    docs/plans/
    docs/solutions/
    中文件的结果。
  6. 结果不重复 linter 输出:不要标记项目的linter/格式化器会捕获的内容(缺少分号、缩进错误)。聚焦语义问题。

Language-Aware Conditionals

语言感知条件

This skill uses stack-specific reviewer agents when the diff clearly warrants them. Keep those agents opinionated. They are not generic language checkers; they add a distinct review lens on top of the always-on and cross-cutting personas.
Do not spawn them mechanically from file extensions alone. The trigger is meaningful changed behavior, architecture, or UI state in that stack.
当差异明显需要时,本技能使用栈专属的审查者代理。保持这些代理的意见性。它们不是通用语言检查器;它们在始终启用和跨领域角色之上,添加独特的审查视角。
不要仅根据文件扩展名机械地启动它们。触发条件是该栈中存在有意义的变更行为、架构或UI状态。

After Review

审查后

Mode-Driven Post-Review Flow

模式驱动的审查后流程

After presenting findings and verdict (Stage 6), route the next steps by mode. Review and synthesis stay the same in every mode; only mutation and handoff behavior changes.
在呈现结果和verdict(阶段6)后,按模式路由后续步骤。审查和综合处理在所有模式中保持相同;仅修改和移交行为不同。

Step 1: Build the action sets

步骤1:构建操作集

  • Clean review means zero findings after suppression and pre-existing separation. Skip the fix/handoff phase when the review is clean.
  • Fixer queue: final findings routed to
    safe_auto -> review-fixer
    .
  • Residual actionable queue: unresolved
    gated_auto
    or
    manual
    findings whose final owner is
    downstream-resolver
    .
  • Report-only queue:
    advisory
    findings and any outputs owned by
    human
    or
    release
    .
  • Never convert advisory-only outputs into fix work or todos. Deployment notes, residual risks, and release-owned items stay in the report.
  • 干净审查指经过抑制和分离预先存在的问题后,结果数量为零。当审查干净时,跳过修复/移交阶段。
  • 修复器队列: 最终路由到
    safe_auto -> review-fixer
    的结果。
  • 剩余可执行队列: 未解决的
    gated_auto
    manual
    结果,最终所有者为
    downstream-resolver
  • 仅报告队列:
    advisory
    结果,以及任何所有者为
    human
    release
    的输出。
  • 绝不将仅建议的输出转换为修复工作或待办项:部署说明、剩余风险和release所有的项保留在报告中。

Step 2: Choose policy by mode

步骤2:按模式选择策略

Interactive mode
  • Apply
    safe_auto -> review-fixer
    findings automatically without asking. These are safe by definition.
  • Zero-remaining case: if no
    gated_auto
    or
    manual
    findings remain after the
    safe_auto
    pass, skip the routing question entirely. Emit a one-line completion summary phrased so advisory and pre-existing findings (which are not handled by this flow) are not implied to be cleared. When no advisory or pre-existing findings remain in the report,
    All findings resolved — N safe_auto fixes applied.
    is accurate. When advisory and/or pre-existing findings do remain, use the qualified form
    All actionable findings resolved — N safe_auto fixes applied. (K advisory, J pre-existing findings remain in the report.)
    , omitting any zero-count clause. Follow the summary with the existing end-of-review verdict, then proceed to Step 5 per the gating rule there.
  • Tracker pre-detection: before rendering the routing question, consult
    references/tracker-defer.md
    for the session's tracker tuple
    { tracker_name, confidence, named_sink_available, any_sink_available }
    . The probe runs at most once per session and is cached for the rest of the run.
    named_sink_available
    drives the option C label (inline tracker name only when the named sink can actually be invoked).
    any_sink_available
    drives whether option C is offered at all (it can still be offered when the named tracker is unreachable but
    gh
    or the harness task primitive works).
  • Verify question-tool pre-load (checklist, Claude Code only). Before firing the routing question in Claude Code, confirm
    AskUserQuestion
    is loaded (per Interactive mode rules at the top of this skill). If not yet loaded this session, call
    ToolSearch
    with query
    select:AskUserQuestion
    now. Do not proceed to the routing question without this verification. Rendering the question as narrative text is a bug, not a valid fallback. On Codex (
    request_user_input
    ) and Gemini (
    ask_user
    ) this checklist does not apply — the platform-native question tool is loaded by default and there is no
    ToolSearch
    preload step to perform.
  • Routing question. Ask using the platform's blocking question tool (
    AskUserQuestion
    in Claude Code,
    request_user_input
    in Codex,
    ask_user
    in Gemini). Stem:
    What should the agent do with the remaining N findings?
    — use third-person voice referring to "the agent", not first-person "me" / "I". Options:
    (A) Review each finding one by one — accept the recommendation or choose another action
    (B) LFG. Apply the agent's best-judgment action per finding
    (C) File a [TRACKER] ticket per finding without applying fixes
    (D) Report only — take no further action
    Render option C per
    references/tracker-defer.md
    : when
    confidence = high
    AND
    named_sink_available = true
    , replace
    [TRACKER]
    with the concrete name and keep the full label (e.g.,
    File a Linear ticket per finding without applying fixes
    ). When
    any_sink_available = true
    but either
    confidence = low
    or
    named_sink_available = false
    (a fallback tier like GitHub Issues or the harness task primitive is working instead), use the generic label
    File an issue per finding without applying fixes
    — this is a whole-label substitution, not a
    [TRACKER]
    token swap. When
    any_sink_available = false
    , omit option C entirely and add one line to the stem explaining why (e.g.,
    Defer unavailable — no tracker or task-tracking primitive detected on this platform.
    ). The three remaining options (A, B, D) survive.
    The numbered-list text fallback applies only when
    ToolSearch
    explicitly returns no match for the platform's question tool or the tool call errors. It does not apply when the agent simply hasn't loaded the tool yet — in that case, load it now (see the verification checklist above). On platforms genuinely without a blocking question tool, present the applicable options as a numbered list and wait for the user's reply.
  • Dispatch on selection. Route by the option letter (A / B / C / D), not by the rendered label string. The option-C label varies by tracker-detection confidence (
    File a [TRACKER] ticket per finding without applying fixes
    for a named tracker,
    File an issue per finding without applying fixes
    as the generic fallback, or omitted entirely when no sink is available — see
    references/tracker-defer.md
    ), and options A / B / D have a single canonical label each. The letter is the stable dispatch signal; the canonical labels below are shown for documentation only. A low-confidence run that rendered option C as the generic label routes to the same branch as a high-confidence run that rendered it with the named tracker.
    • (A)
      Review each finding one by one
      — load
      references/walkthrough.md
      and enter the per-finding walk-through loop. The walk-through accumulates Apply decisions in memory; Defer decisions execute inline via
      references/tracker-defer.md
      ; Skip / Acknowledge decisions are recorded as no-action;
      LFG the rest
      routes through
      references/bulk-preview.md
      . At end of the loop, dispatch one fixer subagent for the accumulated Apply set (Step 3). Emit the unified completion report.
    • (B)
      LFG. Apply the agent's best-judgment action per finding
      — load
      references/bulk-preview.md
      scoped to every pending
      gated_auto
      /
      manual
      finding. On
      Proceed
      , execute the plan: Apply set → Step 3 fixer dispatch; Defer set →
      references/tracker-defer.md
      ; Skip / Acknowledge → no-op. On
      Cancel
      , return to this routing question. Emit the unified completion report after execution.
    • (C)
      File a [TRACKER] ticket per finding without applying fixes
      (or the generic
      File an issue per finding without applying fixes
      when the named-tracker label is not used) — load
      references/bulk-preview.md
      with every pending finding in the file-tickets bucket (regardless of the agent's natural recommendation). On
      Proceed
      , route every finding through
      references/tracker-defer.md
      ; no fixes are applied. On
      Cancel
      , return to this routing question. Emit the unified completion report.
    • (D)
      Report only — take no further action
      — do not enter any dispatch phase. Emit the completion report, then proceed to Step 5 per its gating rule (
      fixes_applied_count > 0
      from earlier
      safe_auto
      passes). If no fixes were applied this run, stop after the report.
  • The walk-through's completion report, the LFG / File-tickets completion report, and the zero-remaining completion summary all follow the unified completion-report structure documented in
    references/walkthrough.md
    . Use the same structure across every terminal path.
Autofix mode
  • Ask no questions.
  • Apply only the
    safe_auto -> review-fixer
    queue.
  • Leave
    gated_auto
    ,
    manual
    ,
    human
    , and
    release
    items unresolved.
  • Prepare residual work only for unresolved actionable findings whose final owner is
    downstream-resolver
    .
Report-only mode
  • Ask no questions.
  • Do not build a fixer queue.
  • Do not create residual todos or
    .context
    artifacts.
  • Stop after Stage 6. Everything remains in the report.
Headless mode
  • Ask no questions.
  • Apply only the
    safe_auto -> review-fixer
    queue in a single pass. Do not enter the bounded re-review loop (Step 3). Spawn one fixer subagent, apply fixes, then proceed directly to Step 4.
  • Leave
    gated_auto
    ,
    manual
    ,
    human
    , and
    release
    items unresolved — they appear in the structured text output.
  • Output the headless output envelope (see Stage 6) instead of the interactive report.
  • Write a run artifact (Step 4) but do not create todo files.
  • Stop after the structured text output and "Review complete" signal. No commit/push/PR.
交互式模式
  • 自动应用
    safe_auto -> review-fixer
    结果,无需询问。这些结果本质上是安全的。
  • 无剩余结果情况: 如果应用
    safe_auto
    后,没有
    gated_auto
    manual
    结果剩余,完全跳过路由问题。输出一行完成总结,措辞要明确,不要暗示建议和预先存在的结果(不由本流程处理)已被清除。当报告中没有建议和/或预先存在的结果时,
    All findings resolved — N safe_auto fixes applied.
    是准确的。当存在建议和/或预先存在的结果时,使用限定形式
    All actionable findings resolved — N safe_auto fixes applied. (K advisory, J pre-existing findings remain in the report.)
    ,省略任何零计数子句。在总结后跟随现有的审查结束verdict,然后根据那里的门控规则进入步骤5。
  • 跟踪器预检测: 在渲染路由问题前,参考
    references/tracker-defer.md
    获取会话的跟踪器元组
    { tracker_name, confidence, named_sink_available, any_sink_available }
    。探测在每个会话中最多运行一次,并缓存供整个运行使用。
    named_sink_available
    驱动选项C的标签(仅当命名接收器实际可调用时,内联跟踪器名称)。
    any_sink_available
    驱动是否提供选项C(即使命名跟踪器不可达,但
    gh
    或 harness任务原语可用,仍可提供)。
  • 验证提问工具预加载(仅Claude Code): 在Claude Code中触发路由问题前,确认
    AskUserQuestion
    已加载(根据本技能顶部的交互式模式规则)。如果本次会话尚未加载,现在调用
    ToolSearch
    并传入查询
    select:AskUserQuestion
    。在未验证前不要进入路由问题。将提问渲染为叙述性文本是错误,而非有效回退。在Codex(
    request_user_input
    )和Gemini(
    ask_user
    )中无需此检查——平台原生提问工具默认已加载,且无
    ToolSearch
    预加载步骤。
  • 路由问题: 使用平台的阻塞提问工具(Claude Code中的
    AskUserQuestion
    、Codex中的
    request_user_input
    、Gemini中的
    ask_user
    )提问。问题主干:
    What should the agent do with the remaining N findings?
    — 使用第三人称指代"the agent",而非第一人称"me" / "I"。选项:
    (A) Review each finding one by one — accept the recommendation or choose another action
    (B) LFG. Apply the agent's best-judgment action per finding
    (C) File a [TRACKER] ticket per finding without applying fixes
    (D) Report only — take no further action
    根据
    references/tracker-defer.md
    渲染选项C:当
    confidence = high
    named_sink_available = true
    时,将
    [TRACKER]
    替换为具体名称,并保留完整标签(例如,
    File a Linear ticket per finding without applying fixes
    )。当
    any_sink_available = true
    confidence = low
    named_sink_available = false
    (使用GitHub Issues或harness任务原语等回退层级)时,使用通用标签
    File an issue per finding without applying fixes
    — 这是完整标签替换,而非
    [TRACKER]
    令牌交换。当
    any_sink_available = false
    时,完全省略选项C,并在主干中添加一行解释原因(例如,
    Defer unavailable — no tracker or task-tracking primitive detected on this platform.
    )。剩余三个选项(A、B、D)保留。
    仅当平台的提问工具的
    ToolSearch
    明确返回无匹配或工具调用错误时,才使用编号列表文本回退。如果代理尚未加载工具,不要使用回退——此时应加载它(见上方的验证检查清单)。在确实没有阻塞提问工具的平台上,将适用的选项作为编号列表呈现,并等待用户回复。
  • 按选择调度:按选项字母(A / B / C / D)路由,而非渲染的标签字符串。选项C的标签根据跟踪器检测置信度而变化(对于命名跟踪器为
    File a [TRACKER] ticket per finding without applying fixes
    ,通用回退为
    File an issue per finding without applying fixes
    ,或当无接收器可用时完全省略——见
    references/tracker-defer.md
    ),选项A / B / D各有一个标准标签。字母是稳定的调度信号;下方显示的标准标签仅用于文档。低置信度运行将选项C渲染为通用标签,与高置信度运行将其渲染为命名跟踪器标签,路由到相同分支。
    • (A)
      Review each finding one by one
      — 加载
      references/walkthrough.md
      ,进入逐个结果的审查循环。逐个审查在内存中累积Apply决策;Defer决策通过
      references/tracker-defer.md
      内联执行;Skip / Acknowledge决策记录为无操作;
      LFG the rest
      通过
      references/bulk-preview.md
      路由。循环结束后,为累积的Apply集调度一个修复器子代理(步骤3)。输出统一的完成报告。
    • (B)
      LFG. Apply the agent's best-judgment action per finding
      — 加载
      references/bulk-preview.md
      ,范围为所有待处理的
      gated_auto
      /
      manual
      结果。在
      Proceed
      时,执行计划:Apply集 → 步骤3修复器调度;Defer集 →
      references/tracker-defer.md
      ;Skip / Acknowledge → 无操作。在
      Cancel
      时,返回此路由问题。执行后输出统一的完成报告。
    • (C)
      File a [TRACKER] ticket per finding without applying fixes
      (或当未使用命名跟踪器标签时的通用
      File an issue per finding without applying fixes
      ) — 加载
      references/bulk-preview.md
      ,所有待处理结果都在创建工单的桶中(无论代理的自然建议如何)。在
      Proceed
      时,将每个结果通过
      references/tracker-defer.md
      路由;不应用修复。在
      Cancel
      时,返回此路由问题。输出统一的完成报告。
    • (D)
      Report only — take no further action
      — 不进入任何调度阶段。输出完成报告,然后根据其门控规则(
      fixes_applied_count > 0
      来自之前的
      safe_auto
      应用)进入步骤5。如果本次运行未应用任何修复,报告后停止。
  • 逐个审查的完成报告、LFG / 创建工单的完成报告以及无剩余结果的完成总结,均遵循
    references/walkthrough.md
    中记录的统一完成报告结构。在所有终端路径中使用相同结构。
自动修复模式
  • 不提问。
  • 仅应用
    safe_auto -> review-fixer
    队列。
  • 保留
    gated_auto
    manual
    human
    release
    类项不处理。
  • 仅为最终所有者为
    downstream-resolver
    的未解决可执行结果准备剩余工作。
仅报告模式
  • 不提问。
  • 不构建修复器队列。
  • 不创建剩余待办项或
    .context
    工件。
  • 阶段6后停止。所有内容保留在报告中。
无头模式
  • 不提问。
  • 仅单次应用
    safe_auto -> review-fixer
    队列。不进入有限轮次的重新审查循环(步骤3)。启动一个修复器子代理,应用修复,然后直接进入步骤4。
  • 保留
    gated_auto
    manual
    human
    release
    类项不处理——它们出现在结构化文本输出中。
  • 输出无头输出信封(见阶段6),而非交互式报告。
  • 写入运行工件(步骤4),但不创建待办文件。
  • 结构化文本输出和"Review complete"信号后停止。不执行提交/推送/PR。

Step 3: Apply fixes with one fixer and bounded rounds

步骤3:使用一个修复器和有限轮次应用修复

  • Spawn exactly one fixer subagent for the current fixer queue in the current checkout. That fixer applies all approved changes and runs the relevant targeted tests in one pass against a consistent tree.
  • Do not fan out multiple fixers against the same checkout. Parallel fixers require isolated worktrees/branches and deliberate mergeback.
  • Re-review only the changed scope after fixes land.
  • Bound the loop with
    max_rounds: 2
    . If issues remain after the second round, stop and hand them off as residual work or report them as unresolved.
  • If any applied finding has
    requires_verification: true
    , the round is incomplete until the targeted verification runs.
  • Do not start a mutating review round concurrently with browser testing on the same checkout. Future orchestrators that want both must either run
    mode:report-only
    during the parallel phase or isolate the mutating review in its own checkout/worktree.
  • 在当前检出目录中,为当前修复器队列启动恰好一个修复器子代理。该修复器在一致的代码树上,一次性应用所有批准的变更并运行相关的针对性测试。
  • 不要在同一检出目录上启动多个修复器。并行修复器需要独立的工作树/分支和刻意的合并。
  • 修复完成后,仅重新审查变更范围。
  • max_rounds: 2
    限制循环。如果第二轮后仍有问题,停止并将其作为剩余工作移交,或作为未解决问题报告。
  • 如果任何应用的结果有
    requires_verification: true
    ,在完成针对性验证前,本轮不完整。
  • 不要在同一检出目录上同时启动可变审查轮次与浏览器测试。未来需要两者的编排器,必须在并行阶段运行
    mode:report-only
    ,或在独立的检出目录/工作树中隔离可变审查。

Step 4: Emit artifacts and downstream handoff

步骤4:生成工件和下游移交

  • In interactive, autofix, and headless modes, write a per-run artifact under
    .context/compound-engineering/ce-code-review/<run-id>/
    containing:
    • synthesized findings (merged output from Stage 5)
    • applied fixes
    • residual actionable work
    • advisory-only outputs Per-agent full-detail JSON files (
      {reviewer_name}.json
      ) are already present in this directory from Stage 4 dispatch.
  • Also write
    metadata.json
    alongside the findings so downstream skills (e.g.,
    ce-polish-beta
    ) can verify the artifact matches the current branch and HEAD. Minimum fields:
    json
    {
      "run_id": "<run-id>",
      "branch": "<git branch --show-current at dispatch time>",
      "head_sha": "<git rev-parse HEAD at dispatch time>",
      "verdict": "<Ready to merge | Ready with fixes | Not ready>",
      "completed_at": "<ISO 8601 UTC timestamp>"
    }
    Capture
    branch
    and
    head_sha
    at dispatch time (before any autofixes land), and write the file after the verdict is finalized. This file is additive -- pre-existing artifacts that predate this field are still valid, and downstream skills fall back to file mtime when it is missing.
  • In autofix mode, create durable todo files only for unresolved actionable findings whose final owner is
    downstream-resolver
    . Load the
    ce-todo-create
    skill for the canonical directory path, naming convention, YAML frontmatter structure, and template. Each todo should map the finding's severity to the todo priority (
    P0
    /
    P1
    ->
    p1
    ,
    P2
    ->
    p2
    ,
    P3
    ->
    p3
    ) and set
    status: ready
    since these findings have already been triaged by synthesis.
  • Do not create todos for
    advisory
    findings,
    owner: human
    ,
    owner: release
    , or protected-artifact cleanup suggestions.
  • If only advisory outputs remain, create no todos.
  • Interactive mode may offer to externalize residual actionable work after fixes, but it is not required to finish the review.
  • 在交互式、自动修复和无头模式下,在
    .context/compound-engineering/ce-code-review/<run-id>/
    路径下写入每个运行的工件,包含:
    • 综合结果(阶段5的合并输出)
    • 已应用修复
    • 剩余可执行工作
    • 仅建议的输出 每个代理的全细节JSON文件(
      {reviewer_name}.json
      )已存在于该目录中,来自阶段4的调度。
  • 同时在结果旁写入
    metadata.json
    ,以便下游技能(例如
    ce-polish-beta
    )验证工件与当前分支和HEAD匹配。最小字段:
    json
    {
      "run_id": "<run-id>",
      "branch": "<git branch --show-current at dispatch time>",
      "head_sha": "<git rev-parse HEAD at dispatch time>",
      "verdict": "<Ready to merge | Ready with fixes | Not ready>",
      "completed_at": "<ISO 8601 UTC timestamp>"
    }
    在调度时(自动修复前)捕获
    branch
    head_sha
    ,并在verdict确定后写入文件。此文件是附加的——缺少此字段的预先存在的工件仍然有效,下游技能在缺少时回退到文件修改时间。
  • 在自动修复模式下,仅为最终所有者为
    downstream-resolver
    的未解决可执行结果创建持久待办文件。使用
    ce-todo-create
    技能遵循标准目录路径、命名规范、YAML前置元数据结构和模板。每个待办项应将结果的严重性映射为待办项优先级(
    P0
    /
    P1
    ->
    p1
    P2
    ->
    p2
    P3
    ->
    p3
    ),并设置
    status: ready
    ,因为这些结果已被综合处理分类。
  • 不为
    advisory
    结果、
    owner: human
    owner: release
    或受保护工件清理建议创建待办项。
  • 如果仅剩余建议输出,不创建待办项。
  • 交互式模式可在修复后提供将剩余可执行工作外部化的选项,但这不是完成审查的必需步骤。

Step 5: Final next steps

步骤5:最终后续步骤

Interactive mode only. After the fix-review cycle completes (clean verdict or the user chose to stop), offer next steps based on the entry mode. Reuse the resolved review base/default branch from Stage 1 when known; do not hard-code only
main
/
master
.
The gate is total fixes applied this run, not routing option. Track
fixes_applied_count
across the whole Interactive invocation. This counter includes both the
safe_auto
fixes applied automatically before the routing question (see Step 2 Interactive mode) AND any Apply decisions executed by routing option A (walk-through) or option B (LFG). Routing options C (File tickets) and D (Report only) add zero to this counter; neither does a walk-through that ends with only Skip / Defer / Acknowledge, and neither does an LFG whose recommendations were all Defer / Skip / Acknowledge.
Step 5 runs only when
fixes_applied_count > 0
. If the counter is zero — no
safe_auto
fixes were applied AND the routing path produced no additional Apply — skip Step 5 entirely and exit after the completion report. Asking "push fixes?" when nothing changed in the working tree is incoherent.
Common outcomes:
  • safe_auto
    produced fixes AND the user picked any routing option → Step 5 runs (counter > 0 from the safe_auto pass alone).
  • No
    safe_auto
    fixes AND the user picked option C or D → Step 5 skipped.
  • No
    safe_auto
    fixes AND walk-through / LFG finished with zero Applies → Step 5 skipped.
  • Zero-remaining case (no
    gated_auto
    /
    manual
    after
    safe_auto
    ) with at least one
    safe_auto
    fix → Step 5 runs; the routing question was never asked but the counter is > 0.
  • PR mode (entered via PR number/URL):
    • Push fixes -- push commits to the existing PR branch
    • Exit -- done for now
  • Branch mode (feature branch with no PR, and not the resolved review base/default branch):
    • Create a PR (Recommended) -- push and open a pull request
    • Continue without PR -- stay on the branch
    • Exit -- done for now
  • On the resolved review base/default branch:
    • Continue -- proceed with next steps
    • Exit -- done for now
If "Create a PR": first publish the branch with
git push --set-upstream origin HEAD
, then use
gh pr create
with a title and summary derived from the branch changes. If "Push fixes": push the branch with
git push
to update the existing PR.
Autofix, report-only, and headless modes: stop after the report, artifact emission, and residual-work handoff. Do not commit, push, or create a PR.
仅交互式模式。修复-审查周期完成后(干净verdict或用户选择停止),根据入口模式提供后续步骤。重用阶段1解析的审查基准/默认分支;不要仅硬编码
main
/
master
门控是本次运行应用的总修复数,而非路由选项。跟踪整个交互式调用的
fixes_applied_count
。此计数器包括路由问题前自动应用的
safe_auto
修复(见步骤2交互式模式),以及路由选项A(逐个审查)或选项B(LFG)执行的任何Apply决策。路由选项C(创建工单)和D(仅报告)不增加计数器;仅包含Skip / Defer / Acknowledge的逐个审查,以及所有建议为Defer / Skip / Acknowledge的LFG,也不增加计数器。
仅当
fixes_applied_count > 0
时运行步骤5。如果计数器为零——未应用任何
safe_auto
修复,且路由路径未产生额外Apply——完全跳过步骤5,完成报告后退出。当工作树中没有任何变更时,询问"push fixes?"是不合理的。
常见结果:
  • safe_auto
    产生修复,且用户选择任何路由选项 → 运行步骤5(计数器仅从safe_auto应用就已>0)。
  • safe_auto
    修复,且用户选择选项C或D → 跳过步骤5。
  • safe_auto
    修复,且逐个审查 / LFG完成时零Applies → 跳过步骤5。
  • 无剩余结果情况(应用
    safe_auto
    后无
    gated_auto
    /
    manual
    ),且至少有一个
    safe_auto
    修复 → 运行步骤5;从未询问路由问题,但计数器>0。
  • PR模式(通过PR编号/URL进入):
    • Push fixes -- 将提交推送到现有PR分支
    • Exit -- 暂时完成
  • 分支模式(无PR的功能分支,且不是解析的审查基准/默认分支):
    • Create a PR (Recommended) -- 推送并打开拉取请求
    • Continue without PR -- 留在分支上
    • Exit -- 暂时完成
  • 在解析的审查基准/默认分支上:
    • Continue -- 继续后续步骤
    • Exit -- 暂时完成
如果选择"Create a PR":首先用
git push --set-upstream origin HEAD
发布分支,然后使用
gh pr create
,标题和摘要派生自分变更。 如果选择"Push fixes":用
git push
推送分支,更新现有PR。
自动修复、仅报告和无头模式: 报告、工件生成和剩余工作移交后停止。不执行提交、推送或创建PR。

Fallback

回退

If the platform doesn't support parallel sub-agents, run reviewers sequentially. Everything else (stages, output format, merge pipeline) stays the same.

如果平台不支持并行子代理,顺序运行审查者。其他所有内容(阶段、输出格式、合并流水线)保持不变。

Included References

包含的参考

Persona Catalog

角色目录

@./references/persona-catalog.md
@./references/persona-catalog.md

Subagent Template

子代理模板

@./references/subagent-template.md
@./references/subagent-template.md

Diff Scope Rules

差异范围规则

@./references/diff-scope.md
@./references/diff-scope.md

Findings Schema

结果Schema

@./references/findings-schema.json
@./references/findings-schema.json

Review Output Template

审查输出模板

@./references/review-output-template.md
@./references/review-output-template.md