review-all

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Review All - two clean-agent code review

全面审查 - 双干净Agent代码审查

What this does, and why it has this shape

功能说明及设计初衷

Two independent reviewers examine the same change, then their findings are reconciled. The value is not "two reviews"; it is the agreement signal: when both clean agents independently flag the same issue, confidence is high. When only one flags something, it deserves scrutiny.
Four properties make the signal useful, and the procedure exists to protect them:
  1. Clean context per reviewer. Each reviewer must start fresh, with no memory of this orchestration or of each other, or they stop being independent. Spawn two agents with clean context; if the agent API supports a
    fork_context
    flag, set it to
    false
    . Do not paste either reviewer's output into the other reviewer.
  2. Same target. Agreement only means something if both looked at the exact same diff. Resolve the target once and hand the identical
    base...HEAD
    range to both.
  3. Two complementary review paths. One clean agent runs Codex's native
    codex review --base "$base"
    command. The other clean agent performs a direct review using
    references/review-guide.md
    , which is based on Google's "What to look for in a code review": https://google.github.io/eng-practices/review/reviewer/looking-for.html
  4. Genuine parallelism. Spawn both agents before waiting for either result. Do not serialize the reviews.
This skill is review-only. Never pass
--fix
/
--comment
, never apply patches, and never tell the user you are about to change code.
两个独立审查者会检查同一代码变更,随后整合他们的审查结果。其价值不在于“两次审查”,而在于共识信号:当两个干净Agent独立标记同一问题时,可信度极高;当只有一个Agent标记某问题时,该问题值得仔细核查。
以下四个特性让这一共识信号具备实用价值,本流程正是为保障这些特性而设计:
  1. 审查者上下文独立干净。每个审查者必须从全新状态开始,不保留本次编排或其他审查者的任何记忆,否则将失去独立性。启动两个上下文干净的Agent;若Agent API支持
    fork_context
    参数,将其设置为
    false
    。请勿将任一审查者的输出粘贴给另一个审查者。
  2. 审查目标完全一致。只有当两者审查的是完全相同的差异时,共识才有意义。只需解析一次目标,然后将完全相同的
    base...HEAD
    范围提供给两个Agent。
  3. 两种互补的审查路径。一个干净Agent运行Codex原生的
    codex review --base "$base"
    命令。另一个干净Agent则使用基于Google《代码审查要点》的
    references/review-guide.md
    直接进行审查:https://google.github.io/eng-practices/review/reviewer/looking-for.html
  4. 真正的并行执行。在等待任一结果前,先启动两个Agent。请勿串行执行审查流程。
此技能仅用于审查。切勿添加
--fix
/
--comment
参数,切勿应用补丁,切勿告知用户你将修改代码。

Not a PR-finishing loop

非PR收尾循环

When the user asks to update code until review comments are handled and checks pass, use the
pr-finish
workflow instead. This skill should be run once after a batched fix, or at most once more after a second batched fix if valid P0-P2 findings remain. Do not rerun
review-all
after each individual fix.
当用户要求更新代码直至处理完审查意见并通过检查时,请改用
pr-finish
工作流。此技能应在批量修复后运行一次,若仍存在有效的P0-P2问题,最多可在第二次批量修复后再运行一次。请勿在每次单独修复后重新运行
review-all

Procedure

执行流程

Step 1 — Resolve the shared target (one base, one range)

步骤1 — 确定共享审查目标(统一基准分支与差异范围)

The target is
base...HEAD
(merge-base diff of the current branch), so both reviewers see exactly the commits this branch adds.
bash
current=$(git rev-parse --abbrev-ref HEAD)
审查目标为
base...HEAD
(当前分支的合并基准差异),确保两个审查者看到的都是该分支新增的提交。
bash
current=$(git rev-parse --abbrev-ref HEAD)

Base precedence: user-provided --base > origin's default branch > main > master

基准分支优先级:用户指定的--base > 远程仓库默认分支 > main > master

base="$ARG_BASE" # whatever the user passed, may be empty if [ -z "$base" ]; then base=$(git symbolic-ref --quiet refs/remotes/origin/HEAD 2>/dev/null
| sed 's@^refs/remotes/origin/@@') fi [ -z "$base" ] && git show-ref --verify --quiet refs/heads/main && base=main [ -z "$base" ] && git show-ref --verify --quiet refs/heads/master && base=master
echo "current=$current base=$base" git diff --shortstat "$base"...HEAD git diff --name-only "$base"...HEAD
undefined
base="$ARG_BASE" # 用户传入的参数,可能为空 if [ -z "$base" ]; then base=$(git symbolic-ref --quiet refs/remotes/origin/HEAD 2>/dev/null
| sed 's@^refs/remotes/origin/@@') fi [ -z "$base" ] && git show-ref --verify --quiet refs/heads/main && base=main [ -z "$base" ] && git show-ref --verify --quiet refs/heads/master && base=master
echo "current=$current base=$base" git diff --shortstat "$base"...HEAD git diff --name-only "$base"...HEAD
undefined

Step 2 — Pre-flight (fail fast, don't waste a review)

步骤2 — 预检查(快速失败,避免浪费审查资源)

Stop and tell the user plainly if any of these hold:
  • Not inside a git repo.
  • No base branch could be resolved → ask the user which base to diff against.
  • current
    is the base branch → there is nothing to compare; ask for a base.
  • The diff is empty (
    git diff --shortstat "$base"...HEAD
    prints nothing) → there are no committed changes to review. Remind the user this mode reviews committed changes only; if their work is uncommitted, they should commit first.
  • Codex is not ready:
    codex login status
    does not report a logged-in account. Report it and offer to run the Google-rubric review alone.
若出现以下任一情况,请直接告知用户并终止流程:
  • 不在git仓库内。
  • 无法解析基准分支 → 询问用户应使用哪个分支作为差异基准。
  • current
    即为基准分支 → 无可比较内容;请用户指定基准分支。
  • 差异为空(
    git diff --shortstat "$base"...HEAD
    无输出)→ 无已提交变更可审查。提醒用户此模式仅审查已提交的变更;若工作区存在未提交内容,需先提交。
  • Codex未就绪:
    codex login status
    未显示已登录账户。告知用户此情况,并提供单独运行Google准则审查的选项。

Step 3 - Spawn BOTH clean agents in parallel

步骤3 — 并行启动两个干净Agent

Use the agent/subagent facility available in the current environment. Start both review agents before waiting. If the API exposes
fork_context
, set it to
false
for each agent. Give each agent only the repo path, resolved base, and its task.
Agent A: Codex review-command agent
Task prompt:
You are a clean-context review-command runner. In the repo at
<repo path>
, run Codex's native review command against the committed branch diff:
codex review --base <base>
This is review-only. Do not pass
--fix
or
--comment
, do not post anything to GitHub, and do not modify files. Return the native command output and, if possible, a normalized JSON array of findings:
{"file":"...","line":<int or null>,"priority":"P0|P1|P2|P3","category":"design|functionality|complexity|tests|naming|comments|consistency|documentation|security|other","title":"<one line>","description":"<evidence and impact>"}
. If the command finds nothing, return
[]
after the raw output summary. If the command fails, return the exact failure and stop.
Agent B: Google-rubric review agent
Read
references/review-guide.md
next to this
SKILL.md
, then give the agent this task with the full rubric pasted in:
You are an independent code reviewer with clean context. In the repo at
<repo path>
, review only the committed changes in
git diff <base>...HEAD
. Apply this review rubric, based on Google's "What to look for in a code review":
<paste the full contents of references/review-guide.md here>
Constraints: This is review-only. Do not pass
--comment
or
--fix
, do not post anything to GitHub, and do not modify any files. Use system context as a lens to judge the changed lines, but anchor every finding to the diff (a changed line, or something the change should have touched but didn't, like a missing test). Skip nitpicks a linter, formatter, typechecker, or compiler would catch.
Return your findings to me as a JSON array and nothing else. Each finding:
{"file": "...", "line": <int or null>, "priority": "P0|P1|P2|P3", "category": "design|functionality|complexity|tests|naming|comments|consistency|documentation|security|other", "title": "<one line>", "description": "<why it's a problem, with evidence>"}
. Assign
priority
per the rubric's P0–P3 scale. If you find nothing, return
[]
. If the change does something notably well, you may add one finding with priority
P3
and category
other
titled "Good: …".
After both agents have been spawned, wait for their results.
使用当前环境中的Agent/子Agent功能。在等待结果前先启动两个审查Agent。若API提供
fork_context
参数,为每个Agent设置为
false
。仅向每个Agent提供仓库路径、解析后的基准分支及其任务。
Agent A:Codex审查命令执行Agent
任务提示:
你是一个上下文干净的审查命令执行者。在
<repo path>
仓库中,针对已提交分支的差异运行Codex原生审查命令:
codex review --base <base>
此任务仅用于审查。请勿添加
--fix
--comment
参数,请勿向GitHub提交任何内容,请勿修改文件。返回原生命令输出,若可能,返回标准化的JSON格式审查结果数组:
{"file":"...","line":<int or null>,"priority":"P0|P1|P2|P3","category":"design|functionality|complexity|tests|naming|comments|consistency|documentation|security|other","title":"<one line>","description":"<evidence and impact>"}
. 若命令未发现任何问题,在原始输出摘要后返回
[]
。若命令执行失败,返回确切错误信息并终止。
Agent B:Google准则审查Agent
先读取此
SKILL.md
旁的
references/review-guide.md
,然后向Agent提供以下任务并粘贴完整准则内容:
你是一个上下文干净的独立代码审查者。在
<repo path>
仓库中,仅审查
git diff <base>...HEAD
中的已提交变更。 应用以下基于Google《代码审查要点》的审查准则:
<此处粘贴references/review-guide.md的完整内容>
约束条件:此任务仅用于审查。请勿添加
--comment
--fix
参数,请勿向GitHub提交任何内容,请勿修改任何文件。以系统上下文为视角判断变更行,但每个审查结果必须锚定到差异内容(某一行变更,或变更应涉及但未涉及的内容,如缺失的测试)。跳过可由代码检查器、格式化工具、类型检查器或编译器发现的细微问题。
仅以JSON数组形式返回审查结果。每个结果格式如下:
{"file": "...", "line": <int or null>, "priority": "P0|P1|P2|P3", "category": "design|functionality|complexity|tests|naming|comments|consistency|documentation|security|other", "title": "<one line>", "description": "<why it's a problem, with evidence>"}
. 依据准则中的P0–P3等级分配
priority
。若未发现任何问题,返回
[]
。若变更有明显可取之处,可添加一条优先级为
P3
、分类为
other
的结果,标题为“优点:……”。
启动两个Agent后,等待它们的结果。

Step 4 — Collect both results

步骤4 — 收集两个Agent的结果

  • Await the Google-rubric review agent's JSON.
  • Await the Codex review-command agent's raw output and/or normalized JSON. Parse the native Codex output semantically; don't rely on a rigid regex. What the native reviewer's output often looks like:
    • A preamble block (Codex version, workdir, model, and a dump of the diff and the shell commands it ran) — skip all of it.
    • The findings appear after a
      codex
      marker as a summary line followed by
      Full review comments:
      and a list of entries shaped like
      - [P2] <title> — <path>:<start>-<end>
      with a description paragraph under each. Each entry is one finding.
    • The findings block is often printed twice (streamed, then repeated as the final message). Dedupe — it's the same findings, not new ones.
    • Codex already tags each finding
      [P0]
      [P3]
      ; keep those labels as-is — it's the same scale the Google-rubric reviewer uses, so no remapping is needed.
    • Harmless
      git: warning: confstr()
      /
      xcrun_db
      lines come from the read-only sandbox; ignore them.
If one side fails (Codex errored, an agent returned nothing usable), continue with whatever you have and say so explicitly in the report — a half review clearly labeled beats a silent gap.
  • 等待Google准则审查Agent返回JSON结果。
  • 等待Codex审查命令Agent返回原始输出和/或标准化JSON结果。语义解析Codex原生输出;不要依赖严格的正则表达式。原生审查者输出通常包含以下部分:
    • 前言块(Codex版本、工作目录、模型、差异内容及执行的shell命令)——全部跳过
    • 审查结果出现在
      codex
      标记之后,先是摘要行,随后是
      Full review comments:
      及一系列条目,格式为
      - [P2] <title> — <path>:<start>-<end>
      ,每个条目下有描述段落。每个条目对应一个审查结果。
    • 审查结果块通常会打印两次(流式输出一次,最终消息重复一次)。需去重——两次内容相同,并非新结果。
    • Codex已为每个结果标记
      [P0]
      [P3]
      ;保留这些标签不变——其等级与Google准则审查者使用的等级一致,无需映射。
    • 来自只读沙箱的无害
      git: warning: confstr()
      /
      xcrun_db
      行可忽略。
若其中一方失败(Codex执行错误,或Agent未返回可用内容),则继续使用现有结果,并在报告中明确说明——清晰标记的半份审查结果强于隐藏缺失内容。

Step 5 — Merge, dedupe, rank

步骤5 — 合并、去重、排序

Normalize both sides into the same finding shape, then reconcile:
  • Dedupe by same file + overlapping/adjacent lines + same underlying issue (semantic match, not string match — the two agents will word things differently).
  • Tag the source of every finding:
    both
    ,
    google
    , or
    codex
    .
  • Resolve priority for each merged finding: if both reviewers flagged it but assigned different priorities, take the higher (more severe) one and note the split.
  • Rank primarily by priority (P0 → P3). Within a priority tier, list findings both agents agree on first — independent agreement is the strongest confidence signal this skill produces.
  • Surface disagreement rather than hiding it: if the two agents conflict on whether something is a bug, show both positions briefly. That tension is often the most useful part of the report.
将两边结果标准化为同一格式,然后整合:
  • 去重:依据相同文件、重叠/相邻行及相同核心问题(语义匹配,而非字符串匹配——两个Agent的表述可能不同)进行去重。
  • 标记来源:为每个结果标记来源:
    both
    google
    codex
  • 确定优先级:对于合并后的结果,若两个审查者标记了同一问题但优先级不同,取更高(更严重)的优先级,并注明优先级差异。
  • 排序:主要按优先级排序(P0 → P3)。同一优先级内,先列出两个Agent达成共识的结果——独立共识是此技能产生的最强可信度信号。
  • 呈现分歧:而非隐藏分歧:若两个Agent对某内容是否为bug存在冲突,简要展示双方观点。这种分歧往往是报告中最有价值的部分。

Step 6 — Present one unified report

步骤6 — 生成统一报告

Lead with a verdict, then a priority overview table, then findings grouped by priority tier. Tag every finding with its priority, its source (
both
/
google
/
codex
), and its rubric dimension.
undefined
以结论开头,随后是优先级概览表,再按优先级分组展示审查结果。为每个结果标记优先级、来源(
both
/
google
/
codex
)及准则维度。
undefined

Review-all: <current> vs <base> (<N> files, +<adds>/-<dels>)

全面审查:<current> vs <base><N>个文件,新增<adds>行/删除<dels>行)

Verdict: APPROVE / REQUEST CHANGES / COMMENT Overall correctness: patch is correct / patch is incorrect Codex review and Google-rubric review examined the same diff independently; <X> findings agreed.
结论: 批准 / 请求修改 / 评论 整体正确性: 补丁正确 / 补丁错误 Codex审查与Google准则审查独立检查了同一差异;<X>项结果达成共识。

Findings overview

审查结果概览

PriorityCountWhereSummary
P0<n><file:line or —><short or "none">
P1<n>
P2<n>
P3<n>
优先级数量位置摘要
P0<n><文件:行 或 —><简短描述 或 "无">
P1<n>
P2<n>
P3<n>

P0 ← show only tiers that have findings

P0 ← 仅展示存在结果的优先级层级

  1. [P0] <title>
    file:line
    · both · functionality <merged description>
  1. [P0] <标题>
    文件:行
    · both · 功能 <合并后的描述>

P1

P1

...
...

P2

P2

...
...

P3

P3

...

Derive the verdict from the priorities (same logic the kelos reviewer uses):

- **Overall correctness** is "patch is incorrect" if there's any P0 or P1
  finding; otherwise "patch is correct". Ignore P2/P3 nits for this call.
- **REQUEST CHANGES** when there's a P0/P1; **APPROVE** when only P2/P3 (or
  nothing); **COMMENT** when you genuinely need the author's input before
  deciding.

Keep it tight: no emojis, cite `file:line`, mark agreed (`both`) findings clearly
since that's the highest-confidence signal, and don't pad single-model findings to
look like consensus. If both agents found nothing, say so and stop. A notable
strength may be a one-line "Good:" note under the lowest tier — matter-of-fact,
not flattery.
...

依据优先级推导结论(与kelos审查者使用相同逻辑):

- **整体正确性**:若存在任何P0或P1结果,则为“补丁错误”;否则为“补丁正确”。此判断忽略P2/P3级别的细微问题。
- **请求修改**:当存在P0/P1结果时;**批准**:仅存在P2/P3结果(或无结果)时;**评论**:确实需要作者输入才能做出决定时。

保持简洁:不使用表情符号,标注`文件:行`,清晰标记达成共识(`both`)的结果(这是最高可信度信号),不要将单一Agent的结果伪装成共识。若两个Agent均未发现问题,直接说明并结束报告。若有明显优点,可在最低优先级层级添加一行“优点:……”的注释——客观陈述,无需奉承。

Notes & edge cases

注意事项与边缘情况

  • Committed changes only.
    codex review --base
    and
    base...HEAD
    both ignore uncommitted/untracked files. If the user wants those reviewed, they must commit first (a future
    --working-tree
    mode could cover that case).
  • Large diffs. Codex may take a while; that's exactly why it runs in its own clean agent. Don't kill it early.
  • Arguments. Accept an optional base override (e.g.
    review-all --base develop
    or
    review-all develop
    ). If none is given, auto-resolve per Step 1.
  • Don't double-review. Both reviewers must get the identical range; never let one drift to working-tree and the other to branch scope.
  • 仅审查已提交变更
    codex review --base
    base...HEAD
    均忽略未提交/未跟踪文件。若用户希望审查这些内容,需先提交(未来的
    --working-tree
    模式可覆盖此场景)。
  • 大差异文件。Codex可能需要较长时间处理;这正是将其放在独立干净Agent中运行的原因。请勿提前终止。
  • 参数。接受可选的基准分支覆盖(如
    review-all --base develop
    review-all develop
    )。若未指定,按步骤1自动解析。
  • 避免重复审查。两个审查者必须获得完全相同的差异范围;切勿让一个审查工作区内容,另一个审查分支范围。