pr-review
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePR Review
PR代码审查
Review a PR/MR diff by dispatching independent role-based subagents in parallel, then publish findings as one sticky summary comment + per-finding inline comments. The main session never reviews — it ingests, dispatches, merges, emits, publishes.
<HARD-GATE>
You MUST dispatch independent subagents — NEVER review the diff yourself in the main session. The main session accumulates context bias from prior conversation. Only an isolated subagent can deliver an unbiased finding.
Dispatch in PARALLEL using a single message with multiple Agent tool calls. If one subagent fails, proceed with the rest BUT surface the failure in the sticky comment header (never silent). If ALL fail, report failure — do NOT fall back to self-review.
Publishing happens in the main session (post-merge) — not in subagents.
mode: local通过并行调度独立的角色化subagent来审查PR/MR代码差异,然后将审查结果发布为一条固定总结评论 + 每个问题对应的行内评论。主会话从不直接参与审查——仅负责接收信息、调度subagent、合并结果、输出内容、发布评论。
<HARD-GATE>
你必须调度独立的subagent——绝对不能在主会话中自行审查代码差异。主会话会从之前的对话中积累上下文偏见,只有独立的subagent才能提供无偏见的审查结果。
通过单条消息调用多个Agent工具来并行调度subagent。如果某个subagent执行失败,继续使用其他subagent的结果,但必须在固定总结评论的头部说明失败情况(绝对不能隐瞒)。如果所有subagent都失败,报告失败——绝对不能 fallback 到自行审查。
发布操作在主会话中完成(合并结果后)——而非在subagent中。
mode: localRationalization Prevention
防止合理化借口
| Thought | Reality |
|---|---|
| "The diff is small, I can review it myself" | Self-review is biased by what you saw in the conversation. Small ≠ unbiased. |
| "I already saw this code earlier" | That's exactly why you can't review it. Familiarity hides issues. |
| "Dispatching 3-4 subagents is overkill" | Each persona uses a different mental model. A single agent dilutes all of them. |
| "Sequential is fine, I'll save tokens" | Parallel is faster wall-clock and prevents one report from biasing the next. |
| "spec-auditor isn't needed, the spec is short" | If has_spec is true, dispatch. The check is whether spec exists, not whether it's verbose. |
| "I'll just check the obvious bug myself" | Even one self-checked finding contaminates the report — readers can't tell which findings are biased. |
| "1 subagent failed, just hide it" | Hiding partial review = pretending coverage existed. Surface it in sticky header. |
| "Prior finding line moved, mark fixed" | Line moving ≠ behaviour fixed. Require subagent verification, then hedge as |
| 想法 | 实际情况 |
|---|---|
| "代码差异很小,我可以自己审查" | 自行审查会受到对话中已见内容的偏见影响。代码量小≠无偏见。 |
| "我之前已经看过这段代码了" | 这正是你不能自行审查的原因。熟悉感会掩盖问题。 |
| "调度3-4个subagent太夸张了" | 每个角色使用不同的思维模型。单个Agent会稀释所有模型的作用。 |
| "顺序调度就可以,能节省token" | 并行调度耗时更短,还能避免一份报告影响下一份报告的结果。 |
| "不需要spec-auditor,规范很短" | 如果has_spec为true,就必须调度。判断标准是规范是否存在,而非是否冗长。 |
| "我自己检查明显的bug就行" | 哪怕只有一个自行检查的问题,也会污染整个报告——读者无法区分哪些结果带有偏见。 |
| "有1个subagent失败了,隐瞒就行" | 隐瞒部分审查结果=假装完成了全面审查。必须在固定总结评论头部说明。 |
| "之前的问题行被移动了,标记为已修复" | 代码行移动≠行为修复。需要subagent验证,然后标记为 |
Red Flags — STOP if you catch yourself:
危险信号——如果发现自己有以下行为,请立即停止:
- Reviewing any category yourself instead of dispatching
- Dispatching subagents sequentially instead of in parallel
- Skipping a subagent because "the diff doesn't look like it has X"
- Claiming review passed without reading subagent findings
- Editing code during review (review reads, doesn't write)
- Falling back to self-review because a subagent failed
- Hiding subagent failures in output (must surface in sticky header)
- Marking prior findings as without a verification note from subagent
✅ Fixed - Publishing inline comments before merging findings (dispatch → merge → publish)
<decision_boundary>
Use for:
- Reviewing a PR/MR diff and producing structured findings
- Security / logic / performance scans on code changes
- Spec compliance verification when spec exists
- Organizing findings with severity + blast radius + confidence
- Posting findings to GitHub PR as sticky summary + inline comments
NOT for:
- Writing or improving PR descriptions
- Design review requiring business judgment about scope or direction
- Writing the actual code/diff
- Deep CVE / supply-chain / OWASP sweep
- Writing release notes / CHANGELOG
- End-user / UX review (use /qa or /design-review)
- Auto-approving or auto-merging (produce findings only; humans merge)
</decision_boundary>
- 自行审查任何类别内容,而非调度subagent
- 顺序调度subagent而非并行调度
- 因为“代码差异看起来没有X问题”而跳过某个subagent
- 未阅读subagent结果就声称审查通过
- 审查过程中修改代码(审查仅负责读取,不负责编写)
- 因subagent失败而fallback到自行审查
- 在输出中隐瞒subagent失败情况(必须在固定总结评论头部说明)
- 未获得subagent的验证说明就将之前的问题标记为(已修复)
✅ Fixed - 合并结果前就发布行内评论(流程应为:调度→合并→发布)
<decision_boundary>
适用场景:
- 审查PR/MR代码差异并生成结构化问题
- 对代码变更进行安全/逻辑/性能扫描
- 当存在规范时,验证代码是否符合规范
- 按严重性、影响范围、置信度整理审查结果
- 将审查结果发布到GitHub PR,形式为固定总结评论+行内评论
不适用场景:
- 撰写或优化PR描述
- 需要业务判断范围或方向的设计审查
- 编写实际代码/差异
- 深度CVE/供应链/OWASP扫描
- 撰写发布说明/CHANGELOG
- 终端用户/UX审查(使用/qa或/design-review)
- 自动批准或自动合并(仅生成审查结果;合并由人工完成)
</decision_boundary>
Flow
流程
dot
digraph pr_review {
"Receive inputs" [shape=doublecircle];
"Resolve mode" [shape=box];
"Compute capability flags" [shape=box];
"Parallel dispatch" [shape=box, style=bold];
"security-reviewer" [shape=box];
"staff-engineer" [shape=box];
"sdet" [shape=box];
"has_spec?" [shape=diamond];
"spec-auditor" [shape=box];
"Skip spec-auditor" [shape=box];
"Collect + Dedup" [shape=box];
"Apply severity merge rule" [shape=box];
"Build sticky + inline" [shape=box];
"dry-run?" [shape=diamond];
"Print to console" [shape=box];
"Publish to PR" [shape=box];
"Done" [shape=doublecircle];
"Receive inputs" -> "Resolve mode";
"Resolve mode" -> "Compute capability flags";
"Compute capability flags" -> "Parallel dispatch";
"Parallel dispatch" -> "security-reviewer";
"Parallel dispatch" -> "staff-engineer";
"Parallel dispatch" -> "sdet";
"Parallel dispatch" -> "has_spec?";
"has_spec?" -> "spec-auditor" [label="yes"];
"has_spec?" -> "Skip spec-auditor" [label="no"];
"security-reviewer" -> "Collect + Dedup";
"staff-engineer" -> "Collect + Dedup";
"sdet" -> "Collect + Dedup";
"spec-auditor" -> "Collect + Dedup";
"Skip spec-auditor" -> "Collect + Dedup";
"Collect + Dedup" -> "Apply severity merge rule";
"Apply severity merge rule" -> "Build sticky + inline";
"Build sticky + inline" -> "dry-run?";
"dry-run?" -> "Print to console" [label="yes"];
"dry-run?" -> "Publish to PR" [label="no"];
"Apply severity merge rule" -> "Emit findings JSON" [label="mode=local"];
"Emit findings JSON" [shape=box];
"Emit findings JSON" -> "Done";
}dot
digraph pr_review {
"Receive inputs" [shape=doublecircle];
"Resolve mode" [shape=box];
"Compute capability flags" [shape=box];
"Parallel dispatch" [shape=box, style=bold];
"security-reviewer" [shape=box];
"staff-engineer" [shape=box];
"sdet" [shape=box];
"has_spec?" [shape=diamond];
"spec-auditor" [shape=box];
"Skip spec-auditor" [shape=box];
"Collect + Dedup" [shape=box];
"Apply severity merge rule" [shape=box];
"Build sticky + inline" [shape=box];
"dry-run?" [shape=diamond];
"Print to console" [shape=box];
"Publish to PR" [shape=box];
"Done" [shape=doublecircle];
"Receive inputs" -> "Resolve mode";
"Resolve mode" -> "Compute capability flags";
"Compute capability flags" -> "Parallel dispatch";
"Parallel dispatch" -> "security-reviewer";
"Parallel dispatch" -> "staff-engineer";
"Parallel dispatch" -> "sdet";
"Parallel dispatch" -> "has_spec?";
"has_spec?" -> "spec-auditor" [label="yes"];
"has_spec?" -> "Skip spec-auditor" [label="no"];
"security-reviewer" -> "Collect + Dedup";
"staff-engineer" -> "Collect + Dedup";
"sdet" -> "Collect + Dedup";
"spec-auditor" -> "Collect + Dedup";
"Skip spec-auditor" -> "Collect + Dedup";
"Collect + Dedup" -> "Apply severity merge rule";
"Apply severity merge rule" -> "Build sticky + inline";
"Build sticky + inline" -> "dry-run?";
"dry-run?" -> "Print to console" [label="yes"];
"dry-run?" -> "Publish to PR" [label="no"];
"Apply severity merge rule" -> "Emit findings JSON" [label="mode=local"];
"Emit findings JSON" [shape=box];
"Emit findings JSON" -> "Done";
}Inputs
输入参数
Required
必填参数
pr<owner>/<repo>#<N>pr<owner>/<repo>#<N>Optional
可选参数
dry-runtruefalsefalsebaseorigin/mainlast_shaspechas_spec- Path:
spec: docs/specs/payment-v2.md - URL:
spec: https://confluence.example.com/payment-v2 - Inline:
spec: this PR implements PCI DSS v4.0 logical isolation - Multiple:
spec: design doc at X, acceptance criteria in Jira ABC-123
If absent → spec-auditor not dispatched. Other subagents do not reference spec.
test direction- Approach: /
unit only/integration required/e2e requiredno test needed - Location: expected test file path
- Focus: scenario or case the test should cover
Missing → sdet uses heuristic from diff nature.
context- Business risk: "this endpoint is internal-admin only"
- Domain rules: "tenant_id is required"
- Known trade-offs: "we're aware of the N+1, will fix next sprint"
- Environment constraints: "CDE service, security findings cannot be downgraded"
- Hotfix narrowing: "hotfix — only check critical security"
- Cross-PR coupling: "ships together with PR #1234"
Context can adjust severity at merge time (see Severity Merge Rule).
dry-runtruefalsefalsebaseorigin/mainlast_shaspechas_spec- 路径:
spec: docs/specs/payment-v2.md - URL:
spec: https://confluence.example.com/payment-v2 - 内联:
spec: this PR implements PCI DSS v4.0 logical isolation - 多个来源:
spec: design doc at X, acceptance criteria in Jira ABC-123
如果未提供→不调度spec-auditor。其他subagent不会引用规范。
test direction- Approach:/
unit only/integration required/e2e requiredno test needed - Location:预期的测试文件路径
- Focus:测试应覆盖的场景或用例
如果未提供→sdet根据代码差异的性质使用启发式规则判断。
context- 业务风险:"this endpoint is internal-admin only"
- 领域规则:"tenant_id is required"
- 已知权衡:"we're aware of the N+1, will fix next sprint"
- 环境约束:"CDE service, security findings cannot be downgraded"
- 热修复范围:"hotfix — only check critical security"
- 跨PR关联:"ships together with PR #1234"
上下文信息可在合并阶段调整问题严重性(详见严重性合并规则)。
Mode Detection
模式检测
Resolve before dispatch. The mode controls diff scope and output sections.
mode: localbaselast_shadot
digraph mode {
"mode input" [shape=box];
"Has sticky?" [shape=diamond];
"last_sha reachable?" [shape=diamond];
"Same as HEAD?" [shape=diamond];
"incremental" [shape=box, style=bold];
"full" [shape=box, style=bold];
"local" [shape=box, style=bold];
"noop (report no-change)" [shape=box, style=bold];
"mode input" -> "Has sticky?" [label="auto"];
"mode input" -> "incremental" [label="incremental (forced)"];
"mode input" -> "full" [label="full (forced)"];
"mode input" -> "local" [label="local (forced)"];
"Has sticky?" -> "last_sha reachable?" [label="yes"];
"Has sticky?" -> "full" [label="no"];
"last_sha reachable?" -> "Same as HEAD?" [label="yes"];
"last_sha reachable?" -> "full" [label="no\n(force-push?)"];
"Same as HEAD?" -> "noop (report no-change)" [label="yes"];
"Same as HEAD?" -> "incremental" [label="no"];
}Sticky discovery:
bash
gh api repos/<owner>/<repo>/issues/<N>/comments \
--jq '.[] | select(.body | contains("<!-- pr-review:sticky -->")) | {id, body}'Markers embedded in sticky body:
- — locator
<!-- pr-review:sticky --> - — last reviewed HEAD
<!-- pr-review:sha=<commit> -->
SHA reachability:
bash
git cat-file -e <last_sha> 2>/dev/null && echo reachable || echo unreachableIf unreachable (force-push / squash-merge of older PR / branch rebased): fall back to AND prepend to sticky body:
fullmarkdown
> ⚠️ Prior review base `<last_sha>` is not reachable (force-push?). This iteration is a full re-review.调度前先确定模式。模式控制代码差异范围和输出内容。
dot
digraph mode {
"mode input" [shape=box];
"Has sticky?" [shape=diamond];
"last_sha reachable?" [shape=diamond];
"Same as HEAD?" [shape=diamond];
"incremental" [shape=box, style=bold];
"full" [shape=box, style=bold];
"local" [shape=box, style=bold];
"noop (report no-change)" [shape=box, style=bold];
"mode input" -> "Has sticky?" [label="auto"];
"mode input" -> "incremental" [label="incremental (forced)"];
"mode input" -> "full" [label="full (forced)"];
"mode input" -> "local" [label="local (forced)"];
"Has sticky?" -> "last_sha reachable?" [label="yes"];
"Has sticky?" -> "full" [label="no"];
"last_sha reachable?" -> "Same as HEAD?" [label="yes"];
"last_sha reachable?" -> "full" [label="no\n(force-push?)"];
"Same as HEAD?" -> "noop (report no-change)" [label="yes"];
"Same as HEAD?" -> "incremental" [label="no"];
}固定评论查找:
bash
gh api repos/<owner>/<repo>/issues/<N>/comments \
--jq '.[] | select(.body | contains("<!-- pr-review:sticky -->")) | {id, body}'固定评论内容中嵌入的标记:
- — 定位标记
<!-- pr-review:sticky --> - — 上次审查的HEAD
<!-- pr-review:sha=<commit> -->
SHA可达性检查:
bash
git cat-file -e <last_sha> 2>/dev/null && echo reachable || echo unreachable如果不可达(强制推送/旧PR squash合并/分支变基):fallback到模式,并在固定评论内容开头添加:
fullmarkdown
> ⚠️ 上次审查基准 `<last_sha>` 不可达(可能是强制推送?)。本次为全面重新审查。Noop case (last_sha == HEAD
)
last_sha == HEADNoop情况(last_sha == HEAD
)
last_sha == HEADWhen the sticky exists AND : skip dispatch + publish. Print to console:
last_sha == HEADpr-review: nothing new since <last_sha>. Skipping. Use mode=full to force a re-review.
The sticky is already current; do not touch it.
当固定评论存在且时:跳过调度+发布。打印到控制台:
last_sha == HEADpr-review: 自 <last_sha> 以来无新内容。已跳过。使用mode=full可强制重新审查。
固定评论已为最新状态;请勿修改。
Local Mode
Local模式
Use when the caller is another skill or supervisor session that needs unbiased multi-role review of a diff but has no PR open yet (e.g. a supervisor session's verify phase doing pre-PR critique). The HARD-GATE still applies — local mode is about output target, not about who reviews.
Caveat — calling from the same dev session that wrote the code (author-as-reviewer bias): pr-review's 4-subagent dispatch is isolated by design — finding generation is robust even when called from the author's session. But the downstreamverdict on each finding is NOT covered by this isolation. If the same session that wrote the code also reasons about which findings to wontfix, author-narrative bias compounds — framing a diff as "bug-free" produces the strongest detection drop among framing conditions tested across 6 LLMs (Mitropoulos et al., Measuring and Exploiting Contextual Bias in LLM-Assisted Security Code Review, arXiv:2603.18740). Treat local-mode findings as advisory in dev sessions; do NOT auto-execute verdicts in main session. A proper dev-stage verdict loop needs a separate Deriver-pattern verdict-subagent (not built yet — seemodify / wontfix / defer§ 4.6 "When NOT to use" for the equivalent caveat on Wontfix Template).pr-babysit/SKILL.md
适用于调用方为其他技能或主管会话,需要对代码差异进行无偏见的多角色审查,但尚未创建PR的场景(例如主管会话的验证阶段,在PR创建前进行评审)。HARD-GATE仍然适用——Local模式仅改变输出目标,不改变审查主体。
注意——从编写代码的同一开发会话调用(作者兼审查者偏见):pr-review的4-subagent调度设计为完全隔离——即使从作者的会话调用,问题生成也能保持稳健。但对每个问题的下游判定不受此隔离保护。如果编写代码的同一会话同时判定哪些问题标记为wontfix,作者叙事偏见会加剧——将代码差异描述为“无bug”会导致6种LLM的检测率大幅下降(Mitropoulos等人,Measuring and Exploiting Contextual Bias in LLM-Assisted Security Code Review,arXiv:2603.18740)。在开发会话中,Local模式的结果仅作为参考建议;绝对不要在主会话中自动执行判定。完善的开发阶段判定流程需要独立的Deriver模式判定subagent(尚未构建——详见modify / wontfix / defer§4.6 "不适用场景"中关于Wontfix模板的类似注意事项)。pr-babysit/SKILL.md
Inputs
输入参数
- (required to enter this mode)
mode: local - (required — e.g.
base: <ref>)origin/main - (optional — if provided, runs incremental on
last_sha: <sha>and still reads<last_sha>..HEADfor cumulative context that subagents need for prior-finding verification)<base>...HEAD - ,
spec,test direction— same semantics as default modecontext
pr- (必填,用于进入此模式)
mode: local - (必填——例如
base: <ref>)origin/main - (可选——如果提供,对
last_sha: <sha>进行增量审查,同时读取<last_sha>..HEAD以获取累积上下文,供subagent验证之前的问题)<base>...HEAD - ,
spec,test direction— 语义与默认模式相同context
prDiff scope
代码差异范围
- No → full diff:
last_sha(three-dot — topic-only changes)git diff <base>...HEAD - With → incremental: subagents see both
last_shaand<base>...HEAD; they report findings only inside the incremental window plus verification status for prior findings (caller must pass prior findings too — see below)<last_sha>..HEAD
- 未提供→ 完整差异:
last_sha(三点语法——仅主题分支变更)git diff <base>...HEAD - 提供→ 增量差异:subagent同时查看
last_sha和<base>...HEAD;仅报告增量范围内的问题,以及之前问题的验证状态(调用方还必须传入之前的问题——详见下文)<last_sha>..HEAD
Caller responsibilities (incremental local mode)
调用方职责(增量Local模式)
The sticky normally carries prior findings between iterations. In local mode the caller owns that state and must pass to pr-review on each invocation:
- : array of objects with
prior_findings— same shape as findings JSON output (see below){id, slug, file, line, category, severity, justification, summary} - :
prior_fix_range— the commits that addressed iter (N-1) findings, used by the threshold's drop signal (B)<first-fix-sha>^..<last-fix-sha>
If is set but is missing → ESCALATE to caller; do not fabricate.
last_shaprior_findings固定评论通常在迭代之间保存之前的问题。Local模式下,调用方负责维护该状态,每次调用pr-review时必须传入:
- :对象数组,格式为
prior_findings——与输出的findings JSON格式相同(详见下文){id, slug, file, line, category, severity, justification, summary} - :
prior_fix_range——解决第(N-1)次迭代问题的提交范围,用于阈值的drop signal (B)自引入检查。如果每次迭代仅提交一个commit,此范围简化为<first-fix-sha>^..<last-fix-sha>。如果调度方无法确定范围(例如强制推送、第N-1次迭代的提交被squash合并)→ fallback到<last_sha>..HEAD模式并在固定评论中说明;绝对不要在没有full的情况下调用增量模式prior_fix_range
如果设置了但未提供→向调用方报错;请勿自行生成。
last_shaprior_findingsOutput
输出
Skip Publishing. Skip sticky/inline markdown construction. Emit one JSON document to stdout:
json
{
"mode": "local",
"base": "origin/main",
"head": "<HEAD sha>",
"last_sha": "<sha or null>",
"status": "blocking | review-before-merge | approved-with-notes | approved | noop | partial-failure",
"subagent_failures": [],
"summary_line": "<same wording as sticky summary line>",
"findings": [
{
"id": "#1",
"p_code": "P0 | P1 | P2 | Q",
"severity_emoji": "🚨 | ⚠️ | 💡 | ❓",
"slug": "kebab-case-slug",
"category": "Original [code name] from subagent",
"file": "path/to/file",
"line_start": 42,
"line_end": 42,
"confidence": "high | medium | low",
"blast": "Local | Module | Cross-service | Data layer",
"justification": "Reachable | Precedent | Asymmetric | Historical",
"failure_mode": "one-line",
"mitigation": "one-line",
"evidence": "verbatim diff line(s)",
"details": "optional multi-line",
"severity_adjustment": null | { "from": "💡 P2", "to": "⚠️ P1", "reason": "..." }
}
],
"spec_gaps": [
{
"id": "#7",
"section": "spec section or decision id",
"title": "one-line",
"spec_quote": "verbatim",
"code_quote": "verbatim",
"questions": ["..."]
}
],
"prior_verifications": [
{
"prior_id": "#1",
"verification": "yes | unclear | no",
"note": "what evidence"
}
],
"checked_and_clean": [
{ "slug": "...", "evidence": "one-line" }
]
}severity_adjustment: nullp_codeseverity_emoji## ⚖️ Severity adjustmentsprior_verifications[]last_sha跳过发布步骤。跳过固定/行内评论的markdown构建。向标准输出输出一个JSON文档:
json
{
"mode": "local",
"base": "origin/main",
"head": "<HEAD sha>",
"last_sha": "<sha or null>",
"status": "blocking | review-before-merge | approved-with-notes | approved | noop | partial-failure",
"subagent_failures": [],
"summary_line": "<与固定评论摘要行相同的措辞>",
"findings": [
{
"id": "#1",
"p_code": "P0 | P1 | P2 | Q",
"severity_emoji": "🚨 | ⚠️ | 💡 | ❓",
"slug": "kebab-case-slug",
"category": "来自subagent的原始[代码名称]",
"file": "path/to/file",
"line_start": 42,
"line_end": 42,
"confidence": "high | medium | low",
"blast": "Local | Module | Cross-service | Data layer",
"justification": "Reachable | Precedent | Asymmetric | Historical",
"failure_mode": "单行描述",
"mitigation": "单行描述",
"evidence": "逐字差异行",
"details": "可选多行",
"severity_adjustment": null | { "from": "💡 P2", "to": "⚠️ P1", "reason": "..." }
}
],
"spec_gaps": [
{
"id": "#7",
"section": "规范章节或决策ID",
"title": "单行描述",
"spec_quote": "逐字引用",
"code_quote": "逐字引用",
"questions": ["..."]
}
],
"prior_verifications": [
{
"prior_id": "#1",
"verification": "yes | unclear | no",
"note": "验证依据"
}
],
"checked_and_clean": [
{ "slug": "...", "evidence": "单行描述" }
]
}severity_adjustment: nullp_codeseverity_emoji## ⚖️ Severity adjustments未提供时,为空数组。
last_shaprior_verifications[]What local mode keeps from default mode
Local模式保留默认模式的特性
- HARD-GATE: still dispatch 4 parallel subagents; main session never reviews
- Capability flags (has_spec, has_repo, is_trivial)
- Finding Inclusion Threshold (Reachable / Precedent / Asymmetric / Historical + drop signals A/B/C/D)
- Severity Merge Rule (4 steps + P-code mapping)
- Dedup between subagent findings
- Subagent failure → if all 4 fail, report failure to caller; never self-review
- HARD-GATE:仍并行调度4个subagent;主会话从不审查
- 能力标志(has_spec, has_repo, is_trivial)
- 问题纳入阈值(Reachable / Precedent / Asymmetric / Historical + drop signals A/B/C/D)
- 严重性合并规则(4步+P-code映射)
- subagent结果去重
- subagent失败→如果全部4个失败,向调用方报告失败;绝不自行审查
What local mode drops
Local模式移除的特性
- Sticky comment build / markdown rendering
- Inline comment markdown / GitHub Review API call
- Sticky discovery via
gh api - last_sha derivation from sticky body (caller passes it)
- Noop case (caller decides whether to re-invoke; if and caller still invokes, return
last_sha == HEAD+ afindings: [])status: "noop"
- 固定评论构建/markdown渲染
- 行内评论markdown/GitHub Review API调用
- 通过查找固定评论
gh api - 从固定评论内容推导last_sha(由调用方传入)
- Noop情况(调用方决定是否重新调用;如果且调用方仍调用,返回
last_sha == HEAD+findings: [])status: "noop"
Capability Flags
能力标志
Compute before dispatch:
| Flag | Default | Set when | Effect |
|---|---|---|---|
| false | spec input present OR PR description has goal/requirement section | dispatch spec-auditor |
| true | repo access available (grep / index / LSP) | enable cross-file checks |
| false | <50 LOC AND (docs-only OR pure rename OR pure type-only) | skip staff-engineer |
调度前计算:
| 标志 | 默认值 | 设置条件 | 效果 |
|---|---|---|---|
| false | 提供了spec输入 或 PR描述包含目标/需求章节 | 调度spec-auditor |
| true | 可访问仓库(支持grep / 索引 / LSP) | 启用跨文件检查 |
| false | 代码行数<50 且(仅文档变更 或 纯重命名 或 仅类型变更) | 跳过staff-engineer |
Dispatch
调度
Default dispatch (4 subagents in parallel via a single message):
| Subagent | Prompt file | When dispatched |
|---|---|---|
| security-reviewer | | always |
| staff-engineer | | always (skip if is_trivial) |
| sdet | | always |
| spec-auditor | | only if has_spec |
Each subagent receives:
- Diff (full in mode;
fullin<last_sha>..HEADmode)incremental - Capability flags (has_spec, has_repo, is_trivial)
- Mode (/
full)incremental - Their relevant inputs only (spec content for spec-auditor, test direction for sdet)
- In mode (dispatcher MUST provide all three):
incremental- Prior findings JSON (subagent's own category scope only)
- Prior slugs for drift spot-check
Checked & clean - :
prior_fix_range— git range covering the commits that addressed iter (N-1) findings. Subagent uses this to apply drop signal (B) self-introduced surface. In single-commit-per-iter cases this collapses to<first-fix-sha>^..<last-fix-sha>. If the dispatcher cannot determine the range (e.g. force-push, squash-merge of iter N-1 commits) → fall back to<last_sha>..HEADmode and announce in sticky; do NOT invoke incremental mode withoutfullprior_fix_range
- NO conversation history, NO session context, NO prior subagent findings from this run
Threshold inlining: the Finding Inclusion Threshold is inlined directly in each subagent prompt ( / / / ). Dispatcher does NOT need to prepend threshold text — subagents apply it from their baked-in section. This avoids relying on dispatcher's "good behavior" to inject the gate on every invocation.
security-reviewer-prompt.mdstaff-engineer-prompt.mdsdet-prompt.mdspec-auditor-prompt.md默认调度(通过单条消息并行调度4个subagent):
| Subagent | 提示文件 | 调度条件 |
|---|---|---|
| security-reviewer | | 始终调度 |
| staff-engineer | | 始终调度(如果is_trivial为true则跳过) |
| sdet | | 始终调度 |
| spec-auditor | | 仅当has_spec为true时调度 |
每个subagent会收到:
- 代码差异(模式下为完整差异;
full模式下为incremental)<last_sha>..HEAD - 能力标志(has_spec, has_repo, is_trivial)
- 模式(/
full)incremental - 仅与自身相关的输入(spec-auditor收到规范内容,sdet收到test direction)
- 模式下(调度方必须提供全部三项):
incremental- 之前的问题JSON(仅subagent自身负责的类别范围)
- 之前的标识,用于检查漂移
Checked & clean - :
prior_fix_range——解决第(N-1)次迭代问题的提交范围。subagent使用此范围应用drop signal (B)自引入检查。如果每次迭代仅提交一个commit,此范围简化为<first-fix-sha>^..<last-fix-sha>。如果调度方无法确定范围(例如强制推送、第N-1次迭代的提交被squash合并)→ fallback到<last_sha>..HEAD模式并在固定评论中说明;绝对不要在没有full的情况下调用增量模式prior_fix_range
- 无对话历史、无会话上下文、无本次运行中其他subagent的结果
阈值内联:问题纳入阈值直接内联到每个subagent的提示中( / / / )。调度方无需预先添加阈值文本——subagent从自身提示中的内置部分应用阈值。这避免依赖调度方在每次调用时都正确注入阈值。
security-reviewer-prompt.mdstaff-engineer-prompt.mdsdet-prompt.mdspec-auditor-prompt.mdIncremental-mode subagent additions
增量模式下subagent的额外输出
In incremental mode, each subagent ALSO emits for every prior finding within its scope:
Prior finding status: <id>
verification: yes | unclear | no
note: <one-line — what evidence supports the verification>Mapping to display status (in table):
## 🔄 Changes since last review | Display |
|---|---|
| ✅ Likely fixed |
| ⏸️ Untouched — <note: "file segment not in diff"> |
| 🔄 Still present — <note: "evidence still observable at <file>:<line>"> |
Never emit without . Default hedge is always — finality belongs to the human reviewer.
✅ Fixedverification: yesLikely fixed增量模式下,每个subagent还会针对自身范围内的每个之前的问题输出:
Prior finding status: <id>
verification: yes | unclear | no
note: <单行描述——验证依据>映射到显示状态(在表格中):
## 🔄 Changes since last review | 显示内容 |
|---|---|
| ✅ Likely fixed |
| ⏸️ Untouched — <说明:"文件片段未在差异中"> |
| 🔄 Still present — <说明:"在<file>:<line>仍能看到相关证据"> |
绝对不要在没有的情况下输出。默认使用(可能已修复)——最终判定由人工审查者完成。
verification: yes✅ FixedLikely fixedFallback rules
fallback规则
- 1 subagent fails → continue with rest; sticky header shows
⚠️ Partial — <subagent> failed - 2+ fail → continue with surviving findings; sticky header shows
⚠️ Partial — N/4 subagents failed: <names> - ALL fail → report failure to user, do not publish, never self-review
- 1个subagent失败→继续使用其他结果;固定评论头部显示
⚠️ Partial — <subagent> failed - 2个及以上失败→继续使用可用结果;固定评论头部显示
⚠️ Partial — N/4 subagents failed: <名称> - 全部失败→向用户报告失败,不发布内容,绝不自行审查
Subagent Finding Contract
Subagent问题输出约定
Each subagent emits findings in this shape:
[<category-id> <category-name>] <file>:<line_start>-<line_end>
Severity: 🚨 | ⚠️ | 💡 | ❓
Confidence: high | medium | low
Blast: Local | Module | Cross-service | Data layer
Justification: Reachable | Precedent | Asymmetric | Historical
Evidence: <verbatim diff line(s) — cite-or-drop rule>
Failure mode: <one-line — what breaks if shipped as-is>
Mitigation: <one-line — fix action; cite test path when test coverage is part of the fix>
Details: <optional — multi-line narrative, repro steps, code patch. Use only when Failure mode genuinely needs more than one line>
Notes: <optional — only if severity differs from default>Field semantics:
- — concrete bug / breach / drift consequence. Forces severity calibration: if you cannot describe what goes wrong in one line, you do not have a finding.
Failure mode - — actionable fix. When the finding's resolution involves test coverage, name the test file and case (e.g.
Mitigation).add assert in foo_test.py:42 'rejects empty input' case - — escape hatch for findings whose explanation cannot fit one line (e.g. multi-step race, cross-file impact chain). Keep
DetailsandFailure modeas one-liners regardless; put narrative here.Mitigation - — required class declaring why the finding is worth emitting. See Finding Inclusion Threshold below. Findings that cannot commit to one of the four classes MUST NOT be emitted as standalone findings; batch into a Q-class hygiene followup instead.
Justification
After findings, each subagent emits declaring which of its owned categories were reviewed and clean. This distinguishes "checked, found nothing" from "skipped".
N/A categories: [<list>]spec-auditor uses + instead of single — both must be verbatim quotes. for spec findings = "what spec contract gets violated if shipped".
Spec quote:Code quote:Evidence:Failure modeDrop rule: any finding without (or both quotes for spec-auditor) is fabrication — discard before merge.
Evidence:每个subagent输出的问题格式如下:
[<category-id> <category-name>] <file>:<line_start>-<line_end>
Severity: 🚨 | ⚠️ | 💡 | ❓
Confidence: high | medium | low
Blast: Local | Module | Cross-service | Data layer
Justification: Reachable | Precedent | Asymmetric | Historical
Evidence: <逐字差异行——引用或丢弃规则>
Failure mode: <单行描述——如果按当前状态发布会出现什么问题>
Mitigation: <单行描述——修复操作;如果修复涉及测试覆盖,需指明测试路径>
Details: <可选——多行叙述、复现步骤、代码补丁。仅当Failure mode确实需要超过一行描述时使用>
Notes: <可选——仅当严重性与默认值不同时使用>字段语义:
- — 具体的bug/漏洞/偏离后果。用于校准严重性:如果无法用一行描述清楚问题,说明这不是一个有效问题。
Failure mode - — 可执行的修复方案。如果问题的解决涉及测试覆盖,需指明测试文件和用例(例如
Mitigation)。add assert in foo_test.py:42 'rejects empty input' case - — 针对无法用一行解释的问题的逃生舱(例如多步骤竞态、跨文件影响链)。无论如何,
Details和Failure mode必须保持为单行描述;将叙述内容放在此处。Mitigation - — 必填类别,说明问题值得输出的原因。详见下文问题纳入阈值。无法归入四个类别之一的问题不得作为独立问题输出;应归类为Q类卫生跟进问题。
Justification
输出问题后,每个subagent会输出,声明自身负责的哪些类别已审查且无问题。这用于区分“已检查,未发现问题”和“已跳过”。
N/A categories: [<列表>]spec-auditor使用 + 替代单个——两者必须为逐字引用。spec问题的 = "如果按当前状态发布,会违反哪些规范约定"。
Spec quote:Code quote:Evidence:Failure mode丢弃规则:任何没有(spec-auditor没有两个引用)的问题均为伪造内容——合并前需丢弃。
Evidence:Finding Inclusion Threshold
问题纳入阈值
This gate is applied by each subagent inline before emitting a finding. Canonical definition lives in the subagent prompts, not here — see any of:
- § Finding Inclusion Threshold
security-reviewer-prompt.md - § Finding Inclusion Threshold
staff-engineer-prompt.md - § Finding Inclusion Threshold
sdet-prompt.md - § Finding Inclusion Threshold
spec-auditor-prompt.md
All four contain the same Justification classes (Reachable / Precedent / Asymmetric / Historical), the same drop signals (A / B / C / D), and the same Asymmetric escape hatch. Per-prompt variations only add category-specific guidance (e.g. "most S1–S5 are Asymmetric" for security, "rare for T-class" for SDET).
Why duplicated across four prompts rather than referenced from one source: see Design note: prompt inlining.
Full vs incremental mode: full mode applies the threshold but drop signal (B) self-introduced surface never fires (no on iter 1). Incremental mode applies all four signals.
prior_fix_rangeSpec ambiguity rule (applies only to spec-auditor's C-class findings, kept in this SKILL.md as cross-cutting): if a candidate finding's mitigation offers "add a code comment" / "document the limitation in a comment" as an equal-weight resolution (phrasing "either X or document Y"), the finding is a Q-class spec gap addressed to the spec author, not P-class actionable. A comment-as-last-resort fallback ("do X; if impractical, document Y") keeps the finding actionable — the primary mitigation is what gets judged.
每个subagent在输出问题前会内联应用此阈值。标准定义位于subagent提示中,而非此处——可查看以下任意文件:
- § Finding Inclusion Threshold
security-reviewer-prompt.md - § Finding Inclusion Threshold
staff-engineer-prompt.md - § Finding Inclusion Threshold
sdet-prompt.md - § Finding Inclusion Threshold
spec-auditor-prompt.md
四个文件包含相同的Justification类别(Reachable / Precedent / Asymmetric / Historical)、相同的drop signals(A / B / C / D)和相同的Asymmetric逃生舱。每个提示的差异仅为类别特定指导(例如安全类提示中“大多数S1–S5属于Asymmetric”,SDET提示中“T类很少属于Asymmetric”)。
为何在四个提示中重复而非引用单一来源:详见设计说明:提示内联优于引用间接。
完整模式vs增量模式:完整模式应用阈值,但drop signal (B)自引入检查永远不会触发(第一次迭代没有)。增量模式应用全部四个信号。
prior_fix_range规范歧义规则(仅适用于spec-auditor的C类问题,作为跨领域规则保留在此SKILL.md中):如果候选问题的修复方案将“添加代码注释”/“在注释中记录限制”作为同等权重的解决方案(措辞为“either X or document Y”),则该问题为针对规范作者的Q类规范缺口,而非可执行的P类问题。如果注释是最后手段的 fallback(“do X; if impractical, document Y”),则问题仍为可执行问题——以主要修复方案为准。
Severity Merge Rule (deterministic precedence)
严重性合并规则(确定性优先级)
Apply in fixed order to each finding. Lower number wins on conflict.
- Base severity — assigned by subagent at finding emission (emoji form)
- Confidence demote — → demote to ❓ Question (terminal, no further escalation)
confidence: low - Blast escalate — or
blast: cross-service→ escalate one level (max 🚨 Blocker). Skipped if step 2 already demoted.blast: data-layer - Context adjust — overrides from context input (e.g. "CDE service: security cannot downgrade") applied last
- Final severity — result after all four steps
- Map to P-code for output (dispatcher does this; subagents emit emoji severity only):
| Emoji | P-code | Label |
|---|---|---|
| 🚨 Blocker | P0 | must fix; blocks merge |
| ⚠️ Factual | P1 | should fix |
| 💡 Suggestion | P2 | consider |
| ❓ Question | Q | clarify; not a priority tier |
Severity ordering (for sort): P0 > P1 > P2. Q is orthogonal.
Downgrades (step 4 lowering a tier) MUST appear in the section. Never silent. Never collapsed behind — render as plain section when any adjustment exists.
Severity adjustments<details>按固定顺序应用于每个问题。冲突时,数字越小优先级越高。
- 基准严重性 — subagent输出问题时指定(表情符号形式)
- 置信度降级 — → 降级为❓ Question(终止,不再进一步升级)
confidence: low - 影响范围升级 — 或
blast: cross-service→ 升级一级(最高为🚨 Blocker)。如果步骤2已降级,则跳过此步骤。blast: data-layer - 上下文调整 — 最后应用context输入中的覆盖规则(例如“CDE服务:安全问题不得降级”)
- 最终严重性 — 经过以上四步后的结果
- 映射到P-code 用于输出(由调度方完成;subagent仅输出表情符号严重性):
| 表情符号 | P-code | 标签 |
|---|---|---|
| 🚨 Blocker | P0 | 必须修复;阻止合并 |
| ⚠️ Factual | P1 | 应该修复 |
| 💡 Suggestion | P2 | 建议考虑 |
| ❓ Question | Q | 需要澄清;非优先级问题 |
严重性排序(用于排序):P0 > P1 > P2。Q类与其他类别正交。
降级操作(步骤4降低级别)必须显示在部分。绝对不能隐瞒。绝对不能放在中——只要存在调整,就以普通章节形式显示。
Severity adjustments<details>Dedup (between subagent findings)
去重(subagent结果之间)
- Same + same category → keep highest base severity, attribute to all reporting subagents
file:line - Same issue described differently across subagents → merge into one finding with combined notes
- Cross-cutting (e.g. staff-eng AND sdet both flag missing test for SQL injection) → keep both, dispatcher cross-references
- 相同+ 相同类别 → 保留基准严重性最高的问题,注明所有报告该问题的subagent
file:line - 不同subagent以不同方式描述同一问题 → 合并为一个问题,合并相关说明
- 跨领域问题(例如staff-eng和sdet均标记SQL注入缺少测试)→ 保留两个问题,调度方添加交叉引用
Output Language
输出语言
PR-published prose (sticky shape narrative, / / content, spec gap question body, verification notes, framing text around code refs) renders in the PR description's primary language. Everything else — markers, section titles, field labels, kebab-case slugs, P-codes, severity / justification / status tokens, the race meta tag — stays English.
Failure modeMitigationDetailsFallback when the PR description lacks substantive prose: linked issue body, then English.
Terminal / JSON output ( JSON, dry-run console, noop message) stays English regardless of PR language — those go to callers, not the PR.
mode=localPR中发布的文本(固定评论叙述内容、//内容、规范缺口问题正文、验证说明、代码引用周围的框架文本)使用PR描述的主要语言。其他内容——标记、章节标题、字段标签、短横线命名标识、P-code、严重性/理由/状态标记、竞态元标记——均保留英文。
Failure modeMitigationDetails如果PR描述缺乏实质性文本, fallback到关联的issue正文,再fallback到英文。
终端/JSON输出( JSON、dry-run控制台输出、noop消息)无论PR语言如何均保留英文——这些内容面向调用方,而非PR。
mode=localOutput Format
输出格式
Two artifacts produced post-merge:
- Sticky comment — single issue comment on the PR; updated in place across iterations (PATCH same comment id)
- Inline review comments — one GitHub review submission () containing one comment per P0 / P1 / P2 finding; Q findings stay in the sticky
event=COMMENT
Status tier (drives sticky header wording, NOT the actual GitHub review event):
| Condition | Wording |
|---|---|
| Any P0 | |
| No P0, any P1 | |
| Only P2 / Q | |
| Zero findings | |
| Any subagent failed | prepend |
The skill does not submit or reviews. Status wording lives inside the sticky comment only. Auto-approve / auto-merge is forbidden.
APPROVEREQUEST_CHANGES合并结果后生成两个产物:
- 固定评论 — PR上的单个issue评论;迭代过程中就地更新(PATCH同一评论ID)
- 行内审查评论 — 一次GitHub审查提交(),包含每个P0/P1/P2问题对应的一条评论;Q类问题仅保留在固定评论中
event=COMMENT
状态层级(驱动固定评论头部措辞,而非实际GitHub审查事件):
| 条件 | 措辞 |
|---|---|
| 存在任何P0问题 | |
| 无P0问题,存在任何P1问题 | |
| 仅存在P2/Q问题 | |
| 无任何问题 | |
| 存在任何subagent失败 | 在状态前添加 |
本技能不会提交或审查。状态措辞仅存在于固定评论中。禁止自动批准/自动合并。
APPROVEREQUEST_CHANGESSummary line (top of sticky)
摘要行(固定评论顶部)
Show only non-zero P-buckets + total + clean count. Zero suppression keeps the line scannable; incremental drift is communicated via the table, not by comparing zero counts.
Changes since last review**Review: <status>** · <total> finding(s) (<P0×N, P1×N, P2×N, Q×N — only non-zero>) · ✅ <N> cleanExamples:
**Review: ✅ Approved** · 0 findings · ✅ 11 clean
**Review: ✅ Approved with notes** · 1 finding (P2×1) · ✅ 11 clean
**Review: ⚠️ Review before merge** · 6 findings (P1×2, P2×3, Q×1) · ✅ 11 clean
**Review: 🔴 Blocking issues found** · 3 findings (P0×1, P1×2) · ✅ 11 clean
**Review: ⚠️ Partial — security-reviewer failed · ⚠️ Review before merge** · 2 findings (P1×2)仅显示非零的P类问题数量 + 总问题数 + 已检查无问题的类别数量。隐藏零值使该行更易扫描;增量漂移通过表格传达,而非比较零值计数。
Changes since last review**Review: <status>** · <total> finding(s) (<P0×N, P1×N, P2×N, Q×N — 仅显示非零>) · ✅ <N> clean示例:
**Review: ✅ Approved** · 0 findings · ✅ 11 clean
**Review: ✅ Approved with notes** · 1 finding (P2×1) · ✅ 11 clean
**Review: ⚠️ Review before merge** · 6 findings (P1×2, P2×3, Q×1) · ✅ 11 clean
**Review: 🔴 Blocking issues found** · 3 findings (P0×1, P1×2) · ✅ 11 clean
**Review: ⚠️ Partial — security-reviewer failed · ⚠️ Review before merge** · 2 findings (P1×2)Category slugs
类别标识
Convert each subagent's to a kebab-case slug for output. Drop the subagent-owned code (S/E/T/C). Examples:
[<code> <name>]- →
[E3 Conditional side effects](use semantic slug, not the literal name when one is more reviewer-meaningful)state-consistency - →
[S3 Secret / credential]secrets-handling - →
[T1 Test coverage gaps]missing-coverage - →
[C4 Business rule alignment]decision-conflict
When semantic slug differs from the literal category name, prefer semantic. The slug is the navigation handle reviewers see; pick the term that conveys "what kind of problem" most directly.
将每个subagent的转换为短横线命名的标识用于输出。移除subagent专属代码(S/E/T/C)。示例:
[<code> <name>]- →
[E3 Conditional side effects](使用语义标识,而非字面名称,选择对审查者更有意义的术语)state-consistency - →
[S3 Secret / credential]secrets-handling - →
[T1 Test coverage gaps]missing-coverage - →
[C4 Business rule alignment]decision-conflict
当语义标识与字面类别名称不同时,优先使用语义标识。标识是审查者看到的导航句柄;选择最能直接传达“问题类型”的术语。
Sticky comment template
固定评论模板
markdown
<!-- pr-review:sticky -->
<!-- pr-review:sha=<HEAD> -->
> 🤖 Automated review by `pr-review` skill
**Review: <status>** · <total> finding(s) (<non-zero buckets>) · ✅ <N> clean
> <one-line shape narrative — what's the issue cluster; render in PR description language. English example: "observability + state-consistency form two P1 clusters; security clean">markdown
<!-- pr-review:sticky -->
<!-- pr-review:sha=<HEAD> -->
> 🤖 由`pr-review`技能自动生成的审查
**Review: <status>** · <total> finding(s) (<非零分类>) · ✅ <N> clean
> <单行叙述——问题集群情况;使用PR描述语言渲染。英文示例:"observability + state-consistency form two P1 clusters; security clean">📋 Currently open (<N>)
📋 当前未解决问题 (<N>)
- <id> <P-code> — <file>:<line>
<slug> - ...
📍 Inline comments: <N> findings pinned to source lines (see the Files changed tab) — render this locator line in PR description language
- <id> <P-code> — <file>:<line>
<slug> - ...
📍 Inline comments: <N> findings pinned to source lines (see the Files changed tab) — 此定位行使用PR描述语言渲染
⚖️ Severity adjustments
⚖️ Severity adjustments
<rendered only when ≥1 adjustment exists; NOT inside <details>; see template below>
<仅当存在≥1次调整时渲染;不要放在<details>中;详见下方模板>
🔄 Last iteration changes (<last_sha>..<HEAD>
)
<last_sha>..<HEAD>🔄 上次迭代变更 (<last_sha>..<HEAD>
)
<last_sha>..<HEAD><rendered only in incremental mode; ONLY this iter's verifications, not cumulative; see template below>
<details><summary>📊 Overview by category</summary>
| Category | P0 | P1 | P2 | Q | Files |
|---|---|---|---|---|---|
| N | N | N | N | <file paths, comma-separated> |
- : <one-line evidence — what specific patterns were verified clean, or which grep / file-read confirmed>
<slug> - ...
🤖 skill · reviewed <· last reviewed — incremental only>
pr-review<base>..<HEAD><last_sha>
Rules:
- Shape narrative mandatory when ≥2 findings; optional for 0-1
- `📋 Currently open` rendered **flat** (no `<details>`) when ≥1 finding is not yet `Likely fixed`; one bullet per finding, sorted P0→P1→P2→Q then by file path. Omit the section entirely when all findings are closed (avoid empty heading)
- `📊 Overview by category` always in `<details>` (collapsed); rows omitted where P0/P1/P2/Q are all zero. Collapsed by default — summary line already conveys totals; the table is for drill-down only
- `📍 Inline comments` line shown when ≥1 P0/P1/P2 finding posted inline; omit otherwise
- `Severity adjustments` rendered **flat** (no `<details>`) when any adjustment exists — discipline requirement, never silent
- `🔄 Last iteration changes` rendered **flat** in incremental mode; shows ONLY this iter's verifications (`<last_sha>..<HEAD>`), never cumulative across older iterations. Audit trail for older iters lives in git history (commits + prior inline comment threads), not in the sticky
- `Spec gap questions` always in `<details>` (collapsed) — verbose; secondary to actionable findings
- `Checked & clean` always in `<details>` (collapsed) — count is the load-bearing signal; expand for trust calibration<仅在增量模式下渲染;仅显示本次迭代的验证结果,不显示累积结果;详见下方模板>
<details><summary>📊 按类别统计</summary>
| Category | P0 | P1 | P2 | Q | Files |
|---|---|---|---|---|---|
| N | N | N | N | <文件路径,逗号分隔> |
<仅当spec-auditor输出缺口项时渲染>
</details>
<details><summary>✅ 已检查无问题 (<N>)</summary>
- : <单行依据——验证了哪些特定模式无问题,或通过哪些grep/文件读取确认>
<slug> - ...
🤖 skill · reviewed <· last reviewed — 仅增量模式显示>
pr-review<base>..<HEAD><last_sha>
规则:
- 当存在≥2个问题时,必须添加叙述内容;0-1个问题时可选
- 当存在≥1个未标记为`Likely fixed`的问题时,`📋 当前未解决问题`部分**平铺显示**(不使用`<details>`);每个问题对应一个项目符号,按P0→P1→P2→Q排序,再按文件路径排序。当所有问题均已解决时,省略此部分(避免空标题)
- `📊 按类别统计`始终放在`<details>`中(折叠状态);P0/P1/P2/Q均为零的行省略。默认折叠——摘要行已传达总数;表格仅用于深入查看
- 当存在≥1个P0/P1/P2问题发布为行内评论时,显示`📍 Inline comments`行;否则省略
- 当存在任何调整时,`Severity adjustments`部分**平铺显示**(不使用`<details>`)——这是纪律要求,绝对不能隐瞒
- `🔄 上次迭代变更`在增量模式下**平铺显示**;仅显示本次迭代`<last_sha>..<HEAD>`中的验证结果,不显示跨旧迭代的累积结果。旧迭代的审计记录保存在git历史中(提交+之前的行内评论线程),而非固定评论中
- `规范缺口问题`始终放在`<details>`中(折叠状态)——内容冗长;优先级低于可执行问题
- `已检查无问题`始终放在`<details>`中(折叠状态)——数量是核心信号;展开用于信任校准Severity adjustments section
Severity adjustments部分模板
markdown
undefinedmarkdown
undefined⚖️ Severity adjustments
⚖️ Severity adjustments
| # | Category | Adjustment | Reason |
|---|---|---|---|
| #<n> | | <original-emoji + P-code> → <final-emoji + P-code> | <reason — one line> |
undefined| # | Category | Adjustment | Reason |
|---|---|---|---|
| #<n> | | <原始表情+P-code> → <最终表情+P-code> | <理由——单行描述> |
undefinedLast iteration changes section (incremental only)
上次迭代变更部分(仅增量模式)
markdown
undefinedmarkdown
undefined🔄 Last iteration changes (<last_sha>..<HEAD>
)
<last_sha>..<HEAD>🔄 上次迭代变更 (<last_sha>..<HEAD>
)
<last_sha>..<HEAD>| Prior | Status |
|---|---|
| #<n> <P-code> <slug> (<file>:<line>) | ✅ Likely fixed |
| #<n> <P-code> <slug> (<file>:<line>) | 🔄 Still present — <note> |
| #<n> <P-code> <slug> (<file>:<line>) | ⏸️ Untouched — <note> |
| #<n> Q <slug> | ⏸️ Awaiting spec author |
Scope: **only findings whose status changed (or was re-confirmed) in this iteration's `<last_sha>..<HEAD>` diff**. Untouched findings carrying over from before `<last_sha>` belong in `📋 Currently open`, not here. The table is the delta, not the inventory.
Status legend (hedged on purpose — line-moved ≠ behaviour-fixed):
- `✅ Likely fixed <sha>` — subagent emitted `verification: yes` + note explaining what changed
- `🔄 Still present` — subagent emitted `verification: no` + note pointing to remaining evidence
- `⏸️ Untouched` — subagent emitted `verification: unclear` (file segment not in diff)| Prior | Status |
|---|---|
| #<n> <P-code> <slug> (<file>:<line>) | ✅ Likely fixed |
| #<n> <P-code> <slug> (<file>:<line>) | 🔄 Still present — <说明> |
| #<n> <P-code> <slug> (<file>:<line>) | ⏸️ Untouched — <说明> |
| #<n> Q <slug> | ⏸️ Awaiting spec author |
范围:**仅包含本次迭代`<last_sha>..<HEAD>`差异中状态变更(或重新确认)的问题**。`<last_sha>`之前未解决的问题属于`📋 当前未解决问题`,而非此部分。此表格显示的是增量变更,而非完整清单。
状态图例(故意使用模糊表述——代码行移动≠行为修复):
- `✅ Likely fixed <sha>` — subagent输出`verification: yes` + 说明变更内容的注释
- `🔄 Still present` — subagent输出`verification: no` + 指向剩余证据的注释
- `⏸️ Untouched` — subagent输出`verification: unclear`(文件片段未在差异中)Inline comment body template
行内评论正文模板
One per P0 / P1 / P2 finding. Posted via GitHub Review API (single review, ).
event=COMMENTmarkdown
** <slug>**
**Failure mode**: <one-line>
**Mitigation**: <one-line; cite test path when applicable>
<details><summary>Evidence</summary>
```diff
<verbatim diff line(s)>
```
</details>
<sub>blast: <Local|Module|Cross-service|Data layer> · <reversible|not reversible> · confidence: <high|medium|low> · justification: <Reachable|Precedent|Asymmetric|Historical></sub>
<!-- pr-review:finding-id=#<n> -->
<!-- pr-review:justification=<Reachable|Precedent|Asymmetric|Historical|Hygiene> -->The HTML marker is consumed by 's diminishing-returns gate to decide whether to keep looping or hand back to the user. value is reserved for batched Q-class hygiene followups; never emit on a P0/P1/P2 finding.
justificationpr-babysitHygieneHygieneBadge colors:
- →
P0red - →
P1orange - →
P2yellow
reversible- — code-only change, additive feature, refactor without state migration
reversible - — destructive migration, breaking contract change, irreversible side effect (sent message, deleted data)
not reversible - omit if ambiguous (don't guess)
每个P0/P1/P2问题对应一条行内评论。通过GitHub Review API提交(单次审查,)。
event=COMMENTmarkdown
** <slug>**
**Failure mode**: <单行描述>
**Mitigation**: <单行描述;适用时指明测试路径>
<details><summary>Evidence</summary>
```diff
<逐字差异行>
```
</details>
<sub>blast: <Local|Module|Cross-service|Data layer> · <reversible|not reversible> · confidence: <high|medium|low> · justification: <Reachable|Precedent|Asymmetric|Historical></sub>
<!-- pr-review:finding-id=#<n> -->
<!-- pr-review:justification=<Reachable|Precedent|Asymmetric|Historical|Hygiene> -->justificationpr-babysitHygieneHygiene徽章颜色:
- →
P0red - →
P1orange - →
P2yellow
reversible- — 仅代码变更、新增功能、无状态迁移的重构
reversible - — 破坏性迁移、违反约定的变更、不可逆副作用(已发送消息、已删除数据)
not reversible - 若不确定则省略(不要猜测)
Spec gap questions (in sticky <details>
)
<details>规范缺口问题(固定评论的<details>
中)
<details>markdown
undefinedmarkdown
undefined❓ #<n> <spec-section-or-decision-id> — <one-line title>
❓ #<n> <spec-section-or-decision-id> — <单行标题>
<Blast>Spec quote: <verbatim>
Code quote: <verbatim>
Question for spec author:
- <numbered question>
- ...
<closing line, in PR description language — e.g. "not blocking the PR; want to clarify X">
Q findings do **not** become inline comments — they're often cross-file conceptual questions, pinning to a line misleads.<Blast>Spec quote: <逐字引用>
Code quote: <逐字引用>
Question for spec author:
- <编号问题>
- ...
<结束语,使用PR描述语言——例如"不阻止PR;希望澄清X">
Q类问题**不会**成为行内评论——它们通常是跨文件的概念性问题,固定到某一行会产生误导。What to drop from output
需要从输出中移除的内容
| Drop | Why |
|---|---|
| dispatch logic; reviewer doesn't care |
| which bots ran is process metadata — UNLESS one failed (then surface) |
| reviewer wants "what kind of issue" (already in category), not "who found it" |
| leaks subagent identity; bare slug reads cleaner |
| same — flat topic list |
Empty | render section only when content exists |
| 移除内容 | 原因 |
|---|---|
| 调度逻辑;审查者无需关心 |
| 哪些bot运行属于流程元数据——除非某个bot失败(此时需说明) |
每个问题的 | 审查者想知道“问题类型”(已包含在类别中),而非“谁发现的” |
标识中的 | 暴露subagent身份;纯标识更简洁 |
按subagent分组的 | 同理——平铺主题列表 |
空的 | 仅当存在内容时才渲染章节 |
Publishing
发布
Runs after merge step. Skipped entirely when (print sticky + inline payloads to console instead) or when (emit findings JSON to stdout — see Local Mode).
dry-run: truemode: localdot
digraph publish {
"Findings merged" [shape=doublecircle];
"Build sticky body" [shape=box];
"Build inline payload" [shape=box];
"Find sticky comment" [shape=box];
"Found?" [shape=diamond];
"PATCH sticky body" [shape=box];
"POST new sticky" [shape=box];
"POST review with inline comments" [shape=box];
"Done" [shape=doublecircle];
"Findings merged" -> "Build sticky body";
"Findings merged" -> "Build inline payload";
"Build sticky body" -> "Find sticky comment";
"Find sticky comment" -> "Found?";
"Found?" -> "PATCH sticky body" [label="yes"];
"Found?" -> "POST new sticky" [label="no"];
"PATCH sticky body" -> "POST review with inline comments";
"POST new sticky" -> "POST review with inline comments";
"Build inline payload" -> "POST review with inline comments";
"POST review with inline comments" -> "Done";
}dot
digraph publish {
"Findings merged" [shape=doublecircle];
"Build sticky body" [shape=box];
"Build inline payload" [shape=box];
"Find sticky comment" [shape=box];
"Found?" [shape=diamond];
"PATCH sticky body" [shape=box];
"POST new sticky" [shape=box];
"POST review with inline comments" [shape=box];
"Done" [shape=doublecircle];
"Findings merged" -> "Build sticky body";
"Findings merged" -> "Build inline payload";
"Build sticky body" -> "Find sticky comment";
"Find sticky comment" -> "Found?";
"Found?" -> "PATCH sticky body" [label="yes"];
"Found?" -> "POST new sticky" [label="no"];
"PATCH sticky body" -> "POST review with inline comments";
"POST new sticky" -> "POST review with inline comments";
"Build inline payload" -> "POST review with inline comments";
"POST review with inline comments" -> "Done";
}Commands
命令
bash
undefinedbash
undefined1. find sticky id (may be empty)
1. 查找固定评论ID(可能为空)
STICKY_ID=$(gh api repos/$OWNER/$REPO/issues/$N/comments
--jq '.[] | select(.body | contains("<!-- pr-review:sticky -->")) | .id' | head -1)
--jq '.[] | select(.body | contains("<!-- pr-review:sticky -->")) | .id' | head -1)
STICKY_ID=$(gh api repos/$OWNER/$REPO/issues/$N/comments
--jq '.[] | select(.body | contains("<!-- pr-review:sticky -->")) | .id' | head -1)
--jq '.[] | select(.body | contains("<!-- pr-review:sticky -->")) | .id' | head -1)
2. write sticky body to a temp file (skill builds the markdown)
2. 将固定评论内容写入临时文件(技能构建markdown)
sticky.md should embed both markers: <!-- pr-review:sticky --> and <!-- pr-review:sha=$HEAD -->
sticky.md应嵌入两个标记:<!-- pr-review:sticky --> 和 <!-- pr-review:sha=$HEAD -->
3a. create new sticky
3a. 创建新的固定评论
[ -z "$STICKY_ID" ] && gh api -X POST repos/$OWNER/$REPO/issues/$N/comments
-F body=@sticky.md
-F body=@sticky.md
[ -z "$STICKY_ID" ] && gh api -X POST repos/$OWNER/$REPO/issues/$N/comments
-F body=@sticky.md
-F body=@sticky.md
3b. or patch existing sticky
3b. 或更新现有固定评论
[ -n "$STICKY_ID" ] && gh api -X PATCH repos/$OWNER/$REPO/issues/comments/$STICKY_ID
-F body=@sticky.md
-F body=@sticky.md
[ -n "$STICKY_ID" ] && gh api -X PATCH repos/$OWNER/$REPO/issues/comments/$STICKY_ID
-F body=@sticky.md
-F body=@sticky.md
4. post inline comments as one review (single API call)
4. 将行内评论作为一次审查提交(单次API调用)
inline-comments.json shape: [{"path": "...", "line": N, "side": "RIGHT", "body": "..."}, ...]
inline-comments.json格式: [{"path": "...", "line": N, "side": "RIGHT", "body": "..."}, ...]
gh api -X POST repos/$OWNER/$REPO/pulls/$N/reviews
-F event=COMMENT
-F body="See sticky summary above."
-F 'comments=@inline-comments.json'
-F event=COMMENT
-F body="See sticky summary above."
-F 'comments=@inline-comments.json'
undefinedgh api -X POST repos/$OWNER/$REPO/pulls/$N/reviews
-F event=COMMENT
-F body="See sticky summary above."
-F 'comments=@inline-comments.json'
-F event=COMMENT
-F body="See sticky summary above."
-F 'comments=@inline-comments.json'
undefinedOld inline comments
旧行内评论
Do not delete or resolve old inline review comments from prior iterations. GitHub auto-marks them when the underlying line moves; the UI collapses outdated threads. This is the chosen trade-off (vs. graphql resolve) for operational simplicity.
outdated请勿删除或解决之前迭代的旧行内审查评论。当底层代码行移动时,GitHub会自动将其标记为;UI会折叠过时的线程。这是为了操作简便而选择的权衡(vs. graphql解决)。
outdatedDry-run mode
Dry-run模式
When :
dry-run: true- Print sticky body markdown to console (with markers)
- Print inline payload as JSON
- Skip all writes
gh api - Useful for first-time use, debugging, or auditing output before publishing
当时:
dry-run: true- 将固定评论markdown打印到控制台(包含标记)
- 将行内评论内容打印为JSON
- 跳过所有写入操作
gh api - 适用于首次使用、调试或发布前审计输出内容
Design note: prompt inlining over reference indirection
设计说明:提示内联优于引用间接
Subagents (security-reviewer / staff-engineer / sdet / spec-auditor) operate as isolated dispatches with no shared loader. They cannot follow a "see " pointer at dispatch time — references load via the main session, not the subagent's. This inverts skill-creator's default duplication-avoidance principle: any policy a subagent MUST apply is inlined verbatim into each subagent prompt, not stored once in and pointed to.
Agentreferences/X.mdreferences/Current inlined-duplicated content (intentional, not drift):
- Finding Inclusion Threshold (Justification classes + drop signals A/B/C/D) — identical wording across all 4 subagent prompts; SKILL.md only points to them
- Race-class Finding Metadata (meta tag spec) — identical across
[window=..., damage=..., recovery=...]+staff-engineer-prompt.md; pr-babysit's Gate B parser depends on identical syntaxsecurity-reviewer-prompt.md
Cross-prompt sync is maintained via HTML comments at each duplicated section header. When editing one, grep for the keep-in-sync marker to find paired sections.
<!-- keep-in-sync: ... -->Do NOT refactor inlined content into a shared file — the alternative regresses to the exact failure mode that motivated inlining. SKILL.md previously claimed "dispatcher prepends threshold at dispatch time" and that contract was never enforced (commit fixed it by baking threshold into prompts; ⇒ this design note exists to prevent a future editor from re-introducing the same gap).
references/328b73b8Subagent(security-reviewer / staff-engineer / sdet / spec-auditor)作为独立的调度运行,无共享加载器。它们无法在调度时遵循“see ”的引用——引用内容通过主会话加载,而非subagent的会话。这颠覆了技能创建者默认的避免重复原则:subagent必须应用的任何策略都逐字内联到每个subagent的提示中,而非存储在中并通过引用指向。
Agentreferences/X.mdreferences/当前内联重复的内容(故意设计,而非漂移):
- 问题纳入阈值(Justification类别+drop signals A/B/C/D)——所有4个subagent提示中的措辞完全相同;SKILL.md仅指向它们
- 竞态类问题元数据(元标记规范)——
[window=..., damage=..., recovery=...]+staff-engineer-prompt.md中的措辞完全相同;pr-babysit的Gate B解析器依赖于相同的语法security-reviewer-prompt.md
跨提示同步通过每个重复章节头部的 HTML注释维护。编辑其中一个时,grep查找keep-in-sync标记以找到对应章节。
<!-- keep-in-sync: ... -->请勿将内联内容重构到共享的文件中——替代方案会回归到导致内联设计的完全相同的失败模式。SKILL.md之前声称“调度方在调度时预先添加阈值”,但该约定从未被强制执行(commit 通过将阈值嵌入提示修复了此问题;⇒ 本设计说明的存在是为了防止未来编辑者重新引入相同的漏洞)。
references/328b73b8Notes
注意事项
- Don't auto-approve or auto-merge — produce findings; merge belongs to humans
- Lean conservative — low-confidence findings always demote to ❓ Question (Q)
- Spec gaps don't block review — mark Q for spec author, proceed with code findings
- Severity downgrades must be visible — flat section in sticky, never
<details> - Don't auto-grep for spec location — use only what the user provides
- Subagent reports are advisory — dispatcher applies merge rule and dedup, not subagents
- Subagent failure must be surfaced — sticky header carries the partial-mode warning; never silent
- Prior findings: hedge on "fixed" — always , never bare
Likely fixed; line-moved ≠ behaviour-fixedFixed - Force-push aware — when last_sha is unreachable, fall back to full + announce in sticky
- Output language is adaptive — PR-published prose follows the PR description's language; markers / titles / field labels / keywords / terms stay English. See Output Language
- Local mode is JSON-only — no markdown, no sticky, no inline; caller (e.g. a supervisor session) consumes findings JSON and drives its own follow-up loop
- 请勿自动批准或自动合并——仅生成审查结果;合并由人工完成
- 保守原则——低置信度问题始终降级为❓ Question(Q类)
- 规范缺口不阻止审查——标记为Q类供规范作者处理,继续进行代码问题审查
- 严重性降级必须可见——在固定评论中以平铺章节显示,绝不放在中
<details> - 请勿自动查找规范位置——仅使用用户提供的内容
- Subagent报告仅作为参考——调度方应用合并规则和去重,而非subagent
- Subagent失败必须说明——固定评论头部显示部分模式警告;绝对不能隐瞒
- 之前的问题:对“已修复”保持模糊——始终使用,绝不使用单纯的
Likely fixed;代码行移动≠行为修复Fixed - 支持强制推送——当last_sha不可达时,fallback到完整模式并在固定评论中说明
- 输出语言自适应——PR中发布的文本遵循PR描述的语言;标记/标题/字段标签/关键词/术语保留英文。详见输出语言
- Local模式仅输出JSON——无markdown、无固定评论、无行内评论;调用方(如主管会话)消费findings JSON并驱动自身的后续流程