code-ultrareview

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Code ultrareview

代码超审查

Critical — Label hygiene

关键要求 — 标签规范

Internal planning labels are author coordinates, not reader coordinates. Strip them from every shipped artifact this skill emits — code, comments, commit subjects/bodies, PR titles/descriptions, release notes, doc paragraphs, non-trivial comments.

Workstream and task labels —
```
WS-N
```
,
```
Phase-A
```
,
```
Step-3
```
, issue or ticket numbers, plan phase names from the source spec, issue body, or planning artifact. Translate to the domain noun (
```
Runs the battery script (WS-2)
```
→
```
Runs the battery script
```
).
Process language — "the rebuild", "the prior
```
<file>
```
", "carried verbatim from", "the cleanup pass", "the audit", "spec AC" standalone. Replace with the concrete fact (
```
carries the routing from the prior aggregation
```
→
```
routes via the merge keys in the synthesis module
```
).
Plan-internal references — "as the brief says", "per the workstream", "from the forge artifact". Drop the reference; state the fact directly.

Carve-outs — literal

WS-N

is legitimate where the skill IS the format authority (forge templates, apex rule documentation). Reviewer-facing dev docs (e.g.

MIGRATION.md

under

tests/<skill>/

) may reference deleted artifacts by their author-time names.

Eight-axis judgment code review. Five-phase pipeline scope → tool battery → 8 parallel axis reviewers → Haiku validators → synthesis. Always runs at full strength. Distinct from Anthropic's remote
/ultrareview
— same goal, in-session on the user's subscription.

内部规划标签是作者的标记，而非面向读者的标识。在本技能输出的所有交付产物中都需要移除这些标签——包括代码、注释、提交标题/正文、PR标题/描述、发布说明、文档段落和重要注释。

工作流与任务标签 —
```
WS-N
```
、
```
Phase-A
```
、
```
Step-3
```
、问题或工单编号、来自源规格、问题正文或规划产物的计划阶段名称。需转换为领域名词（例如
```
Runs the battery script (WS-2)
```
→
```
Runs the battery script
```
）。
流程术语 — "重建"、"先前的
```
<file>
```
"、"直接沿用"、"清理阶段"、"审计"、"独立的spec AC"。需替换为具体事实（例如 "carries the routing from the prior aggregation" → "routes via the merge keys in the synthesis module"）。
内部规划引用 — "如 brief 所述"、"根据工作流"、"来自 forge 产物"。需移除引用，直接陈述事实。

例外情况——当本技能作为格式权威时（如forge模板、核心规则文档），字面意义的

WS-N

是合法的。面向审查者的开发文档（例如

tests/<skill>/

下的

MIGRATION.md

）可以按作者命名方式引用已删除的产物。

八维度判定代码审查。五阶段流水线流程：范围确定→工具集→8个并行维度审查器→Haiku验证器→报告合成。始终以完整强度运行。与Anthropic的远程
/ultrareview
不同——目标一致，但在用户订阅的会话内运行。

Important — Writing rules

重要规则 — 写作规范

These rules govern every prose artifact this skill emits — READMEs, CHANGELOGs, commit messages, PR bodies, release notes, doc paragraphs, non-trivial comments. Apply them at draft time, verify before output.

Match the surrounding style — punctuation, capitalization, backtick conventions, em-dash vs parens, bullet style.
Every sentence changes the reader's understanding. Cut it otherwise.
Front-load the verb — "Creates", not "This helps you create".
Concrete over abstract. Lists for ≥3 enumerable items.
Assert positively. Reserve negation for real constraints (
```
NEVER commit secrets
```
).
No marketing words: powerful, robust, seamlessly, leverage, unlock, comprehensive, delightful.
No AI tells: delve, tapestry, intricate, pivotal, testament, underscore, crucial, garner, showcase, additionally, moreover, furthermore, indeed.
After drafting English prose, invoke
```
/humanize-en
```
if installed.

这些规则适用于本技能输出的所有文本产物——README、CHANGELOG、提交信息、PR正文、发布说明、文档段落和重要注释。在草稿阶段应用，输出前验证。

匹配周边风格——标点、大小写、反引号约定、破折号与括号、列表样式。
每句话都要改变读者的认知，否则删除。
动词前置——使用"Creates"而非"This helps you create"。
优先具体表述，避免抽象。≥3个可枚举项使用列表。
正面陈述。仅对真实约束使用否定表述（如
```
NEVER commit secrets
```
）。
禁用营销词汇：powerful、robust、seamlessly、leverage、unlock、comprehensive、delightful。
禁用AI风格表述：delve、tapestry、intricate、pivotal、testament、underscore、crucial、garner、showcase、additionally、moreover、furthermore、indeed。
英文草稿完成后，若已安装
```
/humanize-en
```
则调用该工具优化。

Objective

目标

Run the 8 axes — Correctness, Simplification, Tests, Documentation, Style, Intent, Design/API, Performance — as 8 parallel LLM subagents fed by deterministic tool findings from

scripts/run_battery.sh

. Coherence joins as a 9th axis when metadata files change. Sub-80 axis findings get re-scored by Haiku validators against the verbatim rubric in

references/anthropic-verbatim.md

. Findings synthesize into one report with deterministic dedup, inter-axis precedence, A2 no-silent-drop, and a verdict (Ship / Fix-then-ship / Needs work). The report ends with "What I did NOT check" so the coverage limits are explicit.

运行8个审查维度——正确性、简化性、测试、文档、风格、意图、设计/API、性能——将其作为8个并行LLM子代理，由

scripts/run_battery.sh

输出的确定性工具结果提供数据。当元数据文件变更时，一致性维度会作为第9个维度加入。评分低于80的维度结果会由Haiku验证器根据

references/anthropic-verbatim.md

中的严格评分标准重新打分。结果会合成一份报告，包含确定性去重、维度间优先级、A2无静默丢弃规则以及最终结论（Ship / Fix-then-ship / Needs work）。报告末尾会列出「未检查内容」，明确覆盖范围限制。

Parameters

参数

Flag	Behavior
`-s`	Save the report + JSONL to `~/.claude/output/{project}/code-ultrareview/code-ultrareview-{slug}.{md,jsonl}`
`-S`	Force no-save (overrides any ambient save mode)
`-b <ref>`	Override the review base (skip auto-detection via `scripts/resolve_base.sh` )
`--repo-kind <kind>`	Override the scope classifier. Values: `skills` , `app` , `library` , `docs` , `monorepo` , `python` , `rust` , `go` , `unknown` . Persistent per-repo override at `.code-ultrareview.yaml` ( `repo_kind: <kind>` ); the flag wins on conflict. Invalid value exits 2
`--reconcile <input>`	Activate the Intent-axis derivation sub-mode. `<input>` may be `@auto` , `@pr` , an explicit path or directory, `gh:pr:<N>` , `gh:issue:<owner>/<repo>#<N>` , or a GitHub issue URL. Findings classify as GAP / SCOPE-ADD / DECISION-OVERRIDE / CONSISTENT
`--verify-build`	Run build verification on sub-80 axis findings BEFORE Haiku validators (Phase 3.5). Builds + runs the test command detected by `scripts/build_detect.py` ; confirmed findings get promoted (+30 confidence) and skip the validator phase
`--mutation-test`	Run Stryker (JS/TS), Pitest (JVM), or mutmut (Python) on changed files only. Surviving mutants route to the Tests axis as 🟠 Medium
`--apply-safe`	Opt-in writers: auto-apply low-risk fixes (manifest version sync, structured-field description sync with full-agreement guard, one failing test per confirmed bug). Diff preview + per-file confirmation before any write
`--include-prose`	Coherence axis compares README freeform paragraphs as well (default: structured fields only)
`--axes <list>`	Comma-separated subset of axes to run (e.g. `correctness,tests` ). Default: all 8 + Coherence when triggered
`--preflight`	List detected tools per repo_kind + print install commands for missing ones. Informational only, no install

Lowercase enables, uppercase disables. No

-f

— this skill is a producer, not a consumer.

bash

/code-ultrareview                              # full 8-axis review, print report
/code-ultrareview -s                           # save the report + JSONL for /apex -f
/code-ultrareview -b origin/main               # review HEAD against an explicit base
/code-ultrareview --verify-build               # promote sub-80 findings via real build verification
/code-ultrareview --reconcile @auto            # add Intent derivation sub-mode with auto-detect
/code-ultrareview --apply-safe                 # full review + gated low-risk fixes
/code-ultrareview --preflight                  # list tools the battery would run, no review
/code-ultrareview --axes correctness,tests     # subset of axes

标志	行为
`-s`	将报告+JSONL保存至 `~/.claude/output/{project}/code-ultrareview/code-ultrareview-{slug}.{md,jsonl}`
`-S`	强制不保存（覆盖任何默认保存模式）
`-b <ref>`	覆盖审查基准（跳过 `scripts/resolve_base.sh` 的自动检测）
`--repo-kind <kind>`	覆盖范围分类器。可选值： `skills` 、 `app` 、 `library` 、 `docs` 、 `monorepo` 、 `python` 、 `rust` 、 `go` 、 `unknown` 。可通过 `.code-ultrareview.yaml` 设置仓库级持久化覆盖（ `repo_kind: <kind>` ）；标志设置优先级更高。无效值会导致退出码为2
`--reconcile <input>`	激活意图维度推导子模式。 `<input>` 可以是 `@auto` 、 `@pr` 、明确路径或目录、 `gh:pr:<N>` 、 `gh:issue:<owner>/<repo>#<N>` 或GitHub问题URL。结果会分类为GAP / SCOPE-ADD / DECISION-OVERRIDE / CONSISTENT
`--verify-build`	在Haiku验证器之前（第3.5阶段）对评分低于80的维度结果运行构建验证。执行 `scripts/build_detect.py` 检测到的构建+测试命令；确认的结果会提升置信度（+30）并跳过验证阶段
`--mutation-test`	仅对变更文件运行Stryker（JS/TS）、Pitest（JVM）或mutmut（Python）。存活的变异体作为🟠中等问题提交至测试维度
`--apply-safe`	可选写入功能：自动应用低风险修复（清单版本同步、结构化字段描述同步+完全一致校验、每个确认bug对应一个失败测试）。写入前会展示差异预览+逐文件确认
`--include-prose`	一致性维度同时对比README自由格式段落（默认：仅对比结构化字段）
`--axes <list>`	逗号分隔的要运行的维度子集（例如 `correctness,tests` ）。默认：全部8个维度+触发时的一致性维度
`--preflight`	列出按repo_kind检测到的工具+缺失工具的安装命令。仅提供信息，不执行安装

小写字母启用，大写字母禁用。无

-f

标志——本技能是产物生成器，而非消费者。

bash

/code-ultrareview                              # 完整8维度审查，打印报告
/code-ultrareview -s                           # 保存报告+JSONL供/apex -f使用
/code-ultrareview -b origin/main               # 对比HEAD与明确基准进行审查
/code-ultrareview --verify-build               # 通过实际构建验证提升评分低于80的结果
/code-ultrareview --reconcile @auto            # 启用意图推导子模式并自动检测
/code-ultrareview --apply-safe                 # 完整审查+ gated低风险修复
/code-ultrareview --preflight                  # 列出工具集将运行的工具，不执行审查
/code-ultrareview --axes correctness,tests     # 运行维度子集

The five phases

五阶段流程

Phase 1 — Scope

阶段1 — 范围确定

Runs

scripts/scope.py

. Deterministic, no LLM. Outputs

scope.json

Diff resolution — clean tree →
```
scripts/resolve_base.sh
```
ladder; dirty tree →
```
git diff HEAD
```
+ every untracked file inlined as added lines.

Repo-kind classification — 8 kinds (

skills

app

library

docs

monorepo

python

rust

go

) +

unknown

. Override via

--repo-kind

.code-ultrareview.yaml

CLAUDE.md chain — root
```
CLAUDE.md
```
+ nested
```
CLAUDE.md
```
in changed directories +
```
.claude/rules/*.md
```
+
```
~/.claude/rules/*.md
```
. Ordered root-to-deepest. Read by axis reviewers and validators.

Coherence activation — any of

package.json

.claude-plugin/marketplace.json

marketplace.json

SKILL.md

, root

README.md

tsconfig.json

pyproject.toml

Cargo.toml

go.mod

in the diff →

scope.json["activates_coherence"] = true

Languages detection — from changed-file extensions; drives Phase 2 dispatch.

The output also feeds the report header lines

Repo: <kind>

Base: <ref>

Files: <N>

运行

scripts/scope.py

。确定性流程，无LLM参与。输出

scope.json

：

差异解析 — 干净工作区→
```
scripts/resolve_base.sh
```
阶梯式解析；脏工作区→
```
git diff HEAD
```
+所有未跟踪文件作为新增行内联。

仓库类型分类 — 8种类型（

skills

app

library

docs

monorepo

python

rust

go

）+

unknown

。可通过

--repo-kind

或

.code-ultrareview.yaml

覆盖。

CLAUDE.md链 — 根目录
```
CLAUDE.md
```
+变更目录中的嵌套
```
CLAUDE.md
```
+
```
.claude/rules/*.md
```
+
```
~/.claude/rules/*.md
```
。按从根到最深层级排序。供维度审查器和验证器读取。

一致性维度激活 — 差异中包含

package.json

、

.claude-plugin/marketplace.json

、

marketplace.json

、

SKILL.md

、根目录

README.md

、

tsconfig.json

、

pyproject.toml

、

Cargo.toml

、

go.mod

中的任意文件→

scope.json["activates_coherence"] = true

。

语言检测 — 从变更文件扩展名检测；驱动阶段2的工具调度。

输出还会提供报告标题行

Repo: <kind>

、

Base: <ref>

、

Files: <N>

。

Phase 2 — Tool battery

阶段2 — 工具集

Runs

scripts/run_battery.sh

. Deterministic CLIs feed

tool-findings.jsonl

tagged by axis with

confidence: 100

. Tools dispatch per

scope.json["languages"]

: npx-wrapped (

knip

jscpd

markdownlint-cli2

@microsoft/api-extractor

) and uvx-wrapped (

lizard

vulture

semgrep

vale

) tools wrap zero-install; native binaries (

oasdiff

atlas

/ Go

deadcode

gocyclo

dupl

cargo-machete

) fall back to PATH. Bundled

references/perf-rules/

carries the universal N+1 and sync-I/O semgrep rules. Per-tool axis routing lives in

scripts/battery_ingest.py

. Full Tool → Axis → Install table in

README.md

Graceful skip. Missing tools emit

WARN: <tool> not found — install: <command>

to stderr and append to

scope.json["tools_skipped"]

; the skill continues. The battery NEVER auto-installs — no

brew

cargo

go

pip

, or

npm

install runs.

Phase 2 extension —
--mutation-test
.

scripts/run_mutation.sh

dispatches Stryker (JS/TS), mutmut (Python), or pitest-maven (JVM) scoped to changed files only. Surviving mutants route to the Tests axis as 🟠 Medium with

confidence: 100

(skips Phase 4 validators). Runtime can exceed 10 minutes per language; default 600 s timeout overridable via

MUTATION_TIMEOUT

. Graceful skip on missing tool or config. Details:

references/ultra-execution.md

运行

scripts/run_battery.sh

。确定性CLI工具生成

tool-findings.jsonl

，按维度标记

confidence: 100

。工具根据

scope.json["languages"]

调度：npx封装的工具（

knip

jscpd

markdownlint-cli2

@microsoft/api-extractor

）和uvx封装的工具（

lizard

vulture

semgrep

vale

）支持零安装；原生二进制工具（

oasdiff

atlas

/Go的

deadcode

gocyclo

dupl

cargo-machete

）回退到PATH查找。

references/perf-rules/

包含通用的N+1和同步I/O semgrep规则。工具到维度的路由定义在

scripts/battery_ingest.py

中。完整的工具→维度→安装表见

README.md

。

优雅跳过。缺失工具会向stderr输出

WARN: <tool> not found — install: <command>

并添加到

scope.json["tools_skipped"]

；技能会继续运行。工具集绝不会自动安装——不会执行

brew

、

cargo

、

go

、

pip

或

npm

安装命令。

阶段2扩展 —
--mutation-test
。

scripts/run_mutation.sh

调度Stryker（JS/TS）、mutmut（Python）或pitest-maven（JVM），仅针对变更文件。存活的变异体作为🟠中等问题提交至测试维度，

confidence: 100

（跳过阶段4验证器）。每个语言的运行时间可能超过10分钟；默认超时600秒，可通过

MUTATION_TIMEOUT

覆盖。缺失工具或配置时优雅跳过。详情见

references/ultra-execution.md

。

Phase 3 — Axis review

阶段3 — 维度审查

The orchestrator prepares 8 per-axis bundles (+ Coherence when active) via

scripts/axis_dispatch.py prepare

, then launches every bundle as a parallel

Explore

Task

in one message. Each subagent reads its axis brief, the rubric in

references/anthropic-verbatim.md

, the diff, and its filtered tool findings. Each emits canonical-schema JSONL on stdout. Subagents cannot spawn other subagents — the main thread launches both axis reviewers AND validators.

The 8 always-on axes: Correctness · Simplification · Tests · Documentation · Style · Intent · Design/API · Performance. Each maps to

references/axes/<name>.md

for scope + repo-kind branches. Coherence is the conditional 9th — added when

scope.json["activates_coherence"]

is true; when inactive, the header surfaces

Coherence axis: inactive

so the absence is visible. Full axis map, inter-axis precedence, and orchestration details (prepare CLI, bundle schema, no-silent-failure contract):

references/axes-overview.md

references/orchestration.md

协调器通过

scripts/axis_dispatch.py prepare

准备8个维度的包（激活时额外加一致性维度），然后在一条消息中启动所有包作为并行

Explore

Task

。每个子代理读取其维度说明、

references/anthropic-verbatim.md

中的评分标准、代码差异和过滤后的工具结果。每个子代理在stdout输出符合规范的JSONL格式结果。子代理无法生成其他子代理——主线程同时启动维度审查器和验证器。

始终启用的8个维度：正确性·简化性·测试·文档·风格·意图·设计/API·性能。每个维度对应

references/axes/<name>.md

中的范围+仓库类型分支。一致性维度是可选的第9个维度——当

scope.json["activates_coherence"]

为true时添加；未激活时，报告标题会显示

Coherence axis: inactive

，明确告知该维度未运行。完整的维度映射、维度间优先级和协调详情（准备CLI、包结构、无静默失败约定）见

references/axes-overview.md

references/orchestration.md

。

Phase 4 — Validation

阶段4 — 验证

The orchestrator prepares per-finding validator bundles via

scripts/run_validators.py prepare

, then launches one Haiku

Task

per finding in the same message — batched ≤10 parallel. Each validator receives the finding + diff context + the deepest matching CLAUDE.md snippet + the verbatim rubric, re-scores 0-100, and re-checks the cited CLAUDE.md rule actually exists in

claude_md_chain

(demotes with

CLAUDE.md rule not found at <path>

if not).

Confidence threshold = 80 (

scripts/synthesis_core.py:CONFIDENCE_THRESHOLD

). Tool-battery findings (confidence 100) skip the validator phase — they are deterministic. Validators stay read-only — no Write / Edit / Bash, no nested subagent spawn.

Typical runtime. 5-15 sub-80 findings → one batch → ~30-60s. 25+ findings spread over 2-3 batches stay under ~2 min. Latency is dominated by Haiku launch overhead, not inference.

A2 contract. No sub-80 finding silently dropped. Each one is promoted to ≥80, demoted with reason, or surfaced in

### ⚠️ Unverified

with the validator's reason text.

Validator prepare CLI, bundle schema, ingest pass details:

references/orchestration.md

Phase 3.5 —
--verify-build
. Build verification runs BEFORE validators via

scripts/run_build_verify.py

(composing

scripts/build_detect.py

synthesis_core.iterate_unverified

— +30 confidence, cap 95, floor 80). Sub-80 findings on

correctness

tests

design-api

performance

get promoted past the validator phase when the build fails. Other axes pass through unchanged. Details:

references/orchestration.md

协调器通过

scripts/run_validators.py prepare

准备每个结果的验证器包，然后在同一条消息中为每个结果启动一个Haiku

Task

——批量处理≤10个并行任务。每个验证器接收结果+差异上下文+最匹配的CLAUDE.md片段+严格评分标准，重新打分0-100，并检查引用的CLAUDE.md规则是否存在于

claude_md_chain

中（不存在则降级并标注

CLAUDE.md rule not found at <path>

）。

置信度阈值=80（

scripts/synthesis_core.py:CONFIDENCE_THRESHOLD

）。工具集结果（置信度100）跳过验证阶段——它们是确定性的。验证器保持只读——不执行Write/Edit/Bash操作，不生成嵌套子代理。

典型运行时间。5-15个评分低于80的结果→一批次→约30-60秒。25+个结果分2-3批次处理→约2分钟内完成。延迟主要由Haiku启动开销决定，而非推理时间。

A2约定。评分低于80的结果不会被静默丢弃。每个结果要么提升至≥80，要么降级并给出理由，要么在

### ⚠️ Unverified

中展示并附带验证器的理由文本。

验证器准备CLI、包结构、结果处理详情见

references/orchestration.md

。

阶段3.5 —
--verify-build
。构建验证在验证器之前通过

scripts/run_build_verify.py

运行（结合

scripts/build_detect.py

synthesis_core.iterate_unverified

——置信度+30，上限95，下限80）。正确性/测试/设计-API/性能维度中评分低于80的结果在构建失败时会跳过验证阶段直接提升。其他维度结果保持不变。详情见

references/orchestration.md

。

Phase 5 — Synthesis

阶段5 — 报告合成

Runs

scripts/synthesize.py

on top of

scripts/synthesis_core.py

primitives:

Dedup by
```
(location, finding-text)
```
.
Inter-axis precedence — when 2+ axes flag the same
```
file:line
```
with the same finding wording, highest severity wins; ties resolve via
```
Correctness > Design/API > Simplification > Tests > Documentation > Style > Intent > Performance > Coherence
```
(
```
scripts/synthesis_core.py:AXIS_PRIORITY
```
). Distinct findings at coincident lines (a Correctness null-deref and a Tests missing-assert on the same line) survive as separate entries.
A2 routing — sub-80 stays in Unverified with the validator's reason.

Verdict —

Ship

Fix-then-ship

Needs work

(

scripts/synthesis_core.py:compute_verdict

Report emission — markdown to terminal +

~/.claude/output/{project}/code-ultrareview/code-ultrareview-{slug}.md

. JSONL alongside with Conventional Comments labels (

issue

suggestion

nitpick

question

The closing "What I did NOT check" section is mandatory and always present, even when nothing was skipped — it lists security (defers to

/security-review

), runtime performance / benchmarks (explicit non-goal), flaky test detection (explicit non-goal), and any tools from

scope.json["tools_skipped"]

基于

scripts/synthesis_core.py

的原语运行

scripts/synthesize.py

：

去重 — 按
```
(location, finding-text)
```
去重。
维度间优先级 — 当2+个维度标记同一
```
file:line
```
且结果表述相同时，最高优先级维度胜出；平局按
```
Correctness > Design/API > Simplification > Tests > Documentation > Style > Intent > Performance > Coherence
```
解决（
```
scripts/synthesis_core.py:AXIS_PRIORITY
```
）。同一行的不同结果（例如同一行的正确性空指针引用和测试缺失断言）会作为独立条目保留。
A2路由 — 评分低于80的结果保留在Unverified部分并附带验证器理由。

结论 —

Ship

Fix-then-ship

Needs work

（

scripts/synthesis_core.py:compute_verdict

）。

报告输出 — markdown格式输出到终端+保存至
```
~/.claude/output/{project}/code-ultrareview/code-ultrareview-{slug}.md
```
。同时输出符合Conventional Comments标签（
```
issue
```
/
```
suggestion
```
/
```
nitpick
```
/
```
question
```
）的JSONL格式。

末尾的**「未检查内容」**部分是必填项，始终存在，即使没有跳过任何内容——会列出安全问题（转至

/security-review

）、运行时性能/基准测试（明确非目标）、不稳定测试检测（明确非目标）以及

scope.json["tools_skipped"]

中的所有工具。

Final report layout

最终报告布局

templates/code-ultrareview.md

is the canonical wire format — every

##

section renders verbatim in template order with its emoji prefix; no rename, merge, reorder, or improvise. Terminal echo is mandatory — the full canonical report prints to the chat-terminal on every invocation;

-s

is purely additive (writes the same bytes to

~/.claude/output/{project}/code-ultrareview/code-ultrareview-{slug}.md

, byte-for-byte identical to terminal output). Severity marker mapping (🔴 High blocks ship · 🟠 Medium fix-soon · 🟢 Low nit · ⚠️ Unverified sub-80) lives in

scripts/synthesis_core.py:SEVERITY_MARKERS

templates/code-ultrareview.md

是标准格式——每个

##

部分都会按模板顺序带emoji前缀原样渲染；不允许重命名、合并、重新排序或即兴修改。终端输出是强制要求——每次调用都会在聊天终端打印完整的标准报告；

-s

仅作为附加功能（将相同内容写入

~/.claude/output/{project}/code-ultrareview/code-ultrareview-{slug}.md

，与终端输出字节完全一致）。严重程度标记映射（🔴高优先级阻止发布·🟠中等优先级尽快修复·🟢低优先级小问题·⚠️未验证评分低于80）定义在

scripts/synthesis_core.py:SEVERITY_MARKERS

中。

Trust model

信任模型

The skill ingests third-party content — CLAUDE.md files, PR bodies, planning artifacts (

--reconcile

), GitHub issue bodies — which can carry indirect prompt-injection. Axis reviewers and validators are read-only (no Write / Edit / Bash mutation). User review of the report is the trust boundary before any

--apply-safe

write;

--apply-safe

itself gates writes behind diff preview + per-file confirmation.

本技能会摄入第三方内容——CLAUDE.md文件、PR正文、规划产物（

--reconcile

）、GitHub问题正文——这些内容可能携带间接提示注入。维度审查器和验证器是只读的（不执行Write/Edit/Bash修改）。在执行

--apply-safe

写入之前，用户对报告的审查是信任边界；

--apply-safe

本身会在写入前展示差异预览+逐文件确认。

Rules

规则

Only new findings. Issues the diff introduces. Pre-existing findings carry the
```
Pre-existing
```
tier for context, never flip the verdict.

No silent drop (A2). Sub-80 findings surface in

### ⚠️ Unverified

with rationale

Sub-80 confidence ({score}) — verify locally before action.

Fail loud. A phase that cannot run (unresolvable base, missing tool with no skip path, dependency failure) appears in the header or as a finding. Never silent.
Cite precisely. Every finding carries
```
file:line
```
; CLAUDE.md findings quote the violated rule verbatim; permalinks use
```
https://github.com/<owner>/<repo>/blob/<full-sha>/<path>#L<n>-L<m>
```
(full SHA via
```
git rev-parse HEAD
```
).
Full report in chat every time. The complete report prints to the terminal on every invocation.
```
-s
```
writes the same bytes to disk; it never gates or summarises chat output.
NEVER auto-install tools. Missing tools surface install commands in the report and
```
scope.json["tools_skipped"]
```
. The user installs them.
NEVER modify code without
--apply-safe
. Default is read-only review.
```
--apply-safe
```
writers are surgical and per-file confirmed.

仅报告新问题。仅关注代码差异引入的问题。预先存在的问题会标记为
```
Pre-existing
```
作为上下文，绝不会改变最终结论。

无静默丢弃（A2）。评分低于80的结果会在

### ⚠️ Unverified

中展示，理由为

Sub-80 confidence ({score}) — verify locally before action.

失败时明确提示。无法运行的阶段（无法解析基准、无跳过路径的缺失工具、依赖失败）会在标题或结果中显示。绝不会静默失败。
精确引用。每个结果都携带
```
file:line
```
；CLAUDE.md相关结果会逐字引用违反的规则；永久链接使用
```
https://github.com/<owner>/<repo>/blob/<full-sha>/<path>#L<n>-L<m>
```
（完整SHA通过
```
git rev-parse HEAD
```
获取）。
每次调用都在聊天中输出完整报告。每次调用都会在终端打印完整报告。
```
-s
```
仅将相同内容写入磁盘；绝不会限制或总结聊天输出。
绝不自动安装工具。缺失工具会在报告中显示安装命令并添加到
```
scope.json["tools_skipped"]
```
。由用户自行安装。
无
--apply-safe
时绝不修改代码。默认仅执行只读审查。
```
--apply-safe
```
写入是精准的且需逐文件确认。

Deferrals

未覆盖范围

The closing "What I did NOT check" section always names these — explicit user-facing calibration of coverage:

Security →
```
/security-review
```
or Anthropic's
```
claude-code-security-review
```
. Security is a distinct concern with its own deeper review pattern.
Runtime performance / benchmarks → not covered. The Performance axis catches static patterns (N+1, sync I/O) but not runtime profiling.
Flaky test detection → not covered. The Tests axis catches structural smells, not flake.
Tools from
scope.json["tools_skipped"]
→ listed explicitly so the user sees what they sacrificed by not installing the native binaries.

末尾的「未检查内容」部分始终会列出以下内容——明确告知用户覆盖范围：

安全问题 →
```
/security-review
```
或Anthropic的
```
claude-code-security-review
```
。安全是独立的关注点，有专门的深度审查模式。
运行时性能/基准测试 → 未覆盖。性能维度仅检测静态模式（N+1、同步I/O），不涉及运行时分析。
不稳定测试检测 → 未覆盖。测试维度仅检测结构问题，不涉及不稳定测试。
scope.json["tools_skipped"]
中的工具 → 明确列出，让用户了解未安装原生工具所牺牲的功能。

Graceful degradation

优雅降级

No CLAUDE.md / no
.claude/rules
— Style axis runs without baseline; the report says
```
Style axis: skipped — no rules baseline found
```
.
No
npx
/ no
uvx
— every wrappable tool skips; only PATH binaries run.
Missing native binary (
```
oasdiff
```
,
```
atlas
```
,
```
cargo-machete
```
, Go tools) — emits to stderr +
```
scope.json["tools_skipped"]
```
. The relevant axis loses its tool input but still runs LLM judgment.
Unresolvable base — fail loud with the resolver's hint line. Do not guess.
Unknown repo_kind — axes run with their
```
unknown
```
branch (no specialization).
Coherence inactive — when no metadata files change, the 9th axis simply does not launch. The report header says
```
Coherence axis: inactive
```
so the absence is visible.

无CLAUDE.md/无
.claude/rules
— 风格维度在无基准规则的情况下运行；报告会显示
```
Style axis: skipped — no rules baseline found
```
。
无
npx
/无
uvx
— 所有可封装工具都会跳过；仅运行PATH中的二进制工具。
缺失原生二进制工具（
```
oasdiff
```
、
```
atlas
```
、
```
cargo-machete
```
、Go工具） — 向stderr输出+添加到
```
scope.json["tools_skipped"]
```
。相关维度会失去工具输入，但仍会运行LLM判定。
无法解析基准 — 明确提示解析器的提示信息。绝不猜测。
未知仓库类型 — 维度使用其
```
unknown
```
分支运行（无特殊化处理）。
一致性维度未激活 — 当无元数据文件变更时，第9个子代理不会启动。报告标题会显示
```
Coherence axis: inactive
```
，明确告知该维度未运行。

Composition

组合使用

Bridge to the fix pass after the report ships:

/apex -f ~/.claude/output/{project}/code-ultrareview/code-ultrareview-{slug}.md

— structured fix pass (requires

-s

; pass the absolute path the report prints).

```
/oneshot "<finding>"
```
— single-finding quick fix (takes a description, not a file).

报告生成后可衔接修复流程：

/apex -f ~/.claude/output/{project}/code-ultrareview/code-ultrareview-{slug}.md

— 结构化修复流程（需要

-s

；传入报告打印的绝对路径）。

```
/oneshot "<finding>"
```
— 单问题快速修复（接收问题描述，而非文件）。

Opt-in flag composition

可选标志组合

The four opt-in flags layer orthogonally on the always-on pipeline: mutation tests join Phase 2 tool findings;

--verify-build

adds Phase 3.5;

--reconcile

enriches the Intent axis in Phase 3;

--apply-safe

writers run post-synthesis with diff preview + per-file confirmation. Without a flag, its feature is off. Full composition matrix and per-flag details:

references/ultra-execution.md

四个可选标志可在默认流水线基础上正交叠加：变异测试加入阶段2工具结果；

--verify-build

添加阶段3.5；

--reconcile

增强阶段3的意图维度；

--apply-safe

写入在合成后运行，带差异预览+逐文件确认。无标志时对应功能关闭。完整的组合矩阵和每个标志的详情见

references/ultra-execution.md

。

What this skill is NOT

本技能不具备的能力

Not a security audit. Defers to
```
/security-review
```
. The closing section makes this explicit on every report.
Not a linter or formatter. The deterministic tool battery (npx/uvx-wrapped CLIs) handles linting and dead-code detection. The skill layers LLM judgment on top of those signals.
Not Anthropic's remote
/ultrareview
. Distinct surface — this skill runs in-session on the user's subscription;
```
/ultrareview
```
runs in a remote sandbox and bills per run.
Not a fix tool. Report-only by default.
```
--apply-safe
```
covers three surgical writers; everything else routes to
```
/apex
```
or
```
/oneshot
```
.

不是安全审计工具。转至
```
/security-review
```
。每份报告的末尾部分都会明确说明这一点。
不是代码检查器或格式化工具。确定性工具集（npx/uvx封装的CLI）负责代码检查和死代码检测。本技能在这些信号之上叠加LLM判定。
不是Anthropic的远程
/ultrareview
。界面不同——本技能在用户订阅的会话内运行；
```
/ultrareview
```
在远程沙箱中运行并按次计费。
不是修复工具。默认仅生成报告。
```
--apply-safe
```
仅支持三种精准写入；其他修复需转至
```
/apex
```
或
```
/oneshot
```
。

References

参考文档

Reviewer primitives —
```
references/anthropic-verbatim.md
```
(rubric + HIGH SIGNAL + false-positive taxonomy),
```
references/axes-overview.md
```
(8 axes + Coherence + inter-axis precedence),
```
references/axes/<name>.md
```
(per-axis briefs),
```
references/orchestration.md
```
(Phase 3 + 4 + 3.5 prepare CLIs and bundle schemas).

Opt-in flags —

references/ultra-execution.md

covers

--verify-build

--mutation-test

--reconcile

--apply-safe

in full.

Scripts —

scope.py

(Phase 1),

run_battery.sh

battery_ingest.py

(Phase 2),

axis_dispatch.py

(Phase 3),

run_validators.py

(Phase 4),

synthesize.py

synthesis_core.py

findings_to_jsonl.py

(Phase 5). Opt-in:

run_build_verify.py

run_mutation.sh

derivation/run.py

apply_safe/{version_sync,description_sync,failing_test_writer}.py

审查者原语 —
```
references/anthropic-verbatim.md
```
（评分标准+高信号+误判分类）、
```
references/axes-overview.md
```
（8个维度+一致性维度+维度间优先级）、
```
references/axes/<name>.md
```
（各维度说明）、
```
references/orchestration.md
```
（阶段3+4+3.5准备CLI和包结构）。

可选标志 —

references/ultra-execution.md

详细介绍

--verify-build

、

--mutation-test

、

--reconcile

、

--apply-safe

。

脚本 —

scope.py

（阶段1）、

run_battery.sh

battery_ingest.py

（阶段2）、

axis_dispatch.py

（阶段3）、

run_validators.py

（阶段4）、

synthesize.py

synthesis_core.py

findings_to_jsonl.py

（阶段5）。可选脚本：

run_build_verify.py

、

run_mutation.sh

、

derivation/run.py

、

apply_safe/{version_sync,description_sync,failing_test_writer}.py

。

Gotchas

注意事项

Sub-80 findings can be dropped instead of surfaced in
### ⚠️ Unverified
. The A2 contract (
```
scripts/synthesis_core.py:apply_a2
```
) is no-silent-drop. The model sometimes treats a sub-80 score as a rejection signal and omits the finding entirely. Fix: scan the
```
### ⚠️ Unverified
```
section explicitly on every report; compare finding count to axis output to catch drops.
First-run
npx
/
uvx
downloads add latency. Cold start adds ~5s per tool the first time the battery touches it; subsequent runs are fast (cached at
```
~/.npm/_npx
```
/
```
~/.cache/uv
```
). The README install table documents this so users don't fear repeated downloads.
Coherence activates silently on metadata changes. A single
```
package.json
```
touch triggers the 9th subagent automatically. Watch for the
```
Coherence axis: active
```
line in the report header — it tells you the axis ran without you asking.
--reconcile @auto
skips silently on malformed planning artifacts. A forge or apex file with broken YAML frontmatter (unclosed
```
---
```
, tab indentation, unquoted colons) is dropped from the auto-detect list. Verify with
```
head -20 ~/.claude/output/{project}/forge/forge-*.md
```
before relying on
```
@auto
```
.

评分低于80的结果可能被丢弃而非显示在
### ⚠️ Unverified
中。A2约定（
```
scripts/synthesis_core.py:apply_a2
```
）要求无静默丢弃。模型有时会将评分低于80视为拒绝信号并完全省略结果。修复方法：每次报告都明确检查
```
### ⚠️ Unverified
```
部分；对比结果数量与维度输出以发现丢失的结果。
首次运行
npx
/
uvx
下载会增加延迟。冷启动时每个工具首次运行会增加约5秒延迟；后续运行会很快（缓存于
```
~/.npm/_npx
```
/
```
~/.cache/uv
```
）。README安装表会说明这一点，避免用户担心重复下载。
元数据变更时一致性维度会自动激活。仅修改
```
package.json
```
就会自动触发第9个子代理。注意报告标题中的
```
Coherence axis: active
```
行——它会告知你该维度已运行，即使你未主动触发。
--reconcile @auto
会在规划产物格式错误时静默跳过。YAML前置格式错误的forge或apex文件（未闭合
```
---
```
、制表符缩进、未加引号的冒号）会被从自动检测列表中移除。依赖
```
@auto
```
之前请用
```
head -20 ~/.claude/output/{project}/forge/forge-*.md
```
验证。