autistic-code-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Autistic Code Review

Autistic Code Review

Goal

目标

Audit an implementation end-to-end, with or without a formal plan, and produce a defensible review with evidence from code, diffs, tests, and manual UI verification.
对实施内容进行端到端审核(无论是否有正式计划),并基于代码、差异文件、测试及手动UI验证结果生成可举证的评审报告。

When to use

适用场景

Use this skill when the user asks for a broad post-implementation review such as:
  • comparing implementation to an attached plan or handoff
  • reviewing uncommitted or committed changes for regressions and bugs
  • manually verifying front-end behavior with Playwright and/or agent-browser
  • assessing strategic implementation quality, not only local correctness
  • identifying test coverage gaps, adding tests, and running suites across application and database layers
当用户需要进行宽泛的实施后评审时使用本技能,例如:
  • 对比实施内容与附带的计划或交接文档
  • 评审未提交或已提交的变更,排查回归问题与缺陷
  • 使用Playwright和/或agent-browser手动验证前端行为
  • 评估实施的战略质量,而非仅关注局部正确性
  • 识别测试覆盖缺口,补充测试并在应用和数据库层运行测试套件

Entry criteria

准入条件

Check these preconditions before deep review:
  • repo scope is clear (
    cwd
    , target project, and base branch/range known)
  • change scope is available (
    git status
    /
    git diff
    or explicit commit range)
  • runnable environment exists for intended checks (tests/build/dev server as needed)
  • UI verification prerequisites are known (auth path, test user/role, seed state)
  • DB review prerequisites are known when relevant (local DB state, migration order, reset/test commands)
  • test command set is known (
    npm test
    /
    vitest
    ,
    supabase test db
    , and any targeted commands)
If any criterion fails, continue with available lanes and clearly report blocked coverage.
在深入评审前检查以下前置条件:
  • repo范围明确(
    cwd
    、目标项目及基准分支/范围已知)
  • 变更范围可用(
    git status
    /
    git diff
    或明确的提交范围)
  • 存在可运行的环境以执行预期检查(按需提供测试/构建/开发服务器)
  • UI验证的先决条件已知(认证路径、测试用户/角色、初始状态)
  • 相关时数据库评审的先决条件已知(本地DB状态、迁移顺序、重置/测试命令)
  • 测试命令集已知(
    npm test
    /
    vitest
    supabase test db
    及任何针对性命令)
若任何条件不满足,继续执行可用环节并清晰报告未覆盖的部分。

Inputs

输入信息

Gather the following before review:
  1. Intention source (preferred in this order):
  • .plan.md
    file path, or
  • pasted implementation/handoff text in the prompt, or
  • no-plan mode (derive expected behavior from changed files, tests, docs, and commit/diff context)
  1. Change scope:
  • uncommitted (
    git status
    ,
    git diff
    ), or
  • committed range (
    git diff <base>...HEAD
    )
  1. UI scope:
  • routes/pages to verify, pulled from plan, tests, docs, and changed files
  1. Test scope:
  • app-layer test framework/commands
  • DB-layer test framework/commands (for example pgTAP via
    supabase test db
    )
If any item is missing and blocks execution, ask one short question. Otherwise, state assumptions and proceed.
评审前收集以下信息:
  1. 意图来源(优先顺序如下):
  • .plan.md
    文件路径,或
  • 粘贴在提示中的实施/交接文本,或
  • 无计划模式(从变更文件、测试、文档及提交/差异上下文推导预期行为)
  1. 变更范围:
  • 未提交(
    git status
    git diff
    ),或
  • 已提交范围(
    git diff <base>...HEAD
  1. UI范围:
  • 需验证的路由/页面,从计划、测试、文档及变更文件中提取
  1. 测试范围:
  • 应用层测试框架/命令
  • 数据库层测试框架/命令(例如通过
    supabase test db
    运行的pgTAP)
若任何缺失项阻碍执行,提出一个简短问题。否则,说明假设并继续执行。

Review modes

评审模式

Select one mode explicitly at the start of the review:
  1. plan
    mode
  • Use when a
    .plan.md
    is available.
  • Evaluate strict plan-to-implementation alignment.
  1. handoff
    mode
  • Use when only prompt/handoff intent is available.
  • Evaluate claim-to-implementation alignment.
  1. no-plan
    mode
  • Use when no plan/handoff is provided.
  • Skip strict alignment claims and focus on correctness, regressions, UX behavior, coverage, and strategy quality.
  1. self-review
    mode
  • Use when the same agent that implemented changes performs the review.
  • Treat prior assumptions as untrusted and require diff/test/UI evidence for every claim.
在评审开始时明确选择一种模式:
  1. plan
    模式
  • 当存在
    .plan.md
    文件时使用。
  • 严格评估计划与实施内容的一致性。
  1. handoff
    模式
  • 仅当提示/交接意图可用时使用。
  • 评估声明内容与实施内容的一致性。
  1. no-plan
    模式
  • 当无计划/交接文档时使用。
  • 跳过严格的一致性检查,专注于正确性、回归问题、UX行为、覆盖范围及战略质量。
  1. self-review
    模式
  • 当实施变更的同一代理执行评审时使用。
  • 将先前的假设视为不可信,要求每个结论都有差异/测试/UI证据支持。

Parallel subagents

并行子代理

Run parallel subagents with explicit, non-overlapping responsibilities:
  1. plan-alignment-reviewer
  • Build an intention-to-evidence matrix from plan/handoff claims.
  • Verify each claim against actual file diffs.
  • Flag missing, partial, or extra implementation.
  1. ui-verification-reviewer
  • Perform manual UI checks using Playwright or agent-browser.
  • Validate key user paths and permissions/role gating.
  • Record pass/fail with exact route and observed behavior.
  1. technical-risk-reviewer
  • Perform code review on changed files.
  • Prioritize bugs, regressions, data/permission risks, and design-level defects.
  • Include file references and concrete failure modes.
  1. strategic-reviewer
  • Evaluate architecture and implementation strategy.
  • Identify coupling, migration safety gaps, maintainability risks, and scalability concerns.
  • Suggest alternatives only when they materially reduce risk.
  1. test-coverage-reviewer
  • Determine test coverage for changed behavior across app and DB layers.
  • Identify missing tests and high-risk untested paths.
  • Suggest and/or create targeted tests to close gaps.
  • Run relevant suites and report results with command evidence.
运行具有明确、无重叠职责的并行子代理:
  1. plan-alignment-reviewer
  • 从计划/交接声明构建意图-证据矩阵。
  • 对照实际文件差异验证每个声明。
  • 标记缺失、部分完成或额外的实施内容。
  1. ui-verification-reviewer
  • 使用Playwright或agent-browser执行手动UI检查。
  • 验证关键用户路径及权限/角色限制。
  • 记录通过/失败结果,包含具体路由及观察到的行为。
  1. technical-risk-reviewer
  • 对变更文件执行代码评审。
  • 优先处理缺陷、回归问题、数据/权限风险及设计层面的缺陷。
  • 包含文件引用及具体失效模式。
  1. strategic-reviewer
  • 评估架构与实施策略。
  • 识别耦合问题、迁移安全缺口、可维护性风险及可扩展性顾虑。
  • 仅当能实质性降低风险时才建议替代方案。
  1. test-coverage-reviewer
  • 确定应用和数据库层中变更行为的测试覆盖情况。
  • 识别缺失的测试及高风险未测试路径。
  • 建议和/或创建针对性测试以填补缺口。
  • 运行相关套件并附带命令结果报告。

Subagent output contract

子代理输出契约

Require each subagent to return this exact structure:
  • findings
    : severity-ranked items with file references when applicable
  • evidence
    : concrete observations (diff snippet summary, command result, UI observation)
  • confidence
    :
    high | medium | low
    per finding
  • unverified_assumptions
    : assumptions that could change conclusions
  • blocked_items
    : what could not be validated and why
Reject subagent output that is opinion-only or lacks evidence.
要求每个子代理返回以下精确结构:
  • findings
    :按严重性排序的项,适用时包含文件引用
  • evidence
    :具体观察结果(差异片段摘要、命令结果、UI观察记录)
  • confidence
    :每个结论的
    high | medium | low
    (高|中|低)
  • unverified_assumptions
    :可能改变结论的假设
  • blocked_items
    :无法验证的内容及原因
拒绝仅含观点或缺乏证据的子代理输出。

UI coverage matrix

UI覆盖矩阵

Build and execute a minimal matrix:
  • persona/role x route/page x key action x expected result
  • include at least one happy path and one negative/permission-boundary path per protected area
  • include a navigation/gating check (route guard, menu visibility, or access denial behavior)
  • record each matrix row as
    pass
    ,
    fail
    , or
    blocked
When blocked, capture exact blocker and the attempted step.
构建并执行最小化矩阵:
  • 角色/人物 x 路由/页面 x 关键操作 x 预期结果
  • 每个受保护区域至少包含一个正常路径和一个异常/权限边界路径
  • 包含导航/权限检查(路由守卫、菜单可见性或访问拒绝行为)
  • 将矩阵的每一行记录为
    pass
    (通过)、
    fail
    (失败)或
    blocked
    (受阻)
当受阻时,记录具体的阻碍因素及尝试的步骤。

Test coverage matrix

测试覆盖矩阵

Build and execute a minimal matrix:
  • changed component/module/function/table/function/RPC x existing tests x gap x action
  • app layer: unit/integration tests for changed behavior and boundary cases
  • DB layer: pgTAP (or equivalent) coverage for changed tables, policies, functions, and permissions
  • include at least one negative path for each changed permission-sensitive behavior
Action values:
  • covered
    (existing tests already sufficient)
  • add-tests
    (write targeted tests)
  • deferred
    (cannot safely add in scope; justify)
When
add-tests
is chosen, create focused tests and run affected suites.
构建并执行最小化矩阵:
  • 变更的组件/模块/函数/表/函数/RPC x 现有测试 x 缺口 x 操作
  • 应用层:针对变更行为及边界情况的单元/集成测试
  • 数据库层:针对变更的表、策略、函数及权限的pgTAP(或等效工具)覆盖
  • 每个变更的权限敏感行为至少包含一个异常路径
操作值:
  • covered
    (已覆盖):现有测试已足够
  • add-tests
    (补充测试):编写针对性测试
  • deferred
    (延后处理):无法在当前范围内安全补充,需说明理由
当选择
add-tests
时,创建聚焦的测试并运行受影响的套件。

Workflow

工作流

  1. Establish scope and evidence
  • Determine review mode (
    plan
    ,
    handoff
    , or
    no-plan
    ) and whether review is
    self-review
    .
  • Read plan/handoff text when provided.
  • Enumerate changed files and classify by area (DB/schema, server, client, tests/docs).
  • Derive expected outcomes from the best available intention source for the selected mode.
  1. Validate entry criteria and set timebox
  • Confirm entry criteria; note any missing prerequisites.
  • Set a review timebox and prioritize critical paths first (permissions, data integrity, primary UI flows, high-risk untested changes).
  1. Dispatch the five subagents in parallel
  • Provide each subagent only the context needed for its lane.
  • Require each subagent to return contract-compliant output.
  1. Run UI verification explicitly
  • Start from user-visible flows (routes, nav, forms, role-conditional UI).
  • Verify both happy path and at least one negative/permission boundary path.
  • When blocked (auth, env, seed data), report blocker and partial coverage.
  1. Run DB/migration checklist when schema or SQL changed
  • check RLS/policy behavior against intended access model
  • check migration safety (ordering, idempotency where relevant, rollback feasibility)
  • check grants/privileges drift and RPC exposure changes
  • check seed/test/type-generation consistency with schema changes
  1. Close test coverage gaps
  • map changed behaviors to existing tests (app + DB)
  • create targeted tests for high-risk uncovered behavior where feasible
  • run relevant app-layer and DB-layer suites
  • capture exact commands and pass/fail output summary
  1. Consolidate findings
  • de-duplicate overlaps across subagents
  • convert raw notes into severity-ranked findings
  • separate confirmed defects from open questions
  1. Deliver review result
  • findings first (highest severity first)
  • then alignment/reconstruction matrix, UI status, coverage status, technical analysis, strategic analysis, artifacts, and verdict
  • if timebox expires or blockers remain, provide partial verdict with explicit coverage gaps
  1. 确定范围与证据
  • 确定评审模式(
    plan
    handoff
    no-plan
    )及是否为
    self-review
    (自评审)。
  • 阅读提供的计划/交接文本。
  • 枚举变更文件并按领域分类(DB/schema、服务器、客户端、测试/文档)。
  • 从所选模式下最可用的意图来源推导预期结果。
  1. 验证准入条件并设置时间盒
  • 确认准入条件;记录任何缺失的前置条件。
  • 设置评审时间盒并优先处理关键路径(权限、数据完整性、主要UI流程、高风险未测试变更)。
  1. 并行调度五个子代理
  • 为每个子代理仅提供其职责所需的上下文。
  • 要求每个子代理返回符合契约的输出。
  1. 明确执行UI验证
  • 从用户可见的流程开始(路由、导航、表单、角色条件UI)。
  • 验证正常路径及至少一个异常/权限边界路径。
  • 当受阻时(认证、环境、初始数据问题),报告阻碍因素及已完成的部分覆盖内容。
  1. 当schema或SQL变更时运行数据库/迁移检查清单
  • 检查RLS/policy行为与预期访问模型是否一致
  • 检查迁移安全性(顺序、相关幂等性、回滚可行性)
  • 检查权限/特权漂移及RPC暴露变更
  • 检查初始数据/测试/类型生成与schema变更的一致性
  1. 填补测试覆盖缺口
  • 将变更行为映射到现有测试(应用+数据库)
  • 为可行的高风险未覆盖行为创建针对性测试
  • 运行相关的应用层和数据库层套件
  • 记录具体命令及通过/失败输出摘要
  1. 整合结果
  • 去重子代理间的重叠内容
  • 将原始笔记转换为按严重性排序的结论
  • 区分已确认的缺陷与未解决的问题
  1. 交付评审结果
  • 先展示结论(按最高严重性排序)
  • 然后是一致性/重建矩阵、UI状态、覆盖状态、技术分析、战略分析、工件及评审结论
  • 若时间盒到期或存在阻碍,提供部分结论并明确说明覆盖缺口

Severity model

严重性模型

Use this priority scale:
  • P0
    : release-blocking correctness or security issue
  • P1
    : high-risk bug/regression likely to affect production behavior
  • P2
    : meaningful correctness/maintainability/test gap
  • P3
    : minor issue or improvement opportunity
使用以下优先级等级:
  • P0
    :阻碍发布的正确性或安全问题
  • P1
    :高风险缺陷/回归问题,可能影响生产环境行为
  • P2
    :重要的正确性/可维护性/测试缺口
  • P3
    :次要问题或改进机会

Sign-off gates

审批关卡

Apply these gates before issuing the final verdict:
  • do not return
    aligned
    if any open
    P0
    or
    P1
    exists
  • do not return
    aligned
    when critical UI flows are
    blocked
    without mitigation evidence
  • do not return
    aligned
    when DB/migration changes were made but DB checklist was skipped
  • do not return
    aligned
    when high-risk changed behavior has unresolved coverage gaps or failing tests
  • in
    no-plan
    mode, return
    no-plan reviewed
    (never strict
    aligned
    )
在发布最终结论前应用以下审批关卡:
  • 若存在任何未解决的
    P0
    P1
    问题,不得返回
    aligned
    (一致)
  • 当关键UI流程受阻且无缓解证据时,不得返回
    aligned
  • 当进行了数据库/迁移变更但跳过了数据库检查清单时,不得返回
    aligned
  • 当高风险变更行为存在未解决的覆盖缺口或失败测试时,不得返回
    aligned
  • no-plan
    模式下,返回
    no-plan reviewed
    (无计划评审完成),不得返回严格的
    aligned

Output template

输出模板

markdown
Review target: `<plan path or prompt summary>`
Review mode: `<plan | handoff | no-plan>` (+ `self-review` when applicable)
Change scope: `<uncommitted | commit range>`

Findings:
1. [P1] <title>`<file:line>`
   Evidence: <what was observed>
   Impact: <user/system impact>
   Recommendation: <concrete fix>
1. [P2] <title>`<file:line>`
   Evidence: <what was observed>
   Impact: <user/system impact>
   Recommendation: <concrete fix>

Plan alignment matrix (for `plan`/`handoff` modes):
1. `<planned item>` -> `<implemented evidence>` -> `<aligned | partial | missing | extra>`
1. `<planned item>` -> `<implemented evidence>` -> `<aligned | partial | missing | extra>`

Intent reconstruction matrix (for `no-plan` mode):
1. `<inferred expected behavior>` -> `<implemented evidence>` -> `<confirmed | partial | contradicted>`
1. `<inferred expected behavior>` -> `<implemented evidence>` -> `<confirmed | partial | contradicted>`

UI verification:
1. `<route + area + action>` -> `<pass/fail/blocked>` -> `<observed result>`
1. `<route + area + action>` -> `<pass/fail/blocked>` -> `<observed result>`
Blockers: <none or list>

Test coverage:
1. `<changed behavior>` -> `<existing coverage>` -> `<gap>` -> `<covered | add-tests | deferred>`
1. `<changed behavior>` -> `<existing coverage>` -> `<gap>` -> `<covered | add-tests | deferred>`
Test execution:
- `<command>` -> `<pass/fail>` -> `<key result>`
- `<command>` -> `<pass/fail>` -> `<key result>`

Technical analysis:
- `<top technical risk or confirmation>`
- `<top technical risk or confirmation>`

Strategic analysis:
- `<strategy strength/weakness>`
- `<strategy strength/weakness>`

Review artifacts:
- `<commands run and key outcomes>`
- `<ui evidence: screenshots/log notes or blocker proof>`
- `<coverage summary: tested vs blocked vs deferred>`

Verdict: `<aligned | partially aligned | not aligned | no-plan reviewed>`
Recommended next steps:
1. <step>
1. <step>
markdown
Review target: `<plan path or prompt summary>`
Review mode: `<plan | handoff | no-plan>` (+ `self-review` when applicable)
Change scope: `<uncommitted | commit range>`

Findings:
1. [P1] <title>`<file:line>`
   Evidence: <what was observed>
   Impact: <user/system impact>
   Recommendation: <concrete fix>
1. [P2] <title>`<file:line>`
   Evidence: <what was observed>
   Impact: <user/system impact>
   Recommendation: <concrete fix>

Plan alignment matrix (for `plan`/`handoff` modes):
1. `<planned item>` -> `<implemented evidence>` -> `<aligned | partial | missing | extra>`
1. `<planned item>` -> `<implemented evidence>` -> `<aligned | partial | missing | extra>`

Intent reconstruction matrix (for `no-plan` mode):
1. `<inferred expected behavior>` -> `<implemented evidence>` -> `<confirmed | partial | contradicted>`
1. `<inferred expected behavior>` -> `<implemented evidence>` -> `<confirmed | partial | contradicted>`

UI verification:
1. `<route + area + action>` -> `<pass/fail/blocked>` -> `<observed result>`
1. `<route + area + action>` -> `<pass/fail/blocked>` -> `<observed result>`
Blockers: <none or list>

Test coverage:
1. `<changed behavior>` -> `<existing coverage>` -> `<gap>` -> `<covered | add-tests | deferred>`
1. `<changed behavior>` -> `<existing coverage>` -> `<gap>` -> `<covered | add-tests | deferred>`
Test execution:
- `<command>` -> `<pass/fail>` -> `<key result>`
- `<command>` -> `<pass/fail>` -> `<key result>`

Technical analysis:
- `<top technical risk or confirmation>`
- `<top technical risk or confirmation>`

Strategic analysis:
- `<strategy strength/weakness>`
- `<strategy strength/weakness>`

Review artifacts:
- `<commands run and key outcomes>`
- `<ui evidence: screenshots/log notes or blocker proof>`
- `<coverage summary: tested vs blocked vs deferred>`

Verdict: `<aligned | partially aligned | not aligned | no-plan reviewed>`
Recommended next steps:
1. <step>
1. <step>

Guardrails

防护规则

  • do not mark
    aligned
    unless plan claims are evidenced in diffs/tests/UI checks
  • in
    no-plan
    mode, do not claim strict alignment; use verdict
    no-plan reviewed
  • do not bury critical defects under summary text; findings must appear first
  • if UI cannot be fully executed, provide exact blocker and what was still validated
  • if tests cannot be executed, list exact missing prerequisites and impacted confidence
  • prefer concrete, falsifiable statements over broad judgments
  • in
    self-review
    mode, call out reviewer/implementer overlap and keep evidence thresholds strict
  • enforce subagent output contract; request retries for incomplete outputs
  • if review is partial due to blockers/timebox, say so explicitly in verdict context
  • 除非计划声明在差异/测试/UI检查中有证据支持,否则不得标记为
    aligned
  • no-plan
    模式下,不得声称严格一致;使用评审结论
    no-plan reviewed
  • 不得将关键缺陷隐藏在摘要文本下;结论必须放在最前面
  • 若无法完全执行UI验证,提供具体阻碍因素及仍验证的内容
  • 若无法执行测试,列出具体缺失的前置条件及对置信度的影响
  • 优先使用具体、可证伪的陈述,而非宽泛的判断
  • self-review
    模式下,指出评审者/实施者的重叠性,并保持严格的证据阈值
  • 强制执行子代理输出契约;对不完整的输出要求重试
  • 若因阻碍/时间盒导致评审不完整,在结论背景中明确说明

Subagent prompt pack

子代理提示包

Use these prompts as-is, replacing placeholders.
按原样使用以下提示,替换占位符。

Parent orchestration prompt

父编排提示

text
Run autistic-code-review.

Context:
- Review target: <plan path OR handoff summary OR "none">
- Review mode: <plan | handoff | no-plan>
- Self-review: <yes | no>
- Change scope: <uncommitted | commit range>
- Repo/project path: <path>
- UI routes in scope: <route list>
- Test commands in scope: <app commands + DB commands>
- Timebox: <minutes>

Execution requirements:
1) Spawn five parallel subagents:
   - plan-alignment-reviewer
   - ui-verification-reviewer
   - technical-risk-reviewer
   - strategic-reviewer
   - test-coverage-reviewer
2) Enforce this output contract for every subagent:
   - findings
   - evidence
   - confidence
   - unverified_assumptions
   - blocked_items
3) Reject and retry any subagent output that lacks evidence.
4) Require the test-coverage-reviewer to suggest/create tests for uncovered high-risk changes and run relevant suites.
5) Consolidate results into one findings-first report with severity ordering.
6) Apply sign-off gates from the skill and produce a final verdict.
text
Run autistic-code-review.

Context:
- Review target: <plan path OR handoff summary OR "none">
- Review mode: <plan | handoff | no-plan>
- Self-review: <yes | no>
- Change scope: <uncommitted | commit range>
- Repo/project path: <path>
- UI routes in scope: <route list>
- Test commands in scope: <app commands + DB commands>
- Timebox: <minutes>

Execution requirements:
1) Spawn five parallel subagents:
   - plan-alignment-reviewer
   - ui-verification-reviewer
   - technical-risk-reviewer
   - strategic-reviewer
   - test-coverage-reviewer
2) Enforce this output contract for every subagent:
   - findings
   - evidence
   - confidence
   - unverified_assumptions
   - blocked_items
3) Reject and retry any subagent output that lacks evidence.
4) Require the test-coverage-reviewer to suggest/create tests for uncovered high-risk changes and run relevant suites.
5) Consolidate results into one findings-first report with severity ordering.
6) Apply sign-off gates from the skill and produce a final verdict.

Prompt:
plan-alignment-reviewer

Prompt:
plan-alignment-reviewer

text
You are the plan-alignment-reviewer.

Inputs:
- Review mode: <plan | handoff | no-plan>
- Intention source: <plan path or handoff text; can be empty in no-plan mode>
- Change scope: <uncommitted | commit range>
- Changed file list/diff summary: <insert>

Tasks:
1) Build an intention-to-evidence matrix from intention claims and actual diffs.
2) For each claim, classify as aligned, partial, missing, or extra.
3) In no-plan mode, produce an intent reconstruction matrix:
   - inferred expected behavior -> implemented evidence -> confirmed/partial/contradicted
4) Flag any claimed work not evidenced in code/tests/docs.

Return exactly:
- findings: severity-ranked issues with file refs
- evidence: specific diff/test/doc observations
- confidence: high/medium/low per finding
- unverified_assumptions: assumptions and why
- blocked_items: what prevented validation
text
You are the plan-alignment-reviewer.

Inputs:
- Review mode: <plan | handoff | no-plan>
- Intention source: <plan path or handoff text; can be empty in no-plan mode>
- Change scope: <uncommitted | commit range>
- Changed file list/diff summary: <insert>

Tasks:
1) Build an intention-to-evidence matrix from intention claims and actual diffs.
2) For each claim, classify as aligned, partial, missing, or extra.
3) In no-plan mode, produce an intent reconstruction matrix:
   - inferred expected behavior -> implemented evidence -> confirmed/partial/contradicted
4) Flag any claimed work not evidenced in code/tests/docs.

Return exactly:
- findings: severity-ranked issues with file refs
- evidence: specific diff/test/doc observations
- confidence: high/medium/low per finding
- unverified_assumptions: assumptions and why
- blocked_items: what prevented validation

Prompt:
ui-verification-reviewer

Prompt:
ui-verification-reviewer

text
You are the ui-verification-reviewer.

Inputs:
- UI scope routes/pages: <insert>
- Personas/roles: <insert>
- Environment/access constraints: <insert>
- Change scope summary: <insert>

Tasks:
1) Use Playwright and/or agent-browser to manually verify UI behavior.
2) Build and execute a coverage matrix:
   - role x route/page x key action x expected result
3) Include at least:
   - one happy path per protected area
   - one negative/permission-boundary path per protected area
   - one gating/navigation check (route guard/menu visibility/access denial)
4) Record each row as pass/fail/blocked with observed result.
5) Capture evidence artifacts (screenshots/log notes) for failures or blockers.

Return exactly:
- findings: severity-ranked UI defects/regressions
- evidence: route-level observations and artifact references
- confidence: high/medium/low per finding
- unverified_assumptions: missing env/auth/data assumptions
- blocked_items: exact blocker + attempted step
text
You are the ui-verification-reviewer.

Inputs:
- UI scope routes/pages: <insert>
- Personas/roles: <insert>
- Environment/access constraints: <insert>
- Change scope summary: <insert>

Tasks:
1) Use Playwright and/or agent-browser to manually verify UI behavior.
2) Build and execute a coverage matrix:
   - role x route/page x key action x expected result
3) Include at least:
   - one happy path per protected area
   - one negative/permission-boundary path per protected area
   - one gating/navigation check (route guard/menu visibility/access denial)
4) Record each row as pass/fail/blocked with observed result.
5) Capture evidence artifacts (screenshots/log notes) for failures or blockers.

Return exactly:
- findings: severity-ranked UI defects/regressions
- evidence: route-level observations and artifact references
- confidence: high/medium/low per finding
- unverified_assumptions: missing env/auth/data assumptions
- blocked_items: exact blocker + attempted step

Prompt:
technical-risk-reviewer

Prompt:
technical-risk-reviewer

text
You are the technical-risk-reviewer.

Inputs:
- Changed files and diff: <insert>
- Related tests/docs/commands run: <insert>
- Review mode and constraints: <insert>

Tasks:
1) Perform a code review focused on:
   - correctness bugs
   - behavioral regressions
   - data integrity and permission risks
   - missing or weak tests
2) If SQL/schema changed, run DB/migration checklist:
   - RLS/policy behavior vs intended access model
   - migration safety, ordering, rollback feasibility
   - grants/privileges/RPC exposure drift
   - seed/test/type-generation consistency
3) Prioritize findings by P0-P3 and include file references.

Return exactly:
- findings: severity-ranked technical issues with file refs
- evidence: concrete code/diff/test command observations
- confidence: high/medium/low per finding
- unverified_assumptions: what is assumed but unproven
- blocked_items: checks that could not be completed
text
You are the technical-risk-reviewer.

Inputs:
- Changed files and diff: <insert>
- Related tests/docs/commands run: <insert>
- Review mode and constraints: <insert>

Tasks:
1) Perform a code review focused on:
   - correctness bugs
   - behavioral regressions
   - data integrity and permission risks
   - missing or weak tests
2) If SQL/schema changed, run DB/migration checklist:
   - RLS/policy behavior vs intended access model
   - migration safety, ordering, rollback feasibility
   - grants/privileges/RPC exposure drift
   - seed/test/type-generation consistency with schema changes
3) Prioritize findings by P0-P3 and include file references.

Return exactly:
- findings: severity-ranked technical issues with file refs
- evidence: concrete code/diff/test command observations
- confidence: high/medium/low per finding
- unverified_assumptions: what is assumed but unproven
- blocked_items: checks that could not be completed

Prompt:
strategic-reviewer

Prompt:
strategic-reviewer

text
You are the strategic-reviewer.

Inputs:
- Implementation summary: <insert>
- Changed areas by layer (db/server/client/tests/docs): <insert>
- Review mode: <insert>

Tasks:
1) Evaluate implementation strategy quality:
   - architecture cohesion and coupling
   - migration/cutover safety and operability
   - maintainability and future change cost
   - scalability and team workflow implications
2) Identify strategic weaknesses and practical alternatives.
3) Recommend only changes that materially reduce risk or complexity.

Return exactly:
- findings: severity-ranked strategic risks/anti-patterns
- evidence: concrete repo or diff observations
- confidence: high/medium/low per finding
- unverified_assumptions: strategic assumptions needing confirmation
- blocked_items: missing context that limits confidence
text
You are the strategic-reviewer.

Inputs:
- Implementation summary: <insert>
- Changed areas by layer (db/server/client/tests/docs): <insert>
- Review mode: <insert>

Tasks:
1) Evaluate implementation strategy quality:
   - architecture cohesion and coupling
   - migration/cutover safety and operability
   - maintainability and future change cost
   - scalability and team workflow implications
2) Identify strategic weaknesses and practical alternatives.
3) Recommend only changes that materially reduce risk or complexity.

Return exactly:
- findings: severity-ranked strategic risks/anti-patterns
- evidence: concrete repo or diff observations
- confidence: high/medium/low per finding
- unverified_assumptions: strategic assumptions needing confirmation
- blocked_items: missing context that limits confidence

Prompt:
test-coverage-reviewer

Prompt:
test-coverage-reviewer

text
You are the test-coverage-reviewer.

Inputs:
- Changed files and diff: <insert>
- Existing tests in scope: <insert>
- Test commands:
  - app layer: <insert>
  - DB layer (pgTAP or equivalent): <insert>
- Review mode and constraints: <insert>

Tasks:
1) Build a coverage matrix:
   - changed behavior -> existing tests -> gap -> action
2) Identify high-risk untested behavior in app and DB layers.
3) Suggest and create targeted tests to close feasible gaps.
   - app layer: unit/integration tests for changed behavior and boundaries
   - DB layer: pgTAP tests for changed tables/functions/policies/permissions
4) Run relevant test suites after test additions/updates.
5) Report pass/fail and any remaining uncovered high-risk behavior.

Return exactly:
- findings: severity-ranked coverage and test-quality issues
- evidence: coverage matrix + test diffs + command results
- confidence: high/medium/low per finding
- unverified_assumptions: assumptions about environment/data/setup
- blocked_items: tests not run or not creatable and why
text
You are the test-coverage-reviewer.

Inputs:
- Changed files and diff: <insert>
- Existing tests in scope: <insert>
- Test commands:
  - app layer: <insert>
  - DB layer (pgTAP or equivalent): <insert>
- Review mode and constraints: <insert>

Tasks:
1) Build a coverage matrix:
   - changed behavior -> existing tests -> gap -> action
2) Identify high-risk untested behavior in app and DB layers.
3) Suggest and create targeted tests to close feasible gaps.
   - app layer: unit/integration tests for changed behavior and boundaries
   - DB layer: pgTAP tests for changed tables/functions/policies/permissions
4) Run relevant test suites after test additions/updates.
5) Report pass/fail and any remaining uncovered high-risk behavior.

Return exactly:
- findings: severity-ranked coverage and test-quality issues
- evidence: coverage matrix + test diffs + command results
- confidence: high/medium/low per finding
- unverified_assumptions: assumptions about environment/data/setup
- blocked_items: tests not run or not creatable and why

Consolidation prompt (optional)

整合提示(可选)

text
Consolidate five subagent outputs into one final review.

Rules:
1) Findings first, highest severity first, deduplicated across lanes.
2) Keep only evidence-backed findings.
3) Include mode-appropriate matrix:
   - plan/handoff -> plan alignment matrix
   - no-plan -> intent reconstruction matrix
4) Include UI verification status, blockers, and coverage summary.
5) Include test coverage matrix, tests added/suggested, and execution results.
6) Apply sign-off gates before verdict.
7) Verdict allowed values:
   - aligned
   - partially aligned
   - not aligned
   - no-plan reviewed
text
Consolidate five subagent outputs into one final review.

Rules:
1) Findings first, highest severity first, deduplicated across lanes.
2) Keep only evidence-backed findings.
3) Include mode-appropriate matrix:
   - plan/handoff -> plan alignment matrix
   - no-plan -> intent reconstruction matrix
4) Include UI verification status, blockers, and coverage summary.
5) Include test coverage matrix, tests added/suggested, and execution results.
6) Apply sign-off gates before verdict.
7) Verdict allowed values:
   - aligned
   - partially aligned
   - not aligned
   - no-plan reviewed