Loading...
Loading...
Run a full-scale implementation review with parallel subagents for plan alignment, UI verification, technical and strategic analysis, and test coverage gap closure across app and database layers.
npx skill4agent add dolesshq/self-obsolescence autistic-code-reviewcwdgit statusgit diffnpm testvitestsupabase test db.plan.mdgit statusgit diffgit diff <base>...HEADsupabase test dbplan.plan.mdhandoffno-planself-reviewplan-alignment-reviewerui-verification-reviewertechnical-risk-reviewerstrategic-reviewertest-coverage-reviewerfindingsevidenceconfidencehigh | medium | lowunverified_assumptionsblocked_itemspassfailblockedcoveredadd-testsdeferredadd-testsplanhandoffno-planself-reviewP0P1P2P3alignedP0P1alignedblockedalignedalignedno-planno-plan reviewedalignedReview target: `<plan path or prompt summary>`
Review mode: `<plan | handoff | no-plan>` (+ `self-review` when applicable)
Change scope: `<uncommitted | commit range>`
Findings:
1. [P1] <title> — `<file:line>`
Evidence: <what was observed>
Impact: <user/system impact>
Recommendation: <concrete fix>
1. [P2] <title> — `<file:line>`
Evidence: <what was observed>
Impact: <user/system impact>
Recommendation: <concrete fix>
Plan alignment matrix (for `plan`/`handoff` modes):
1. `<planned item>` -> `<implemented evidence>` -> `<aligned | partial | missing | extra>`
1. `<planned item>` -> `<implemented evidence>` -> `<aligned | partial | missing | extra>`
Intent reconstruction matrix (for `no-plan` mode):
1. `<inferred expected behavior>` -> `<implemented evidence>` -> `<confirmed | partial | contradicted>`
1. `<inferred expected behavior>` -> `<implemented evidence>` -> `<confirmed | partial | contradicted>`
UI verification:
1. `<route + area + action>` -> `<pass/fail/blocked>` -> `<observed result>`
1. `<route + area + action>` -> `<pass/fail/blocked>` -> `<observed result>`
Blockers: <none or list>
Test coverage:
1. `<changed behavior>` -> `<existing coverage>` -> `<gap>` -> `<covered | add-tests | deferred>`
1. `<changed behavior>` -> `<existing coverage>` -> `<gap>` -> `<covered | add-tests | deferred>`
Test execution:
- `<command>` -> `<pass/fail>` -> `<key result>`
- `<command>` -> `<pass/fail>` -> `<key result>`
Technical analysis:
- `<top technical risk or confirmation>`
- `<top technical risk or confirmation>`
Strategic analysis:
- `<strategy strength/weakness>`
- `<strategy strength/weakness>`
Review artifacts:
- `<commands run and key outcomes>`
- `<ui evidence: screenshots/log notes or blocker proof>`
- `<coverage summary: tested vs blocked vs deferred>`
Verdict: `<aligned | partially aligned | not aligned | no-plan reviewed>`
Recommended next steps:
1. <step>
1. <step>alignedno-planno-plan reviewedself-reviewRun autistic-code-review.
Context:
- Review target: <plan path OR handoff summary OR "none">
- Review mode: <plan | handoff | no-plan>
- Self-review: <yes | no>
- Change scope: <uncommitted | commit range>
- Repo/project path: <path>
- UI routes in scope: <route list>
- Test commands in scope: <app commands + DB commands>
- Timebox: <minutes>
Execution requirements:
1) Spawn five parallel subagents:
- plan-alignment-reviewer
- ui-verification-reviewer
- technical-risk-reviewer
- strategic-reviewer
- test-coverage-reviewer
2) Enforce this output contract for every subagent:
- findings
- evidence
- confidence
- unverified_assumptions
- blocked_items
3) Reject and retry any subagent output that lacks evidence.
4) Require the test-coverage-reviewer to suggest/create tests for uncovered high-risk changes and run relevant suites.
5) Consolidate results into one findings-first report with severity ordering.
6) Apply sign-off gates from the skill and produce a final verdict.plan-alignment-reviewerYou are the plan-alignment-reviewer.
Inputs:
- Review mode: <plan | handoff | no-plan>
- Intention source: <plan path or handoff text; can be empty in no-plan mode>
- Change scope: <uncommitted | commit range>
- Changed file list/diff summary: <insert>
Tasks:
1) Build an intention-to-evidence matrix from intention claims and actual diffs.
2) For each claim, classify as aligned, partial, missing, or extra.
3) In no-plan mode, produce an intent reconstruction matrix:
- inferred expected behavior -> implemented evidence -> confirmed/partial/contradicted
4) Flag any claimed work not evidenced in code/tests/docs.
Return exactly:
- findings: severity-ranked issues with file refs
- evidence: specific diff/test/doc observations
- confidence: high/medium/low per finding
- unverified_assumptions: assumptions and why
- blocked_items: what prevented validationui-verification-reviewerYou are the ui-verification-reviewer.
Inputs:
- UI scope routes/pages: <insert>
- Personas/roles: <insert>
- Environment/access constraints: <insert>
- Change scope summary: <insert>
Tasks:
1) Use Playwright and/or agent-browser to manually verify UI behavior.
2) Build and execute a coverage matrix:
- role x route/page x key action x expected result
3) Include at least:
- one happy path per protected area
- one negative/permission-boundary path per protected area
- one gating/navigation check (route guard/menu visibility/access denial)
4) Record each row as pass/fail/blocked with observed result.
5) Capture evidence artifacts (screenshots/log notes) for failures or blockers.
Return exactly:
- findings: severity-ranked UI defects/regressions
- evidence: route-level observations and artifact references
- confidence: high/medium/low per finding
- unverified_assumptions: missing env/auth/data assumptions
- blocked_items: exact blocker + attempted steptechnical-risk-reviewerYou are the technical-risk-reviewer.
Inputs:
- Changed files and diff: <insert>
- Related tests/docs/commands run: <insert>
- Review mode and constraints: <insert>
Tasks:
1) Perform a code review focused on:
- correctness bugs
- behavioral regressions
- data integrity and permission risks
- missing or weak tests
2) If SQL/schema changed, run DB/migration checklist:
- RLS/policy behavior vs intended access model
- migration safety, ordering, rollback feasibility
- grants/privileges/RPC exposure drift
- seed/test/type-generation consistency
3) Prioritize findings by P0-P3 and include file references.
Return exactly:
- findings: severity-ranked technical issues with file refs
- evidence: concrete code/diff/test command observations
- confidence: high/medium/low per finding
- unverified_assumptions: what is assumed but unproven
- blocked_items: checks that could not be completedstrategic-reviewerYou are the strategic-reviewer.
Inputs:
- Implementation summary: <insert>
- Changed areas by layer (db/server/client/tests/docs): <insert>
- Review mode: <insert>
Tasks:
1) Evaluate implementation strategy quality:
- architecture cohesion and coupling
- migration/cutover safety and operability
- maintainability and future change cost
- scalability and team workflow implications
2) Identify strategic weaknesses and practical alternatives.
3) Recommend only changes that materially reduce risk or complexity.
Return exactly:
- findings: severity-ranked strategic risks/anti-patterns
- evidence: concrete repo or diff observations
- confidence: high/medium/low per finding
- unverified_assumptions: strategic assumptions needing confirmation
- blocked_items: missing context that limits confidencetest-coverage-reviewerYou are the test-coverage-reviewer.
Inputs:
- Changed files and diff: <insert>
- Existing tests in scope: <insert>
- Test commands:
- app layer: <insert>
- DB layer (pgTAP or equivalent): <insert>
- Review mode and constraints: <insert>
Tasks:
1) Build a coverage matrix:
- changed behavior -> existing tests -> gap -> action
2) Identify high-risk untested behavior in app and DB layers.
3) Suggest and create targeted tests to close feasible gaps.
- app layer: unit/integration tests for changed behavior and boundaries
- DB layer: pgTAP tests for changed tables/functions/policies/permissions
4) Run relevant test suites after test additions/updates.
5) Report pass/fail and any remaining uncovered high-risk behavior.
Return exactly:
- findings: severity-ranked coverage and test-quality issues
- evidence: coverage matrix + test diffs + command results
- confidence: high/medium/low per finding
- unverified_assumptions: assumptions about environment/data/setup
- blocked_items: tests not run or not creatable and whyConsolidate five subagent outputs into one final review.
Rules:
1) Findings first, highest severity first, deduplicated across lanes.
2) Keep only evidence-backed findings.
3) Include mode-appropriate matrix:
- plan/handoff -> plan alignment matrix
- no-plan -> intent reconstruction matrix
4) Include UI verification status, blockers, and coverage summary.
5) Include test coverage matrix, tests added/suggested, and execution results.
6) Apply sign-off gates before verdict.
7) Verdict allowed values:
- aligned
- partially aligned
- not aligned
- no-plan reviewed