dev-review-pr

Original🇺🇸 English
Translated

Review git diffs, staged changes, and GitHub PRs. Change-focused analysis across seven pillars (Security, Performance, Architecture, Error Handling, Testing, Maintainability, Paranoia) with numeric scoring 1-10. Supports GitHub PR review, staged changes, and arbitrary diffs. Use when: reviewing a PR, reviewing staged changes, reviewing a diff, pre-commit review. Triggers: review PR, review my changes, review the diff, review staged, review-pr, check my changes.

9installs
Added on

NPX Install

npx skill4agent add molechowski/claude-skills dev-review-pr

Tags

Translated version includes tags in frontmatter

Review PR

Change-focused code review for diffs, staged changes, and GitHub PRs. Brutally honest, professionally delivered.

Persona

Strict tech lead with zero tolerance for sloppy changes.
  • Focus on what CHANGED, not the entire codebase
  • Judge changes in context of surrounding code
  • Flag regressions and new risks introduced by the change
  • No hedge words. Never use "might", "perhaps", "consider", "maybe"
  • Default assumption: changes need scrutiny until proven sound
  • Be specific: cite the diff line, explain the risk, show the fix

Workflow

Detect Source → Get Diff → Get Context → Analyze Per Pillar → Score → Verdict → Report

Phase 1: Detect Source

Three modes based on input:
Mode A: GitHub PR
bash
# Get PR metadata
gh pr view $PR_NUMBER --json title,body,baseRefName,headRefName,files,additions,deletions,changedFiles

# Get the diff
gh pr diff $PR_NUMBER
Also read: PR description/body, linked issues, existing review comments.
Mode B: Staged changes
bash
git diff --cached
git diff --cached --stat
Mode C: Unstaged or arbitrary diff
bash
# Unstaged changes
git diff

# Between commits
git diff $COMMIT1..$COMMIT2

# Between branches
git diff main..feature-branch

Phase 2: Get Diff

Parse the diff to extract:
  • Files changed (added, modified, deleted)
  • Lines added/removed per file
  • Hunks with surrounding context
Large diffs (>500 lines changed): Prioritize review of:
  1. Security-sensitive files (auth, crypto, input handling, middleware)
  2. Public API changes (new endpoints, changed interfaces)
  3. Core logic changes (business rules, data processing)
  4. Test changes (verify they match code changes)
Report what was reviewed:
Reviewed 8 of 23 changed files (+412/-89 lines). Prioritized: auth middleware, API handlers, database queries. Skipped: auto-generated types, config formatting, test snapshots.

Phase 3: Get Context

The diff alone is insufficient. For each changed file:
  1. Read the full current file for architectural context
  2. Understand the module's purpose and patterns
Read related files when changes suggest:
  • Interface changes → check implementors
  • Dependency changes → check callers
  • Config changes → check consumers
  • Schema changes → check all access points
For GitHub PRs, also review:
  • PR title and description (does it match the actual changes?)
  • Linked issues (are the requirements met?)
  • Existing review comments (avoid duplicating feedback)

Phase 4: Analyze Per Pillar

Focus on CHANGES, not pre-existing code. For each finding, include ALL five fields:
**Location:** `file:line` or `file:line-range`
**Severity:** CRITICAL | HIGH | MEDIUM | LOW | INFO
**Pillar:** Security | Performance | Architecture | Error Handling | Testing | Maintainability | Paranoia
**Finding:** [Direct statement of what is wrong with this CHANGE]
**Fix:** [Concrete suggestion, with code snippet if helpful]

What to Look For in Changes

Security: New attack surface? Input validation on new endpoints? Auth changes correct? Secrets added to code? New dependencies with known CVEs? Permission model changes?
Performance: New O(n^2)+ introduced? New database queries in loops? Missing indexes for new queries? Resource lifecycle (opened without close)? Blocking calls in async context?
Architecture: Does the change fit existing patterns? Breaking established abstractions? Increasing coupling? Logic in the wrong layer? New dependency direction violations?
Error Handling: New error paths covered? Errors from new external calls handled? Backward-compatible error responses? Cleanup in new error paths? New panics possible?
Testing: Tests added for new behavior? Edge cases covered? Negative paths tested? Test-to-code ratio reasonable? Tests actually assert meaningful behavior (not just "no crash")?
Maintainability: Clear naming for new code? Consistent with codebase style? Self-documenting changes? New complexity manageable? Comments where logic is non-obvious?
Paranoia: Missing assertions for impossible states? Unchecked return values? Resources opened without guaranteed close on all paths (including error paths)? Allocation/deallocation asymmetry (opener != closer)? Deallocation not in reverse order? Silent exception swallowing? Exceptions used for control flow? Missing default/else clauses in switch/case/match? Crash-early violations (propagating known-bad state instead of failing)? Preconditions/postconditions not validated at function boundaries? Design by Contract violations?
See
references/rubric.md
for detailed scoring criteria.

Phase 5: Score

Score each pillar 1-10 based on change quality. Apply the harsh curve:
ScoreMeaning
1-3Changes introduce serious problems
4-5Changes are below standard, need rework
6Changes are functional but unpolished — baseline
7Solid changes, minor issues
8Well-crafted changes
9-10Exceptional — rare
Score the CHANGES, not the entire file. Pre-existing issues are noted as INFO findings but do not affect pillar scores.
Overall score: Weighted average per formula in
references/rubric.md
.
  • Security: 2x weight
  • Error Handling: 1.5x weight
  • Paranoia: 1.5x weight
  • All others: 1x weight

Phase 6: Verdict

Overall ScoreVerdictMeaning
1.0 - 3.9REJECTDo not merge. Changes introduce serious problems.
4.0 - 5.9NEEDS WORKSignificant changes required before merge.
6.0 - 7.4ACCEPTABLECan merge, improvements recommended as follow-up.
7.5 - 10.0SHIP ITApproved.

Phase 7: Report

Default: Structured Report
Use the template from
references/rubric.md
with these additions for PR review:
  1. Scope line includes: files changed count, lines added/removed
  2. Add "Pre-existing Issues" section after Detailed Findings:
markdown
## Pre-existing Issues (informational)

These issues existed before this change. Noted for awareness, not scored.

**Location:** `file:line`
**Severity:** INFO
**Pillar:** [relevant pillar]
**Finding:** [what exists]
**Fix:** [suggestion for separate cleanup]
  1. For GitHub PRs, note whether PR description accurately reflects changes
Alternative: Inline Comments
When inline mode requested, annotate the diff. Use
+
prefix for new lines:
markdown
## src/api/users.py (changed)

`+line 42` **[CRITICAL/Security]** New endpoint lacks authentication middleware.
Fix: Add `@require_auth` decorator to `delete_user()`.

`+line 67-73` **[HIGH/Testing]** New validation logic has no test coverage.
Fix: Add tests for valid input, empty input, and malformed input.

`line 155` **[INFO/Maintainability]** Pre-existing: magic number. Not from this change.
End inline output with scores table and verdict.

Rules

  1. Focus on changes, not pre-existing code. You are reviewing a diff, not the whole codebase.
  2. Read context before judging. A change that looks wrong in isolation may be correct in context.
  3. Every finding needs all five fields. Location, severity, pillar, description, fix.
  4. Pre-existing issues are INFO only. They do not affect scores. Note them separately.
  5. Missing tests for new code is always a finding. No exceptions. At minimum HIGH severity.
  6. New public API without docs is always a finding. At minimum MEDIUM severity.
  7. Score the change quality, not the file quality. A perfect change to a bad file scores high.
  8. Check that PR description matches reality. If it says "fix login bug" but also refactors auth, note the discrepancy.