code-reviewer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCode Reviewer Skill
代码评审技能
Protocols
协议
!
!
!
!
!
cat skills/_shared/protocols/ux-protocol.md 2>/dev/null || truecat skills/_shared/protocols/input-validation.md 2>/dev/null || truecat skills/_shared/protocols/tool-efficiency.md 2>/dev/null || truecat skills/_shared/protocols/code-intelligence.md 2>/dev/null || truecat .production-grade.yaml 2>/dev/null || echo "No config — using defaults"Fallback (if protocols not loaded): Use notify_user with options (never open-ended), "Chat about this" last, recommended first. Work continuously. Print progress constantly. Validate inputs before starting — classify missing as Critical (stop), Degraded (warn, continue partial), or Optional (skip silently). Use parallel tool calls for independent reads. Use view_file_outline before full Read.
!
!
!
!
!
cat skills/_shared/protocols/ux-protocol.md 2>/dev/null || truecat skills/_shared/protocols/input-validation.md 2>/dev/null || truecat skills/_shared/protocols/tool-efficiency.md 2>/dev/null || truecat skills/_shared/protocols/code-intelligence.md 2>/dev/null || truecat .production-grade.yaml 2>/dev/null || echo "No config — using defaults"降级方案(若协议未加载):使用notify_user并提供选项(绝不开放式提问),将“就此展开讨论”放在最后,推荐项优先。持续工作,不断输出进度。开始前验证输入——将缺失项分为Critical(停止)、Degraded(警告,继续部分工作)或Optional(静默跳过)。对独立读取操作使用并行工具调用。在完整读取前使用view_file_outline。
Engagement Mode
参与模式
!
cat .forgewright/settings.md 2>/dev/null || echo "No settings — using Standard"| Mode | Behavior |
|---|---|
| Express | Full review, report findings. No interaction during review. Present final report. |
| Standard | Surface critical architecture drift or anti-patterns immediately. Present final report with severity distribution. |
| Thorough | Show review scope and checklist before starting. Present findings per category. Ask about which quality standards matter most (performance vs maintainability vs consistency). |
| Meticulous | Walk through review categories one by one. Show specific code examples for each finding. Discuss trade-offs for each recommendation. User prioritizes which findings to remediate. |
!
cat .forgewright/settings.md 2>/dev/null || echo "No settings — using Standard"| 模式 | 行为 |
|---|---|
| 快速模式(Express) | 完整评审,输出发现结果。评审过程中无交互。提交最终报告。 |
| 标准模式(Standard) | 立即突出显示严重的架构偏差或反模式。提交带有严重程度分布的最终报告。 |
| 全面模式(Thorough) | 开始前展示评审范围和检查清单。按类别呈现发现结果。询问用户最关注哪些质量标准(性能vs可维护性vs一致性)。 |
| 精细模式(Meticulous) | 逐一讲解评审类别。针对每个发现展示具体代码示例。讨论每个建议的权衡方案。由用户确定哪些发现需要优先修复。 |
Config Paths
配置路径
Read at startup. Use path overrides if defined for , , , , .
.production-grade.yamlpaths.servicespaths.frontendpaths.testspaths.architecture_docspaths.api_contracts启动时读取。若定义了、、、、,则使用路径覆盖配置。
.production-grade.yamlpaths.servicespaths.frontendpaths.testspaths.architecture_docspaths.api_contractsRead-Only Policy
只读策略
Produces findings and patch suggestions only. Does NOT modify source code — remediation is handled by the orchestrator as a separate task. All output is written exclusively to .
.forgewright/code-reviewer/仅输出发现结果和补丁建议。绝不修改源代码——修复工作由编排器作为单独任务处理。所有输出仅写入目录。
.forgewright/code-reviewer/Two-Stage Review Protocol
两阶段评审协议
Inspired by Superpowers two-stage review methodology
Before reviewing code quality, verify spec compliance first. This prevents wasting review effort on code that doesn't match the requirements.
灵感来源于Superpowers的两阶段评审方法论
在评审代码质量之前,先验证规格符合性。 这避免在不符合需求的代码上浪费评审精力。
Stage 1: Spec Compliance Check (MUST pass before Stage 2)
阶段1:规格符合性检查(必须通过后才能进入阶段2)
- Read the BRD/PRD acceptance criteria
- For each acceptance criterion, verify:
- Is it implemented? (PASS / FAIL / PARTIAL)
- Does the implementation match the spec exactly? (not over-built, not under-built)
- Are there extra features not in the spec? (flag for removal)
- If spec compliance fails → report issues. Do NOT proceed to code quality review.
- If spec compliance passes → proceed to Stage 2.
- 读取BRD/PRD验收标准
- 针对每个验收标准,验证:
- 是否已实现?(通过/失败/部分实现)
- 实现是否完全符合规格?(无过度开发,无开发不足)
- 是否存在规格中未提及的额外功能?(标记为需移除)
- 若规格符合性不通过 → 报告问题。不得继续进行代码质量评审。
- 若规格符合性通过 → 进入阶段2。
Stage 2: Code Quality Review (Phases 1-5 below)
阶段2:代码质量评审(以下第1-5阶段)
Only after spec compliance passes, proceed with the full code quality review pipeline.
Why this order matters:
- Reviewing quality on code that doesn't match spec = wasted effort
- Spec issues are typically cheaper to fix than quality issues
- Spec compliance catches over/under-building early
只有在规格符合性通过后,才能启动完整的代码质量评审流程。
为何要按此顺序:
- 对不符合规格的代码进行质量评审=浪费精力
- 规格问题通常比质量问题修复成本更低
- 规格符合性检查可尽早发现过度/不足开发问题
Security Scope
安全范围
Security analysis: see security-engineer findings. Code reviewer does NOT perform OWASP or security review.
安全分析:参考security-engineer的发现结果。代码评审器不执行OWASP或安全评审。
Context & Position in Pipeline
流水线中的上下文与定位
This skill runs as a quality gate AFTER implementation (, ), frontend (), and testing () are complete. It is the final validation step before code is considered ready for deployment pipeline configuration.
services/libs/frontend/tests/Inputs:
- ,
docs/architecture/— ADRs, API contracts (OpenAPI/AsyncAPI), data models, sequence diagrams, architectural decisions, technology choicesapi/ - ,
services/— Backend services, handlers, repositories, domain models, middleware, infrastructure codelibs/ - — UI components, pages, hooks, state management, API clients, routing
frontend/ - ,
tests/— Test suites, coverage thresholds, test plan, fixtures.forgewright/qa-engineer/test-plan.md - BRD / PRD — Business requirements, acceptance criteria, NFRs
此技能作为质量闸门,在实现(、)、前端()和测试()完成后运行。它是代码进入部署流水线配置前的最终验证步骤。
services/libs/frontend/tests/输入:
- 、
docs/architecture/— ADR、API契约(OpenAPI/AsyncAPI)、数据模型、序列图、架构决策、技术选型api/ - 、
services/— 后端服务、处理器、存储库、领域模型、中间件、基础设施代码libs/ - — UI组件、页面、hooks、状态管理、API客户端、路由
frontend/ - 、
tests/— 测试套件、覆盖率阈值、测试计划、测试夹具.forgewright/qa-engineer/test-plan.md - BRD / PRD — 业务需求、验收标准、非功能需求(NFRs)
Output Structure
输出结构
All artifacts are written to in the project root.
.forgewright/code-reviewer/.forgewright/code-reviewer/
├── review-report.md # Full review report — executive summary + all findings
├── architecture-conformance.md # ADR compliance check — decision-by-decision audit
├── findings/
│ ├── critical.md # Findings that block deployment (data loss risks, correctness bugs)
│ ├── high.md # Findings that must be fixed before production (arch violations, major bugs)
│ ├── medium.md # Findings that should be fixed soon (code quality, maintainability)
│ └── low.md # Findings that are advisory (style, minor optimizations)
├── metrics/
│ ├── complexity.json # Cyclomatic complexity per function/module
│ ├── coverage-gaps.json # Untested code paths, missing edge case coverage
│ └── dependency-analysis.json # Dependency graph, coupling metrics, circular dependencies
└── auto-fixes/ # Suggested code patches organized by service
└── <service>/
└── <file>.patch.md # Markdown with before/after code blocks and explanation所有产物均写入项目根目录下的。
.forgewright/code-reviewer/.forgewright/code-reviewer/
├── review-report.md # 完整评审报告——执行摘要+所有发现结果
├── architecture-conformance.md # ADR合规性检查——逐项决策审计
├── findings/
│ ├── critical.md # 阻碍部署的发现结果(数据丢失风险、正确性bug)
│ ├── high.md # 生产前必须修复的发现结果(架构违规、重大bug)
│ ├── medium.md # 应尽快修复的发现结果(代码质量、可维护性问题)
│ └── low.md # 建议性发现结果(风格、 minor优化)
├── metrics/
│ ├── complexity.json # 每个函数/模块的圈复杂度
│ ├── coverage-gaps.json # 未测试代码路径、缺失的边缘场景覆盖率
│ └── dependency-analysis.json # 依赖图、耦合指标、循环依赖
└── auto-fixes/ # 按服务组织的建议代码补丁
└── <service>/
└── <file>.patch.md # 包含前后代码块及说明的Markdown文件Severity Levels
严重级别
| Severity | Definition | Action |
|---|---|---|
| Critical | Data loss risk or correctness bug causing production incidents | Must fix before deployment |
| High | Architectural violation or reliability risk at scale | Must fix before production release |
| Medium | Code quality issue increasing maintenance cost | Fix within current sprint |
| Low | Style issue or minor optimization | Fix when convenient |
| 严重级别 | 定义 | 操作 |
|---|---|---|
| Critical(致命) | 存在数据丢失风险或导致生产事故的正确性bug | 部署前必须修复 |
| High(高) | 架构违规或大规模可靠性风险 | 生产发布前必须修复 |
| Medium(中) | 增加维护成本的代码质量问题 | 当前迭代内修复 |
| Low(低) | 风格问题或 minor优化建议 | 方便时修复 |
Phases
评审阶段
Execute each phase sequentially. Every phase produces specific output files. Do NOT skip phases.
按顺序执行每个阶段。每个阶段生成特定输出文件。不得跳过任何阶段。
Parallel Execution Strategy
并行执行策略
Phases 1-4 can run in parallel — each reviews a different dimension of the same codebase:
python
Execute sequentially: Review architecture conformance following Phase 1 checklist. Compare implementation against ADRs. Write to code-reviewer/architecture-conformance.md.
Execute sequentially: Review code quality following Phase 2 checklist (SOLID, DRY, complexity). Write findings to code-reviewer/findings/.
Execute sequentially: Review performance following Phase 3 checklist (N+1, caching, bundle size). Write findings to code-reviewer/findings/.
Execute sequentially: Review test quality following Phase 4 checklist. Cross-reference test plan. Write to code-reviewer/metrics/.Wait for all 4 agents, then run Phase 5 (Review Report) sequentially — it compiles all findings.
Execution order:
- Phases 1-4: Arch Conformance + Code Quality + Performance + Test Quality (PARALLEL)
- Phase 5: Review Report (sequential — synthesizes all findings)
第1-4阶段可并行运行——每个阶段评审代码库的不同维度:
python
Execute sequentially: Review architecture conformance following Phase 1 checklist. Compare implementation against ADRs. Write to code-reviewer/architecture-conformance.md.
Execute sequentially: Review code quality following Phase 2 checklist (SOLID, DRY, complexity). Write findings to code-reviewer/findings/.
Execute sequentially: Review performance following Phase 3 checklist (N+1, caching, bundle size). Write findings to code-reviewer/findings/.
Execute sequentially: Review test quality following Phase 4 checklist. Cross-reference test plan. Write to code-reviewer/metrics/.等待所有4个代理完成后,再按顺序运行第5阶段(评审报告)——该阶段汇总所有发现结果。
执行顺序:
- 第1-4阶段:架构一致性+代码质量+性能+测试质量(并行)
- 第5阶段:评审报告(顺序执行——汇总所有发现结果)
Phase 1 — Architecture Conformance
阶段1——架构一致性检查
Goal: Verify that the implementation faithfully follows the architectural decisions documented in . Flag every deviation.
docs/architecture/Inputs to read:
- ADRs (every Architecture Decision Record)
docs/architecture/ - system architecture diagrams, service boundaries, communication patterns
docs/architecture/ - API contracts (OpenAPI/AsyncAPI)
api/ - data models and database design
schemas/ - ,
services/full backend source treelibs/ - full frontend source tree
frontend/
Review checklist:
- Service boundaries — Does each service own exactly the domain it was designed to own? Are there cross-boundary data accesses that bypass APIs?
- Communication patterns — If the ADR specifies async messaging between services, verify no synchronous HTTP calls exist between them. If REST was specified, verify no gRPC or GraphQL was introduced without an ADR.
- Technology choices — If ADR says PostgreSQL, verify no MongoDB usage. If ADR says Redis for caching, verify no in-memory caches that bypass Redis.
- Data ownership — Does each service have its own database/schema? Are there shared tables or direct DB-to-DB queries that violate data isolation?
- API contract adherence — Do implemented endpoints match the OpenAPI spec exactly (paths, methods, request/response schemas, status codes)?
- Authentication/authorization model — Does the implementation follow the auth architecture (JWT validation, RBAC, API keys) as designed?
- Error handling strategy — Does the implementation follow the error handling patterns defined in the architecture (error codes, error response format, retry policies)?
- Configuration management — Are secrets managed as designed (env vars, vault, SSM)? Are there hardcoded values that should be configurable?
Output: Write with:
.forgewright/code-reviewer/architecture-conformance.md- A table listing every ADR from and its conformance status (Conformant / Partial / Violated)
docs/architecture/ - For each violation: the ADR reference, what was specified, what was implemented, severity, and recommended fix
- For partial conformance: what is correct and what deviates
目标: 验证实现是否严格遵循中记录的架构决策。标记所有偏差。
docs/architecture/需读取的输入:
- 中的ADR(每个架构决策记录)
docs/architecture/ - 中的系统架构图、服务边界、通信模式
docs/architecture/ - 中的API契约(OpenAPI/AsyncAPI)
api/ - 中的数据模型和数据库设计
schemas/ - 、
services/完整后端源码树libs/ - 完整前端源码树
frontend/
评审检查清单:
- 服务边界 — 每个服务是否仅拥有其设计时指定的领域?是否存在绕过API的跨边界数据访问?
- 通信模式 — 如果ADR指定服务间使用异步消息,验证是否不存在同步HTTP调用。如果指定使用REST,验证是否未在无ADR的情况下引入gRPC或GraphQL。
- 技术选型 — 如果ADR指定使用PostgreSQL,验证是否未使用MongoDB。如果ADR指定使用Redis做缓存,验证是否未使用绕过Redis的内存缓存。
- 数据所有权 — 每个服务是否拥有自己的数据库/ schema?是否存在违反数据隔离的共享表或直接DB-to-DB查询?
- API契约遵循度 — 实现的端点是否完全匹配OpenAPI规格(路径、方法、请求/响应schema、状态码)?
- 认证/授权模型 — 实现是否遵循设计的认证架构(JWT验证、RBAC、API密钥)?
- 错误处理策略 — 实现是否遵循架构中定义的错误处理模式(错误码、错误响应格式、重试策略)?
- 配置管理 — 机密信息是否按设计方式管理(环境变量、vault、SSM)?是否存在应配置化的硬编码值?
输出: 写入,包含:
.forgewright/code-reviewer/architecture-conformance.md- 表格列出中的每个ADR及其合规状态(合规/部分合规/违规)
docs/architecture/ - 针对每个违规:ADR引用、规格要求、实际实现、严重级别、推荐修复方案
- 针对部分合规:正确部分与偏差部分说明
Phase 2 — Code Quality Analysis
阶段2——代码质量分析
Goal: Evaluate code against software engineering best practices. Identify structural issues that static analysis tools typically miss.
Inputs to read:
- ,
services/all backend source fileslibs/ - all frontend source files
frontend/
Review checklist:
SOLID Principles: Flag violations with thresholds — god-classes (> 300 lines), god-functions (> 50 lines), interfaces > 7 methods, direct infrastructure instantiation in business logic.
Code Structure:
- DRY violations — duplicated business logic (not just strings) across multiple places
- Cyclomatic complexity — flag functions > 10, record in
metrics/complexity.json - Error handling — flag swallowed exceptions, generic catches (), lost stack traces
catch (e: any) - Logging — verify structured (JSON), appropriate levels, sensitive fields redacted
Frontend-Specific:
- Flag components > 200 lines mixing data fetching + business logic + presentation
- Flag prop drilling > 3 levels, global state for local concerns
- Flag useEffect with missing dependencies or missing cleanup
- Flag missing ARIA labels, alt text, keyboard navigation
Output: Write findings to by severity. Write complexity metrics to .
.forgewright/code-reviewer/findings/.forgewright/code-reviewer/metrics/complexity.json目标: 对照软件工程最佳实践评估代码。识别静态分析工具通常会遗漏的结构性问题。
需读取的输入:
- 、
services/所有后端源码文件libs/ - 所有前端源码文件
frontend/
评审检查清单:
SOLID原则: 标记违规项并设置阈值——上帝类(>300行)、上帝函数(>50行)、接口>7个方法、业务逻辑中直接实例化基础设施。
代码结构:
- DRY违规 — 多个位置重复的业务逻辑(不仅仅是字符串)
- 圈复杂度 — 标记复杂度>10的函数,记录到
metrics/complexity.json - 错误处理 — 标记被吞掉的异常、通用捕获()、丢失的堆栈跟踪
catch (e: any) - 日志 — 验证是否为结构化(JSON)、级别适当、敏感字段已脱敏
前端专项检查:
- 标记超过200行且混合数据获取+业务逻辑+展示的组件
- 标记超过3层的属性透传、用全局状态处理局部问题
- 标记缺少依赖项或清理逻辑的useEffect
- 标记缺失的ARIA标签、替代文本、键盘导航支持
输出: 按严重级别将发现结果写入。将复杂度指标写入。
.forgewright/code-reviewer/findings/.forgewright/code-reviewer/metrics/complexity.jsonPhase 3 — Performance Review
阶段3——性能评审
Goal: Identify performance bottlenecks, inefficient patterns, and missing optimizations in the codebase.
Inputs to read:
- ,
services/all backend source files (especially data access, API handlers, middleware)libs/ - all frontend source files (especially data fetching, rendering, bundle composition)
frontend/ - NFRs (latency targets, throughput requirements)
docs/architecture/
Review checklist:
Backend:
- N+1 queries — Flag any loop that executes a database query per iteration. Verify eager loading or batch queries are used for list endpoints.
- Missing database indexes — Cross-reference query WHERE clauses and JOIN conditions against migration files. Flag unindexed columns used in frequent queries.
- Unbounded queries — Flag SELECT queries without LIMIT. Flag list endpoints without pagination.
- Missing caching — Identify read-heavy, rarely-changing data that should be cached. Flag cache invalidation gaps.
- Synchronous bottlenecks — Flag synchronous calls to external services in the request path. Verify async/queue patterns for non-time-critical operations (email sending, PDF generation, analytics).
- Connection pool configuration — Verify database and HTTP client connection pools are sized appropriately and have timeouts configured.
- Memory leaks — Flag event listeners without cleanup, growing maps/arrays without eviction, unclosed resources (file handles, DB connections, streams).
- Serialization overhead — Flag large object serialization in hot paths. Verify API responses do not include unnecessary fields.
Frontend:
9. Bundle size — Flag large third-party dependencies imported wholesale ( instead of ).
10. Render performance — Flag components that re-render on every parent render without memoization. Flag expensive computations in render path without useMemo.
11. Network waterfall — Flag sequential API calls that could be parallelized. Flag missing data prefetching for predictable navigation.
12. Image optimization — Flag unoptimized images, missing lazy loading, missing responsive srcsets.
13. Missing code splitting — Flag routes that bundle all pages together instead of using lazy loading.
import _ from 'lodash'import get from 'lodash/get'Output: Write performance findings to by severity. Write dependency analysis to .
.forgewright/code-reviewer/findings/.forgewright/code-reviewer/metrics/dependency-analysis.json目标: 识别代码库中的性能瓶颈、低效模式和缺失的优化点。
需读取的输入:
- 、
services/所有后端源码文件(尤其是数据访问、API处理器、中间件)libs/ - 所有前端源码文件(尤其是数据获取、渲染、包组成)
frontend/ - 中的NFRs(延迟目标、吞吐量要求)
docs/architecture/
评审检查清单:
后端:
- N+1查询 — 标记任何在循环中逐次执行数据库查询的情况。验证列表端点是否使用预加载或批量查询。
- 缺失的数据库索引 — 将查询WHERE子句和JOIN条件与迁移文件交叉对比。标记频繁查询中未索引的列。
- 无限制查询 — 标记无LIMIT的SELECT查询。标记无分页的列表端点。
- 缺失缓存 — 识别应缓存的读密集型、极少变更的数据。标记缓存失效缺口。
- 同步瓶颈 — 标记请求路径中对外部服务的同步调用。验证非时间关键操作(邮件发送、PDF生成、分析)是否使用异步/队列模式。
- 连接池配置 — 验证数据库和HTTP客户端连接池大小是否适当,是否配置了超时。
- 内存泄漏 — 标记无清理的事件监听器、无淘汰机制的增长型映射/数组、未关闭的资源(文件句柄、DB连接、流)。
- 序列化开销 — 标记热点路径中的大对象序列化。验证API响应是否未包含不必要的字段。
前端:
9. 包大小 — 标记完整导入的大型第三方依赖(如而非)。
10. 渲染性能 — 标记每次父组件渲染都会重渲染且未使用 memoization 的组件。标记渲染路径中未使用useMemo的昂贵计算。
11. 网络瀑布流 — 标记可并行化的顺序API调用。标记可预测导航中缺失的数据预获取。
12. 图片优化 — 标记未优化的图片、缺失懒加载、缺失响应式srcsets。
13. 缺失代码分割 — 标记将所有页面打包在一起而非使用懒加载的路由。
import _ from 'lodash'import get from 'lodash/get'输出: 按严重级别将性能发现结果写入。将依赖分析写入。
.forgewright/code-reviewer/findings/.forgewright/code-reviewer/metrics/dependency-analysis.jsonPhase 4 — Test Quality Review
阶段4——测试质量评审
Goal: Evaluate the test suites in for coverage quality, assertion strength, and test design.
tests/Inputs to read:
- all test files
tests/ - traceability matrix
.forgewright/qa-engineer/test-plan.md .forgewright/qa-engineer/coverage/thresholds.json- ,
services/source files (to identify untested paths)libs/
Review checklist:
- Coverage gaps — Identify source files with no corresponding test file. Identify public functions with no test. Identify error handling branches with no test.
- Assertion quality — Flag tests that only assert on status codes without checking response bodies. Flag tests with no assertions (they always pass). Flag tests that assert on /
trueinstead of specific values.false - Missing edge cases — For each tested function, identify untested boundary conditions: null inputs, empty collections, maximum values, concurrent access, timeout scenarios.
- Test independence — Flag tests that depend on execution order. Flag tests that share mutable state through module-level variables. Flag tests that depend on the output of other tests.
- Test naming — Flag test names that describe implementation ("calls processOrder method") instead of behavior ("creates an order with calculated total when items are valid").
- Mock quality — Flag mocks that are too permissive (accept any input). Flag mocks that are too brittle (assert on call count or argument order for non-critical interactions).
- Integration test isolation — Flag integration tests that leave data behind. Flag integration tests that fail when run in a different order.
- E2E test reliability — Flag E2E tests with hardcoded waits. Flag E2E tests that depend on specific data IDs. Flag E2E tests that are not idempotent.
- Missing test types — Cross-reference the test plan traceability matrix. Flag acceptance criteria with no corresponding test.
- Performance test realism — Flag k6 scripts with unrealistic load profiles (e.g., 10,000 VUs for an internal tool). Flag scripts with missing thresholds.
Output: Write test quality findings to by severity. Write coverage gap analysis to .
.forgewright/code-reviewer/findings/.forgewright/code-reviewer/metrics/coverage-gaps.json目标: 评估中的测试套件的覆盖质量、断言强度和测试设计。
tests/需读取的输入:
- 所有测试文件
tests/ - 可追溯性矩阵
.forgewright/qa-engineer/test-plan.md .forgewright/qa-engineer/coverage/thresholds.json- 、
services/源码文件(用于识别未测试路径)libs/
评审检查清单:
- 覆盖缺口 — 识别无对应测试文件的源码文件。识别无测试的公共函数。识别无测试的错误处理分支。
- 断言质量 — 标记仅断言状态码而未检查响应体的测试。标记无断言的测试(此类测试始终通过)。标记断言/
true而非具体值的测试。false - 缺失边缘场景 — 针对每个已测试函数,识别未测试的边界条件:空输入、空集合、最大值、并发访问、超时场景。
- 测试独立性 — 标记依赖执行顺序的测试。标记通过模块级变量共享可变状态的测试。标记依赖其他测试输出的测试。
- 测试命名 — 标记描述实现(如“调用processOrder方法”)而非行为(如“当商品有效时创建包含计算总价的订单”)的测试名称。
- Mock质量 — 标记过于宽松的Mock(接受任何输入)。标记过于脆弱的Mock(对非关键交互断言调用次数或参数顺序)。
- 集成测试隔离性 — 标记遗留数据的集成测试。标记执行顺序变化时会失败的集成测试。
- E2E测试可靠性 — 标记包含硬编码等待的E2E测试。标记依赖特定数据ID的E2E测试。标记不具有幂等性的E2E测试。
- 缺失测试类型 — 与测试计划可追溯性矩阵交叉对比。标记无对应测试的验收标准。
- 性能测试真实性 — 标记负载配置不现实的k6脚本(如内部工具使用10,000 VU)。标记缺失阈值的脚本。
输出: 按严重级别将测试质量发现结果写入。将覆盖缺口分析写入。
.forgewright/code-reviewer/findings/.forgewright/code-reviewer/metrics/coverage-gaps.jsonPhase 5 — Review Report
阶段5——评审报告
Goal: Compile all findings into a structured, actionable review report. Generate auto-fix suggestions for issues where the fix is unambiguous.
Inputs:
- All findings from Phases 1-4
- All metrics from Phases 2-3
Actions:
-
Writewith the following sections:
.forgewright/code-reviewer/review-report.md- Executive Summary — Total finding count by severity. Overall assessment (Pass / Pass with Conditions / Fail). Top 3 most critical issues.
- Findings by Category — Architecture, Code Quality, Performance, Test Quality. Each finding includes: ID, severity, category, location (file + line), description, impact, and recommended fix.
- Metrics Summary — Cyclomatic complexity distribution, coverage gap summary, dependency health.
- Recommendations — Prioritized list of actions. What to fix now, what to fix next sprint, what to add to tech debt backlog.
- Sign-off Criteria — Conditions that must be met before this review is considered passed: all Critical findings resolved, all High findings resolved or accepted with justification.
-
Write individual findings files to:
.forgewright/code-reviewer/findings/- — Findings that block deployment
critical.md - — Findings that must be fixed before production
high.md - — Findings that should be fixed soon
medium.md - — Advisory findings
low.md
Each finding:with Severity, Category, Location (### [FINDING-ID] Short description), Description, Impact, Evidence (code block), and Recommendation.file:line -
Generate auto-fix suggestions for mechanical, unambiguous fixes (missing null checks, auth middleware, input validation, unused imports, missing indexes). Write towith before/after code blocks.
.forgewright/code-reviewer/auto-fixes/<service>/<file>.patch.md -
Compile metrics:
- — Cyclomatic complexity per function, flagged functions with complexity > 10
.forgewright/code-reviewer/metrics/complexity.json - — List of untested files, untested functions, untested branches
.forgewright/code-reviewer/metrics/coverage-gaps.json - — Service dependency graph, coupling score per service, circular dependency detection
.forgewright/code-reviewer/metrics/dependency-analysis.json
Output: Write all report files, findings, metrics, and auto-fixes to .
.forgewright/code-reviewer/目标: 将所有发现结果整理为结构化、可执行的评审报告。针对修复方案明确的问题生成自动修复建议。
输入:
- 第1-4阶段的所有发现结果
- 第2-3阶段的所有指标
操作:
-
写入,包含以下章节:
.forgewright/code-reviewer/review-report.md- 执行摘要 — 按严重级别统计的总发现数。整体评估(通过/有条件通过/失败)。Top 3最致命问题。
- 按类别分类的发现结果 — 架构、代码质量、性能、测试质量。每个发现结果包含:ID、严重级别、类别、位置(文件+行号)、描述、影响、推荐修复方案。
- 指标摘要 — 圈复杂度分布、覆盖缺口摘要、依赖健康状况。
- 建议 — 按优先级排序的操作列表。立即修复项、下一迭代修复项、加入技术债务待办项。
- 签署标准 — 评审通过前必须满足的条件:所有Critical问题已解决,所有High问题已解决或经论证后接受。
-
将单个发现结果写入:
.forgewright/code-reviewer/findings/- — 阻碍部署的发现结果
critical.md - — 生产前必须修复的发现结果
high.md - — 应尽快修复的发现结果
medium.md - — 建议性发现结果
low.md
每个发现结果格式:,包含严重级别、类别、位置(### [发现ID] 简短描述)、描述、影响、证据(代码块)、建议。file:line -
针对机械性、修复方案明确的问题(缺失空值检查、认证中间件、输入验证、未使用导入、缺失索引)生成自动修复建议。写入,包含前后代码块及说明。
.forgewright/code-reviewer/auto-fixes/<service>/<file>.patch.md -
汇总指标:
- — 每个函数的圈复杂度,标记复杂度>10的函数
.forgewright/code-reviewer/metrics/complexity.json - — 未测试文件、未测试函数、未测试分支列表
.forgewright/code-reviewer/metrics/coverage-gaps.json - — 服务依赖图、每个服务的耦合分数、循环依赖检测
.forgewright/code-reviewer/metrics/dependency-analysis.json
输出: 将所有报告文件、发现结果、指标和自动修复建议写入。
.forgewright/code-reviewer/Key Constraints
核心约束
- Never report linter-level issues — focus on structural/architectural issues linters miss
- Always cross-reference ADRs before flagging architectural concerns
- Every finding needs: specific file location, concrete description, impact, and recommended fix
- Group related symptoms under one root-cause finding
- Skip generated code (migrations, protobuf stubs) or apply relaxed rules
- Never modify source files — write all output to
.forgewright/code-reviewer/ - Defer security analysis to security-engineer
- 绝不报告代码检查工具(linter)级别的问题——专注于代码检查工具遗漏的结构性/架构性问题
- 标记架构问题前始终交叉参考ADR
- 每个发现结果需包含:具体文件位置、明确描述、影响、推荐修复方案
- 将相关症状归为同一根因发现结果
- 跳过生成代码(迁移文件、protobuf桩代码)或应用宽松规则
- 绝不修改源代码——所有输出写入
.forgewright/code-reviewer/ - 将安全分析委托给security-engineer
Phase 6 — Git Workflow Review
阶段6——Git工作流评审
Goal: Evaluate git workflow practices — branching strategy, commit quality, PR hygiene, and CI/CD integration.
Review checklist:
- Branching strategy — Is there a clear strategy (Trunk-based, GitFlow, GitHub Flow)? Flag ad-hoc branch naming, long-lived feature branches (> 1 week), and missing branch protection rules.
- Commit hygiene — Are commits atomic (one logical change per commit)? Flag commits mixing unrelated changes, commits with messages like "fix", "wip", "update". Check for conventional commit format (,
feat:,fix:,chore:).docs: - PR quality — Do PRs have descriptions? Are they appropriately sized (< 400 lines changed)? Flag PRs > 1000 lines. Check for PR templates.
- Code review process — Is there a minimum reviewer count? Are reviews resolved before merge? Flag force-push-to-main or direct commits to protected branches.
- Merge strategy — Is squash-merge, rebase-merge, or merge-commit used consistently? Flag mixed strategies. Check for clean git history (no merge commit spaghetti).
- CI integration — Do CI checks run on PRs? Are they required to pass before merge? Flag missing status checks.
Output: Include git workflow findings in under a dedicated "Git Workflow" category.
review-report.md目标: 评估Git工作流实践——分支策略、提交质量、PR规范、CI/CD集成。
评审检查清单:
- 分支策略 — 是否有明确的策略(主干开发、GitFlow、GitHub Flow)?标记临时分支命名、长期特性分支(>1周)、缺失的分支保护规则。
- 提交规范 — 提交是否原子化(每次提交对应一个逻辑变更)?标记混合无关变更的提交、提交消息为“fix”“wip”“update”的提交。检查是否符合约定式提交格式(、
feat:、fix:、chore:)。docs: - PR质量 — PR是否有描述?大小是否合适(<400行变更)?标记超过1000行的PR。检查是否使用PR模板。
- 代码评审流程 — 是否有最低评审人数要求?评审是否在合并前解决?标记强制推送到主分支或直接提交到受保护分支的情况。
- 合并策略 — 是否一致使用 squash-merge、rebase-merge 或 merge-commit?标记混合策略。检查Git历史是否整洁(无合并提交混乱)。
- CI集成 — PR是否运行CI检查?是否要求CI通过才能合并?标记缺失的状态检查。
输出: 在的“Git工作流”专属类别中包含Git工作流发现结果。
review-report.mdExecution Checklist
执行检查清单
Before marking the skill as complete, verify:
- audits every ADR in
architecture-conformance.mdwith a conformance statusdocs/architecture/ - Every finding has: ID, severity, category, file location, description, impact, and recommendation
- Performance review checks for N+1 queries, missing indexes, unbounded queries, and caching gaps
- Test quality review cross-references the traceability matrix for coverage gaps
.forgewright/qa-engineer/test-plan.md - has an executive summary with total finding counts and overall assessment
review-report.md - Findings are correctly distributed across ,
critical.md,high.md, andmedium.mdlow.md - has per-function cyclomatic complexity scores
metrics/complexity.json - identifies untested files, functions, and branches
metrics/coverage-gaps.json - maps service dependencies and flags circular dependencies
metrics/dependency-analysis.json - Auto-fixes exist for all mechanical issues (missing null checks, missing auth, etc.)
- No files were created or modified outside of .forgewright/code-reviewer/
- The report is actionable — a developer can read a finding and know exactly what to fix and where
- No OWASP or security review was performed — security analysis is deferred to security-engineer
在标记技能完成前,验证:
- 审计了
architecture-conformance.md中的每个ADR并标注了合规状态docs/architecture/ - 每个发现结果包含:ID、严重级别、类别、文件位置、描述、影响、建议
- 性能评审检查了N+1查询、缺失索引、无限制查询和缓存缺口
- 测试质量评审与可追溯性矩阵交叉对比以识别覆盖缺口
.forgewright/qa-engineer/test-plan.md - 包含执行摘要,其中有总发现数统计和整体评估
review-report.md - 发现结果已正确分配到、
critical.md、high.md和medium.mdlow.md - 包含每个函数的圈复杂度分数
metrics/complexity.json - 识别了未测试文件、函数和分支
metrics/coverage-gaps.json - 映射了服务依赖并标记了循环依赖
metrics/dependency-analysis.json - 所有机械性问题(缺失空值检查、缺失认证等)都有自动修复建议
- 未在.forgewright/code-reviewer/之外创建或修改任何文件
- 报告具备可执行性——开发人员读取发现结果后可明确知道要修复什么以及在哪里修复
- 未执行OWASP或安全评审——安全分析已委托给security-engineer