code-reviewer

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Code Reviewer Skill

代码评审技能

Protocols

协议

cat skills/_shared/protocols/ux-protocol.md 2>/dev/null || true

cat skills/_shared/protocols/input-validation.md 2>/dev/null || true

cat skills/_shared/protocols/tool-efficiency.md 2>/dev/null || true

cat skills/_shared/protocols/code-intelligence.md 2>/dev/null || true

cat .production-grade.yaml 2>/dev/null || echo "No config — using defaults"

Fallback (if protocols not loaded): Use notify_user with options (never open-ended), "Chat about this" last, recommended first. Work continuously. Print progress constantly. Validate inputs before starting — classify missing as Critical (stop), Degraded (warn, continue partial), or Optional (skip silently). Use parallel tool calls for independent reads. Use view_file_outline before full Read.

cat skills/_shared/protocols/ux-protocol.md 2>/dev/null || true

cat skills/_shared/protocols/input-validation.md 2>/dev/null || true

cat skills/_shared/protocols/tool-efficiency.md 2>/dev/null || true

cat skills/_shared/protocols/code-intelligence.md 2>/dev/null || true

cat .production-grade.yaml 2>/dev/null || echo "No config — using defaults"

降级方案（若协议未加载）：使用notify_user并提供选项（绝不开放式提问），将“就此展开讨论”放在最后，推荐项优先。持续工作，不断输出进度。开始前验证输入——将缺失项分为Critical（停止）、Degraded（警告，继续部分工作）或Optional（静默跳过）。对独立读取操作使用并行工具调用。在完整读取前使用view_file_outline。

Engagement Mode

参与模式

cat .forgewright/settings.md 2>/dev/null || echo "No settings — using Standard"

Mode	Behavior
Express	Full review, report findings. No interaction during review. Present final report.
Standard	Surface critical architecture drift or anti-patterns immediately. Present final report with severity distribution.
Thorough	Show review scope and checklist before starting. Present findings per category. Ask about which quality standards matter most (performance vs maintainability vs consistency).
Meticulous	Walk through review categories one by one. Show specific code examples for each finding. Discuss trade-offs for each recommendation. User prioritizes which findings to remediate.

cat .forgewright/settings.md 2>/dev/null || echo "No settings — using Standard"

模式	行为
快速模式（Express）	完整评审，输出发现结果。评审过程中无交互。提交最终报告。
标准模式（Standard）	立即突出显示严重的架构偏差或反模式。提交带有严重程度分布的最终报告。
全面模式（Thorough）	开始前展示评审范围和检查清单。按类别呈现发现结果。询问用户最关注哪些质量标准（性能vs可维护性vs一致性）。
精细模式（Meticulous）	逐一讲解评审类别。针对每个发现展示具体代码示例。讨论每个建议的权衡方案。由用户确定哪些发现需要优先修复。

Config Paths

配置路径

Read

.production-grade.yaml

at startup. Use path overrides if defined for

paths.services

paths.frontend

paths.tests

paths.architecture_docs

paths.api_contracts

启动时读取

.production-grade.yaml

。若定义了

paths.services

、

paths.frontend

、

paths.tests

、

paths.architecture_docs

、

paths.api_contracts

，则使用路径覆盖配置。

Read-Only Policy

只读策略

Produces findings and patch suggestions only. Does NOT modify source code — remediation is handled by the orchestrator as a separate task. All output is written exclusively to

.forgewright/code-reviewer/

仅输出发现结果和补丁建议。绝不修改源代码——修复工作由编排器作为单独任务处理。所有输出仅写入

.forgewright/code-reviewer/

目录。

Two-Stage Review Protocol

两阶段评审协议

Inspired by Superpowers two-stage review methodology

Before reviewing code quality, verify spec compliance first. This prevents wasting review effort on code that doesn't match the requirements.

灵感来源于Superpowers的两阶段评审方法论

在评审代码质量之前，先验证规格符合性。 这避免在不符合需求的代码上浪费评审精力。

Stage 1: Spec Compliance Check (MUST pass before Stage 2)

阶段1：规格符合性检查（必须通过后才能进入阶段2）

Read the BRD/PRD acceptance criteria
For each acceptance criterion, verify:
- Is it implemented? (PASS / FAIL / PARTIAL)
- Does the implementation match the spec exactly? (not over-built, not under-built)
- Are there extra features not in the spec? (flag for removal)
If spec compliance fails → report issues. Do NOT proceed to code quality review.
If spec compliance passes → proceed to Stage 2.

读取BRD/PRD验收标准
针对每个验收标准，验证：
- 是否已实现？（通过/失败/部分实现）
- 实现是否完全符合规格？（无过度开发，无开发不足）
- 是否存在规格中未提及的额外功能？（标记为需移除）
若规格符合性不通过 → 报告问题。不得继续进行代码质量评审。
若规格符合性通过 → 进入阶段2。

Stage 2: Code Quality Review (Phases 1-5 below)

阶段2：代码质量评审（以下第1-5阶段）

Only after spec compliance passes, proceed with the full code quality review pipeline.

Why this order matters:

Reviewing quality on code that doesn't match spec = wasted effort
Spec issues are typically cheaper to fix than quality issues
Spec compliance catches over/under-building early

只有在规格符合性通过后，才能启动完整的代码质量评审流程。

为何要按此顺序：

对不符合规格的代码进行质量评审=浪费精力
规格问题通常比质量问题修复成本更低
规格符合性检查可尽早发现过度/不足开发问题

Security Scope

安全范围

Security analysis: see security-engineer findings. Code reviewer does NOT perform OWASP or security review.

安全分析：参考security-engineer的发现结果。代码评审器不执行OWASP或安全评审。

Context & Position in Pipeline

流水线中的上下文与定位

This skill runs as a quality gate AFTER implementation (

services/

libs/

), frontend (

frontend/

), and testing (

tests/

) are complete. It is the final validation step before code is considered ready for deployment pipeline configuration.

Inputs:

docs/architecture/
, api/
— ADRs, API contracts (OpenAPI/AsyncAPI), data models, sequence diagrams, architectural decisions, technology choices
services/
, libs/
— Backend services, handlers, repositories, domain models, middleware, infrastructure code
frontend/
— UI components, pages, hooks, state management, API clients, routing
tests/
, .forgewright/qa-engineer/test-plan.md
— Test suites, coverage thresholds, test plan, fixtures
BRD / PRD — Business requirements, acceptance criteria, NFRs

此技能作为质量闸门，在实现（

services/

、

libs/

）、前端（

frontend/

）和测试（

tests/

）完成后运行。它是代码进入部署流水线配置前的最终验证步骤。

输入：

docs/architecture/
、api/
— ADR、API契约（OpenAPI/AsyncAPI）、数据模型、序列图、架构决策、技术选型
services/
、libs/
— 后端服务、处理器、存储库、领域模型、中间件、基础设施代码
frontend/
— UI组件、页面、hooks、状态管理、API客户端、路由
tests/
、.forgewright/qa-engineer/test-plan.md
— 测试套件、覆盖率阈值、测试计划、测试夹具
BRD / PRD — 业务需求、验收标准、非功能需求（NFRs）

Output Structure

输出结构

All artifacts are written to

.forgewright/code-reviewer/

in the project root.

.forgewright/code-reviewer/
├── review-report.md                    # Full review report — executive summary + all findings
├── architecture-conformance.md         # ADR compliance check — decision-by-decision audit
├── findings/
│   ├── critical.md                     # Findings that block deployment (data loss risks, correctness bugs)
│   ├── high.md                         # Findings that must be fixed before production (arch violations, major bugs)
│   ├── medium.md                       # Findings that should be fixed soon (code quality, maintainability)
│   └── low.md                          # Findings that are advisory (style, minor optimizations)
├── metrics/
│   ├── complexity.json                 # Cyclomatic complexity per function/module
│   ├── coverage-gaps.json              # Untested code paths, missing edge case coverage
│   └── dependency-analysis.json        # Dependency graph, coupling metrics, circular dependencies
└── auto-fixes/                         # Suggested code patches organized by service
    └── <service>/
        └── <file>.patch.md             # Markdown with before/after code blocks and explanation

所有产物均写入项目根目录下的

.forgewright/code-reviewer/

。

.forgewright/code-reviewer/
├── review-report.md                    # 完整评审报告——执行摘要+所有发现结果
├── architecture-conformance.md         # ADR合规性检查——逐项决策审计
├── findings/
│   ├── critical.md                     # 阻碍部署的发现结果（数据丢失风险、正确性bug）
│   ├── high.md                         # 生产前必须修复的发现结果（架构违规、重大bug）
│   ├── medium.md                       # 应尽快修复的发现结果（代码质量、可维护性问题）
│   └── low.md                          # 建议性发现结果（风格、 minor优化）
├── metrics/
│   ├── complexity.json                 # 每个函数/模块的圈复杂度
│   ├── coverage-gaps.json              # 未测试代码路径、缺失的边缘场景覆盖率
│   └── dependency-analysis.json        # 依赖图、耦合指标、循环依赖
└── auto-fixes/                         # 按服务组织的建议代码补丁
    └── <service>/
        └── <file>.patch.md             # 包含前后代码块及说明的Markdown文件

Severity Levels

严重级别

Severity	Definition	Action
Critical	Data loss risk or correctness bug causing production incidents	Must fix before deployment
High	Architectural violation or reliability risk at scale	Must fix before production release
Medium	Code quality issue increasing maintenance cost	Fix within current sprint
Low	Style issue or minor optimization	Fix when convenient

严重级别	定义	操作
Critical（致命）	存在数据丢失风险或导致生产事故的正确性bug	部署前必须修复
High（高）	架构违规或大规模可靠性风险	生产发布前必须修复
Medium（中）	增加维护成本的代码质量问题	当前迭代内修复
Low（低）	风格问题或 minor优化建议	方便时修复

Phases

评审阶段

Execute each phase sequentially. Every phase produces specific output files. Do NOT skip phases.

按顺序执行每个阶段。每个阶段生成特定输出文件。不得跳过任何阶段。

Parallel Execution Strategy

并行执行策略

Phases 1-4 can run in parallel — each reviews a different dimension of the same codebase:

python

Execute sequentially: Review architecture conformance following Phase 1 checklist. Compare implementation against ADRs. Write to code-reviewer/architecture-conformance.md.
Execute sequentially: Review code quality following Phase 2 checklist (SOLID, DRY, complexity). Write findings to code-reviewer/findings/.
Execute sequentially: Review performance following Phase 3 checklist (N+1, caching, bundle size). Write findings to code-reviewer/findings/.
Execute sequentially: Review test quality following Phase 4 checklist. Cross-reference test plan. Write to code-reviewer/metrics/.

Wait for all 4 agents, then run Phase 5 (Review Report) sequentially — it compiles all findings.

Execution order:

Phases 1-4: Arch Conformance + Code Quality + Performance + Test Quality (PARALLEL)
Phase 5: Review Report (sequential — synthesizes all findings)

第1-4阶段可并行运行——每个阶段评审代码库的不同维度：

python

Execute sequentially: Review architecture conformance following Phase 1 checklist. Compare implementation against ADRs. Write to code-reviewer/architecture-conformance.md.
Execute sequentially: Review code quality following Phase 2 checklist (SOLID, DRY, complexity). Write findings to code-reviewer/findings/.
Execute sequentially: Review performance following Phase 3 checklist (N+1, caching, bundle size). Write findings to code-reviewer/findings/.
Execute sequentially: Review test quality following Phase 4 checklist. Cross-reference test plan. Write to code-reviewer/metrics/.

等待所有4个代理完成后，再按顺序运行第5阶段（评审报告）——该阶段汇总所有发现结果。

执行顺序：

第1-4阶段：架构一致性+代码质量+性能+测试质量（并行）
第5阶段：评审报告（顺序执行——汇总所有发现结果）

Phase 1 — Architecture Conformance

阶段1——架构一致性检查

Goal: Verify that the implementation faithfully follows the architectural decisions documented in

docs/architecture/

. Flag every deviation.

Inputs to read:

```
docs/architecture/
```
ADRs (every Architecture Decision Record)
```
docs/architecture/
```
system architecture diagrams, service boundaries, communication patterns
```
api/
```
API contracts (OpenAPI/AsyncAPI)
```
schemas/
```
data models and database design
```
services/
```
,
```
libs/
```
full backend source tree
```
frontend/
```
full frontend source tree

Review checklist:

Service boundaries — Does each service own exactly the domain it was designed to own? Are there cross-boundary data accesses that bypass APIs?
Communication patterns — If the ADR specifies async messaging between services, verify no synchronous HTTP calls exist between them. If REST was specified, verify no gRPC or GraphQL was introduced without an ADR.
Technology choices — If ADR says PostgreSQL, verify no MongoDB usage. If ADR says Redis for caching, verify no in-memory caches that bypass Redis.
Data ownership — Does each service have its own database/schema? Are there shared tables or direct DB-to-DB queries that violate data isolation?
API contract adherence — Do implemented endpoints match the OpenAPI spec exactly (paths, methods, request/response schemas, status codes)?
Authentication/authorization model — Does the implementation follow the auth architecture (JWT validation, RBAC, API keys) as designed?
Error handling strategy — Does the implementation follow the error handling patterns defined in the architecture (error codes, error response format, retry policies)?
Configuration management — Are secrets managed as designed (env vars, vault, SSM)? Are there hardcoded values that should be configurable?

Output: Write

.forgewright/code-reviewer/architecture-conformance.md

with:

A table listing every ADR from
```
docs/architecture/
```
and its conformance status (Conformant / Partial / Violated)
For each violation: the ADR reference, what was specified, what was implemented, severity, and recommended fix
For partial conformance: what is correct and what deviates

目标： 验证实现是否严格遵循

docs/architecture/

中记录的架构决策。标记所有偏差。

需读取的输入：

```
docs/architecture/
```
中的ADR（每个架构决策记录）
```
docs/architecture/
```
中的系统架构图、服务边界、通信模式
```
api/
```
中的API契约（OpenAPI/AsyncAPI）
```
schemas/
```
中的数据模型和数据库设计
```
services/
```
、
```
libs/
```
完整后端源码树
```
frontend/
```
完整前端源码树

评审检查清单：

服务边界 — 每个服务是否仅拥有其设计时指定的领域？是否存在绕过API的跨边界数据访问？
通信模式 — 如果ADR指定服务间使用异步消息，验证是否不存在同步HTTP调用。如果指定使用REST，验证是否未在无ADR的情况下引入gRPC或GraphQL。
技术选型 — 如果ADR指定使用PostgreSQL，验证是否未使用MongoDB。如果ADR指定使用Redis做缓存，验证是否未使用绕过Redis的内存缓存。
数据所有权 — 每个服务是否拥有自己的数据库/ schema？是否存在违反数据隔离的共享表或直接DB-to-DB查询？
API契约遵循度 — 实现的端点是否完全匹配OpenAPI规格（路径、方法、请求/响应schema、状态码）？
认证/授权模型 — 实现是否遵循设计的认证架构（JWT验证、RBAC、API密钥）？
错误处理策略 — 实现是否遵循架构中定义的错误处理模式（错误码、错误响应格式、重试策略）？
配置管理 — 机密信息是否按设计方式管理（环境变量、vault、SSM）？是否存在应配置化的硬编码值？

输出： 写入

.forgewright/code-reviewer/architecture-conformance.md

，包含：

表格列出
```
docs/architecture/
```
中的每个ADR及其合规状态（合规/部分合规/违规）
针对每个违规：ADR引用、规格要求、实际实现、严重级别、推荐修复方案
针对部分合规：正确部分与偏差部分说明

Phase 2 — Code Quality Analysis

阶段2——代码质量分析

Goal: Evaluate code against software engineering best practices. Identify structural issues that static analysis tools typically miss.

Inputs to read:

```
services/
```
,
```
libs/
```
all backend source files
```
frontend/
```
all frontend source files

Review checklist:

SOLID Principles: Flag violations with thresholds — god-classes (> 300 lines), god-functions (> 50 lines), interfaces > 7 methods, direct infrastructure instantiation in business logic.

Code Structure:

DRY violations — duplicated business logic (not just strings) across multiple places
Cyclomatic complexity — flag functions > 10, record in
```
metrics/complexity.json
```
Error handling — flag swallowed exceptions, generic catches (
```
catch (e: any)
```
), lost stack traces
Logging — verify structured (JSON), appropriate levels, sensitive fields redacted

Frontend-Specific:

Flag components > 200 lines mixing data fetching + business logic + presentation
Flag prop drilling > 3 levels, global state for local concerns
Flag useEffect with missing dependencies or missing cleanup
Flag missing ARIA labels, alt text, keyboard navigation

Output: Write findings to

.forgewright/code-reviewer/findings/

by severity. Write complexity metrics to

.forgewright/code-reviewer/metrics/complexity.json

目标： 对照软件工程最佳实践评估代码。识别静态分析工具通常会遗漏的结构性问题。

需读取的输入：

```
services/
```
、
```
libs/
```
所有后端源码文件
```
frontend/
```
所有前端源码文件

评审检查清单：

SOLID原则： 标记违规项并设置阈值——上帝类（>300行）、上帝函数（>50行）、接口>7个方法、业务逻辑中直接实例化基础设施。

代码结构：

DRY违规 — 多个位置重复的业务逻辑（不仅仅是字符串）
圈复杂度 — 标记复杂度>10的函数，记录到
```
metrics/complexity.json
```
错误处理 — 标记被吞掉的异常、通用捕获（
```
catch (e: any)
```
）、丢失的堆栈跟踪
日志 — 验证是否为结构化（JSON）、级别适当、敏感字段已脱敏

前端专项检查：

标记超过200行且混合数据获取+业务逻辑+展示的组件
标记超过3层的属性透传、用全局状态处理局部问题
标记缺少依赖项或清理逻辑的useEffect
标记缺失的ARIA标签、替代文本、键盘导航支持

输出： 按严重级别将发现结果写入

.forgewright/code-reviewer/findings/

。将复杂度指标写入

.forgewright/code-reviewer/metrics/complexity.json

。

Phase 3 — Performance Review

阶段3——性能评审

Goal: Identify performance bottlenecks, inefficient patterns, and missing optimizations in the codebase.

Inputs to read:

```
services/
```
,
```
libs/
```
all backend source files (especially data access, API handlers, middleware)
```
frontend/
```
all frontend source files (especially data fetching, rendering, bundle composition)
```
docs/architecture/
```
NFRs (latency targets, throughput requirements)

Review checklist:

Backend:

N+1 queries — Flag any loop that executes a database query per iteration. Verify eager loading or batch queries are used for list endpoints.
Missing database indexes — Cross-reference query WHERE clauses and JOIN conditions against migration files. Flag unindexed columns used in frequent queries.
Unbounded queries — Flag SELECT queries without LIMIT. Flag list endpoints without pagination.
Missing caching — Identify read-heavy, rarely-changing data that should be cached. Flag cache invalidation gaps.
Synchronous bottlenecks — Flag synchronous calls to external services in the request path. Verify async/queue patterns for non-time-critical operations (email sending, PDF generation, analytics).
Connection pool configuration — Verify database and HTTP client connection pools are sized appropriately and have timeouts configured.
Memory leaks — Flag event listeners without cleanup, growing maps/arrays without eviction, unclosed resources (file handles, DB connections, streams).
Serialization overhead — Flag large object serialization in hot paths. Verify API responses do not include unnecessary fields.

Frontend: 9. Bundle size — Flag large third-party dependencies imported wholesale (

import _ from 'lodash'

instead of

import get from 'lodash/get'

). 10. Render performance — Flag components that re-render on every parent render without memoization. Flag expensive computations in render path without useMemo. 11. Network waterfall — Flag sequential API calls that could be parallelized. Flag missing data prefetching for predictable navigation. 12. Image optimization — Flag unoptimized images, missing lazy loading, missing responsive srcsets. 13. Missing code splitting — Flag routes that bundle all pages together instead of using lazy loading.

Output: Write performance findings to

.forgewright/code-reviewer/findings/

by severity. Write dependency analysis to

.forgewright/code-reviewer/metrics/dependency-analysis.json

目标： 识别代码库中的性能瓶颈、低效模式和缺失的优化点。

需读取的输入：

```
services/
```
、
```
libs/
```
所有后端源码文件（尤其是数据访问、API处理器、中间件）
```
frontend/
```
所有前端源码文件（尤其是数据获取、渲染、包组成）
```
docs/architecture/
```
中的NFRs（延迟目标、吞吐量要求）

评审检查清单：

后端：

N+1查询 — 标记任何在循环中逐次执行数据库查询的情况。验证列表端点是否使用预加载或批量查询。
缺失的数据库索引 — 将查询WHERE子句和JOIN条件与迁移文件交叉对比。标记频繁查询中未索引的列。
无限制查询 — 标记无LIMIT的SELECT查询。标记无分页的列表端点。
缺失缓存 — 识别应缓存的读密集型、极少变更的数据。标记缓存失效缺口。
同步瓶颈 — 标记请求路径中对外部服务的同步调用。验证非时间关键操作（邮件发送、PDF生成、分析）是否使用异步/队列模式。
连接池配置 — 验证数据库和HTTP客户端连接池大小是否适当，是否配置了超时。
内存泄漏 — 标记无清理的事件监听器、无淘汰机制的增长型映射/数组、未关闭的资源（文件句柄、DB连接、流）。
序列化开销 — 标记热点路径中的大对象序列化。验证API响应是否未包含不必要的字段。

前端： 9. 包大小 — 标记完整导入的大型第三方依赖（如

import _ from 'lodash'

而非

import get from 'lodash/get'

）。 10. 渲染性能 — 标记每次父组件渲染都会重渲染且未使用 memoization 的组件。标记渲染路径中未使用useMemo的昂贵计算。 11. 网络瀑布流 — 标记可并行化的顺序API调用。标记可预测导航中缺失的数据预获取。 12. 图片优化 — 标记未优化的图片、缺失懒加载、缺失响应式srcsets。 13. 缺失代码分割 — 标记将所有页面打包在一起而非使用懒加载的路由。

输出： 按严重级别将性能发现结果写入

.forgewright/code-reviewer/findings/

。将依赖分析写入

.forgewright/code-reviewer/metrics/dependency-analysis.json

。

Phase 4 — Test Quality Review

阶段4——测试质量评审

Goal: Evaluate the test suites in

tests/

for coverage quality, assertion strength, and test design.

Inputs to read:

```
tests/
```
all test files
```
.forgewright/qa-engineer/test-plan.md
```
traceability matrix

.forgewright/qa-engineer/coverage/thresholds.json

```
services/
```
,
```
libs/
```
source files (to identify untested paths)

Review checklist:

Coverage gaps — Identify source files with no corresponding test file. Identify public functions with no test. Identify error handling branches with no test.
Assertion quality — Flag tests that only assert on status codes without checking response bodies. Flag tests with no assertions (they always pass). Flag tests that assert on
```
true
```
/
```
false
```
instead of specific values.
Missing edge cases — For each tested function, identify untested boundary conditions: null inputs, empty collections, maximum values, concurrent access, timeout scenarios.
Test independence — Flag tests that depend on execution order. Flag tests that share mutable state through module-level variables. Flag tests that depend on the output of other tests.
Test naming — Flag test names that describe implementation ("calls processOrder method") instead of behavior ("creates an order with calculated total when items are valid").
Mock quality — Flag mocks that are too permissive (accept any input). Flag mocks that are too brittle (assert on call count or argument order for non-critical interactions).
Integration test isolation — Flag integration tests that leave data behind. Flag integration tests that fail when run in a different order.
E2E test reliability — Flag E2E tests with hardcoded waits. Flag E2E tests that depend on specific data IDs. Flag E2E tests that are not idempotent.
Missing test types — Cross-reference the test plan traceability matrix. Flag acceptance criteria with no corresponding test.
Performance test realism — Flag k6 scripts with unrealistic load profiles (e.g., 10,000 VUs for an internal tool). Flag scripts with missing thresholds.

Output: Write test quality findings to

.forgewright/code-reviewer/findings/

by severity. Write coverage gap analysis to

.forgewright/code-reviewer/metrics/coverage-gaps.json

目标： 评估

tests/

中的测试套件的覆盖质量、断言强度和测试设计。

需读取的输入：

```
tests/
```
所有测试文件
```
.forgewright/qa-engineer/test-plan.md
```
可追溯性矩阵

.forgewright/qa-engineer/coverage/thresholds.json

```
services/
```
、
```
libs/
```
源码文件（用于识别未测试路径）

评审检查清单：

覆盖缺口 — 识别无对应测试文件的源码文件。识别无测试的公共函数。识别无测试的错误处理分支。
断言质量 — 标记仅断言状态码而未检查响应体的测试。标记无断言的测试（此类测试始终通过）。标记断言
```
true
```
/
```
false
```
而非具体值的测试。
缺失边缘场景 — 针对每个已测试函数，识别未测试的边界条件：空输入、空集合、最大值、并发访问、超时场景。
测试独立性 — 标记依赖执行顺序的测试。标记通过模块级变量共享可变状态的测试。标记依赖其他测试输出的测试。
测试命名 — 标记描述实现（如“调用processOrder方法”）而非行为（如“当商品有效时创建包含计算总价的订单”）的测试名称。
Mock质量 — 标记过于宽松的Mock（接受任何输入）。标记过于脆弱的Mock（对非关键交互断言调用次数或参数顺序）。
集成测试隔离性 — 标记遗留数据的集成测试。标记执行顺序变化时会失败的集成测试。
E2E测试可靠性 — 标记包含硬编码等待的E2E测试。标记依赖特定数据ID的E2E测试。标记不具有幂等性的E2E测试。
缺失测试类型 — 与测试计划可追溯性矩阵交叉对比。标记无对应测试的验收标准。
性能测试真实性 — 标记负载配置不现实的k6脚本（如内部工具使用10,000 VU）。标记缺失阈值的脚本。

输出： 按严重级别将测试质量发现结果写入

.forgewright/code-reviewer/findings/

。将覆盖缺口分析写入

.forgewright/code-reviewer/metrics/coverage-gaps.json

。

Phase 5 — Review Report

阶段5——评审报告

Goal: Compile all findings into a structured, actionable review report. Generate auto-fix suggestions for issues where the fix is unambiguous.

Inputs:

All findings from Phases 1-4
All metrics from Phases 2-3

Actions:

Write
```
.forgewright/code-reviewer/review-report.md
```
with the following sections:
- Executive Summary — Total finding count by severity. Overall assessment (Pass / Pass with Conditions / Fail). Top 3 most critical issues.
- Findings by Category — Architecture, Code Quality, Performance, Test Quality. Each finding includes: ID, severity, category, location (file + line), description, impact, and recommended fix.
- Metrics Summary — Cyclomatic complexity distribution, coverage gap summary, dependency health.
- Recommendations — Prioritized list of actions. What to fix now, what to fix next sprint, what to add to tech debt backlog.
- Sign-off Criteria — Conditions that must be met before this review is considered passed: all Critical findings resolved, all High findings resolved or accepted with justification.
Write individual findings files to
```
.forgewright/code-reviewer/findings/
```
:
- ```
critical.md
```
  — Findings that block deployment
- ```
high.md
```
  — Findings that must be fixed before production
- ```
medium.md
```
  — Findings that should be fixed soon
- ```
low.md
```
  — Advisory findings
Each finding:
```
### [FINDING-ID] Short description
```
with Severity, Category, Location (
```
file:line
```
), Description, Impact, Evidence (code block), and Recommendation.
Generate auto-fix suggestions for mechanical, unambiguous fixes (missing null checks, auth middleware, input validation, unused imports, missing indexes). Write to
```
.forgewright/code-reviewer/auto-fixes/<service>/<file>.patch.md
```
with before/after code blocks.
Compile metrics:
- ```
.forgewright/code-reviewer/metrics/complexity.json
```
  — Cyclomatic complexity per function, flagged functions with complexity > 10
- ```
.forgewright/code-reviewer/metrics/coverage-gaps.json
```
  — List of untested files, untested functions, untested branches
- ```
.forgewright/code-reviewer/metrics/dependency-analysis.json
```
  — Service dependency graph, coupling score per service, circular dependency detection

Output: Write all report files, findings, metrics, and auto-fixes to

.forgewright/code-reviewer/

目标： 将所有发现结果整理为结构化、可执行的评审报告。针对修复方案明确的问题生成自动修复建议。

输入：

第1-4阶段的所有发现结果
第2-3阶段的所有指标

操作：

写入
```
.forgewright/code-reviewer/review-report.md
```
，包含以下章节：
- 执行摘要 — 按严重级别统计的总发现数。整体评估（通过/有条件通过/失败）。Top 3最致命问题。
- 按类别分类的发现结果 — 架构、代码质量、性能、测试质量。每个发现结果包含：ID、严重级别、类别、位置（文件+行号）、描述、影响、推荐修复方案。
- 指标摘要 — 圈复杂度分布、覆盖缺口摘要、依赖健康状况。
- 建议 — 按优先级排序的操作列表。立即修复项、下一迭代修复项、加入技术债务待办项。
- 签署标准 — 评审通过前必须满足的条件：所有Critical问题已解决，所有High问题已解决或经论证后接受。
将单个发现结果写入
```
.forgewright/code-reviewer/findings/
```
：
- ```
critical.md
```
  — 阻碍部署的发现结果
- ```
high.md
```
  — 生产前必须修复的发现结果
- ```
medium.md
```
  — 应尽快修复的发现结果
- ```
low.md
```
  — 建议性发现结果
每个发现结果格式：
```
### [发现ID] 简短描述
```
，包含严重级别、类别、位置（
```
file:line
```
）、描述、影响、证据（代码块）、建议。
针对机械性、修复方案明确的问题（缺失空值检查、认证中间件、输入验证、未使用导入、缺失索引）生成自动修复建议。写入
```
.forgewright/code-reviewer/auto-fixes/<service>/<file>.patch.md
```
，包含前后代码块及说明。
汇总指标：
- ```
.forgewright/code-reviewer/metrics/complexity.json
```
  — 每个函数的圈复杂度，标记复杂度>10的函数
- ```
.forgewright/code-reviewer/metrics/coverage-gaps.json
```
  — 未测试文件、未测试函数、未测试分支列表
- ```
.forgewright/code-reviewer/metrics/dependency-analysis.json
```
  — 服务依赖图、每个服务的耦合分数、循环依赖检测

输出： 将所有报告文件、发现结果、指标和自动修复建议写入

.forgewright/code-reviewer/

。

Key Constraints

核心约束

Never report linter-level issues — focus on structural/architectural issues linters miss
Always cross-reference ADRs before flagging architectural concerns
Every finding needs: specific file location, concrete description, impact, and recommended fix
Group related symptoms under one root-cause finding
Skip generated code (migrations, protobuf stubs) or apply relaxed rules
Never modify source files — write all output to
```
.forgewright/code-reviewer/
```
Defer security analysis to security-engineer

绝不报告代码检查工具（linter）级别的问题——专注于代码检查工具遗漏的结构性/架构性问题
标记架构问题前始终交叉参考ADR
每个发现结果需包含：具体文件位置、明确描述、影响、推荐修复方案
将相关症状归为同一根因发现结果
跳过生成代码（迁移文件、protobuf桩代码）或应用宽松规则
绝不修改源代码——所有输出写入
```
.forgewright/code-reviewer/
```
将安全分析委托给security-engineer

Phase 6 — Git Workflow Review

阶段6——Git工作流评审

Goal: Evaluate git workflow practices — branching strategy, commit quality, PR hygiene, and CI/CD integration.

Review checklist:

Branching strategy — Is there a clear strategy (Trunk-based, GitFlow, GitHub Flow)? Flag ad-hoc branch naming, long-lived feature branches (> 1 week), and missing branch protection rules.
Commit hygiene — Are commits atomic (one logical change per commit)? Flag commits mixing unrelated changes, commits with messages like "fix", "wip", "update". Check for conventional commit format (
```
feat:
```
,
```
fix:
```
,
```
chore:
```
,
```
docs:
```
).
PR quality — Do PRs have descriptions? Are they appropriately sized (< 400 lines changed)? Flag PRs > 1000 lines. Check for PR templates.
Code review process — Is there a minimum reviewer count? Are reviews resolved before merge? Flag force-push-to-main or direct commits to protected branches.
Merge strategy — Is squash-merge, rebase-merge, or merge-commit used consistently? Flag mixed strategies. Check for clean git history (no merge commit spaghetti).
CI integration — Do CI checks run on PRs? Are they required to pass before merge? Flag missing status checks.

Output: Include git workflow findings in

review-report.md

under a dedicated "Git Workflow" category.

目标： 评估Git工作流实践——分支策略、提交质量、PR规范、CI/CD集成。

评审检查清单：

分支策略 — 是否有明确的策略（主干开发、GitFlow、GitHub Flow）？标记临时分支命名、长期特性分支（>1周）、缺失的分支保护规则。
提交规范 — 提交是否原子化（每次提交对应一个逻辑变更）？标记混合无关变更的提交、提交消息为“fix”“wip”“update”的提交。检查是否符合约定式提交格式（
```
feat:
```
、
```
fix:
```
、
```
chore:
```
、
```
docs:
```
）。
PR质量 — PR是否有描述？大小是否合适（<400行变更）？标记超过1000行的PR。检查是否使用PR模板。
代码评审流程 — 是否有最低评审人数要求？评审是否在合并前解决？标记强制推送到主分支或直接提交到受保护分支的情况。
合并策略 — 是否一致使用 squash-merge、rebase-merge 或 merge-commit？标记混合策略。检查Git历史是否整洁（无合并提交混乱）。
CI集成 — PR是否运行CI检查？是否要求CI通过才能合并？标记缺失的状态检查。

输出： 在

review-report.md

的“Git工作流”专属类别中包含Git工作流发现结果。

Execution Checklist

执行检查清单

Before marking the skill as complete, verify:

architecture-conformance.md

audits every ADR in

docs/architecture/

with a conformance status

Every finding has: ID, severity, category, file location, description, impact, and recommendation
Performance review checks for N+1 queries, missing indexes, unbounded queries, and caching gaps
Test quality review cross-references the
```
.forgewright/qa-engineer/test-plan.md
```
traceability matrix for coverage gaps
```
review-report.md
```
has an executive summary with total finding counts and overall assessment
Findings are correctly distributed across
```
critical.md
```
,
```
high.md
```
,
```
medium.md
```
, and
```
low.md
```
```
metrics/complexity.json
```
has per-function cyclomatic complexity scores
```
metrics/coverage-gaps.json
```
identifies untested files, functions, and branches
```
metrics/dependency-analysis.json
```
maps service dependencies and flags circular dependencies
Auto-fixes exist for all mechanical issues (missing null checks, missing auth, etc.)
No files were created or modified outside of .forgewright/code-reviewer/
The report is actionable — a developer can read a finding and know exactly what to fix and where
No OWASP or security review was performed — security analysis is deferred to security-engineer

在标记技能完成前，验证：

```
architecture-conformance.md
```
审计了
```
docs/architecture/
```
中的每个ADR并标注了合规状态
每个发现结果包含：ID、严重级别、类别、文件位置、描述、影响、建议
性能评审检查了N+1查询、缺失索引、无限制查询和缓存缺口
测试质量评审与
```
.forgewright/qa-engineer/test-plan.md
```
可追溯性矩阵交叉对比以识别覆盖缺口
```
review-report.md
```
包含执行摘要，其中有总发现数统计和整体评估
发现结果已正确分配到
```
critical.md
```
、
```
high.md
```
、
```
medium.md
```
和
```
low.md
```
```
metrics/complexity.json
```
包含每个函数的圈复杂度分数
```
metrics/coverage-gaps.json
```
识别了未测试文件、函数和分支
```
metrics/dependency-analysis.json
```
映射了服务依赖并标记了循环依赖
所有机械性问题（缺失空值检查、缺失认证等）都有自动修复建议
未在.forgewright/code-reviewer/之外创建或修改任何文件
报告具备可执行性——开发人员读取发现结果后可明确知道要修复什么以及在哪里修复
未执行OWASP或安全评审——安全分析已委托给security-engineer