adversarial-code-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Adversarial Code Review

对抗性代码审查

Core Principle: Review as if you're trying to break the code. Deliberately adopt hostile perspectives—each reveals issues the others miss.
This is not about finding fault. It's about finding problems before users do.
核心原则: 以试图破坏代码的心态进行审查。刻意采用恶意视角——每个视角都能发现其他视角遗漏的问题。
这不是为了挑错,而是为了在用户发现问题之前找到它们。

Review Mode

审查模式

ModeTriggerFocus
Diff-Focused (default)No explicit instruction, PR reviewWhat changed? What could break?
Audit"audit", "holistic", "codebase review"Broader scope, systematic coverage
When in doubt, use diff-focused mode. Audit mode requires explicit request.
模式触发条件关注重点
差异聚焦模式(默认)无明确指令、PR审查哪些内容被修改?哪些部分可能崩溃?
审计模式收到“audit”、“holistic”、“codebase review”指令更广泛的范围、系统性覆盖
若不确定,请使用差异聚焦模式。审计模式需要明确的请求。

The Six Adversarial Lenses

六大对抗性视角

Review through each lens deliberately. Don't blend them—switching perspectives forces deeper analysis.
LensCore QuestionReveals
Malicious User"How would I exploit this?"Input validation gaps, injection vectors, privilege escalation
Careless Colleague"How would this break if used wrong?"API misuse, unclear contracts, error handling gaps
Future Maintainer"What will confuse me in 6 months?"Implicit assumptions, missing context, temporal coupling
Ops/On-Call"How will this fail at 3am?"Observability gaps, recovery paths, failure modes
Data Integrity"What happens to state?"Race conditions, partial failures, consistency violations
Interaction Effects"What does this change elsewhere?"Unintended side effects, behavioral changes, contract breaks
刻意从每个视角进行审查。不要混合视角——切换视角能促使更深入的分析。
视角核心问题能发现的问题
恶意用户视角“我该如何利用这段代码?”输入验证漏洞、注入攻击向量、权限提升问题
粗心同事视角“如果使用方式错误,这段代码会如何崩溃?”API误用、不清晰的契约、错误处理漏洞
未来维护者视角“六个月后这段代码会让我困惑什么?”隐含假设、缺失上下文、时间耦合问题
运维/值班工程师视角“这段代码会在凌晨3点如何故障?”可观测性漏洞、恢复路径、故障模式
数据完整性视角“状态会发生什么变化?”竞态条件、部分故障、一致性违规问题
交互影响视角“这段修改会对其他地方产生什么影响?”意外副作用、行为变更、契约破坏

Lens Details

视角细节

Malicious User

恶意用户视角

Assume the user is actively trying to break or exploit the system.
  • What inputs are trusted that shouldn't be?
  • Can I escalate privileges or access unauthorized data?
  • What happens if I send malformed/oversized/unexpected input?
  • Are there injection points (SQL, XSS, command, path traversal)?
For deep security reviews, see
references/security-lens-detail.md
.
假设用户正积极尝试破坏或利用系统。
  • 哪些被信任的输入其实不应该被信任?
  • 我能否提升权限或访问未授权数据?
  • 如果我发送格式错误/过大/意外的输入会发生什么?
  • 是否存在注入点(SQL、XSS、命令注入、路径遍历)?
如需深入的安全审查,请参阅
references/security-lens-detail.md

Careless Colleague

粗心同事视角

Assume another developer will use this code without reading documentation.
  • Is the API intuitive or are there "gotchas"?
  • What happens if methods are called in wrong order?
  • Are error messages helpful or cryptic?
  • Could someone misuse this and get silent wrong results?
假设其他开发者会在不阅读文档的情况下使用这段代码。
  • API是否直观,还是存在“陷阱”?
  • 如果方法调用顺序错误,会发生什么?
  • 错误提示是否有用,还是晦涩难懂?
  • 有人是否会误用这段代码并得到无提示的错误结果?

Future Maintainer

未来维护者视角

Assume you'll revisit this code in 6 months with no memory of writing it.
  • Why does this code exist? Is that documented?
  • What assumptions are implicit that should be explicit?
  • Are there magic numbers/strings without explanation?
  • Would I understand the control flow on first read?
假设你会在六个月后重新审视这段代码,且完全忘记了它的编写背景。
  • 这段代码存在的原因是什么?是否有文档说明?
  • 哪些隐含假设应该被明确标注?
  • 是否存在无解释的魔法数字/字符串?
  • 我第一次阅读时能否理解控制流?

Ops/On-Call

运维/值班工程师视角

Assume this will fail in production at the worst possible time.
  • How will I know when this fails? (Logging, metrics, alerts)
  • Can I diagnose the problem from logs alone?
  • Is there a recovery path? Can it be retried safely?
  • What's the blast radius if this fails?
假设这段代码会在最糟糕的时间(生产环境中)发生故障。
  • 我如何知道这段代码发生了故障?(日志、指标、告警)
  • 仅通过日志我能否诊断问题?
  • 是否有恢复路径?能否安全重试?
  • 如果这段代码故障,影响范围有多大?

Data Integrity

数据完整性视角

Assume multiple things will try to modify state simultaneously.
  • What happens if this runs twice concurrently?
  • Are there partial failure states that leave data inconsistent?
  • Is there a transaction boundary? What if it fails mid-way?
  • Are reads and writes properly synchronized?
假设多个进程会同时尝试修改状态。
  • 如果这段代码并发运行两次会发生什么?
  • 是否存在导致数据不一致的部分故障状态?
  • 是否有事务边界?如果中途失败会怎样?
  • 读写操作是否同步得当?

Interaction Effects

交互影响视角

Assume this change has consequences beyond its immediate scope.
  • What calls this? Will their expectations still hold?
  • What does this call? What assumptions are we making?
  • Does this subtly change behavior callers depend on?
  • Are there caches, indexes, or derived data that need updating?
假设这段修改会产生超出其直接范围的影响。
  • 哪些部分会调用这段代码?它们的预期是否仍然成立?
  • 这段代码会调用哪些部分?我们做出了哪些假设?
  • 这段修改是否会微妙地改变调用方依赖的行为?
  • 是否有缓存、索引或派生数据需要更新?

Review Workflow

审查工作流

Copy this checklist when starting a review:
Adversarial Review Progress:
- [ ] Step 1: Determine mode (diff-focused or audit)
- [ ] Step 2: Understand the change/code purpose
- [ ] Step 3: Apply lenses (prioritize by risk, ~5 min each):
  - [ ] Malicious User
  - [ ] Careless Colleague
  - [ ] Future Maintainer
  - [ ] Ops/On-Call Engineer
  - [ ] Data Integrity
  - [ ] Interaction Effects
- [ ] Step 4: Filter findings through Impact Filter
- [ ] Step 5: Classify severity (Must Fix / Should Fix / Consider)
- [ ] Step 6: Limit "Consider" items to max 2
- [ ] Step 7: Identify at least one positive
- [ ] Step 8: Format report
开始审查时复制此检查清单:
Adversarial Review Progress:
- [ ] Step 1: Determine mode (diff-focused or audit)
- [ ] Step 2: Understand the change/code purpose
- [ ] Step 3: Apply lenses (prioritize by risk, ~5 min each):
  - [ ] Malicious User
  - [ ] Careless Colleague
  - [ ] Future Maintainer
  - [ ] Ops/On-Call Engineer
  - [ ] Data Integrity
  - [ ] Interaction Effects
- [ ] Step 4: Filter findings through Impact Filter
- [ ] Step 5: Classify severity (Must Fix / Should Fix / Consider)
- [ ] Step 6: Limit "Consider" items to max 2
- [ ] Step 7: Identify at least one positive
- [ ] Step 8: Format report

Lens Prioritization

视角优先级

Not all lenses are equally important for all code. Prioritize:
Code TypePriority Lenses
User input handlingMalicious User, Data Integrity
API/public interfaceCareless Colleague, Interaction Effects
Background jobsOps/On-Call, Data Integrity
Business logicFuture Maintainer, Interaction Effects
Database operationsData Integrity, Ops/On-Call
并非所有视角对所有代码都同等重要。请按以下优先级处理:
代码类型优先关注的视角
用户输入处理代码恶意用户视角、数据完整性视角
API/公共接口代码粗心同事视角、交互影响视角
后台任务代码运维/值班工程师视角、数据完整性视角
业务逻辑代码未来维护者视角、交互影响视角
数据库操作代码数据完整性视角、运维/值班工程师视角

The Five Iron Laws

五大铁律

<IMPORTANT> 1. **No findings without specific location AND impact** - Bad: "This could have race conditions" - Good: "Line 45: concurrent access to `cache` without lock could cause data corruption when requests overlap"
  1. Severity matches actual risk, not theoretical worst-case
    • Bad: "CRITICAL: This string could theoretically be used for XSS" (in internal CLI)
    • Good: "LOW: Unescaped string—not currently risky but add escaping if this reaches browser"
  2. Every "Must Fix" requires demonstration or clear reasoning
    • Don't just assert the bug exists—show WHY it's a bug
  3. Alternative suggestions are optional, not mandated
    • Present options. Don't dictate implementation details.
  4. Acknowledge at least one thing done well
    • Adversarial doesn't mean hostile. Recognition builds trust. </IMPORTANT>
<IMPORTANT> 1. **所有发现必须包含具体位置和影响** - 错误示例:“这段代码可能存在竞态条件” - 正确示例:“第45行:无锁并发访问`cache`可能导致请求重叠时的数据损坏”
  1. 严重程度匹配实际风险,而非理论最坏情况
    • 错误示例:“CRITICAL:这个字符串理论上可用于XSS攻击”(内部CLI中)
    • 正确示例:“LOW:未转义字符串——当前无风险,但如果该内容进入浏览器请添加转义”
  2. 每个“必须修复”项需要演示或清晰的推理
    • 不要只断言bug存在——说明为什么这是bug
  3. 替代建议是可选的,而非强制的
    • 提供选项,不要 dictate 实现细节。
  4. 至少认可一个做得好的地方
    • 对抗性不等于敌意。认可能建立信任。 </IMPORTANT>

Impact Filter

影响过滤器

Every potential finding must pass this filter. Score 2+ to report:
□ Likely to occur (probability)
□ Impactful if it occurs (severity)
□ Non-obvious to the author (added value)
If a finding scores 0-1, don't report it. You're adding noise, not value.
每个潜在发现必须通过此过滤器。得分≥2才可报告:
□ 可能发生(概率)
□ 发生后有影响(严重程度)
□ 代码作者未意识到(附加价值)
如果发现得分0-1,请勿报告。这只会增加噪音,而非价值。

Severity Tiers

严重程度分级

TierDefinitionActionExamples
Must FixBreaks correctness, security, or data integrityBlock mergeSQL injection, race condition causing data loss, auth bypass
Should FixLikely problems but not immediately brokenFix before or soon after mergeMissing error handling, unclear naming, no tests for edge case
ConsiderStyle, optimization, theoretical concernsMax 2 per reviewCould be more idiomatic, minor perf optimization
级别定义行动要求示例
必须修复破坏正确性、安全性或数据完整性阻止合并SQL注入、导致数据丢失的竞态条件、权限绕过
应该修复可能存在问题但未立即崩溃合并前或合并后尽快修复缺失错误处理、命名不清晰、边缘情况无测试
建议考虑风格、优化、理论性问题每次审查最多2项可更符合语言习惯、 minor 性能优化

The "Consider" Trap

“建议考虑”陷阱

Limit "Consider" comments to 2 maximum. More than that:
  • Dilutes important feedback
  • Feels like nitpicking
  • Reduces trust in your reviews
If you have many "Consider" items, pick the 2 most valuable and drop the rest.
将“建议考虑”的评论限制在最多2项。超过2项会:
  • 稀释重要反馈的关注度
  • 显得吹毛求疵
  • 降低审查的可信度
如果你有很多“建议考虑”项,请选择最有价值的2项,其余的放弃。

What NOT to Flag

不应标记的内容

  • Style preferences covered by linter/formatter — Automation handles this
  • Alternative implementations of equal merit — "I would have done X" isn't a bug
  • Hypothetical futures — "What if we need to support Y someday..." isn't actionable
  • Things you'd do differently but aren't wrong — Preferences aren't defects
The Test: "Would a reasonable senior engineer disagree with me here?"
If yes → Probably not worth commenting.
  • 代码规范工具已覆盖的风格偏好——自动化工具会处理这些
  • 同等价值的替代实现——“我会用X方法实现”不是bug
  • 假设性未来需求——“如果我们以后需要支持Y怎么办……”不具备可操作性
  • 你个人偏好但并非错误的内容——偏好不等于缺陷
测试标准: “资深工程师是否会合理地不同意我的观点?”
如果是→可能不值得评论。

Reporting Format

报告格式

Structure findings clearly:
markdown
undefined
清晰结构化发现:
markdown
undefined

Summary

摘要

[1-2 sentence overview of the review]
[1-2句话概述审查内容]

What's Done Well

做得好的地方

  • [Specific positive observation]
  • [具体的正面观察]

Must Fix

必须修复

[Issue Title]

[问题标题]

Location:
file.ts:45-52
Lens: [Which lens found this] Issue: [Clear description of the problem] Impact: [What happens if not fixed] Suggestion: [Optional - how to fix]
位置:
file.ts:45-52
视角: [发现该问题的视角] 问题: [清晰描述问题] 影响: 不修复会发生什么 建议: [可选 - 修复方案]

Should Fix

应该修复

[Same format as Must Fix]
[与“必须修复”格式相同]

Consider

建议考虑

[Brief bullet points only - max 2 items]
undefined
[仅简要要点 - 最多2项]
undefined

When to Escalate

何时升级处理

Stop the review and escalate when:
TriggerAction
Security-critical code (auth, crypto, payments)See
references/security-lens-detail.md
, consider external review
3+ "Must Fix" issues foundStop reviewing. Escalate for fundamental redesign discussion.
You don't understand the codeDon't guess. Request walkthrough before reviewing.
Architectural concernsFlag for design discussion, don't try to "fix" in review
当出现以下情况时,停止审查并升级处理:
触发条件行动
安全关键代码(认证、加密、支付)参阅
references/security-lens-detail.md
,考虑外部审查
发现3个以上“必须修复”问题停止审查。升级以讨论基础设计重写。
你无法理解代码不要猜测。请求代码走查后再审查。
架构层面的问题标记为设计讨论项,不要试图在审查中“修复”

Audit Mode

审计模式

When explicitly requested to audit (not just review changes):
当明确要求进行审计(而非仅审查变更)时:

Scope Definition

范围定义

Before starting, clarify:
  • What areas/modules to focus on?
  • What's the primary concern? (Security? Performance? Maintainability?)
  • What's the time budget?
开始前,明确:
  • 重点关注哪些区域/模块?
  • 主要关注点是什么?(安全?性能?可维护性?)
  • 时间预算是多少?

Sampling Strategy

抽样策略

For large codebases, don't review everything. Sample strategically:
  1. High-risk areas first — Auth, payments, user input handling
  2. Recently changed code
    git log --since="3 months ago" --name-only
  3. Complex code — High cyclomatic complexity, many dependencies
  4. Code with no tests — Higher likelihood of hidden bugs
对于大型代码库,无需审查所有内容。请战略性抽样:
  1. 先审查高风险区域——认证、支付、用户输入处理
  2. 最近修改的代码——
    git log --since="3 months ago" --name-only
  3. 复杂代码——圈复杂度高、依赖多的代码
  4. 无测试的代码——隐藏bug的可能性更高

Audit Checklist Addition

审计检查清单补充项

Audit-Specific Steps:
- [ ] Define scope and primary concerns with requester
- [ ] Identify high-risk areas for focused review
- [ ] Sample strategically (don't boil the ocean)
- [ ] Track coverage (what was reviewed vs skipped)
- [ ] Note systemic patterns across multiple files
Audit-Specific Steps:
- [ ] Define scope and primary concerns with requester
- [ ] Identify high-risk areas for focused review
- [ ] Sample strategically (don't boil the ocean)
- [ ] Track coverage (what was reviewed vs skipped)
- [ ] Note systemic patterns across multiple files

Edge Cases

边缘情况

For systematic edge case generation by input domain, see
references/edge-case-domains.md
.
如需按输入域系统性生成边缘情况,请参阅
references/edge-case-domains.md

Common Mistakes

常见错误

Reviewing Without Understanding

未理解代码就开始审查

Don't start reviewing until you understand:
  • What is this code supposed to do?
  • Why does this change exist?
  • What's the broader context?
Reviewing without understanding produces shallow, unhelpful feedback.
在理解以下内容前,不要开始审查:
  • 这段代码的预期功能是什么?
  • 为什么要做这个修改?
  • 更广泛的上下文是什么?
未理解代码就进行审查会产生肤浅、无用的反馈。

Lens Blending

视角混合

Don't try to apply all lenses simultaneously. You'll miss things.
Do this:
  1. Apply Malicious User lens → Note findings
  2. Apply Careless Colleague lens → Note findings
  3. Continue through remaining lenses
Not this:
  • "Let me look at this code and find all the issues"
不要同时应用所有视角。你会遗漏问题。
正确做法:
  1. 应用恶意用户视角→记录发现
  2. 应用粗心同事视角→记录发现
  3. 继续应用剩余视角
错误做法:
  • “让我看看这段代码,找出所有问题”

Severity Inflation

严重程度夸大

Not everything is critical. Reserve "Must Fix" for actual blockers.
If everything is urgent, nothing is urgent.
并非所有问题都是关键的。仅将“必须修复”留给实际的阻塞问题。
如果所有内容都紧急,就没有真正紧急的内容了。

Missing the Forest for Trees

只见树木不见森林

After applying all lenses, step back:
  • Are there systemic patterns in the findings?
  • Is there a deeper design issue causing multiple symptoms?
  • Should this be a redesign conversation instead of a review?
应用所有视角后,退一步思考:
  • 发现中是否存在系统性模式?
  • 是否存在导致多个症状的深层设计问题?
  • 这是否应该是设计讨论,而非审查中的“修复”?

Key Principle

核心原则

The goal isn't to find as many issues as possible. It's to find the issues that matter before they reach users.
Quality over quantity. Impact over volume. Trust over thoroughness.
目标不是发现尽可能多的问题。而是在问题到达用户之前,找出真正重要的问题。
质量优先于数量。影响优先于数量。信任优先于彻底性。",