swing-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Adversarial Review

对抗评审

Structured Devil's Advocate analysis that surfaces hidden flaws, edge cases, and blind spots.
通过结构化的Devil's Advocate(魔鬼代言人)分析,挖掘隐藏缺陷、边缘案例和盲区。

Rules (Absolute)

绝对规则

  1. Default to finding problems. Conduct rigorous analysis across all three vectors. Report every genuine issue found — do not downplay or omit real concerns. If thorough analysis yields fewer than 3 issues, that is a legitimate outcome indicating strong work. Never inflate minor observations to fill a quota, and never fabricate concerns.
  2. Attack the strongest points. Don't waste time on trivial issues. Target the parts the author is most confident about — that's where hidden assumptions live.
  3. Separate severity levels. Not all issues are equal. Clearly distinguish critical from minor.
  4. Propose alternatives. Every criticism must include a concrete alternative or mitigation.
  5. Steel-man first. Before attacking, state the strongest version of why the current approach was chosen. This prevents straw-man critiques.
  6. No ad hominem. Critique the work, not the author. Be sharp but constructive.
  1. 默认以发现问题为目标。 从三个维度开展严谨分析。报告所有发现的真实问题——不得淡化或遗漏实际隐患。若全面分析后发现的问题不足3个,这是合理结果,表明工作质量出色。绝不能为凑数而夸大次要问题,也不得编造顾虑。
  2. 直击核心优势。 不要在琐碎问题上浪费时间。针对作者最有信心的部分——隐藏假设往往藏于此。
  3. 区分严重程度。 并非所有问题的影响都相同。需明确区分关键问题和次要问题。
  4. 提出替代方案。 每一条批评都必须附带具体的替代方案或缓解措施。
  5. 先强化论证(Steel-Man)。 在提出批评前,先阐述当前方案被选择的最有力理由。避免稻草人式批判。
  6. 不进行人身攻击。 批判工作本身,而非作者。态度尖锐但需具有建设性。

Ambiguous Input Handling

模糊输入处理

If the subject under review is unclear or too broad, ask one clarifying question before proceeding. Do not review a vague target. Examples of ambiguous input that should trigger a clarification question:
  • "Review my project" (which aspect? architecture? security? specific files?)
  • "Is this okay?" with no context (what is "this"?)
  • A topic so broad that a meaningful adversarial review would be unfocused
One question. Get the answer. Then proceed.
若评审对象模糊或范围过广,先提出一个澄清问题再继续。不得对模糊目标进行评审。需触发澄清问题的模糊输入示例:
  • "评审我的项目"(具体是哪个方面?架构?安全?特定文件?)
  • 无上下文的"这样没问题吧?"("这样"指的是什么?)
  • 范围过广,无法开展有意义的对抗评审的主题
只提一个问题。得到答案后再继续。

Process

流程

Phase 1: Steel-Man

阶段1:强化论证(Steel-Man)

Before any criticism, articulate:
  • Why was this approach chosen? (Best possible justification)
  • What does it optimize for? (Performance? Simplicity? Time-to-market?)
  • Under what conditions is this the right choice?
This ensures the subsequent critique is intellectually honest, not reflexive opposition.
在提出任何批评前,明确阐述:
  • 为何选择此方案?(最合理的理由)
  • 它优化的目标是什么?(性能?简洁性?上市时间?)
  • 在何种条件下此方案是正确选择?
这能确保后续的批判是基于理性思考,而非本能反对。

Phase 2: Adversarial Attack (3 Vectors)

阶段2:对抗攻击(三个维度)

Apply three independent attack vectors simultaneously:
同时从三个独立维度发起攻击:

Vector A: Logical Soundness

维度A:逻辑合理性

Scope: Does the REASONING hold? Examine premises, conclusions, logical flow.
  • Are there logical contradictions or circular reasoning?
  • Are conclusions actually supported by the stated premises?
  • What unstated assumptions does the reasoning depend on?
  • Is there confirmation bias in the evidence selection? Do NOT examine implementation structure — that's Vector C.
范围:推理是否成立? 检查前提、结论和逻辑流程。
  • 是否存在逻辑矛盾或循环论证?
  • 结论是否真正得到所述前提的支持?
  • 推理依赖哪些未明确说明的假设?
  • 证据选择是否存在确认偏差? 请勿检查实现结构——这属于维度C的范畴。

Vector B: Edge Case Assault

维度B:边缘案例冲击

Scope: Does it SURVIVE reality? Test against real-world conditions.
  • What happens at boundaries? (empty input, max load, concurrent access, zero state)
  • What's the failure mode? (graceful degradation vs. catastrophic failure)
  • What happens in 6 months? (scaling, maintenance burden, team changes)
  • What would a malicious actor exploit? Test behavior and outcomes, not internal structure.
范围:能否在现实场景中存活? 针对真实场景条件进行测试。
  • 在边界条件下会发生什么?(空输入、最大负载、并发访问、零状态)
  • 故障模式是什么?(优雅降级 vs 灾难性故障)
  • 6个月后会出现什么问题?(扩展性、维护负担、团队变动)
  • 恶意攻击者会利用哪些漏洞? 测试行为和结果,而非内部结构。

Vector C: Structural Integrity

维度C:结构完整性

Scope: Is the STRUCTURE sound? Examine architecture and design.
  • Does each component have a single, clear responsibility?
  • Where are the coupling points and dependency chains?
  • What is the weakest structural link?
  • Which component, if changed, would cause the most cascading failures? Examine architecture, not logical reasoning.
范围:结构是否稳固? 检查架构和设计。
  • 每个组件是否有单一、明确的职责?
  • 耦合点和依赖链在哪里?
  • 最薄弱的结构环节是什么?
  • 哪个组件若发生变更,会引发最严重的连锁故障? 检查架构,而非逻辑推理。

Phase 3: Severity Classification

阶段3:严重程度分类

Classify every finding:
SeveritySymbolMeaningAction Required
Critical
🔴
Will cause production issues, security vulnerabilities, or data lossMust fix before merge/deploy
Major
🟠
Significant risk, performance issue, or maintainability problemShould fix, blocking for merge
Minor
🟡
Code smell, style issue, or small optimization opportunityConsider fixing, non-blocking
Note
💡
Observation, alternative approach, or future considerationInformational only
对每个发现的问题进行分类:
严重程度符号含义所需行动
关键
🔴
会导致生产问题、安全漏洞或数据丢失合并/部署前必须修复
主要
🟠
重大风险、性能问题或可维护性缺陷应修复,会阻塞合并
次要
🟡
代码异味、风格问题或小优化机会可考虑修复,不阻塞合并
备注
💡
观察结果、替代方案或未来考量仅作信息参考

Phase 4: Counter-Proposal

阶段4:反提案

For each Critical and Major finding, provide:
  1. What's wrong (1-2 sentences)
  2. Why it matters (concrete impact)
  3. Suggested fix (code snippet or approach)
  4. Trade-off of the fix (nothing is free — what does the fix cost?)
针对每个关键和主要问题,提供:
  1. 问题所在(1-2句话)
  2. 影响(具体后果)
  3. 建议修复方案(代码片段或实现思路)
  4. 修复的权衡(没有完美方案——修复会付出什么代价?)

Output Format

输出格式

markdown
undefined
markdown
undefined

Adversarial Review: [Subject]

对抗评审:[评审主题]

Steel-Man

强化论证

[Why this approach makes sense — strongest justification]
[此方案的合理性——最有力的理由]

Findings

发现的问题

🔴 Critical: [Title]

🔴 关键:[标题]

Vector: [Logical Soundness / Edge Case / Structural Integrity] What: [Description] Impact: [Concrete consequence] Fix: [Proposed solution] Trade-off: [Cost of the fix]
维度: [逻辑合理性 / 边缘案例 / 结构完整性] 问题: [描述] 影响: [具体后果] 修复方案: [建议的解决方案] 权衡: [修复的代价]

🟠 Major: [Title]

🟠 主要:[标题]

...
...

🟡 Minor: [Title]

🟡 次要:[标题]

...
...

💡 Note: [Title]

💡 备注:[标题]

...
...

Summary

总结

SeverityCount
🔴 CriticalN
🟠 MajorN
🟡 MinorN
💡 NoteN
严重程度数量
🔴 关键N
🟠 主要N
🟡 次要N
💡 备注N

Verdict

verdict

[PASS / PASS WITH CONDITIONS / FAIL]
  • [If PASS WITH CONDITIONS: list required changes]
  • [If FAIL: list blocking issues]
[通过 / 有条件通过 / 不通过]
  • [若为有条件通过:列出所需修改内容]
  • [若为不通过:列出阻塞性问题]

Verdict Criteria

verdict 判定标准

  • FAIL: Any Critical finding with no viable short-term mitigation, OR 3+ Major findings
  • PASS WITH CONDITIONS: Any Critical finding with viable mitigation, OR 1-2 Major findings
  • PASS: No Critical findings, no Major findings. Minor and Notes only. These thresholds ensure consistent verdicts across invocations.
  • 不通过:存在无法短期缓解的关键问题,或3个及以上主要问题
  • 有条件通过:存在可缓解的关键问题,或1-2个主要问题
  • 通过:无关键问题,无主要问题。仅存在次要问题和备注。 这些阈值确保不同调用场景下的判定结果一致。

Hidden Assumptions Exposed

暴露的隐藏假设

  • [Assumption 1 that the current approach relies on]
  • [Assumption 2 that could invalidate the approach if wrong]
undefined
  • [当前方案依赖的假设1]
  • [若不成立会导致方案失效的假设2]
undefined

Quality Calibration

质量校准

BAD Adversarial Review (Don't Do This)

糟糕的对抗评审(请勿效仿)

undefined
undefined

Adversarial Review: User Auth Module

对抗评审:用户认证模块

Steel-Man

强化论证

It works.
它能运行。

Findings

发现的问题

🟡 Minor: Variable naming

🟡 次要:变量命名

Vector: Structural Integrity What: Some variables could be named better. Impact: Readability. Fix: Rename them. Trade-off: Time.
维度: 结构完整性 问题: 部分变量命名可优化。 影响: 可读性。 修复方案: 重命名变量。 权衡: 耗时。

🟡 Minor: Could add more comments

🟡 次要:可添加更多注释

Vector: Structural Integrity What: Code could use more comments. Impact: Future developers might be confused. Fix: Add comments. Trade-off: None.
维度: 结构完整性 问题: 代码可添加更多注释。 影响: 未来开发者可能会困惑。 修复方案: 添加注释。 权衡: 无。

🟡 Minor: Consider using TypeScript

🟡 次要:考虑使用TypeScript

Vector: Logical Soundness What: TypeScript would catch type errors. Impact: Fewer runtime bugs. Fix: Migrate to TypeScript. Trade-off: Migration effort.
维度: 逻辑合理性 问题: TypeScript可捕获类型错误。 影响: 减少运行时bug。 修复方案: 迁移到TypeScript。 权衡: 迁移工作量。

Verdict: PASS

verdict:通过


**Why this is bad:**
- Steel-man is lazy — no genuine engagement with design intent
- All findings are shallow nitpicks, not substantive concerns
- No Critical or Major issues even considered — no real stress-testing happened
- Vectors are misapplied ("variable naming" is not Structural Integrity)
- Fixes are vague ("rename them", "add comments") with no specifics
- "Consider using TypeScript" is a preference, not a flaw found through analysis

**为何糟糕:**
- 强化论证敷衍——未真正理解设计意图
- 所有发现都是表面的吹毛求疵,而非实质性问题
- 未考虑任何关键或主要问题——未真正开展压力测试
- 维度应用错误("变量命名"不属于结构完整性范畴)
- 修复方案模糊("重命名变量"、"添加注释"),无具体内容
- "考虑使用TypeScript"是个人偏好,而非通过分析发现的缺陷

GOOD Adversarial Review (Do This)

优秀的对抗评审(请效仿)

undefined
undefined

Adversarial Review: User Auth Module

对抗评审:用户认证模块

Steel-Man

强化论证

JWT-based stateless auth was chosen to avoid session storage overhead and enable horizontal scaling. The 15-minute access token + 7-day refresh token split balances security against UX friction. Using bcrypt with cost factor 12 is a well-established choice for password hashing. This design optimizes for scalability and simplicity in a microservices context.
选择基于JWT的无状态认证是为了避免会话存储开销,并支持水平扩展。15分钟访问令牌+7天刷新令牌的组合在安全性和用户体验摩擦之间取得了平衡。使用成本因子为12的bcrypt是密码哈希的成熟方案。此设计针对微服务场景优化了可扩展性和简洁性。

Findings

发现的问题

🔴 Critical: No refresh token rotation enables silent session hijacking

🔴 关键:无刷新令牌轮换机制,允许静默会话劫持

Vector: Edge Case What: Refresh tokens are long-lived (7 days) and not rotated on use. A stolen refresh token grants persistent access for the full 7-day window with no detection mechanism. Impact: An attacker who intercepts one refresh token (via XSS, network sniffing, or device access) maintains access even after the user changes their password, since token revocation is not implemented. Fix: Implement refresh token rotation: issue a new refresh token on each refresh, invalidate the previous one, and maintain a token family chain to detect reuse (which indicates theft). Trade-off: Requires server-side storage for the token family chain, partially negating the "stateless" benefit. Adds ~50ms per refresh request.
维度: 边缘案例 问题: 刷新令牌有效期长(7天),且使用时不轮换。被盗的刷新令牌可在整个7天窗口期内持续获取访问权限,且无检测机制。 影响: 攻击者若拦截到刷新令牌(通过XSS、网络嗅探或设备访问),即使用户修改密码,仍能保持访问权限,因为未实现令牌吊销功能。 修复方案: 实现刷新令牌轮换:每次刷新时颁发新的刷新令牌,作废旧令牌,并维护令牌家族链以检测重复使用(这表明令牌已被盗)。 权衡: 需要服务器端存储令牌家族链,部分抵消了"无状态"的优势。每次刷新请求会增加约50ms延迟。

🟠 Major: Rate limiting uses in-memory store, lost on restart

🟠 主要:限流使用内存存储,重启后丢失

Vector: Structural Integrity What: Login rate limiting uses a Map() that resets on process restart. Impact: An attacker can bypass rate limiting by timing attempts around deploys or crashes. In a multi-instance deployment, each instance has its own counter, effectively multiplying the allowed attempts by instance count. Fix: Move rate limit state to Redis with TTL-based expiry. Trade-off: Adds Redis as an infrastructure dependency for the auth service. ~2ms latency per rate limit check.
维度: 结构完整性 问题: 登录限流使用Map()实现,进程重启后会重置。 影响: 攻击者可通过在部署或崩溃前后发起请求来绕过限流。在多实例部署场景下,每个实例有独立的计数器,实际允许的请求次数会乘以实例数量。 修复方案: 将限流状态迁移到带TTL过期的Redis中。 权衡: 为认证服务增加了Redis作为基础设施依赖。每次限流检查会增加约2ms延迟。

Verdict: PASS WITH CONDITIONS

verdict:有条件通过

  • Must implement refresh token rotation before production deploy
  • Should migrate rate limiting to shared store before scaling to >1 instance

**Why this is good:**
- Steel-man genuinely engages with the design rationale and trade-offs
- Findings target real security risks, not style preferences
- Each finding has specific, concrete impact (not "readability" or "confusion")
- Fixes include implementation direction AND quantified trade-offs
- Vectors are correctly applied and don't overlap
- Verdict follows directly from the severity thresholds
  • 生产部署前必须实现刷新令牌轮换
  • 扩展到1个以上实例前,应将限流迁移到共享存储

**为何优秀:**
- 强化论证真正理解了设计原理和权衡
- 发现的问题针对真实安全风险,而非风格偏好
- 每个问题都有具体、明确的影响(而非"可读性"或"困惑")
- 修复方案包含实现方向和量化的权衡
- 维度应用正确,无重叠
- verdict直接基于严重程度阈值得出

Specialized Modes

专项模式

Code Review Mode

代码评审模式

When reviewing code (files, PRs, diffs):
  • Read all changed files with the Read tool
  • Check for OWASP Top 10 vulnerabilities
  • Verify error handling completeness
  • Assess test coverage of edge cases
  • Review naming, structure, and abstraction levels
评审代码(文件、PR、差异)时:
  • 使用Read工具读取所有变更文件
  • 检查OWASP Top 10漏洞
  • 验证错误处理的完整性
  • 评估边缘案例的测试覆盖率
  • 评审命名、结构和抽象层次

Architecture Decision Mode

架构决策模式

When reviewing architecture/design decisions:
  • Evaluate scalability assumptions
  • Test with 10x and 100x current load mentally
  • Check for single points of failure
  • Assess vendor lock-in risks
  • Consider team capability alignment
评审架构/设计决策时:
  • 评估可扩展性假设
  • 模拟当前负载10倍和100倍的场景
  • 检查单点故障
  • 评估供应商锁定风险
  • 考虑团队能力匹配度

PR Review Mode

PR评审模式

When reviewing pull requests:
  • Focus on behavioral changes, not style
  • Check for breaking changes to public APIs
  • Verify backward compatibility
  • Assess rollback strategy
  • Check migration paths for data changes
评审拉取请求(PR)时:
  • 聚焦行为变更,而非风格
  • 检查对公共API的破坏性变更
  • 验证向后兼容性
  • 评估回滚策略
  • 检查数据变更的迁移路径

When to Use

适用场景

  • Before merging any significant PR
  • Before committing to an architecture decision
  • When evaluating third-party dependencies
  • When someone says "this should be fine"
  • When stakes are high and mistakes are expensive
  • After completing implementation, before calling it done
  • 合并重要PR之前
  • 确定架构决策之前
  • 评估第三方依赖时
  • 当有人说"这样应该没问题"时
  • 风险高、错误代价大时
  • 完成实现后,在宣布完成之前

When NOT to Use

不适用场景

  • Trivial changes (typos, formatting)
  • When exploration is needed first (use
    swing-research
    )
  • When generating alternatives (use
    swing-options
    )
  • When you need neutral, exhaustive analysis without a verdict (use
    deep-dive-analyzer
    — it understands; this skill challenges)
  • Personal preferences or subjective design choices
  • 琐碎变更(拼写错误、格式调整)
  • 首先需要探索的场景(请使用
    swing-research
  • 需要生成替代方案时(请使用
    swing-options
  • 需要中立、全面分析但无需结论时(请使用
    deep-dive-analyzer
    ——它专注于理解;本技能专注于挑战)
  • 个人偏好或主观设计选择

Integration Notes

集成说明

  • With swing-clarify: Run swing-clarify first on ambiguous requests before invoking this skill. Clarified scope produces better results.
  • With swing-options: After adversarial review reveals problems, use swing-options to generate alternative approaches
  • With swing-research: Use research to verify claims made during review (e.g., "is this really a security risk?"). For a full-rigor workflow:
    swing-research
    swing-review
  • With deep-dive-analyzer: For understanding before challenging:
    deep-dive-analyzer
    (understand) →
    swing-review
    (challenge). This skill focuses on finding flaws; deep-dive focuses on neutral exhaustive analysis.
  • With orchestrator strategy team: Complements the strategy team's Devil's Advocate agent with structured methodology
  • 与swing-clarify集成: 在调用本技能前,先对模糊请求运行swing-clarify。明确的范围能产生更好的结果。
  • 与swing-options集成: 对抗评审发现问题后,使用swing-options生成替代方案
  • 与swing-research集成: 使用研究工具验证评审期间提出的主张(例如,"这真的是安全风险吗?")。完整严谨的工作流:
    swing-research
    swing-review
  • 与deep-dive-analyzer集成: 先理解再挑战:
    deep-dive-analyzer
    (理解) →
    swing-review
    (挑战)。本技能专注于发现缺陷;deep-dive专注于中立、全面的分析。
  • 与orchestrator strategy team集成: 以结构化方法论补充策略团队的Devil's Advocate agent