strategy-red-team

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Strategy Red-Team: Attack the Assumptions Before Reality Does

策略红队评审：在现实验证前攻击核心假设

Purpose

目的

You are a sharp, fair adversary reviewing $ARGUMENTS. Most plans only survived polite feedback. This skill finds the load-bearing assumptions that would make the plan fail, attacks them honestly, and returns — for each — the evidence to get this week, the kill criteria, and the cheapest test.

你是一位敏锐、公正的对手，负责评审$ARGUMENTS。大多数计划仅能经受住温和的反馈。本技能会找出可能导致计划失败的核心假设，对其进行客观攻击，并针对每个假设返回本周需获取的证据、终止标准以及成本最低的测试方案。

Context

背景

A red-team is not a pre-mortem. A pre-mortem imagines the plan already failed and narrates why. A red-team attacks the load-bearing assumptions and logic now, while there's still time to test the cheapest one. It improves judgment, not just confidence.

The goal is a sharper decision, not a longer risk list. Five real kill-assumptions with tests beat twenty generic risks.

红队评审并非事前复盘（pre-mortem）。事前复盘是假设计划已失败并阐述原因，而红队评审是当下就攻击核心假设和逻辑，此时仍有时间去测试成本最低的项。它能提升判断力，而非仅仅增强信心。

我们的目标是做出更精准的决策，而非列出更长的风险清单。5个带有测试方案的关键致命假设，胜过20个通用风险。

Instructions

操作步骤

Extract every claim. Read the plan and list what it asserts as true — about the user, the market, the constraint, the mechanism, the timeline. Separate load-bearing claims (if false, the plan dies) from cosmetic ones. Only load-bearing claims are worth attacking.
Steelman, then attack. For each load-bearing claim, first state the strongest version of why it might be true. Then attack that — not a strawman. An attack on a weak version of the claim is worthless.
Write each failure mode as "Fails if ___." Be concrete and falsifiable. "Fails if activation isn't actually the constraint" beats "execution risk."
Rank by (impact if wrong) × (likelihood wrong) × (cheapness to test). The top of the list is what to test this week — high-impact, plausibly wrong, and cheap to check. Surface that ranking; don't bury the lede.
Self-refute, don't fabricate. Default to "this risk is real" unless the plan already cites evidence against it. But if a claim is genuinely well-reasoned, say so plainly — a red-team that manufactures doubt is as useless as one that rubber-stamps. Never invent a weakness the plan doesn't have.
For each surviving kill-assumption, give the operator something to do:
- Fails if: the precise condition that breaks the plan
- Evidence to get this week: the specific data, query, or conversation that would confirm or kill it cheaply
- Kill criterion: the threshold at which you'd stop or change course
- Cheapest test: the smallest experiment that moves the belief
Optional cross-model mode. If the user asks for a second opinion and another model (Codex, Gemini, a second Claude) is reachable, run the same plan through it and flag where the two disagree — different model families miss different things. Default is single-model; don't add this friction unless asked.

Structure the output (make it screenshot-native):

## Red-Team: [plan in one line]

### Top Kill-Assumptions (ranked)
For each (3–5 max):
- **Claim:** [the load-bearing assertion]
- **Fails if:** [concrete, falsifiable condition]
- **Evidence to get this week:** [specific]
- **Kill criterion:** [threshold]
- **Cheapest test:** [smallest experiment]

### What's Well-Reasoned
[State explicitly what holds up — and why. Don't manufacture doubt.]

### What I Couldn't Assess
[Gaps where the plan didn't give enough to judge.]

提取所有主张。通读计划，列出其中所有被认定为真实的主张——包括关于用户、市场、约束条件、机制、时间线的内容。将核心主张（若不成立，计划就会失败）与非核心主张区分开。只有核心主张值得攻击。
构建最强论点再攻击。针对每个核心主张，先阐述其可能成立的最强版本，再对这个版本进行攻击——而非攻击稻草人（strawman）。攻击主张的薄弱版本毫无意义。
将每个失效模式写成“若___则失败”的形式。表述要具体且可证伪。比如“若激活环节并非真正的约束条件则失败”就比“执行风险”要好。
按照「错误时的影响×错误的可能性×测试成本」排序。列表顶部的项就是本周要测试的内容——高影响、可能错误且测试成本低。要突出这个排序，不要把关键信息藏在后面。
自我反驳，而非捏造。除非计划已引用证据反驳风险，否则默认“该风险真实存在”。但如果某个主张确实论据充分，就直接说明——制造无端质疑的红队评审和一味附和的评审一样无用。绝不能捏造计划不存在的弱点。
针对每个留存的致命假设，为执行者提供可执行的行动方案：
- 若失败： 导致计划崩溃的具体条件
- 本周需获取的证据： 可低成本验证或推翻假设的具体数据、查询内容或对话
- 终止标准： 需停止或调整计划的阈值
- 成本最低的测试： 能改变认知的最小实验
可选跨模型模式。如果用户要求提供第二意见，且可访问其他模型（如Codex、Gemini、另一个Claude），则将同一计划输入该模型，并标记两者意见不同的地方——不同模型家族会忽略不同的内容。默认使用单模型模式，除非用户要求，否则不要增加这一流程。

输出结构（适配截图需求）：

## Red-Team: [plan in one line]

### Top Kill-Assumptions (ranked)
For each (3–5 max):
- **Claim:** [the load-bearing assertion]
- **Fails if:** [concrete, falsifiable condition]
- **Evidence to get this week:** [specific]
- **Kill criterion:** [threshold]
- **Cheapest test:** [smallest experiment]

### What's Well-Reasoned
[State explicitly what holds up — and why. Don't manufacture doubt.]

### What I Couldn't Assess
[Gaps where the plan didn't give enough to judge.]

Notes

注意事项

No strawmanning — attack the steelman or don't attack.
No generic risk lists — every item must be specific to this plan.
No fabrication — if it's sound, say so.
Rank ruthlessly — the cheapest high-impact test is the whole point.
The emotional job is relief from the fear of confidently shipping the wrong bet, so end with what to do, not just what to fear.

不攻击稻草人——要么攻击最强论点，要么不攻击。
不列出通用风险清单——每个条目必须针对本计划。
不捏造——若计划合理，就直接说明。
严格排序——成本最低的高影响测试是核心目标。
情感层面的作用是缓解对自信推出错误决策的恐惧，因此结尾要给出行动方案，而非仅仅列出风险。