Pre-Mortem

预验尸（Pre-Mortem）

Prospective failure analysis that defeats optimism bias by assuming failure first, then working backward to surface risks, early warnings, and escape hatches.

Based on Gary Klein's swing-mortem technique: Instead of asking "will this work?" (which triggers optimism bias), this skill forces the question: "It's 6 months from now and this has completely failed. What went wrong?"

Key distinction from swing-review:

```
swing-review
```
examines the CURRENT state — "what's wrong NOW?"
```
swing-mortem
```
examines the FUTURE — "what will go wrong LATER?"
Adversarial review finds existing flaws. Pre-mortem anticipates flaws that don't exist yet.

一种前瞻性失败分析方法，通过先假设失败，再回溯排查风险、预警信号和应急方案，来抵消乐观偏差。

基于Gary Klein的swing-mortem技术： 不再问“这个方案可行吗？”（这类问题会引发乐观偏差），该方法会迫使团队思考：“6个月后，这个项目彻底失败了。问题出在哪里？”

与swing-review的核心区别：

```
swing-review
```
审视当前状态——“现在存在什么问题？”
```
swing-mortem
```
审视未来——“未来会出现什么问题？”
对抗性评审查找已存在的缺陷，而预验尸分析则预判尚未出现的缺陷。

Rules (Absolute)

核心规则（必须遵守）

Never produce generic risks. Every failure scenario must name specific technologies, quantities, timelines, or conditions. "The database might not scale" is banned. "PostgreSQL connection pool exhaustion at >2,000 concurrent users due to long-running analytical queries holding connections for 30s+" is acceptable.
Exactly 5 scenarios across 5 categories. One Technical, one Organizational, one External, one Temporal, one Assumption. No category may be skipped, no category may have more than one scenario.
Leading indicators must be observable and measurable. "Watch out for problems" is banned. Every indicator must specify what to measure, what threshold signals danger, and where to observe it.
Circuit breakers must include a specific trigger condition. "If things go wrong" is banned. Every trigger must be a measurable condition with a concrete threshold.
The swing-mortem summary is MANDATORY. It is the BLUF of the analysis. It must appear at the end and synthesize the highest risk, its leading indicator, and its escape hatch in one paragraph.
Assume complete failure. Not partial, not "underperformance." The premise is total failure. This extreme framing is what forces creative risk identification — do not soften it.
Specificity over coverage. One deeply analyzed, plausible failure scenario per category is worth more than five shallow ones. Depth beats breadth.

禁止提出泛泛的风险。 每个故障场景必须明确指出具体的技术、数量、时间线或条件。“数据库可能无法扩容”这类表述是不允许的。“当并发用户数超过2000时，由于长时间运行的分析查询占用连接30秒以上，导致PostgreSQL连接池耗尽”这类表述是可接受的。
需覆盖5个类别，每个类别1个场景。 分别为技术类、组织类、外部类、时间类、假设类。不得跳过任何类别，也不得在单个类别下设置多个场景。
预警指标必须可观测、可量化。 “注意可能出现的问题”这类表述是不允许的。每个指标必须明确说明要测量的内容、触发预警的阈值，以及观测渠道。
熔断机制必须包含具体的触发条件。 “如果出现问题”这类表述是不允许的。每个触发条件必须是可量化的具体阈值。
必须包含预验尸分析摘要。 这是分析的核心结论部分，需放在最后，用一段话总结最高优先级风险、对应的预警指标和应急方案。
假设项目已完全失败。 不是部分失败或“表现不佳”，前提是彻底失败。这种极端假设能倒逼团队创造性地识别风险——不得弱化这一前提。
宁深勿广。 每个类别下一个深入分析、符合逻辑的故障场景，远胜于五个浅尝辄止的场景。深度优于广度。

Process

执行流程

Execute these 6 phases sequentially. Do NOT skip phases.

按以下6个阶段依次执行，不得跳过任何阶段。

Phase 1: Set the Failure Frame

阶段1：设定故障框架

Establish the temporal and contextual frame before any analysis.

FAILURE FRAME
─────────────
Subject: [what is being analyzed — plan, decision, architecture, launch]
Timeframe: [when failure is discovered — default 6 months, adjust to context]
Failure statement: "It is [timeframe] from now. [Subject] has failed completely.
Not partially underperformed — completely failed. The team is conducting a
post-mortem. What went wrong?"

If the subject is ambiguous or too broad, ask one clarifying question before proceeding. "Analyze our project" is too vague. "Analyze our migration from MongoDB to PostgreSQL for the user service" is actionable.

Before generating scenarios, gather context:

If code/architecture exists, read relevant files to ground scenarios in reality
If a project plan exists, examine timelines and dependencies
If prior decisions are documented, review the rationale and constraints

Do not generate scenarios from imagination alone when concrete artifacts are available.

在开始任何分析前，先确定时间范围和背景框架。

FAILURE FRAME
─────────────
Subject: [what is being analyzed — plan, decision, architecture, launch]
Timeframe: [when failure is discovered — default 6 months, adjust to context]
Failure statement: "It is [timeframe] from now. [Subject] has failed completely.
Not partially underperformed — completely failed. The team is conducting a
post-mortem. What went wrong?"

如果分析对象模糊或范围过宽，先提出一个澄清问题再继续。“分析我们的项目”这类表述过于模糊，“分析用户服务从MongoDB迁移到PostgreSQL的计划”才是可执行的。

生成场景前，先收集上下文信息：

如果已有代码/架构，阅读相关文件，确保场景符合实际情况
如果已有项目计划，查看时间线和依赖关系
如果已有决策文档，回顾决策依据和约束条件

当有具体文档可参考时，不得仅凭想象生成场景。

Phase 2: Failure Scenario Generation

阶段2：生成故障场景

Generate exactly 5 failure scenarios, one per category. Each scenario must be a specific, plausible narrative — not a generic risk label.

生成恰好5个故障场景，每个类别一个。每个场景必须是具体、符合逻辑的完整叙事，而非泛泛的风险标签。

Category 1: Technical

类别1：技术类

The technology didn't work as expected. Name the specific technology, the specific failure mode, and the specific conditions under which it failed.

技术未按预期工作。需明确指出具体技术、故障模式，以及触发故障的具体条件。

Category 2: Organizational

类别2：组织类

Team, process, or communication broke down. Name the specific team dynamics, handoff points, or process gaps that caused failure.

团队、流程或沟通出现问题。需明确指出具体的团队动态、交接环节或流程漏洞。

Category 3: External

类别3：外部类

Market shifted, competitor moved, regulation changed, or a dependency broke. Name the specific external force and its specific impact.

市场变化、竞品动作、监管调整或依赖项故障。需明确指出具体的外部因素及其影响。

Category 4: Temporal

类别4：时间类

Timeline was wrong. Name what took longer (or what window was missed) and by how much, with the specific cascading consequence.

时间线出现偏差。需明确指出哪部分工作耗时超出预期（或错过时间窗口）、超出的时长，以及由此引发的连锁反应。

Category 5: Assumption

类别5：假设类

A core assumption turned out to be false. Name the specific assumption, why it seemed reasonable at the time, and what reality turned out to be.

Format each scenario as:

SCENARIO [N]: [Category] — [Title]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

What happened:
[2-4 sentence specific narrative of how this failure unfolded]

Why it was plausible:
[1-2 sentences on why this wasn't obvious beforehand]

Concrete consequence:
[Specific, measurable impact — revenue lost, users affected, time wasted, data compromised]

核心假设被证明是错误的。需明确指出具体假设、当初认为该假设合理的原因，以及实际情况。

每个场景的格式如下：

SCENARIO [N]: [Category] — [Title]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

What happened:
[2-4 sentence specific narrative of how this failure unfolded]

Why it was plausible:
[1-2 sentences on why this wasn't obvious beforehand]

Concrete consequence:
[Specific, measurable impact — revenue lost, users affected, time wasted, data compromised]

Phase 3: Likelihood x Impact Matrix

阶段3：可能性×影响矩阵

Rate each scenario and determine priority:

RISK MATRIX
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
| # | Category       | Scenario             | Likelihood | Impact       | Priority |
|---|----------------|----------------------|------------|--------------|----------|
| 1 | Technical      | [title]              | H / M / L  | Cat/Sev/Mod  | [rank]   |
| 2 | Organizational | [title]              | H / M / L  | Cat/Sev/Mod  | [rank]   |
| 3 | External       | [title]              | H / M / L  | Cat/Sev/Mod  | [rank]   |
| 4 | Temporal       | [title]              | H / M / L  | Cat/Sev/Mod  | [rank]   |
| 5 | Assumption     | [title]              | H / M / L  | Cat/Sev/Mod  | [rank]   |

Likelihood: High (>50%), Medium (15-50%), Low (<15%) Impact: Catastrophic (project killed, irreversible damage), Severe (major rework, significant loss), Moderate (setback, recoverable with effort)

Priority scoring:

High + Catastrophic = P1
High + Severe OR Medium + Catastrophic = P2
Medium + Severe OR High + Moderate = P3
Everything else = P4

Select the top 3 by priority for detailed analysis in Phases 4-5. In case of tie, prefer higher Likelihood.

对每个场景进行评级，确定优先级：

RISK MATRIX
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
| # | Category       | Scenario             | Likelihood | Impact       | Priority |
|---|----------------|----------------------|------------|--------------|----------|
| 1 | Technical      | [title]              | H / M / L  | Cat/Sev/Mod  | [rank]   |
| 2 | Organizational | [title]              | H / M / L  | Cat/Sev/Mod  | [rank]   |
| 3 | External       | [title]              | H / M / L  | Cat/Sev/Mod  | [rank]   |
| 4 | Temporal       | [title]              | H / M / L  | Cat/Sev/Mod  | [rank]   |
| 5 | Assumption     | [title]              | H / M / L  | Cat/Sev/Mod  | [rank]   |

可能性： 高（>50%）、中（15-50%）、低（<15%） 影响： 灾难性（项目终止、不可逆损失）、严重（重大返工、巨额损失）、中等（进度延误，可通过努力恢复）

优先级评分：

高可能性+灾难性影响 = P1
高可能性+严重影响或中可能性+灾难性影响 = P2
中可能性+严重影响或高可能性+中等影响 = P3
其他所有情况 = P4

选择优先级排名前3的场景，在阶段4-5中进行深入分析。若出现并列，优先选择可能性更高的场景。

Phase 4: Leading Indicators

阶段4：预警指标

For each of the top 3 risks, identify 2-3 early warning signals. These are observable, measurable conditions that would indicate the failure mode is beginning to materialize — before it's too late.

LEADING INDICATORS — Scenario [N]: [Title]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Indicator 1: [Name]
  Measure: [What specifically to track]
  Threshold: [At what value does this become a warning]
  Where to observe: [Dashboard, log, metric, report, or manual check]
  Lead time: [How far in advance of failure this signal appears]

Indicator 2: [Name]
  Measure: [What specifically to track]
  Threshold: [At what value does this become a warning]
  Where to observe: [Dashboard, log, metric, report, or manual check]
  Lead time: [How far in advance of failure this signal appears]

Every indicator must pass the "intern test": could a new team member, given this description alone, determine whether the threshold has been crossed? If not, make it more specific.

针对前3个高优先级风险，每个识别2-3个早期预警信号。这些信号是可观测、可量化的条件，能在故障完全发生前预警——在为时已晚之前。

LEADING INDICATORS — Scenario [N]: [Title]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Indicator 1: [Name]
  Measure: [What specifically to track]
  Threshold: [At what value does this become a warning]
  Where to observe: [Dashboard, log, metric, report, or manual check]
  Lead time: [How far in advance of failure this signal appears]

Indicator 2: [Name]
  Measure: [What specifically to track]
  Threshold: [At what value does this become a warning]
  Where to observe: [Dashboard, log, metric, report, or manual check]
  Lead time: [How far in advance of failure this signal appears]

每个指标必须通过**“新人测试”**：仅根据描述，新团队成员能否判断是否已达到预警阈值？若不能，需进一步明确。

Phase 5: Circuit Breakers

阶段5：熔断机制

For each of the top 3 risks, define the decision framework for when and how to change course.

CIRCUIT BREAKER — Scenario [N]: [Title]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Trigger:
  [Specific measurable condition that activates this circuit breaker.
   Must be a concrete threshold, not "if things go wrong."]

Fallback:
  [The alternative path. What do you switch to? Be specific about
   the replacement approach, not just "find another way."]

Cost of delay:
  [What do you lose by waiting one more week/sprint/month for more
   information before activating the fallback? Quantify if possible.]

Decision owner:
  [Who has authority to pull this trigger? Role, not name.]

针对前3个高优先级风险，定义何时及如何调整方向的决策框架。

CIRCUIT BREAKER — Scenario [N]: [Title]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Trigger:
  [Specific measurable condition that activates this circuit breaker.
   Must be a concrete threshold, not "if things go wrong."]

Fallback:
  [The alternative path. What do you switch to? Be specific about
   the replacement approach, not just "find another way."]

Cost of delay:
  [What do you lose by waiting one more week/sprint/month for more
   information before activating the fallback? Quantify if possible.]

Decision owner:
  [Who has authority to pull this trigger? Role, not name.]

Phase 6: Pre-Mortem Summary (BLUF)

阶段6：预验尸分析摘要（核心结论）

Synthesize the entire analysis into one paragraph. This is the most important output — the reader should be able to read ONLY this paragraph and walk away with the critical insight.

Format:

PRE-MORTEM SUMMARY
━━━━━━━━━━━━━━━━━━

The highest risk to [subject] is [specific scenario from top priority].
You'll know it's happening when [most actionable leading indicator with
threshold]. Your escape hatch is [primary fallback from circuit breaker].
The cost of ignoring this: [concrete consequence]. The cost of acting
too early: [trade-off of the fallback]. Monitor [specific metric] starting
[when] to stay ahead of this risk.

将整个分析内容浓缩为一段话。这是最重要的输出成果——读者只需阅读这段话，就能获取关键洞察。

格式如下：

PRE-MORTEM SUMMARY
━━━━━━━━━━━━━━━━━━

The highest risk to [subject] is [specific scenario from top priority].
You'll know it's happening when [most actionable leading indicator with
threshold]. Your escape hatch is [primary fallback from circuit breaker].
The cost of ignoring this: [concrete consequence]. The cost of acting
too early: [trade-off of the fallback]. Monitor [specific metric] starting
[when] to stay ahead of this risk.

Output Format

输出格式

markdown

undefined

markdown

undefined

Pre-Mortem: [Subject]