swing-trace
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseReasoning Tracer
Reasoning Tracer
Anti-black-box engine that makes reasoning chains visible, auditable, and decomposable.
Addresses the cognitive failure mode of black-box reasoning -- Claude gives an answer but the user cannot see what assumptions were relied on, what alternatives were rejected, or which part of the reasoning is weakest.
一款反黑箱引擎,可让推理链可见、可审计且可拆解。
解决黑箱推理的认知缺陷——Claude给出答案,但用户无法看到其依赖的假设、否决的备选方案或推理中最薄弱的环节。
Rules (Absolute)
核心规则(必须严格遵守)
- Never present a single-path narrative. Every trace must show at least one rejected alternative at a meaningful decision fork. "I considered X but chose Y because Z" is the minimum; two rejected alternatives is preferred.
- Confidence decomposition requires 3+ sub-components. Overall confidence is always broken into at least three independent dimensions, each with its own percentage and justification.
- Every assumption gets rated. Each assumption must have an explicit criticality rating (High/Medium/Low) and verifiability rating (Directly Verifiable / Indirectly Verifiable / Unverifiable). No unrated assumptions.
- Weakest Link is MANDATORY. Never skip it. This is the highest-value section -- it tells the user exactly where to focus their own verification effort.
- No confidence theater. Do not assign high confidence (>80%) without specific justification. Vague appeals to "experience" or "common knowledge" are banned. Every confidence level must cite a concrete basis.
- Distinguish evidence types. Separate empirical evidence (benchmarks, data, test results) from theoretical reasoning (design principles, heuristics) from authority (docs, expert consensus). Label which type supports each claim.
- Trace must be falsifiable. Every conclusion must include conditions under which it would be wrong. If you cannot state what would disprove your conclusion, the reasoning is insufficiently rigorous.
- 禁止单路径叙事。每个追踪记录必须在关键决策分支处展示至少一个被否决的备选方案。最低要求是说明“我考虑了X但选择了Y,原因是Z”,优先展示两个及以上被否决的方案.
- 置信度拆解需包含3个以上子维度。整体置信度必须拆解为至少三个独立维度,每个维度都有对应的百分比和理由.
- 所有假设必须评级。每个假设都需明确标注关键程度(高/中/低)和可验证性(可直接验证/可间接验证/无法验证),不允许出现未评级的假设.
- 必须标注最薄弱环节。绝对不能跳过这一部分,这是价值最高的内容——它能告诉用户该将验证精力集中在何处.
- 禁止虚假置信度。若无具体理由,不得给出高置信度(>80%)。禁止模糊地诉诸“经验”或“常识”,每个置信度等级都必须有具体依据.
- 区分证据类型。将实证证据(基准测试、数据、测试结果)、理论推理(设计原则、启发式方法)和权威依据(文档、专家共识)分开,并标注每个主张对应的证据类型.
- 追踪记录必须可证伪。每个结论都需包含可能使其不成立的条件。若无法说明什么能推翻结论,则推理不够严谨.
Mode Selection
模式选择
Quick Mode (Default)
快速模式(默认)
When invoked without , execute only:
--full- Stage 1: Claim Isolation — break into atomic claims
- Stage 2: Assumption Inventory — enumerate assumptions with criticality/verifiability
- Stage 5: Weakest Link & Alternative Conclusion — identify the single most fragile assumption
Skip Stages 3 (Decision Tree) and 4 (Confidence Decomposition).
Quick mode output format:
markdown
undefined当未添加参数调用时,仅执行以下步骤:
--full- 阶段1:主张拆分——将内容拆解为原子主张
- 阶段2:假设清单——枚举所有假设并标注关键程度/可验证性
- 阶段5:最薄弱环节与备选结论——识别最脆弱的单个假设
跳过阶段3(决策树)和阶段4(置信度拆解)。
快速模式输出格式:
markdown
undefinedReasoning Trace: [Claim]
Reasoning Trace: [Claim]
Atomic Claims
Atomic Claims
- [Claim 1]
- [Claim 2]
- [Claim 1]
- [Claim 2]
Assumption Inventory
Assumption Inventory
| # | Assumption | Criticality | Verifiability |
|---|---|---|---|
| A1 | ... | High/Med/Low | Direct/Indirect/Unverifiable |
| # | Assumption | Criticality | Verifiability |
|---|---|---|---|
| A1 | ... | High/Med/Low | Direct/Indirect/Unverifiable |
Weakest Link
Weakest Link
Assumption [A#]: [restate]
- Why weakest: [explanation]
- If wrong: [alternative conclusion]
- How to verify: [concrete steps]
undefinedAssumption [A#]: [restate]
- Why weakest: [explanation]
- If wrong: [alternative conclusion]
- How to verify: [concrete steps]
undefinedFull Mode (--full)
完整模式(--full)
When invoked with , execute all 5 stages as documented below.
--full当添加参数调用时,执行以下全部5个阶段。
--fullProcess
执行流程
Execute these 5 stages sequentially. Do NOT skip stages.
按顺序执行以下5个阶段,不得跳过任何阶段。
Stage 1: Claim Isolation
阶段1:主张拆分
Identify the exact claim(s) being traced. Separate compound questions into atomic claims.
Input: "Why did you recommend microservices over a monolith?"
Atomic claims:
1. Microservices are a better architectural fit for this project
2. The team can handle microservices operational complexity
3. The migration cost is justified by long-term benefitsEach atomic claim gets its own assumption inventory and confidence score.
明确要追踪的具体主张。将复合问题拆分为原子主张。
Input: "Why did you recommend microservices over a monolith?"
Atomic claims:
1. Microservices are a better architectural fit for this project
2. The team can handle microservices operational complexity
3. The migration cost is justified by long-term benefits每个原子主张都有独立的假设清单和置信度得分。
Stage 2: Assumption Inventory
阶段2:假设清单
For each atomic claim, enumerate every assumption the reasoning depends on. Each assumption gets three attributes:
| # | Assumption | Criticality | Verifiability |
|---|---|---|---|
| A1 | [Statement] | High -- conclusion changes if wrong | Directly Verifiable -- can test/measure |
| A2 | [Statement] | Medium -- conclusion weakens if wrong | Indirectly Verifiable -- can infer from proxy data |
| A3 | [Statement] | Low -- conclusion survives if wrong | Unverifiable -- must be accepted or rejected on judgment |
Criticality scale:
- High: If this assumption is wrong, the conclusion flips or becomes unjustifiable.
- Medium: If wrong, the conclusion weakens significantly but may still hold with caveats.
- Low: If wrong, the conclusion is largely unaffected.
Verifiability scale:
- Directly Verifiable: Can be tested, measured, or confirmed from authoritative sources.
- Indirectly Verifiable: Can be inferred from related data, benchmarks, or analogies.
- Unverifiable: Requires judgment, prediction, or depends on future unknowns.
针对每个原子主张,枚举推理所依赖的所有假设。每个假设需包含三个属性:
| # | Assumption | Criticality | Verifiability |
|---|---|---|---|
| A1 | [Statement] | High -- conclusion changes if wrong | Directly Verifiable -- can test/measure |
| A2 | [Statement] | Medium -- conclusion weakens if wrong | Indirectly Verifiable -- can infer from proxy data |
| A3 | [Statement] | Low -- conclusion survives if wrong | Unverifiable -- must be accepted or rejected on judgment |
关键程度分级:
- 高:若该假设不成立,结论将反转或变得不合理。
- 中:若该假设不成立,结论将大幅弱化,但在附加条件下仍可能成立。
- 低:若该假设不成立,结论基本不受影响。
可验证性分级:
- 可直接验证:可通过测试、测量或权威来源确认。
- 可间接验证:可通过相关数据、基准测试或类比推断。
- 无法验证:需依赖判断、预测或未来未知因素,无法确认。
Stage 3: Decision Tree with Branch Justifications
阶段3:带分支理由的决策树
At each significant fork in the reasoning, document:
- Decision point: What question needed answering?
- Options considered: At least 2 (the chosen path + minimum 1 rejected alternative).
- Evaluation criteria: What factors determined the choice?
- Chosen path: Which option was selected?
- Rejection rationale: Why each alternative was rejected -- with specifics, not hand-waving.
- Reversal condition: What would need to be true for the rejected alternative to become the better choice?
Decision Point: Database selection
├─ Option A: PostgreSQL [CHOSEN]
│ Strengths: ACID compliance, JSON support, ecosystem maturity
│ Evidence type: Empirical (benchmarks) + Authority (industry adoption data)
│
├─ Option B: SQLite [REJECTED]
│ Strengths: Zero-config, embedded, fast for reads
│ Rejection: Write concurrency limit (~5 writers) incompatible with
│ multi-instance deployment requirement (Assumption A2)
│ Reversal: If deployment is single-instance AND write volume < 100/sec,
│ SQLite becomes the simpler, better choice
│
└─ Option C: MongoDB [REJECTED]
Strengths: Schema flexibility, horizontal scaling
Rejection: Data has strong relational structure (7 FK relationships);
denormalization cost outweighs flexibility benefit
Reversal: If schema changes weekly or data is primarily document-shaped在推理的每个关键分支点,记录以下内容:
- 决策点:需要回答的问题是什么?
- 考虑的选项:至少2个(选择的路径+至少1个被否决的备选方案)。
- 评估标准:哪些因素决定了选择?
- 选定路径:选择了哪个选项?
- 否决理由:每个备选方案被否决的具体原因——需具体,不能含糊其辞。
- 反转条件:什么情况下被否决的备选方案会成为更优选择?
Decision Point: Database selection
├─ Option A: PostgreSQL [CHOSEN]
│ Strengths: ACID compliance, JSON support, ecosystem maturity
│ Evidence type: Empirical (benchmarks) + Authority (industry adoption data)
│
├─ Option B: SQLite [REJECTED]
│ Strengths: Zero-config, embedded, fast for reads
│ Rejection: Write concurrency limit (~5 writers) incompatible with
│ multi-instance deployment requirement (Assumption A2)
│ Reversal: If deployment is single-instance AND write volume < 100/sec,
│ SQLite becomes the simpler, better choice
│
└─ Option C: MongoDB [REJECTED]
Strengths: Schema flexibility, horizontal scaling
Rejection: Data has strong relational structure (7 FK relationships);
denormalization cost outweighs flexibility benefit
Reversal: If schema changes weekly or data is primarily document-shapedStage 4: Confidence Decomposition
阶段4:置信度拆解
Break overall confidence into independent sub-components. Minimum 3, recommended 4-6.
Each sub-component gets:
- A percentage (0-100%)
- The evidence type supporting it (Empirical / Theoretical / Authority / Mixed)
- A 1-2 sentence justification citing the specific basis
Overall Confidence: 72%
Sub-components:
Technical Feasibility: 90% [Empirical] -- proven in similar systems (refs: X, Y benchmarks)
Timeline Estimate: 45% [Theoretical] -- based on analogy to past project, but team composition differs
Cost Projection: 60% [Mixed] -- infrastructure costs are empirical, opportunity cost is estimated
Team Capability Match: 75% [Authority] -- based on stated team skills; not independently verified
Risk Assessment: 80% [Theoretical] -- standard failure modes well-understood; novel integration untested
Weighted Overall: (90*0.3 + 45*0.25 + 60*0.15 + 75*0.15 + 80*0.15) = 71.5% ≈ 72%The overall confidence is NOT the average. Weight sub-components by their importance to the conclusion.
将整体置信度拆分为独立的子维度,最少3个,推荐4-6个。
每个子维度需包含:
- 百分比(0-100%)
- 支持该维度的证据类型(实证/理论/权威/混合)
- 1-2句具体依据的理由
Overall Confidence: 72%
Sub-components:
Technical Feasibility: 90% [Empirical] -- proven in similar systems (refs: X, Y benchmarks)
Timeline Estimate: 45% [Theoretical] -- based on analogy to past project, but team composition differs
Cost Projection: 60% [Mixed] -- infrastructure costs are empirical, opportunity cost is estimated
Team Capability Match: 75% [Authority] -- based on stated team skills; not independently verified
Risk Assessment: 80% [Theoretical] -- standard failure modes well-understood; novel integration untested
Weighted Overall: (90*0.3 + 45*0.25 + 60*0.15 + 75*0.15 + 80*0.15) = 71.5% ≈ 72%整体置信度并非平均值,需根据子维度对结论的重要性进行加权计算。
Stage 5: Weakest Link & Alternative Conclusion
阶段5:最薄弱环节与备选结论
Weakest Link Identification:
Which single assumption or sub-conclusion, if wrong, would MOST change the final answer?
Criteria for selecting the weakest link:
- Highest criticality among assumptions
- Lowest verifiability (hardest to confirm)
- Lowest confidence among sub-components
- The intersection of these three factors is the weakest link
Alternative Conclusion:
"If [weakest link assumption] is wrong, then the conclusion changes to [X]."
This is not hypothetical filler -- it must be a genuinely reasoned alternative conclusion that follows logically from negating the weakest assumption.
最薄弱环节识别:
哪个假设或子结论若不成立,会对最终答案产生最大影响?
选择最薄弱环节的标准:
- 假设的关键程度最高
- 可验证性最低(最难确认)
- 子维度的置信度最低
- 同时满足以上三个因素的即为最薄弱环节
备选结论:
“若[最薄弱环节假设]不成立,则结论变为[X]。”
这并非假设性的填充内容——它必须是通过否定最薄弱环节后,逻辑推导得出的合理备选结论。
Output Format
输出格式
markdown
undefinedmarkdown
undefinedReasoning Trace: [Claim/Question]
Reasoning Trace: [Claim/Question]
Atomic Claims
Atomic Claims
- [Claim 1]
- [Claim 2]
- [Claim N]
- [Claim 1]
- [Claim 2]
- [Claim N]
Assumption Inventory
Assumption Inventory
| # | Assumption | Criticality | Verifiability | Tied to Claim |
|---|---|---|---|---|
| A1 | [Statement] | High | Directly Verifiable | Claim 1 |
| A2 | [Statement] | High | Unverifiable | Claim 2 |
| A3 | [Statement] | Medium | Indirectly Verifiable | Claim 1, 3 |
| ... | ... | ... | ... | ... |
| # | Assumption | Criticality | Verifiability | Tied to Claim |
|---|---|---|---|---|
| A1 | [Statement] | High | Directly Verifiable | Claim 1 |
| A2 | [Statement] | High | Unverifiable | Claim 2 |
| A3 | [Statement] | Medium | Indirectly Verifiable | Claim 1, 3 |
| ... | ... | ... | ... | ... |
Decision Forks
Decision Forks
Fork 1: [Decision Point]
Fork 1: [Decision Point]
- Chosen: [Option] -- [rationale with evidence type label]
- Rejected: [Option] -- [specific rejection reason]
- Reversal condition: [what would make this the right choice]
- Rejected: [Option] -- [specific rejection reason]
- Reversal condition: [what would make this the right choice]
- Chosen: [Option] -- [rationale with evidence type label]
- Rejected: [Option] -- [specific rejection reason]
- Reversal condition: [what would make this the right choice]
- Rejected: [Option] -- [specific rejection reason]
- Reversal condition: [what would make this the right choice]
Fork 2: [Decision Point]
Fork 2: [Decision Point]
...
...
Confidence Decomposition
Confidence Decomposition
| Dimension | Confidence | Evidence Type | Basis |
|---|---|---|---|
| [Dimension 1] | XX% | Empirical | [specific justification] |
| [Dimension 2] | XX% | Theoretical | [specific justification] |
| [Dimension 3] | XX% | Authority | [specific justification] |
| [Dimension N] | XX% | Mixed | [specific justification] |
Overall Confidence: XX% (weighted by dimension importance)
| Dimension | Confidence | Evidence Type | Basis |
|---|---|---|---|
| [Dimension 1] | XX% | Empirical | [specific justification] |
| [Dimension 2] | XX% | Theoretical | [specific justification] |
| [Dimension 3] | XX% | Authority | [specific justification] |
| [Dimension N] | XX% | Mixed | [specific justification] |
Overall Confidence: XX% (weighted by dimension importance)
Weakest Link
Weakest Link
Assumption [A#]: [restate the assumption]
- Criticality: High
- Verifiability: [rating]
- Current confidence in this assumption: XX%
- Why this is the weakest link: [explanation]
- How to verify: [concrete steps the user can take]
Assumption [A#]: [restate the assumption]
- Criticality: High
- Verifiability: [rating]
- Current confidence in this assumption: XX%
- Why this is the weakest link: [explanation]
- How to verify: [concrete steps the user can take]
Alternative Conclusion
Alternative Conclusion
If [weakest assumption] is wrong, then: [Alternative conclusion with reasoning -- not a throwaway sentence but a genuinely traced alternative]
If [weakest assumption] is wrong, then: [Alternative conclusion with reasoning -- not a throwaway sentence but a genuinely traced alternative]
Falsification Conditions
Falsification Conditions
The main conclusion is wrong if ANY of these turn out to be true:
- [Condition 1]
- [Condition 2]
- [Condition N]
undefinedThe main conclusion is wrong if ANY of these turn out to be true:
- [Condition 1]
- [Condition 2]
- [Condition N]
undefinedQuality Calibration
质量校准
BAD Example (Do NOT produce this)
反面示例(禁止生成此类内容)
Q: "Why did you recommend React over Vue?"
I recommended React because it has a larger ecosystem, more job
postings, and better TypeScript support. It's the safer choice
for most teams because it's battle-tested at scale by Meta and
many other companies.
Confidence: High.Problems:
- No assumptions stated -- what is this "high confidence" based on?
- No alternatives explored -- was Vue even seriously considered?
- No weakest link -- where should the user push back?
- No decision forks -- what criteria determined "larger" or "better"?
- "Battle-tested at scale" is an appeal to authority without specifics
- Reads like a marketing pitch, not a reasoning trace
Q: "Why did you recommend React over Vue?"
I recommended React because it has a larger ecosystem, more job
postings, and better TypeScript support. It's the safer choice
for most teams because it's battle-tested at scale by Meta and
many other companies.
Confidence: High.问题:
- 未说明任何假设——“高置信度”的依据是什么?
- 未探讨备选方案——是否真的认真考虑过Vue?
- 未标注最薄弱环节——用户该从何处提出质疑?
- 未展示决策分支——“更大”或“更好”的评判标准是什么?
- “经大规模实战检验”是无具体依据的权威诉诸
- 读起来像营销话术,而非推理追踪记录
GOOD Example (Target this quality)
正面示例(以此为目标质量)
undefinedundefinedReasoning Trace: Why React over Vue for this project?
Reasoning Trace: Why React over Vue for this project?
Atomic Claims
Atomic Claims
- React is a better framework fit for this project's requirements
- The team will be more productive with React than Vue
- React's ecosystem advantage outweighs Vue's simplicity advantage
- React is a better framework fit for this project's requirements
- The team will be more productive with React than Vue
- React's ecosystem advantage outweighs Vue's simplicity advantage
Assumption Inventory
Assumption Inventory
| # | Assumption | Criticality | Verifiability | Tied to Claim |
|---|---|---|---|---|
| A1 | Team has 2+ engineers with React experience | High | Directly Verifiable | Claim 2 |
| A2 | Project requires complex state management | Medium | Directly Verifiable | Claim 1 |
| A3 | Hiring pipeline will favor React candidates | High | Indirectly Verifiable | Claim 2 |
| A4 | Project will need SSR capabilities | Medium | Directly Verifiable | Claim 1 |
| A5 | TypeScript will be used project-wide | Low | Directly Verifiable | Claim 3 |
| # | Assumption | Criticality | Verifiability | Tied to Claim |
|---|---|---|---|---|
| A1 | Team has 2+ engineers with React experience | High | Directly Verifiable | Claim 2 |
| A2 | Project requires complex state management | Medium | Directly Verifiable | Claim 1 |
| A3 | Hiring pipeline will favor React candidates | High | Indirectly Verifiable | Claim 2 |
| A4 | Project will need SSR capabilities | Medium | Directly Verifiable | Claim 1 |
| A5 | TypeScript will be used project-wide | Low | Directly Verifiable | Claim 3 |
Decision Forks
Decision Forks
Fork 1: Framework Selection
Fork 1: Framework Selection
- Chosen: React -- larger component ecosystem (npm: 90k+ packages tagged "react" vs 25k+ "vue"), more mature SSR story (Next.js 15 stable vs Nuxt 4 recent), team has existing React experience (A1)
- Rejected: Vue -- lower learning curve, better developer ergonomics
for smaller teams, Composition API is excellent
- Reversal condition: If team has 0 React experience AND project is < 6 month lifespan AND no SSR needed, Vue's faster onboarding wins
- Rejected: Svelte -- best DX, smallest bundle, genuinely better
reactivity model
- Reversal condition: If team is greenfield (no framework experience), project is performance-critical consumer app, and hiring is not a constraint (small, stable team)
- Chosen: React -- larger component ecosystem (npm: 90k+ packages tagged "react" vs 25k+ "vue"), more mature SSR story (Next.js 15 stable vs Nuxt 4 recent), team has existing React experience (A1)
- Rejected: Vue -- lower learning curve, better developer ergonomics
for smaller teams, Composition API is excellent
- Reversal condition: If team has 0 React experience AND project is < 6 month lifespan AND no SSR needed, Vue's faster onboarding wins
- Rejected: Svelte -- best DX, smallest bundle, genuinely better
reactivity model
- Reversal condition: If team is greenfield (no framework experience), project is performance-critical consumer app, and hiring is not a constraint (small, stable team)
Confidence Decomposition
Confidence Decomposition
| Dimension | Confidence | Evidence Type | Basis |
|---|---|---|---|
| Framework capability match | 85% | Empirical | Feature comparison against requirements doc |
| Team productivity | 55% | Mixed | Based on stated skills (A1); not observed |
| Ecosystem longevity | 80% | Authority | Meta backing, npm download trends, State of JS 2025 |
| Hiring advantage | 50% | Indirectly Empirical | LinkedIn job postings (3:1 React:Vue ratio, but market shifts) |
Overall Confidence: 68% (productivity and hiring are uncertain
but heavily weighted)
| Dimension | Confidence | Evidence Type | Basis |
|---|---|---|---|
| Framework capability match | 85% | Empirical | Feature comparison against requirements doc |
| Team productivity | 55% | Mixed | Based on stated skills (A1); not observed |
| Ecosystem longevity | 80% | Authority | Meta backing, npm download trends, State of JS 2025 |
| Hiring advantage | 50% | Indirectly Empirical | LinkedIn job postings (3:1 React:Vue ratio, but market shifts) |
Overall Confidence: 68% (productivity and hiring are uncertain
but heavily weighted)
Weakest Link
Weakest Link
Assumption A1: Team has 2+ engineers with React experience
- Criticality: High
- Verifiability: Directly Verifiable
- Current confidence: 70% (stated by PM, not verified via code review)
- Why this is the weakest link: If the team lacks real React experience, the productivity advantage evaporates and Vue's lower learning curve becomes the dominant factor
- How to verify: Review team members' recent commits; conduct brief technical screen on React patterns (hooks, context, suspense)
Assumption A1: Team has 2+ engineers with React experience
- Criticality: High
- Verifiability: Directly Verifiable
- Current confidence: 70% (stated by PM, not verified via code review)
- Why this is the weakest link: If the team lacks real React experience, the productivity advantage evaporates and Vue's lower learning curve becomes the dominant factor
- How to verify: Review team members' recent commits; conduct brief technical screen on React patterns (hooks, context, suspense)
Alternative Conclusion
Alternative Conclusion
If team React experience (A1) is overstated, then Vue is the better choice: its gentler learning curve (Composition API maps well to React hooks mental model but with less boilerplate), better documentation, and faster time-to-productive offset the smaller ecosystem. The recommendation flips to Vue with Nuxt.
If team React experience (A1) is overstated, then Vue is the better choice: its gentler learning curve (Composition API maps well to React hooks mental model but with less boilerplate), better documentation, and faster time-to-productive offset the smaller ecosystem. The recommendation flips to Vue with Nuxt.
Falsification Conditions
Falsification Conditions
- Team has < 2 engineers genuinely proficient in React (not just "used it once")
- Project scope shrinks to < 6 months with no SSR requirement
- Hiring is not a concern (stable team, no growth planned)
undefined- Team has < 2 engineers genuinely proficient in React (not just "used it once")
- Project scope shrinks to < 6 months with no SSR requirement
- Hiring is not a concern (stable team, no growth planned)
undefinedWhen to Use
使用场景
- After any recommendation: "Why Postgres?" "Why microservices?" "Why this library?"
- After any estimate: "Why 2 weeks?" "Why $50k?" "Why 3 engineers?"
- During architecture decisions where stakeholders need to audit the reasoning
- Code review rationale: "Why did you flag this as a security issue?"
- When the user says "convince me" or "I'm not sure about this"
- Post-mortem analysis: "Why did we think X would work?"
- Any high-stakes decision where the reasoning must survive scrutiny
- 任何推荐之后:“为什么选Postgres?”“为什么选微服务?”“为什么用这个库?”
- 任何估算之后:“为什么需要2周?”“为什么是5万美元?”“为什么需要3名工程师?”
- 架构决策过程中,利益相关者需要审计推理逻辑时
- 代码审查理由:“为什么你认为这是安全问题?”
- 当用户说“说服我”或“我对此不确定”时
- 事后分析:“为什么我们认为X可行?”
- 任何高风险决策,推理逻辑需经得起推敲时
When NOT to Use
非使用场景
- Simple factual lookups with clear answers ("What port does Postgres use?")
- Creative brainstorming where structured reasoning kills ideation (use )
swing-options - When the user wants a quick answer and explicitly says so
- Exhaustive code/system analysis without a specific claim to trace (use )
deep-dive-analyzer - Stress-testing someone else's reasoning (use -- it attacks; this skill exposes)
swing-review - Research requiring external source verification (use for facts; this skill traces reasoning about facts)
swing-research
- 答案明确的简单事实查询(“Postgres使用哪个端口?”)
- 创意头脑风暴场景,结构化推理会扼杀创意(使用)
swing-options - 用户明确要求快速答案时
- 无具体主张的全面代码/系统分析(使用)
deep-dive-analyzer - 质疑他人推理逻辑的场景(使用——该工具用于攻击;本工具用于暴露推理逻辑)
swing-review - 需要外部来源验证的研究场景(使用获取事实;本工具用于追踪基于事实的推理逻辑)
swing-research
Integration Notes
集成说明
- With swing-clarify: Run swing-clarify first on ambiguous requests before invoking this skill. Clarified scope produces better results.
- Before swing-review: Trace your reasoning first, then stress-test it. The assumption inventory from swing-trace feeds directly into swing-review's attack vectors.
- After swing-research: Research gathers verified facts; swing-trace then makes the logic connecting those facts to a conclusion transparent.
- With swing-options: When swing-options generates options, swing-trace can trace why one option was selected over others.
- With deep-dive-analyzer: Deep-dive produces exhaustive understanding; swing-trace adds the "so what" -- why that understanding leads to a specific conclusion.
- With skill-composer: Common pipeline: ->
swing-research->swing-trace(gather facts -> trace reasoning -> stress-test conclusion).swing-review
- 与swing-clarify集成:在调用本工具前,先对模糊请求使用swing-clarify。明确的范围能生成更优结果。
- 在swing-review之前使用:先追踪推理逻辑,再进行压力测试。swing-trace生成的假设清单可直接作为swing-review的攻击切入点。
- 在swing-research之后使用:swing-research用于收集已验证的事实;swing-trace用于让“将这些事实与结论关联的逻辑”透明化。
- 与swing-options集成:当swing-options生成选项后,swing-trace可追踪为何选择某一选项而非其他。
- 与deep-dive-analyzer集成:deep-dive用于生成全面的理解;swing-trace补充“所以呢”——即该理解如何导向具体结论。
- 与skill-composer集成:常见流程:->
swing-research->swing-trace(收集事实 -> 追踪推理 -> 压力测试结论)。swing-review