debugger
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSystematic Debugging
系统化调试
Overview
概述
Random fixes waste time and create new bugs. Quick patches mask underlying issues.
Core principle: ALWAYS find root cause before attempting fixes. Symptom fixes are failure.
Violating the letter of this process is violating the spirit of debugging.
随机修复不仅浪费时间,还会引入新bug。仓促的补丁只会掩盖潜在问题。
核心原则: 在尝试修复前,务必找到根本原因。仅修复症状等同于失败。
违反此流程的任何环节,都是违背调试的本质。
The Iron Law
铁律
NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRSTIf you haven't completed Phase 1, you cannot propose fixes.
未完成根本原因调查,禁止进行任何修复如果尚未完成第一阶段,不得提出修复方案。
When to Use
使用场景
Use for ANY technical issue:
- Test failures
- Bugs in production
- Unexpected behavior
- Performance problems
- Build failures
- Integration issues
Use this ESPECIALLY when:
- Under time pressure (emergencies make guessing tempting)
- "Just one quick fix" seems obvious
- You've already tried multiple fixes
- Previous fix didn't work
- You don't fully understand the issue
Don't skip when:
- Issue seems simple (simple bugs have root causes too)
- You're in a hurry (rushing guarantees rework)
适用于任何技术问题:
- 测试失败
- 生产环境中的bug
- 异常行为
- 性能问题
- 构建失败
- 集成问题
尤其在以下场景必须使用:
- 面临时间压力时(紧急情况下容易凭猜测行事)
- 看似"只需快速修复"时
- 已尝试多种修复方案时
- 之前的修复无效时
- 尚未完全理解问题时
切勿跳过的场景:
- 问题看似简单(简单bug同样存在根本原因)
- 时间紧迫时(仓促行事必然导致返工)
The Four Phases
四个阶段
You MUST complete each phase before proceeding to the next.
必须完成当前阶段后,才能进入下一阶段。
Phase 1: Root Cause Investigation
第一阶段:根本原因调查
BEFORE attempting ANY fix:
-
Read Error Messages Carefully
- Don't skip past errors or warnings
- They often contain the exact solution
- Read stack traces completely
- Note line numbers, file paths, error codes
-
Reproduce Consistently
- Can you trigger it reliably?
- What are the exact steps?
- Does it happen every time?
- If not reproducible → gather more data, don't guess
-
Check Recent Changes
- What changed that could cause this?
- Git diff, recent commits
- New dependencies, config changes
- Environmental differences
-
Gather Evidence in Multi-Component SystemsWHEN system has multiple components:BEFORE proposing fixes, add diagnostic instrumentation:
For EACH component boundary: - Log what data enters component - Log what data exits component - Verify environment/config propagation - Check state at each layer Run once to gather evidence showing WHERE it breaks THEN analyze evidence to identify failing component THEN investigate that specific component -
Trace Data Flow
- Where does bad value originate?
- What called this with bad value?
- Keep tracing up until you find the source
- Fix at source, not at symptom
在尝试任何修复之前:
-
仔细阅读错误信息
- 不要跳过错误或警告
- 它们通常包含确切的解决方案
- 完整阅读stack traces(堆栈跟踪)
- 记录行号、文件路径、错误代码
-
稳定复现问题
- 能否可靠触发问题?
- 具体步骤是什么?
- 是否每次都会发生?
- 如果无法复现 → 收集更多数据,不要猜测
-
检查近期变更
- 哪些变更可能导致此问题?
- Git diff、近期提交记录
- 新增依赖、配置变更
- 环境差异
-
在多组件系统中收集证据当系统包含多个组件时:提出修复方案前,添加诊断工具:
For EACH component boundary: - Log what data enters component - Log what data exits component - Verify environment/config propagation - Check state at each layer Run once to gather evidence showing WHERE it breaks THEN analyze evidence to identify failing component THEN investigate that specific component -
追踪数据流
- 错误值源自何处?
- 是谁传入了错误值?
- 持续向上追踪,直到找到源头
- 从源头修复,而非仅修复症状
Phase 2: Pattern Analysis
第二阶段:模式分析
Find the pattern before fixing:
- Find Working Examples — Locate similar working code in same codebase
- Compare Against References — Read reference implementation COMPLETELY
- Identify Differences — List every difference, however small
- Understand Dependencies — What other components does this need?
修复前先找到规律:
- 寻找可行案例 — 在同一代码库中定位类似的可运行代码
- 对比参考实现 — 完整阅读参考实现
- 识别差异 — 列出所有差异,无论多么微小
- 理解依赖关系 — 该功能依赖哪些其他组件?
Phase 3: Hypothesis and Testing
第三阶段:假设与测试
Scientific method:
- Form Single Hypothesis — "I think X is the root cause because Y"
- Test Minimally — Make the SMALLEST possible change to test hypothesis
- Verify Before Continuing — Did it work? Yes → Phase 4. No → form NEW hypothesis
- When You Don't Know — Say "I don't understand X". Don't pretend to know.
遵循科学方法:
- 形成单一假设 — "我认为X是根本原因,因为Y"
- 最小化测试 — 做出最小的改动来验证假设
- 验证后再继续 — 是否有效?是 → 进入第四阶段。否 → 形成新假设
- 当不确定时 — 直接说"我不理解X"。不要假装懂。
Phase 4: Implementation
第四阶段:实施修复
Fix the root cause, not the symptom:
-
Create Failing Test Case — Simplest possible reproduction. MUST have before fixing.
-
Implement Single Fix — Address the root cause. ONE change at a time.
-
Verify Fix — Test passes? No other tests broken? Issue actually resolved?
-
If Fix Doesn't Work — STOP. Count fixes tried. If ≥ 3: question the architecture.
-
If 3+ Fixes Failed: Question Architecture
- Is this pattern fundamentally sound?
- Should we refactor architecture vs. continue fixing symptoms?
- Discuss with user before attempting more fixes
修复根本原因,而非仅修复症状:
-
创建失败测试用例 — 最简单的复现方式。修复前必须完成。
-
实施单一修复 — 针对根本原因。一次只做一处变更。
-
验证修复效果 — 测试通过?其他测试未被破坏?问题真的解决了吗?
-
如果修复无效 — 停止。统计已尝试的修复次数。如果≥3:质疑架构设计。
-
如果3次以上修复失败:质疑架构
- 这种模式从根本上是否合理?
- 我们应该重构架构,还是继续修复症状?
- 在尝试更多修复前与用户讨论
Red Flags — STOP and Follow Process
危险信号 — 停止并遵循流程
If you catch yourself thinking:
- "Quick fix for now, investigate later"
- "Just try changing X and see if it works"
- "Add multiple changes, run tests"
- "It's probably X, let me fix that"
- "I don't fully understand but this might work"
- "One more fix attempt" (when already tried 2+)
ALL of these mean: STOP. Return to Phase 1.
如果你发现自己有以下想法:
- "先快速修复,之后再调查"
- "试试修改X看看能不能行"
- "同时做多处变更,然后运行测试"
- "可能是X的问题,我来修复它"
- "我不完全理解,但这个可能有用"
- "再试一次修复"(已尝试2次以上)
以上所有想法都意味着:停止。回到第一阶段。
Common Rationalizations
常见合理化借口
| Excuse | Reality |
|---|---|
| "Issue is simple, don't need process" | Simple issues have root causes too. |
| "Emergency, no time for process" | Systematic debugging is FASTER than guess-and-check. |
| "Just try this first, then investigate" | First fix sets the pattern. Do it right from the start. |
| "Multiple fixes at once saves time" | Can't isolate what worked. Causes new bugs. |
| "I see the problem, let me fix it" | Seeing symptoms ≠ understanding root cause. |
| "One more fix attempt" (after 2+ failures) | 3+ failures = architectural problem. |
| 借口 | 实际情况 |
|---|---|
| "问题很简单,不需要流程" | 简单问题同样有根本原因。 |
| "紧急情况,没时间走流程" | 系统化调试比试错法更快。 |
| "先试试这个,之后再调查" | 第一次修复会定下模式。从一开始就做对。 |
| "同时做多处变更节省时间" | 无法确定哪项变更起作用。会引入新bug。 |
| "我看到问题了,我来修复" | 看到症状≠理解根本原因。 |
| "再试一次修复"(失败2次后) | 3次以上失败意味着架构存在问题。 |
Quick Reference
快速参考
| Phase | Key Activities | Success Criteria |
|---|---|---|
| 1. Root Cause | Read errors, reproduce, check changes, gather evidence | Understand WHAT and WHY |
| 2. Pattern | Find working examples, compare | Identify differences |
| 3. Hypothesis | Form theory, test minimally | Confirmed or new hypothesis |
| 4. Implementation | Create test, fix, verify | Bug resolved, tests pass |
Related skills:
- — Verify fix worked before claiming success
oac:verification-before-completion - — For creating failing test cases (Phase 4, Step 1)
oac:test-generation
| 阶段 | 关键活动 | 成功标准 |
|---|---|---|
| 1. 根本原因 | 阅读错误信息、复现问题、检查变更、收集证据 | 理解问题是什么以及为什么发生 |
| 2. 模式分析 | 寻找可行案例、对比参考实现 | 识别差异 |
| 3. 假设与测试 | 形成理论、最小化测试 | 假设得到验证或形成新假设 |
| 4. 实施修复 | 创建测试用例、修复、验证 | Bug解决,测试通过 |
相关技能:
- — 确认修复有效后再宣告成功
oac:verification-before-completion - — 用于创建失败测试用例(第四阶段,第一步)
oac:test-generation