debugging-protocol
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDebugging Protocol
调试协议
Value: Feedback -- systematic investigation produces understanding.
Understanding produces correct fixes. Correct fixes prevent recurrence.
Skipping investigation produces symptom fixes that hide bugs.
价值: 反馈——系统化的调查能带来对问题的理解,理解才能催生正确的修复方案,正确的修复可防止问题复发。跳过调查阶段只会针对症状修复,隐藏真正的bug。
Purpose
目的
Teaches a disciplined 4-phase debugging process that enforces root cause
analysis before any fix attempt. Prevents the most common debugging failure
mode: jumping to a fix without understanding why the problem exists.
教授一套严谨的四阶段调试流程,要求在尝试任何修复前先进行根本原因分析。避免最常见的调试失败模式:在未理解问题成因的情况下就急于修复。
Practices
实践方法
The Iron Law: No Fixes Without Investigation
铁律:未调查不修复
Never change code to fix a bug until you have completed root cause
investigation. When you see an error and immediately know the fix, that is
exactly when you are most likely to be wrong. Investigate first.
Do:
- Read the complete error message and stack trace before doing anything else
- Reproduce the bug consistently before investigating
- Understand WHY something is broken, not just WHAT is broken
Do not:
- Add a null check because you see a null pointer error (symptom fix)
- Try "a few things" to see what sticks (random debugging)
- Skip investigation because "this is an easy one"
在完成根本原因调查前,绝不要修改代码来修复bug。当你看到错误后立刻就知道修复方案时,恰恰是你最可能犯错的时候。先调查,再动手。
应该做:
- 在采取任何行动前,完整阅读错误信息和堆栈跟踪
- 在开始调查前,确保能稳定复现bug
- 理解问题为什么会出现,而不只是知道什么出了问题
不应该做:
- 因为看到空指针错误就直接添加空值判断(仅针对症状的修复)
- 尝试“各种方法”碰运气(随机调试)
- 因为“这很简单”就跳过调查阶段
Phase 1: Understand the Failure
阶段1:理解故障
Gather facts. Do not interpret yet.
- Read the full error message -- every line, not just the first
- Identify the exact file and line where the failure occurs
- Reproduce the failure consistently (if it does not reproduce, that is important information)
- Check recent changes: and
git log --oneline -10git diff - Note the data flow: where does the bad value come from?
Output: A clear statement of what is happening, where, and since when.
收集事实,暂不做解读。
- 完整阅读错误信息——每一行都要看,不只是第一行
- 定位故障发生的具体文件和代码行
- 稳定复现故障(如果无法复现,这也是重要信息)
- 检查近期变更:和
git log --oneline -10git diff - 梳理数据流:错误值来自哪里?
输出: 清晰说明问题是什么、发生在哪里、从何时开始出现。
Phase 2: Find Working Examples
阶段2:寻找可用示例
Compare broken against working. The difference is the bug.
- Find similar code that works correctly
- Compare setup, inputs, state, and configuration
- Identify what differs between the working and failing case
- Check dependencies: did a library update? Did an environment change?
Output: A specific difference between working and failing cases.
对比故障案例与正常案例,差异之处就是bug所在。
- 找到功能正常的相似代码
- 对比两者的配置、输入、状态和环境
- 找出正常案例与故障案例的差异点
- 检查依赖项:是否有库更新?环境是否发生变化?
输出: 正常案例与故障案例之间的具体差异。
Phase 3: Test One Hypothesis
阶段3:测试单一假设
Form a single, explicit hypothesis. Test it with one change. Learn from the
result.
- State the hypothesis: "I believe the bug is caused by [X] because [evidence]"
- Make ONE change to test it
- Observe the result
- If the hypothesis is wrong, UNDO the change completely
- Form a new hypothesis incorporating what you learned
Do not change multiple things at once. If you change the import, the type,
and the logic simultaneously, you cannot know which change mattered.
Output: Confirmed or refuted hypothesis with evidence.
提出一个明确的单一假设,通过一项变更来测试它,并从结果中学习。
- 明确陈述假设:“我认为bug是由[X]导致的,因为[相关证据]”
- 仅做一项变更来测试假设
- 观察结果
- 如果假设不成立,完全撤销本次变更
- 结合学到的信息提出新的假设
不要同时做多项变更。如果你同时修改了导入语句、类型定义和逻辑代码,你将无法确定哪项变更起了作用。
输出: 已验证或被推翻的假设及相关证据。
Phase 4: Fix and Verify
阶段4:修复与验证
Fix with confidence because you understand the root cause.
- Write a failing test that reproduces the bug (if one does not already exist)
- Implement the fix targeting the root cause identified in Phase 3
- Verify: the new test passes, all existing tests still pass
- Confirm you fixed the cause, not the symptom
Output: A fix backed by a test, with all tests green.
在理解根本原因的基础上,自信地修复问题。
- 编写一个能复现bug的失败测试用例(如果还没有的话)
- 针对阶段3确定的根本原因实施修复
- 验证:新测试用例通过,所有现有测试用例仍能通过
- 确认你修复的是问题根源,而非仅针对症状
输出: 有测试用例支撑的修复方案,且所有测试用例均通过。
Escalation: Three Strikes Rule
升级规则:三次失败原则
If three fix attempts fail, stop. The problem is not what you think it is.
After the third failure:
- Stop attempting fixes entirely
- Document what you tried and why each attempt failed
- Question your assumptions: wrong abstraction? Wrong domain model? Wrong problem entirely?
- Seek a broader perspective -- architecture review, domain expert, or escalate to the user
Three failed fixes almost always signal a design problem, not a code problem.
More code fixes will not help.
Example:
Attempt 1: Add caching (hypothesis: slow queries) -> Still slow
Attempt 2: Add index (hypothesis: missing index) -> Still slow
Attempt 3: Eager loading (hypothesis: N+1) -> Still slow
STOP. Profile the system.
Result: 90% of time in external API call. Not a database problem at all.如果三次修复尝试都失败了,立即停止。问题并非你所想的那样。
第三次失败后:
- 完全停止修复尝试
- 记录你尝试过的方法以及每次失败的原因
- 质疑你的假设:抽象是否错误?领域模型是否有误?是否完全搞错了问题?
- 寻求更广泛的视角——架构评审、咨询领域专家,或向用户升级问题
三次修复失败几乎总是表明存在设计问题,而非代码问题。更多的代码修复无济于事。
示例:
尝试1:添加缓存(假设:查询过慢)-> 仍然缓慢
尝试2:添加索引(假设:缺少索引)-> 仍然缓慢
尝试3:预加载(假设:N+1查询问题)-> 仍然缓慢
停止。对系统进行性能分析。
结果:90%的时间消耗在外部API调用上。根本不是数据库问题。Enforcement Note
执行说明
This skill provides advisory guidance. It instructs the agent to investigate
before fixing but cannot mechanically prevent premature fix attempts. The
agent follows these practices by convention. If you observe the agent
skipping investigation, point it out.
本技能提供指导性建议。它会指导Agent在修复前先进行调查,但无法机械地阻止过早的修复尝试。Agent会按照惯例遵循这些实践方法。如果你发现Agent跳过了调查阶段,请指出这一点。
Verification
验证
After debugging guided by this skill, verify:
- Completed Phase 1 investigation before any code changes
- Read the complete error message (not just the first line)
- Reproduced the bug consistently
- Found a working example to compare against
- Stated an explicit hypothesis before each fix attempt
- Made only one change per hypothesis test
- Undid failed hypotheses before trying new ones
- Wrote or confirmed a failing test before implementing the fix
- Verified all tests pass after the fix
- Did not exceed three fix attempts without escalating
If any criterion is not met, revisit the relevant phase.
在遵循本技能完成调试后,请验证以下内容:
- 在进行任何代码变更前完成了阶段1的调查
- 完整阅读了错误信息(不只是第一行)
- 能稳定复现bug
- 找到可对比的正常示例
- 在每次修复尝试前都明确提出了假设
- 每次测试假设时仅做一项变更
- 在尝试新假设前撤销了失败的变更
- 在实施修复前编写或确认了失败测试用例
- 修复后验证所有测试用例均通过
- 未在三次修复失败后继续尝试而不升级问题
如果有任何一项未满足,请重新回顾相关阶段。
Dependencies
依赖项
This skill works standalone with no required dependencies. It integrates with:
- tdd: When a test fails unexpectedly during TDD, this skill guides investigation before modifying code
- user-input-protocol: When debugging reaches an ambiguous decision point, pause and ask the user rather than guessing
- domain-modeling: If three fixes fail, the root cause may be a domain modeling problem -- escalate to domain review
Missing a dependency? Install with:
npx skills add jwilger/agent-skills --skill tdd本技能可独立运行,无需依赖其他技能。它可与以下技能集成:
- tdd: 当TDD过程中测试意外失败时,本技能会指导在修改代码前先进行调查
- user-input-protocol: 当调试进入模糊决策点时,暂停并询问用户,而非猜测
- domain-modeling: 如果三次修复失败,根本原因可能是领域建模问题——升级至领域评审
缺少依赖项?使用以下命令安装:
npx skills add jwilger/agent-skills --skill tdd