debug-mantra

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Debug Mantra

Four-step discipline for any debug session. Recite verbatim, then apply in order.

适用于所有调试会话的四步准则。先逐字背诵，再按顺序执行。

Recite this — verbatim, as the first thing in your first response

请先逐字背诵以下内容——作为首次回复的开头

Mantra:

First is reproducibility. Can the issue be reproduced reliably?

Know the fail path. Debugger first; then source trace + knob enumeration; then in-code instrumentation.

Question your hypothesis. What would disprove it?

Every run is a breadcrumb. Cross-reference all of them.

Then begin work.

准则：

首要原则：可复现性。能否稳定复现该问题？

明确故障路径。优先使用Debugger；其次是源码追踪+枚举可调参数；最后是代码内埋点。

质疑你的假设。什么情况可以推翻该假设？

每一次运行都是线索。交叉验证所有线索。

之后再开始工作。

1. Reproduce reliably

1. 稳定复现问题

Build a runnable repro before anything else.

Reliable repro → capture the exact steps, inputs, and environment as a runnable artifact: failing test, curl script, CLI invocation, replay harness.
Flaky repro → the bug is not yet debuggable. Raise the rate first: loop the trigger, parallelise, add stress, narrow timing windows, inject sleeps. 50% flake is debuggable; 1% is not.
No repro at all → stop. Say so explicitly. Ask the user for env access, captured artifacts (HAR, log dump, core), or permission to instrument. Do not proceed to hypothesise.

Target: a fast (1–5 s), deterministic pass/fail signal. Pin time, seed the RNG, freeze network, isolate filesystem.

首先构建可运行的复现场景。

稳定复现 → 将精确步骤、输入和环境封装为可运行的产物：失败的测试用例、curl脚本、CLI调用命令、重放工具。
不稳定复现 → 该bug暂无法调试。先提高复现率：循环触发、并行执行、增加压力、缩小时间窗口、注入延迟。50%的复现率可进行调试；1%则不行。
完全无法复现 → 停止操作。明确告知用户。请求获取环境访问权限、已捕获的产物（HAR文件、日志 dump、核心转储），或允许添加埋点。禁止直接进行假设。

目标：生成快速（1-5秒）、确定性的成功/失败信号。固定时间、设置随机数生成器种子、冻结网络、隔离文件系统。

2. Know the fail path

2. 明确故障路径

Once reproducible, find where the code breaks and what stops it from breaking. The differential narrows the search. Try in this order — escalate only when the prior tactic fails.

Attach a debugger. If the env supports it, attach and step to the failure site. One breakpoint beats ten logs. Do this before turning any knobs.
Source trace + knob enumeration. If no debugger (or it can't reach the bug), trace the code path end-to-end and list every knob that can influence the outcome:
- config flags, env vars, feature toggles
- branch conditions, input shape
- timing, concurrency, build options Each knob is a candidate axis to flip in the differential. Flip one at a time.
In-code instrumentation. If outside knobs can't move the failure, go inside:
```
printf
```
/ log statements at the suspected fail site, dump the relevant internal state. Tag every probe with a unique prefix (e.g.
```
[DBG-a4f2]
```
) so cleanup is a single grep. Let the trace show where reality diverges from your model.

实现稳定复现后，找出代码何处出错，以及什么因素能阻止错误发生。通过差异分析缩小排查范围。按以下顺序尝试——仅当前一种方法失败时才升级方案。

连接Debugger。如果环境支持，连接Debugger并逐步执行到故障点。一个断点胜过十条日志。请在调整任何参数前执行此操作。
源码追踪+枚举可调参数。如果没有Debugger（或无法定位到bug），则从头到尾追踪代码路径，并列出所有可能影响结果的可调参数：
- 配置标志、环境变量、功能开关
- 分支条件、输入格式
- 时序、并发、构建选项每个参数都是差异分析中的候选变量。每次只调整一个参数。
代码内埋点。如果外部参数无法改变故障结果，则深入代码内部：在疑似故障点添加
```
printf
```
/日志语句，输出相关内部状态。给每个埋点标记唯一前缀（例如
```
[DBG-a4f2]
```
），以便通过一次grep命令清理。让追踪信息展示实际情况与你的预期模型的差异。

3. Falsify the hypothesis

3. 证伪假设

When a candidate root cause surfaces, scrutinise it before testing it.

Does it actually explain the symptom end-to-end? Walk it through.
What is the simplest proof? What is the cleanest disproof?
Run the disproof first. If the hypothesis survives, it's real. If it dies, you saved yourself from chasing a phantom.
Generate 3–5 ranked hypotheses, not one. Single-hypothesis thinking anchors on the first plausible idea.

当出现候选根因时，请在测试前仔细审视它。

它能否完整解释所有症状？逐一验证。
最简单的验证方法是什么？最直接的证伪方法是什么？
优先执行证伪测试。如果假设能通过证伪，则它是有效的；如果不能，则避免了无意义的追查。
生成3-5个排序后的假设，而非仅一个。单一假设思维会局限于第一个看似合理的想法。

4. Every run is a breadcrumb

4. 每一次运行都是线索

Maintain a running ledger of every experiment in this session. Each entry: what changed, what happened, what it ruled in or out.

When a new hypothesis surfaces, walk the ledger. Does it hold for every prior observation, not just the most recent?
If any past run contradicts it, the hypothesis is wrong or incomplete — refine or discard.
When in doubt, design the single experiment whose outcome makes it certain. Run that next, instead of churning on adjacent runs.
Update the ledger after every run. It is your memory across the session.

记录本次调试会话中所有实验的台账。每条记录包含：修改了什么、发生了什么、排除或确认了什么。

当新假设出现时，核对台账。它是否符合所有过往观测结果，而非仅最新的？
如果任何过往运行结果与之矛盾，则该假设错误或不完整——需修正或舍弃。
若有疑问，设计一个决定性实验，其结果能明确验证假设。接下来执行该实验，而非反复进行无关的测试。
每次运行后更新台账。它是你在整个调试会话中的记忆载体。

Operating rules

操作规则

Recite the mantra block once per debug session, in your first response. Do not re-recite mid-session.
Recite verbatim. Never paraphrase, shorten, or skip lines of the recital.
If the user says "skip the mantra" → skip the recital but still apply the four steps silently.
Apply the four steps in order:
- Do not propose a fix before #1 is satisfied (reliable repro exists).
- Do not start testing hypotheses before #2 has narrowed the fail path.
- Do not commit to a hypothesis before #3 has tried to disprove it.
- Do not declare a hypothesis correct until #4 confirms it against every prior breadcrumb.
If you catch yourself proposing a fix without a reliable repro, stop and return to step 1.
The mantra is a constraint you carry through the session — not advice to deliver back to the user.

每个调试会话仅在首次回复中逐字背诵一次准则内容。会话中途请勿重复背诵。
逐字背诵。切勿意译、缩短或跳过任何内容。
如果用户说“跳过准则”→ 跳过背诵环节，但仍需默默执行四个步骤。
按顺序执行四个步骤：
- 在完成步骤1（实现稳定复现）前，请勿提出修复方案。
- 在步骤2缩小故障范围前，请勿开始测试假设。
- 在步骤3尝试证伪前，请勿确认假设。
- 在步骤4验证所有过往线索前，请勿宣布假设正确。
如果发现自己在未实现稳定复现的情况下提出修复方案，请停止操作并返回步骤1。
该准则是你在会话中需遵守的约束——而非需传达给用户的建议。