debug-mantra

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Debug Mantra

Debug Mantra

Four-step discipline for any debug session. Recite verbatim, then apply in order.
适用于所有调试会话的四步准则。先逐字背诵,再按顺序执行。

Recite this — verbatim, as the first thing in your first response

请先逐字背诵以下内容——作为首次回复的开头

Mantra:
  1. First is reproducibility. Can the issue be reproduced reliably?
  2. Know the fail path. Debugger first; then source trace + knob enumeration; then in-code instrumentation.
  3. Question your hypothesis. What would disprove it?
  4. Every run is a breadcrumb. Cross-reference all of them.
Then begin work.

准则:
  1. 首要原则:可复现性。能否稳定复现该问题?
  2. 明确故障路径。优先使用Debugger;其次是源码追踪+枚举可调参数;最后是代码内埋点。
  3. 质疑你的假设。什么情况可以推翻该假设?
  4. 每一次运行都是线索。交叉验证所有线索。
之后再开始工作。

1. Reproduce reliably

1. 稳定复现问题

Build a runnable repro before anything else.
  • Reliable repro → capture the exact steps, inputs, and environment as a runnable artifact: failing test, curl script, CLI invocation, replay harness.
  • Flaky repro → the bug is not yet debuggable. Raise the rate first: loop the trigger, parallelise, add stress, narrow timing windows, inject sleeps. 50% flake is debuggable; 1% is not.
  • No repro at all → stop. Say so explicitly. Ask the user for env access, captured artifacts (HAR, log dump, core), or permission to instrument. Do not proceed to hypothesise.
Target: a fast (1–5 s), deterministic pass/fail signal. Pin time, seed the RNG, freeze network, isolate filesystem.
首先构建可运行的复现场景。
  • 稳定复现 → 将精确步骤、输入和环境封装为可运行的产物:失败的测试用例、curl脚本、CLI调用命令、重放工具。
  • 不稳定复现 → 该bug暂无法调试。先提高复现率:循环触发、并行执行、增加压力、缩小时间窗口、注入延迟。50%的复现率可进行调试;1%则不行。
  • 完全无法复现 → 停止操作。明确告知用户。请求获取环境访问权限、已捕获的产物(HAR文件、日志 dump、核心转储),或允许添加埋点。禁止直接进行假设。
目标:生成快速(1-5秒)、确定性的成功/失败信号。固定时间、设置随机数生成器种子、冻结网络、隔离文件系统。

2. Know the fail path

2. 明确故障路径

Once reproducible, find where the code breaks and what stops it from breaking. The differential narrows the search. Try in this order — escalate only when the prior tactic fails.
  1. Attach a debugger. If the env supports it, attach and step to the failure site. One breakpoint beats ten logs. Do this before turning any knobs.
  2. Source trace + knob enumeration. If no debugger (or it can't reach the bug), trace the code path end-to-end and list every knob that can influence the outcome:
    • config flags, env vars, feature toggles
    • branch conditions, input shape
    • timing, concurrency, build options Each knob is a candidate axis to flip in the differential. Flip one at a time.
  3. In-code instrumentation. If outside knobs can't move the failure, go inside:
    printf
    / log statements at the suspected fail site, dump the relevant internal state. Tag every probe with a unique prefix (e.g.
    [DBG-a4f2]
    ) so cleanup is a single grep. Let the trace show where reality diverges from your model.
实现稳定复现后,找出代码何处出错,以及什么因素能阻止错误发生。通过差异分析缩小排查范围。按以下顺序尝试——仅当前一种方法失败时才升级方案。
  1. 连接Debugger。如果环境支持,连接Debugger并逐步执行到故障点。一个断点胜过十条日志。请在调整任何参数前执行此操作。
  2. 源码追踪+枚举可调参数。如果没有Debugger(或无法定位到bug),则从头到尾追踪代码路径,并列出所有可能影响结果的可调参数:
    • 配置标志、环境变量、功能开关
    • 分支条件、输入格式
    • 时序、并发、构建选项 每个参数都是差异分析中的候选变量。每次只调整一个参数。
  3. 代码内埋点。如果外部参数无法改变故障结果,则深入代码内部:在疑似故障点添加
    printf
    /日志语句,输出相关内部状态。给每个埋点标记唯一前缀(例如
    [DBG-a4f2]
    ),以便通过一次grep命令清理。让追踪信息展示实际情况与你的预期模型的差异。

3. Falsify the hypothesis

3. 证伪假设

When a candidate root cause surfaces, scrutinise it before testing it.
  • Does it actually explain the symptom end-to-end? Walk it through.
  • What is the simplest proof? What is the cleanest disproof?
  • Run the disproof first. If the hypothesis survives, it's real. If it dies, you saved yourself from chasing a phantom.
  • Generate 3–5 ranked hypotheses, not one. Single-hypothesis thinking anchors on the first plausible idea.
当出现候选根因时,请在测试前仔细审视它。
  • 它能否完整解释所有症状?逐一验证。
  • 最简单的验证方法是什么?最直接的证伪方法是什么?
  • 优先执行证伪测试。如果假设能通过证伪,则它是有效的;如果不能,则避免了无意义的追查。
  • 生成3-5个排序后的假设,而非仅一个。单一假设思维会局限于第一个看似合理的想法。

4. Every run is a breadcrumb

4. 每一次运行都是线索

Maintain a running ledger of every experiment in this session. Each entry: what changed, what happened, what it ruled in or out.
  • When a new hypothesis surfaces, walk the ledger. Does it hold for every prior observation, not just the most recent?
  • If any past run contradicts it, the hypothesis is wrong or incomplete — refine or discard.
  • When in doubt, design the single experiment whose outcome makes it certain. Run that next, instead of churning on adjacent runs.
  • Update the ledger after every run. It is your memory across the session.

记录本次调试会话中所有实验的台账。每条记录包含:修改了什么、发生了什么、排除或确认了什么。
  • 当新假设出现时,核对台账。它是否符合所有过往观测结果,而非仅最新的?
  • 如果任何过往运行结果与之矛盾,则该假设错误或不完整——需修正或舍弃。
  • 若有疑问,设计一个决定性实验,其结果能明确验证假设。接下来执行该实验,而非反复进行无关的测试。
  • 每次运行后更新台账。它是你在整个调试会话中的记忆载体。

Operating rules

操作规则

  • Recite the mantra block once per debug session, in your first response. Do not re-recite mid-session.
  • Recite verbatim. Never paraphrase, shorten, or skip lines of the recital.
  • If the user says "skip the mantra" → skip the recital but still apply the four steps silently.
  • Apply the four steps in order:
    • Do not propose a fix before #1 is satisfied (reliable repro exists).
    • Do not start testing hypotheses before #2 has narrowed the fail path.
    • Do not commit to a hypothesis before #3 has tried to disprove it.
    • Do not declare a hypothesis correct until #4 confirms it against every prior breadcrumb.
  • If you catch yourself proposing a fix without a reliable repro, stop and return to step 1.
  • The mantra is a constraint you carry through the session — not advice to deliver back to the user.
  • 每个调试会话仅在首次回复中逐字背诵一次准则内容。会话中途请勿重复背诵。
  • 逐字背诵。切勿意译、缩短或跳过任何内容。
  • 如果用户说“跳过准则”→ 跳过背诵环节,但仍需默默执行四个步骤。
  • 按顺序执行四个步骤
    • 在完成步骤1(实现稳定复现)前,请勿提出修复方案。
    • 在步骤2缩小故障范围前,请勿开始测试假设。
    • 在步骤3尝试证伪前,请勿确认假设。
    • 在步骤4验证所有过往线索前,请勿宣布假设正确。
  • 如果发现自己在未实现稳定复现的情况下提出修复方案,请停止操作并返回步骤1。
  • 该准则是你在会话中需遵守的约束——而非需传达给用户的建议。