principle-fix-root-causes
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFix Root Causes
修复根本原因
When debugging, do not paper over symptoms. Trace every problem to its root cause and fix it there.
Why: Symptom fixes accumulate. Each workaround makes the system harder to reason about, and the real bug remains. Root-cause fixes are slower upfront but reduce total debugging time.
Pattern:
- Reproduce first (if you can't reproduce it, you can't verify your fix)
- Ask "why" until you hit the root cause
- Resist the urge to add guards (adding a nil check to silence a crash is a symptom fix)
- Check for the pattern, not just the instance (grep for the same pattern, fix all instances)
- When stuck, instrument. Don't guess (add logging, read the actual error)
Restart bugs: suspect state before code
Code doesn't change between runs. State does. When something "fails after restart," suspect stale persistent state first: config files, caches, lock files, serialized state. If clearing a state file restores behavior, prioritize state validation as the fix.
调试时,不要掩盖问题症状。追踪每一个问题的根本原因并从根源处修复。
原因: 针对症状的修复会不断累积。每一个权宜之计都会让系统更难理解,而真正的bug依然存在。从根本原因修复虽然前期耗时更长,但能减少整体调试时间。
方法:
- 先复现问题(如果无法复现,就无法验证修复效果)
- 持续追问“为什么”,直到找到根本原因
- 克制添加防护代码的冲动(添加空值检查来掩盖崩溃属于针对症状的修复)
- 检查模式,而非仅针对单个实例(用grep查找相同模式,修复所有实例)
- 遇到瓶颈时,添加监控 instrumentation,不要猜测(添加日志,查看实际错误信息)
重启类bug:优先怀疑状态而非代码
代码在多次运行间不会改变,但状态会。当出现“重启后失败”的情况时,首先怀疑陈旧的持久化状态:配置文件、缓存、锁定文件、序列化状态。如果清除状态文件后恢复正常,优先将状态验证作为修复方案。