debug

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Debugging Skill

调试技能

Systematic approaches to investigating and diagnosing bugs.
系统地调查和诊断Bug的方法。

Core Principle

核心原则

Understand before fixing. A proper diagnosis leads to a proper fix.
先理解再修复。 正确的诊断才能带来正确的修复方案。

Name

名称

han-core:debug - Investigate and diagnose issues without necessarily fixing them
han-core:debug - 调查并诊断问题,不一定需要对问题进行修复

Synopsis

概要

/debug [arguments]
/debug [arguments]

Debug vs Fix

调试与修复的区别

Use
/debug
when:
  • Investigating an issue to understand it
  • Need to gather information before fixing
  • Want to identify root cause without implementing solution
  • Triaging to determine severity/priority
  • Research phase before fix
Use
/fix
when:
  • Ready to implement the solution
  • Debugging AND fixing in one go
  • Issue is understood, just needs fixing
在以下场景使用
/debug
  • 调查问题以了解其本质
  • 需要先收集信息再进行修复
  • 希望确定根本原因但不实施解决方案
  • 分类以确定问题的严重程度/优先级
  • 修复前的研究阶段
在以下场景使用
/fix
  • 已准备好实施解决方案
  • 同时进行调试和修复
  • 问题已明确,只需进行修复

The Scientific Method for Debugging

调试的科学方法

1. Observe

1. 观察

Gather all the facts:
  • What's the symptom? (What's happening that shouldn't?)
  • When does it happen? (Always, sometimes, specific conditions?)
  • Who's affected? (All users, some users, specific scenarios?)
  • Error messages? (Exact text, stack traces, error codes?)
  • Recent changes? (What changed before this started?)
Evidence to collect:
  • Error messages and stack traces
  • Application logs
  • User reports
  • Reproduction steps
  • Environment details (browser, OS, versions)
  • Network requests/responses
  • Database query logs
收集所有事实:
  • 症状是什么?(发生了哪些不应该出现的情况?)
  • 何时发生?(总是、偶尔还是特定条件下?)
  • 影响范围?(所有用户、部分用户还是特定场景?)
  • 错误信息?(确切文本、堆栈跟踪、错误代码?)
  • 近期有哪些变更?(问题出现前发生了什么变化?)
需要收集的证据:
  • 错误信息和堆栈跟踪
  • 应用日志
  • 用户反馈
  • 复现步骤
  • 环境详情(浏览器、操作系统、版本)
  • 网络请求/响应
  • 数据库查询日志

2. Form Hypothesis

2. 提出假设

Based on symptoms, what could cause this?
Common categories:
  • Logic error: Code does wrong thing
  • State management: State gets out of sync
  • Async/timing: Race condition, callback hell
  • Data issue: Unexpected input format
  • Integration: API change, service down
  • Environment: Config, permissions, network
  • Resource: Memory leak, connection pool exhausted
Prioritize hypotheses:
  1. Most likely causes first
  2. Easiest to test first (when equal likelihood)
  3. Most impactful if true
基于症状,可能的原因是什么?
常见类别:
  • 逻辑错误: 代码执行了错误的操作
  • 状态管理问题: 状态不同步
  • 异步/时序问题: 竞态条件、回调地狱
  • 数据问题: 输入格式不符合预期
  • 集成问题: API变更、服务宕机
  • 环境问题: 配置、权限、网络
  • 资源问题: 内存泄漏、连接池耗尽
假设优先级排序:
  1. 最有可能的原因优先
  2. 同等可能性下,最容易测试的优先
  3. 如果成立,影响最大的优先

3. Test Hypothesis

3. 验证假设

Design experiment to prove/disprove:
  • Add logging to see values
  • Add breakpoints to pause execution
  • Modify input to isolate variable
  • Disable feature to rule out
  • Compare with working version
Keep notes:
markdown
**Hypothesis:** Database query timeout
**Test:** Add query timing logs
**Result:** Query completes in 50ms
**Conclusion:** Not the database

**Hypothesis:** Network latency
**Test:** Check network tab, add timing
**Result:** API call takes 5 seconds
**Conclusion:** Found the issue
设计实验来验证/推翻假设:
  • 添加日志查看值
  • 添加断点暂停执行
  • 修改输入以隔离变量
  • 禁用功能以排除可能性
  • 与正常版本进行对比
记录笔记:
markdown
**Hypothesis:** Database query timeout
**Test:** Add query timing logs
**Result:** Query completes in 50ms
**Conclusion:** Not the database

**Hypothesis:** Network latency
**Test:** Check network tab, add timing
**Result:** API call takes 5 seconds
**Conclusion:** Found the issue

4. Analyze Results

4. 分析结果

What did you learn?
  • Hypothesis confirmed or rejected?
  • New questions raised?
  • Unexpected findings?
  • Root cause identified?
你有哪些发现?
  • 假设是否成立?
  • 有没有产生新的问题?
  • 有没有意外发现?
  • 是否找到了根本原因?

5. Repeat or Conclude

5. 重复或总结

If root cause found:
  • Document findings
  • Estimate impact
  • Plan fix
If not found:
  • Form new hypothesis
  • Repeat cycle
如果找到根本原因:
  • 记录发现
  • 评估影响
  • 规划修复方案
如果未找到:
  • 提出新的假设
  • 重复流程

Debugging Strategies

调试策略

Strategy 1: Add Logging

策略1:添加日志

Most universally useful technique:
typescript
// Strategic console.log placement
function processOrder(order) {
  console.log('processOrder START:', { orderId: order.id })

  const items = order.items
  console.log('items:', items.length)

  const validated = validate(items)
  console.log('validation result:', validated)

  if (!validated.success) {
    console.log('validation failed:', validated.errors)
    throw new Error('Invalid order')
  }

  const total = calculateTotal(items)
  console.log('total calculated:', total)

  console.log('processOrder END')
  return total
}
Logging guidelines:
  • Log function entry/exit
  • Log branching decisions
  • Log external calls (API, database)
  • Log unexpected values
  • Include context (IDs, user info)
最通用的实用技巧:
typescript
// Strategic console.log placement
function processOrder(order) {
  console.log('processOrder START:', { orderId: order.id })

  const items = order.items
  console.log('items:', items.length)

  const validated = validate(items)
  console.log('validation result:', validated)

  if (!validated.success) {
    console.log('validation failed:', validated.errors)
    throw new Error('Invalid order')
  }

  const total = calculateTotal(items)
  console.log('total calculated:', total)

  console.log('processOrder END')
  return total
}
日志记录准则:
  • 记录函数的进入/退出
  • 记录分支决策
  • 记录外部调用(API、数据库)
  • 记录不符合预期的值
  • 包含上下文信息(ID、用户信息)

Strategy 2: Use Debugger

策略2:使用调试器

Interactive debugging:
typescript
// Browser
function buggyFunction(input) {
  debugger;  // Execution pauses here
  const result = transform(input)
  debugger;  // And here
  return result
}

// Node.js
node --inspect app.js
交互式调试:
typescript
// Browser
function buggyFunction(input) {
  debugger;  // Execution pauses here
  const result = transform(input)
  debugger;  // And here
  return result
}

// Node.js
node --inspect app.js

Then open chrome://inspect in Chrome

Then open chrome://inspect in Chrome


**Debugger features:**

- Step over (next line)
- Step into (into function call)
- Step out (back to caller)
- Watch expressions
- Call stack inspection
- Variable inspection

**调试器功能:**

- 单步跳过(执行下一行)
- 单步进入(进入函数调用)
- 单步退出(返回调用方)
- 监视表达式
- 调用栈检查
- 变量检查

Strategy 3: Binary Search

策略3:二分法排查

Isolate the problem area:
typescript
// 100 lines of code, bug somewhere

// Comment out lines 50-100
// Bug still happens? It's in lines 1-50

// Comment out lines 25-50
// Bug disappears? It's in lines 25-50

// Comment out lines 37-50
// Bug still happens? It's in lines 25-37

// Continue until isolated to specific lines
隔离问题区域:
typescript
// 100 lines of code, bug somewhere

// Comment out lines 50-100
// Bug still happens? It's in lines 1-50

// Comment out lines 25-50
// Bug disappears? It's in lines 25-50

// Comment out lines 37-50
// Bug still happens? It's in lines 25-37

// Continue until isolated to specific lines

Strategy 4: Rubber Duck Debugging

策略4:橡皮鸭调试法

Explain the problem out loud:
  1. "This function is supposed to calculate shipping cost"
  2. "It takes the weight and destination"
  3. "First it... wait, it's using price instead of weight!"
  4. (Bug found)
Why this works: Forces you to examine assumptions.
大声解释问题:
  1. "这个函数应该计算运费"
  2. "它接收重量和目的地作为参数"
  3. "首先它...等等,它用了价格而不是重量!"
  4. (找到Bug)
为什么有效: 迫使你审视自己的假设。

Strategy 5: Compare Working vs Broken

策略5:对比正常与异常版本

What's different?
Version comparison:
bash
undefined
差异是什么?
版本对比:
bash
undefined

Find which commit broke it

Find which commit broke it

git bisect start git bisect bad HEAD git bisect good v1.0.0
git bisect start git bisect bad HEAD git bisect good v1.0.0

Git checks out middle commit

Git checks out middle commit

npm test git bisect good/bad
npm test git bisect good/bad

Repeat until found

Repeat until found


**Environment comparison:**

- Works locally but not production?
- Works for some users but not others?
- Worked yesterday but not today?

**What changed?**

**环境对比:**

- 本地正常但生产环境异常?
- 部分用户正常部分异常?
- 昨天正常今天异常?

**发生了哪些变化?**

Strategy 6: Simplify

策略6:简化问题

Reduce to minimal reproduction:
typescript
// Complex case with bug
processUserOrderWithDiscountsAndShipping(user, cart, promo, address)

// Simplify inputs one at a time
processUserOrderWithDiscountsAndShipping(user, [], null, null)
// Still breaks? Not discount or address

processUserOrderWithDiscountsAndShipping(null, [], null, null)
// Works now? It's the user object

// What about the user object causes it?
简化为最小复现场景:
typescript
// Complex case with bug
processUserOrderWithDiscountsAndShipping(user, cart, promo, address)

// Simplify inputs one at a time
processUserOrderWithDiscountsAndShipping(user, [], null, null)
// Still breaks? Not discount or address

processUserOrderWithDiscountsAndShipping(null, [], null, null)
// Works now? It's the user object

// What about the user object causes it?

Strategy 7: Check Assumptions

策略7:验证假设

Question everything:
typescript
// Assumption: API returns array
const users = await api.getUsers()
users.forEach(...)  // Crashes

// Check assumption
console.log(typeof users)  // "undefined"
console.log(users)          // undefined

// Assumption was wrong!
Common wrong assumptions:
  • Function returns expected type
  • Variable is defined
  • Array is not empty
  • API will always respond
  • Async operation has completed
  • State is up to date
质疑一切:
typescript
// Assumption: API returns array
const users = await api.getUsers()
users.forEach(...)  // Crashes

// Check assumption
console.log(typeof users)  // "undefined"
console.log(users)          // undefined

// Assumption was wrong!
常见错误假设:
  • 函数返回预期类型
  • 变量已定义
  • 数组非空
  • API总会响应
  • 异步操作已完成
  • 状态是最新的

Debugging by Symptom

按症状调试

"Intermittent failure"

"间歇性故障"

Likely causes:
  • Race condition (timing-dependent)
  • Data-dependent (certain inputs trigger it)
  • Resource leak (happens after N operations)
  • External service flakiness
Investigation:
  • Add extensive logging
  • Look for async operations
  • Check timing between operations
  • Look for shared state
  • Run many times to see pattern
可能原因:
  • 竞态条件(依赖时序)
  • 数据依赖(特定输入触发)
  • 资源泄漏(执行N次后发生)
  • 外部服务不稳定
调查方法:
  • 添加详细日志
  • 查找异步操作
  • 检查操作之间的时序
  • 查找共享状态
  • 多次运行以寻找规律

"Works locally, fails in production"

"本地正常,生产环境异常"

Check differences:
  • Environment variables
  • Data (production has different/more data)
  • Network (CORS, SSL, proxies)
  • Dependencies (versions, OS)
  • Resources (memory, connections)
检查差异:
  • 环境变量
  • 数据(生产环境数据不同/更多)
  • 网络(CORS、SSL、代理)
  • 依赖(版本、操作系统)
  • 资源(内存、连接数)

"Slow performance"

"性能缓慢"

Don't guess - profile:
Frontend:
  • Chrome DevTools > Performance tab
  • Look for long tasks (> 50ms)
  • Check for layout thrashing
  • Look for memory leaks
Backend:
  • Add timing logs around operations
  • Check database query time (EXPLAIN ANALYZE)
  • Check external API call time
  • Profile with APM tool
不要猜测——使用性能分析:
前端:
  • Chrome DevTools > 性能面板
  • 查找长任务(>50ms)
  • 检查布局抖动
  • 查找内存泄漏
后端:
  • 为操作添加时序日志
  • 检查数据库查询时间(EXPLAIN ANALYZE)
  • 检查外部API调用时间
  • 使用APM工具进行性能分析

"Memory leak"

"内存泄漏"

Investigation:
typescript
// Take heap snapshot
// Do operation that leaks
// Take another heap snapshot
// Compare - what increased?
Common causes:
  • Event listeners not removed
  • Closures holding references
  • Global variables accumulating
  • Intervals not cleared
  • Cache growing unbounded
调查方法:
typescript
// Take heap snapshot
// Do operation that leaks
// Take another heap snapshot
// Compare - what increased?
常见原因:
  • 事件监听器未移除
  • 闭包持有引用
  • 全局变量不断累积
  • 定时器未清除
  • 缓存无限制增长

"Crash/Exception"

"崩溃/异常"

Read the stack trace:
Error: Cannot read property 'map' of undefined
    at processUsers (app.js:42:15)
    at handleRequest (app.js:23:3)
    at Server.<anonymous> (server.js:12:5)
Stack trace tells you:
  • Line 42: Where it crashed
  • Line 23: Where it was called from
  • Line 12: Origin of the request
Then:
  • Go to line 42
  • Check what's undefined
  • Trace back why it's undefined
阅读堆栈跟踪:
Error: Cannot read property 'map' of undefined
    at processUsers (app.js:42:15)
    at handleRequest (app.js:23:3)
    at Server.<anonymous> (server.js:12:5)
堆栈跟踪告诉你:
  • 第42行:崩溃位置
  • 第23行:调用方位置
  • 第12行:请求的起源
接下来:
  • 定位到第42行
  • 检查哪个变量未定义
  • 回溯为什么它未定义

"It works sometimes"

"有时正常有时异常"

  • Race condition?
  • Timing issue?
  • Data-dependent?
  • Check for async issues
  • 竞态条件?
  • 时序问题?
  • 数据依赖?
  • 检查异步问题

Common Bug Patterns

常见Bug模式

Null/Undefined

Null/Undefined

typescript
// Bug
function process(user) {
  return user.name.toUpperCase()  // Crashes if user is null
}

// Investigation
console.log('user:', user)  // undefined - why?
// Trace back to where user comes from
typescript
// Bug
function process(user) {
  return user.name.toUpperCase()  // Crashes if user is null
}

// Investigation
console.log('user:', user)  // undefined - why?
// Trace back to where user comes from

Off-by-One

差一错误

typescript
// Bug
for (let i = 0; i <= array.length; i++) {  // <= instead of <
  process(array[i])  // Crashes on last iteration
}

// Investigation
console.log('i:', i, 'length:', array.length)
// Notice i === array.length causes array[i] === undefined
typescript
// Bug
for (let i = 0; i <= array.length; i++) {  // <= instead of <
  process(array[i])  // Crashes on last iteration
}

// Investigation
console.log('i:', i, 'length:', array.length)
// Notice i === array.length causes array[i] === undefined

Async Timing

异步时序问题

typescript
// Bug
let data
fetchData().then(result => {
  data = result
})
console.log(data)  // undefined - async not complete

// Investigation
console.log('1. Before fetch')
fetchData().then(result => {
  console.log('3. Got result')
  data = result
})
console.log('2. After fetch call')
// Output: 1, 2, 3 - async completes later
typescript
// Bug
let data
fetchData().then(result => {
  data = result
})
console.log(data)  // undefined - async not complete

// Investigation
console.log('1. Before fetch')
fetchData().then(result => {
  console.log('3. Got result')
  data = result
})
console.log('2. After fetch call')
// Output: 1, 2, 3 - async completes later

State Mutation

状态突变

typescript
// Bug
function addItem(cart, item) {
  cart.items.push(item)  // Mutates input!
  return cart
}

const originalCart = { items: [] }
const newCart = addItem(originalCart, item)
// originalCart was modified - unexpected!

// Investigation
console.log('before:', originalCart)
const newCart = addItem(originalCart, item)
console.log('after:', originalCart)  // Changed!
typescript
// Bug
function addItem(cart, item) {
  cart.items.push(item)  // Mutates input!
  return cart
}

const originalCart = { items: [] }
const newCart = addItem(originalCart, item)
// originalCart was modified - unexpected!

// Investigation
console.log('before:', originalCart)
const newCart = addItem(originalCart, item)
console.log('after:', originalCart)  // Changed!

Scope Issues

作用域问题

typescript
// Bug
for (var i = 0; i < 3; i++) {
  setTimeout(() => console.log(i), 100)
}
// Prints: 3, 3, 3 (expected 0, 1, 2)

// Investigation
// var is function-scoped, i is shared
// By time timeout fires, loop is done, i === 3

// Fix: Use let (block-scoped) or capture i
typescript
// Bug
for (var i = 0; i < 3; i++) {
  setTimeout(() => console.log(i), 100)
}
// Prints: 3, 3, 3 (expected 0, 1, 2)

// Investigation
// var is function-scoped, i is shared
// By time timeout fires, loop is done, i === 3

// Fix: Use let (block-scoped) or capture i

Investigation Report Format

调查报告格式

markdown
undefined
markdown
undefined

Investigation: [Issue description]

Investigation: [Issue description]

Symptoms

Symptoms

[What's happening that's wrong?]
[What's happening that's wrong?]

Evidence

Evidence

  • Error message: [exact text]
  • When it happens: [conditions]
  • Frequency: [always/sometimes/rarely]
  • Affected users: [all/some/specific group]
  • Error message: [exact text]
  • When it happens: [conditions]
  • Frequency: [always/sometimes/rarely]
  • Affected users: [all/some/specific group]

Reproduction Steps

Reproduction Steps

  1. [Step 1]
  2. [Step 2]
  3. [Observe error]
  1. [Step 1]
  2. [Step 2]
  3. [Observe error]

Investigation Timeline

Investigation Timeline

Hypothesis 1: [What I thought might be wrong]
  • Tested by: [What I did to test]
  • Result: [What I found]
  • Conclusion: [Ruled out / Confirmed]
Hypothesis 2: [Next theory]
  • Tested by: [What I did]
  • Result: [What I found]
  • Conclusion: [Ruled out / Confirmed]
Hypothesis 1: [What I thought might be wrong]
  • Tested by: [What I did to test]
  • Result: [What I found]
  • Conclusion: [Ruled out / Confirmed]
Hypothesis 2: [Next theory]
  • Tested by: [What I did]
  • Result: [What I found]
  • Conclusion: [Ruled out / Confirmed]

Root Cause

Root Cause

[What's actually causing the issue]
Evidence:
  • [Log showing the problem]
  • [Stack trace pointing to source]
  • [Data showing the pattern]
[What's actually causing the issue]
Evidence:
  • [Log showing the problem]
  • [Stack trace pointing to source]
  • [Data showing the pattern]

Impact

Impact

  • Severity: [Critical/High/Medium/Low]
  • Scope: [How many users/scenarios affected]
  • Workaround: [Any temporary solutions]
  • Severity: [Critical/High/Medium/Low]
  • Scope: [How many users/scenarios affected]
  • Workaround: [Any temporary solutions]

Next Steps

Next Steps

  • [What should be done to fix]
  • [Any additional investigation needed]
  • [Related issues to check]
undefined
  • [What should be done to fix]
  • [Any additional investigation needed]
  • [Related issues to check]
undefined

Debugging Tools

调试工具

Browser Developer Tools

浏览器开发者工具

Console:
  • console.log()
    - Print values
  • console.table()
    - Display arrays/objects as table
  • console.trace()
    - Print stack trace
  • console.time()
    /
    console.timeEnd()
    - Measure duration
Debugger:
  • Set breakpoints
  • Step through code
  • Inspect variables
  • Watch expressions
  • Call stack
Network:
  • View all requests
  • See request/response headers and bodies
  • Measure timing
  • Replay requests
Performance:
  • Record profile
  • See function call tree
  • Identify bottlenecks
  • Check memory usage
控制台:
  • console.log()
    - 打印值
  • console.table()
    - 以表格形式显示数组/对象
  • console.trace()
    - 打印堆栈跟踪
  • console.time()
    /
    console.timeEnd()
    - 测量耗时
调试器:
  • 设置断点
  • 单步执行代码
  • 检查变量
  • 监视表达式
  • 调用栈
网络面板:
  • 查看所有请求
  • 查看请求/响应头和内容
  • 测量耗时
  • 重放请求
性能面板:
  • 记录性能分析
  • 查看函数调用树
  • 识别瓶颈
  • 检查内存使用

Command Line Tools

命令行工具

bash
undefined
bash
undefined

Search for text in files

Search for text in files

grep -r "error" logs/
grep -r "error" logs/

Follow log file

Follow log file

tail -f logs/app.log
tail -f logs/app.log

Search with context

Search with context

grep -B 5 -A 5 "ERROR" logs/app.log
grep -B 5 -A 5 "ERROR" logs/app.log

Check disk space

Check disk space

df -h
df -h

Check memory

Check memory

free -m
free -m

Check running processes

Check running processes

ps aux | grep node
undefined
ps aux | grep node
undefined

Database Debugging

数据库调试

sql
-- PostgreSQL
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';

-- Show slow queries
SELECT * FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;

-- Check table size
SELECT pg_size_pretty(pg_total_relation_size('users'));

-- Check indexes
\d users
sql
-- PostgreSQL
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';

-- Show slow queries
SELECT * FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;

-- Check table size
SELECT pg_size_pretty(pg_total_relation_size('users'));

-- Check indexes
\d users

Debugging Checklist

调试检查清单

Before Starting

开始前

  • Can reproduce the issue reliably?
  • Have error message or symptom description?
  • Know when it started happening?
  • Checked if recent changes related?
  • Checked logs for clues?
  • 能否稳定复现问题?
  • 是否有错误信息或症状描述?
  • 知道问题何时开始出现吗?
  • 检查过近期变更是否相关吗?
  • 检查过日志寻找线索吗?

During Investigation

调查过程中

  • Formed clear hypothesis?
  • Testing hypothesis systematically?
  • Taking notes on findings?
  • Not making random changes hoping to fix?
  • Questioning assumptions?
  • 是否提出了明确的假设?
  • 是否在系统地验证假设?
  • 是否记录了发现?
  • 是否没有随机修改代码尝试修复?
  • 是否在质疑假设?

After Finding Root Cause

找到根本原因后

  • Understand WHY it happens?
  • Can explain it to someone else?
  • Documented findings?
  • Estimated impact?
  • Identified proper fix?
  • 是否理解问题发生的原因?
  • 能否向他人解释清楚?
  • 是否记录了发现?
  • 是否评估了影响?
  • 是否确定了合适的修复方案?

Anti-Patterns

反模式

Random Code Changes

随机修改代码

BAD: "Maybe if I change this... nope, try this... nope, try this..."
GOOD: "Hypothesis: X causes Y. Test: Change X. Result: Y still happens.
       Conclusion: X is not the cause."
BAD: "Maybe if I change this... nope, try this... nope, try this..."
GOOD: "Hypothesis: X causes Y. Test: Change X. Result: Y still happens.
       Conclusion: X is not the cause."

Assuming Without Verifying

未验证就假设

BAD: "The API must be returning valid data"
GOOD: "Let me log the API response to see what it actually returns"
BAD: "The API must be returning valid data"
GOOD: "Let me log the API response to see what it actually returns"

Stopping at Symptoms

仅停留在症状层面

BAD: "The page is blank. Fixed by adding a null check."
GOOD: "The page is blank because user is null. User is null because
       authentication token expired. Root cause: token not being refreshed."
BAD: "The page is blank. Fixed by adding a null check."
GOOD: "The page is blank because user is null. User is null because
       authentication token expired. Root cause: token not being refreshed."

Debugging in Production

在生产环境调试

BAD: "Let me add console.log to production to see..."
GOOD: "Let me reproduce locally and debug there, or use proper logging"
BAD: "Let me add console.log to production to see..."
GOOD: "Let me reproduce locally and debug there, or use proper logging"

No Reproduction Steps

无复现步骤

BAD: "It crashed once, let me guess why"
GOOD: "Let me find reliable way to reproduce it first"
BAD: "It crashed once, let me guess why"
GOOD: "Let me find reliable way to reproduce it first"

Examples

示例

When the user says:
  • "Why is this page loading slowly?"
  • "Investigate this intermittent test failure"
  • "Figure out why users are seeing this error"
  • "Debug the memory leak in production"
  • "What's causing the database timeouts?"
当用户提出以下问题时:
  • "为什么这个页面加载缓慢?"
  • "调查这个间歇性测试失败的问题"
  • "找出用户看到这个错误的原因"
  • "调试生产环境中的内存泄漏"
  • "是什么导致了数据库超时?"

Integration with Other Skills

与其他技能的集成

  • Use proof-of-work skill to document evidence
  • Use test-driven-development skill to add regression test after fix
  • Use explain skill when explaining bug to others
  • Use boy-scout-rule skill while fixing (improve surrounding code)
  • 使用proof-of-work技能记录证据
  • 使用test-driven-development技能在修复后添加回归测试
  • 向他人解释Bug时使用explain技能
  • 修复时使用boy-scout-rule技能(优化周边代码)

Notes

注意事项

  • Use TaskCreate to track investigation steps
  • Document findings even if not fixing immediately
  • Create minimal reproduction case
  • Consider using /fix once root cause is found
  • Add logging/metrics to prevent future issues
  • 使用TaskCreate跟踪调查步骤
  • 即使不立即修复,也要记录发现
  • 创建最小复现案例
  • 找到根本原因后可以考虑使用/fix
  • 添加日志/指标以防止未来出现类似问题

Remember

谨记

  1. Reproduce first - If you can't reproduce, you can't debug
  2. Gather evidence - Don't guess, look at data
  3. Form hypothesis - What do you think is wrong?
  4. Test systematically - Prove or disprove hypothesis
  5. Find root cause - Not just symptoms
  6. Document - Help future you and others
Debugging is detective work. Be methodical, not random.
  1. 先复现问题 - 无法复现就无法调试
  2. 收集证据 - 不要猜测,查看数据
  3. 提出假设 - 你认为问题出在哪里?
  4. 系统验证 - 证明或推翻假设
  5. 找到根本原因 - 不要只停留在症状
  6. 记录文档 - 帮助未来的自己和他人
调试是侦探工作。要有条不紊,不要随机尝试。