debugging

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Debugging Skill

调试技能

Provides comprehensive debugging capabilities with integrated extended thinking for complex scenarios.
提供全面的调试功能,并集成了针对复杂场景的深度思考能力。

When to Use This Skill

何时使用此技能

Activate this skill when working with:
  • Error troubleshooting
  • Log analysis
  • Performance debugging
  • Distributed system debugging
  • Memory and resource issues
  • Complex, multi-layered bugs requiring deep reasoning
在处理以下场景时激活此技能:
  • 错误排查
  • 日志分析
  • 性能调试
  • 分布式系统调试
  • 内存与资源问题
  • 需要深度推理的复杂多层级Bug

Extended Thinking for Complex Debugging

复杂调试场景下的深度思考

When to Enable Extended Thinking

何时启用深度思考

Use extended thinking (Claude's deeper reasoning mode) for debugging when:
  1. Root Cause Unknown: Multiple possible causes, unclear failure patterns
  2. Intermittent Issues: Race conditions, timing issues, non-deterministic failures
  3. Multi-System Failures: Distributed system bugs spanning multiple services
  4. Performance Mysteries: Unexpected slowdowns without obvious bottlenecks
  5. Complex State Issues: Bugs involving intricate state transitions or side effects
  6. Security Vulnerabilities: Subtle security issues requiring careful analysis
在以下调试场景中,使用深度思考(Claude的深度推理模式):
  1. 根因未知:存在多种可能原因,故障模式不清晰
  2. 间歇性问题:竞态条件、时序问题、非确定性故障
  3. 多系统故障:跨多个服务的分布式系统Bug
  4. 性能谜团:无明显瓶颈的意外性能下降
  5. 复杂状态问题:涉及复杂状态转换或副作用的Bug
  6. 安全漏洞:需要仔细分析的细微安全问题

How to Activate Extended Thinking

如何激活深度思考

markdown
undefined
markdown
undefined

In your debugging prompt

In your debugging prompt

Claude, please use extended thinking to help debug this issue:
[Describe the problem with symptoms, context, and what you've tried]

Extended thinking will provide:
- Systematic hypothesis generation
- Multi-path investigation strategies
- Deeper pattern recognition
- Cross-domain insights (e.g., network + application + infrastructure)
Claude, please use extended thinking to help debug this issue:
[Describe the problem with symptoms, context, and what you've tried]

深度思考将提供:
- 系统化的假设生成
- 多路径调查策略
- 更深入的模式识别
- 跨领域洞察(例如:网络 + 应用 + 基础设施)

Hypothesis-Driven Debugging Framework

基于假设的调试框架

Use this structured approach for complex bugs:
针对复杂Bug,使用以下结构化方法:

1. Observation Phase

1. 观察阶段

What happened?
- Error message/stack trace
- Frequency (always/intermittent)
- When it started
- Environmental context
- Recent changes
What happened?
- Error message/stack trace
- Frequency (always/intermittent)
- When it started
- Environmental context
- Recent changes

2. Hypothesis Generation

2. 假设生成

Generate 3-5 plausible hypotheses:

H1: [Most likely cause based on symptoms]
   Evidence for: [...]
   Evidence against: [...]
   Test: [How to validate/invalidate]

H2: [Alternative explanation]
   Evidence for: [...]
   Evidence against: [...]
   Test: [How to validate/invalidate]

H3: [Edge case or rare scenario]
   Evidence for: [...]
   Evidence against: [...]
   Test: [How to validate/invalidate]
Generate 3-5 plausible hypotheses:

H1: [Most likely cause based on symptoms]
   Evidence for: [...]
   Evidence against: [...]
   Test: [How to validate/invalidate]

H2: [Alternative explanation]
   Evidence for: [...]
   Evidence against: [...]
   Test: [How to validate/invalidate]

H3: [Edge case or rare scenario]
   Evidence for: [...]
   Evidence against: [...]
   Test: [How to validate/invalidate]

3. Systematic Testing

3. 系统化测试

Priority order (high to low confidence):
1. Test H1 → Result: [Pass/Fail/Inconclusive]
2. Test H2 → Result: [Pass/Fail/Inconclusive]
3. Test H3 → Result: [Pass/Fail/Inconclusive]

New evidence discovered:
- [Finding 1]
- [Finding 2]

Revised hypotheses if needed:
- [...]
Priority order (high to low confidence):
1. Test H1 → Result: [Pass/Fail/Inconclusive]
2. Test H2 → Result: [Pass/Fail/Inconclusive]
3. Test H3 → Result: [Pass/Fail/Inconclusive]

New evidence discovered:
- [Finding 1]
- [Finding 2]

Revised hypotheses if needed:
- [...]

4. Root Cause Identification

4. 根因定位

Confirmed root cause: [...]
Contributing factors: [...]
Why it wasn't caught earlier: [...]
Confirmed root cause: [...]
Contributing factors: [...]
Why it wasn't caught earlier: [...]

5. Fix + Validation

5. 修复与验证

Fix implemented: [...]
Tests added: [...]
Validation: [...]
Prevention: [...]
Fix implemented: [...]
Tests added: [...]
Validation: [...]
Prevention: [...]

Structured Debugging Templates

结构化调试模板

Template 1: MECE Bug Analysis (Mutually Exclusive, Collectively Exhaustive)

模板1:MECE Bug分析(相互独立、完全穷尽)

markdown
undefined
markdown
undefined

Bug: [Title]

Bug: [Title]

Problem Statement

Problem Statement

  • What: [Precise description]
  • Where: [System/component]
  • When: [Conditions/triggers]
  • Impact: [Severity/scope]
  • What: [Precise description]
  • Where: [System/component]
  • When: [Conditions/triggers]
  • Impact: [Severity/scope]

MECE Hypothesis Tree

MECE Hypothesis Tree

Layer 1: System Boundaries
  • Frontend issue
  • Backend API issue
  • Database issue
  • Infrastructure/network issue
  • External dependency issue
Layer 2: Component-Specific (based on Layer 1 finding)
  • [Sub-component A]
  • [Sub-component B]
  • [Sub-component C]
Layer 3: Code-Level (based on Layer 2 finding)
  • Logic error
  • State management
  • Resource handling
  • Configuration
Layer 1: System Boundaries
  • Frontend issue
  • Backend API issue
  • Database issue
  • Infrastructure/network issue
  • External dependency issue
Layer 2: Component-Specific (based on Layer 1 finding)
  • [Sub-component A]
  • [Sub-component B]
  • [Sub-component C]
Layer 3: Code-Level (based on Layer 2 finding)
  • Logic error
  • State management
  • Resource handling
  • Configuration

Investigation Log

Investigation Log

TimeActionResultNext Step
[HH:MM][What you tested][Finding][Decision]
TimeActionResultNext Step
[HH:MM][What you tested][Finding][Decision]

Root Cause

Root Cause

[Final determination with evidence]
[Final determination with evidence]

Fix

Fix

[Solution with rationale]
undefined
[Solution with rationale]
undefined

Template 2: 5 Whys Analysis

模板2:5个为什么分析

markdown
undefined
markdown
undefined

Issue: [Brief description]

Issue: [Brief description]

Symptom: [Observable problem]
Why 1: Why did this happen? → [Answer]
Why 2: Why did [answer from Why 1] occur? → [Answer]
Why 3: Why did [answer from Why 2] occur? → [Answer]
Why 4: Why did [answer from Why 3] occur? → [Answer]
Why 5: Why did [answer from Why 4] occur? → [Root cause]
Fix: [Addresses root cause] Prevention: [Process/check to prevent recurrence]
undefined
Symptom: [Observable problem]
Why 1: Why did this happen? → [Answer]
Why 2: Why did [answer from Why 1] occur? → [Answer]
Why 3: Why did [answer from Why 2] occur? → [Answer]
Why 4: Why did [answer from Why 3] occur? → [Answer]
Why 5: Why did [answer from Why 4] occur? → [Root cause]
Fix: [Addresses root cause] Prevention: [Process/check to prevent recurrence]
undefined

Template 3: Timeline Reconstruction

模板3:时间线重建

markdown
undefined
markdown
undefined

Incident Timeline: [Event]

Incident Timeline: [Event]

Goal: Reconstruct exact sequence leading to failure
TimeEventSystem StateEvidence
T-5min[Normal operation][State][Logs]
T-2min[Trigger event][State change][Logs/metrics]
T-30s[Cascade starts][Degraded][Alerts]
T-0[Failure][Failed state][Error logs]
T+5min[Recovery action][Recovering][Actions taken]
Critical Path: [Sequence of events that led to failure] Alternative Scenarios: [What could have prevented it at each step]
undefined
Goal: Reconstruct exact sequence leading to failure
TimeEventSystem StateEvidence
T-5min[Normal operation][State][Logs]
T-2min[Trigger event][State change][Logs/metrics]
T-30s[Cascade starts][Degraded][Alerts]
T-0[Failure][Failed state][Error logs]
T+5min[Recovery action][Recovering][Actions taken]
Critical Path: [Sequence of events that led to failure] Alternative Scenarios: [What could have prevented it at each step]
undefined

Python Debugging Patterns

Python调试模式

Hypothesis-Driven Python Debugging Example

基于假设的Python调试示例

```python """ Bug: API endpoint returns 500 error intermittently Symptoms: 1 in 10 requests fail, always with same user IDs Hypothesis: Race condition in user data caching """
python
"""
Bug: API endpoint returns 500 error intermittently
Symptoms: 1 in 10 requests fail, always with same user IDs
Hypothesis: Race condition in user data caching
"""

H1: Cache key collision between users

H1: Cache key collision between users

Test: Add detailed logging around cache operations

Test: Add detailed logging around cache operations

import logging logging.basicConfig(level=logging.DEBUG)
def get_user(user_id): cache_key = f"user:{user_id}" logging.debug(f"Fetching cache key: {cache_key} for user {user_id}")
cached = cache.get(cache_key)
if cached:
    logging.debug(f"Cache hit: {cache_key} -> {cached}")
    return cached

user = db.query(User).filter_by(id=user_id).first()
logging.debug(f"DB fetch for user {user_id}: {user}")

cache.set(cache_key, user, timeout=300)
logging.debug(f"Cache set: {cache_key} -> {user}")

return user
import logging logging.basicConfig(level=logging.DEBUG)
def get_user(user_id): cache_key = f"user:{user_id}" logging.debug(f"Fetching cache key: {cache_key} for user {user_id}")
cached = cache.get(cache_key)
if cached:
    logging.debug(f"Cache hit: {cache_key} -> {cached}")
    return cached

user = db.query(User).filter_by(id=user_id).first()
logging.debug(f"DB fetch for user {user_id}: {user}")

cache.set(cache_key, user, timeout=300)
logging.debug(f"Cache set: {cache_key} -> {user}")

return user

Result: Discovered cache_key had different format in different code paths

Result: Discovered cache_key had different format in different code paths

Root cause: String formatting inconsistency (f"user:{id}" vs f"user_{id}")

Root cause: String formatting inconsistency (f"user:{id}" vs f"user_{id}")

```
undefined

Advanced Debugging with Context Managers

使用上下文管理器的高级调试

```python import time from contextlib import contextmanager
@contextmanager def debug_timer(operation_name): """Time operations and log if slow""" start = time.perf_counter() try: yield finally: duration = time.perf_counter() - start if duration > 1.0: # Slow operation threshold logging.warning( f"{operation_name} took {duration:.2f}s", extra={'operation': operation_name, 'duration': duration} )
python
import time
from contextlib import contextmanager

@contextmanager
def debug_timer(operation_name):
    """Time operations and log if slow"""
    start = time.perf_counter()
    try:
        yield
    finally:
        duration = time.perf_counter() - start
        if duration > 1.0:  # Slow operation threshold
            logging.warning(
                f"{operation_name} took {duration:.2f}s",
                extra={'operation': operation_name, 'duration': duration}
            )

Usage

Usage

with debug_timer("database_query"): results = db.query(User).filter(...).all()
@contextmanager def hypothesis_test(hypothesis_name, expected_outcome): """Test and validate debugging hypotheses""" print(f"\n=== Testing: {hypothesis_name} ===") print(f"Expected: {expected_outcome}") start_state = capture_state() try: yield finally: end_state = capture_state() outcome = compare_states(start_state, end_state) print(f"Actual: {outcome}") print(f"Hypothesis {'CONFIRMED' if outcome == expected_outcome else 'REJECTED'}")
with debug_timer("database_query"): results = db.query(User).filter(...).all()
@contextmanager def hypothesis_test(hypothesis_name, expected_outcome): """Test and validate debugging hypotheses""" print(f"\n=== Testing: {hypothesis_name} ===") print(f"Expected: {expected_outcome}") start_state = capture_state() try: yield finally: end_state = capture_state() outcome = compare_states(start_state, end_state) print(f"Actual: {outcome}") print(f"Hypothesis {'CONFIRMED' if outcome == expected_outcome else 'REJECTED'}")

Usage

Usage

with hypothesis_test( "H1: Database connection pool exhaustion", expected_outcome="pool_size increases during load" ): # Run load test for i in range(100): api_call() ```
with hypothesis_test( "H1: Database connection pool exhaustion", expected_outcome="pool_size increases during load" ): # Run load test for i in range(100): api_call()
undefined

pdb Debugger with Advanced Techniques

高级pdb调试技巧

```python
python
undefined

Basic breakpoint

Basic breakpoint

import pdb; pdb.set_trace()
import pdb; pdb.set_trace()

Python 3.7+

Python 3.7+

breakpoint()
breakpoint()

Conditional breakpoint

Conditional breakpoint

if user_id == 12345: breakpoint()
if user_id == 12345: breakpoint()

Post-mortem debugging (debug after crash)

Post-mortem debugging (debug after crash)

import pdb try: risky_function() except Exception: pdb.post_mortem()
import pdb try: risky_function() except Exception: pdb.post_mortem()

Common pdb commands

Common pdb commands

n(ext) - Execute next line

n(ext) - Execute next line

s(tep) - Step into function

s(tep) - Step into function

c(ontinue) - Continue execution

c(ontinue) - Continue execution

p expr - Print expression

p expr - Print expression

pp expr - Pretty print

pp expr - Pretty print

l(ist) - Show source code

l(ist) - Show source code

w(here) - Show stack trace

w(here) - Show stack trace

u(p) - Move up stack frame

u(p) - Move up stack frame

d(own) - Move down stack frame

d(own) - Move down stack frame

b(reak) - Set breakpoint

b(reak) - Set breakpoint

cl(ear) - Clear breakpoint

cl(ear) - Clear breakpoint

q(uit) - Quit debugger

q(uit) - Quit debugger

Advanced: Programmatic debugging

Advanced: Programmatic debugging

import pdb pdb.run('my_function()', globals(), locals()) ```
import pdb pdb.run('my_function()', globals(), locals())
undefined

Logging

日志记录

```python import logging
logging.basicConfig( level=logging.DEBUG, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', handlers=[ logging.FileHandler('debug.log'), logging.StreamHandler() ] )
logger = logging.getLogger(name)
logger.debug("Debug message") logger.info("Info message") logger.warning("Warning message") logger.error("Error message", exc_info=True) ```
python
import logging

logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('debug.log'),
        logging.StreamHandler()
    ]
)

logger = logging.getLogger(__name__)

logger.debug("Debug message")
logger.info("Info message")
logger.warning("Warning message")
logger.error("Error message", exc_info=True)

Exception Handling

异常处理

```python import traceback
try: result = risky_operation() except Exception as e: # Log full traceback logger.error(f"Operation failed: {e}") logger.error(traceback.format_exc())
# Or get traceback as string
tb = traceback.format_exception(type(e), e, e.__traceback__)
error_details = ''.join(tb)
```
python
import traceback

try:
    result = risky_operation()
except Exception as e:
    # Log full traceback
    logger.error(f"Operation failed: {e}")
    logger.error(traceback.format_exc())

    # Or get traceback as string
    tb = traceback.format_exception(type(e), e, e.__traceback__)
    error_details = ''.join(tb)

JavaScript/Node.js Debugging

JavaScript/Node.js调试

Hypothesis-Driven JavaScript Debugging Example

基于假设的JavaScript调试示例

```javascript /**
  • Bug: Memory leak in websocket connections
  • Symptoms: Memory grows over time, eventually crashes
  • Hypothesis: Event listeners not cleaned up on disconnect */
// H1: Event listeners accumulating // Test: Track listener counts class WebSocketManager { constructor() { this.connections = new Map(); this.debugListenerCounts = true; }
addConnection(userId, socket) { console.debug(`[H1 Test] Adding connection for user ${userId}`);
if (this.debugListenerCounts) {
  console.debug(\`[H1] Listener count before: \${socket.listenerCount('message')}\`);
}

socket.on('message', (data) => this.handleMessage(userId, data));
socket.on('close', () => this.removeConnection(userId));

if (this.debugListenerCounts) {
  console.debug(\`[H1] Listener count after: \${socket.listenerCount('message')}\`);
}

this.connections.set(userId, socket);
}
removeConnection(userId) { console.debug(`[H1 Test] Removing connection for user ${userId}`);
const socket = this.connections.get(userId);
if (socket) {
  const messageListenerCount = socket.listenerCount('message');
  console.debug(\`[H1] Listeners still attached: \${messageListenerCount}\`);

  // Result: Found 3+ listeners on same event!
  // Root cause: Not removing listeners on reconnect
  socket.removeAllListeners();
  this.connections.delete(userId);
}
} } ```
javascript
/**
 * Bug: Memory leak in websocket connections
 * Symptoms: Memory grows over time, eventually crashes
 * Hypothesis: Event listeners not cleaned up on disconnect
 */

// H1: Event listeners accumulating
// Test: Track listener counts
class WebSocketManager {
  constructor() {
    this.connections = new Map();
    this.debugListenerCounts = true;
  }

  addConnection(userId, socket) {
    console.debug(`[H1 Test] Adding connection for user ${userId}`);

    if (this.debugListenerCounts) {
      console.debug(`[H1] Listener count before: ${socket.listenerCount('message')}`);
    }

    socket.on('message', (data) => this.handleMessage(userId, data));
    socket.on('close', () => this.removeConnection(userId));

    if (this.debugListenerCounts) {
      console.debug(`[H1] Listener count after: ${socket.listenerCount('message')}`);
    }

    this.connections.set(userId, socket);
  }

  removeConnection(userId) {
    console.debug(`[H1 Test] Removing connection for user ${userId}`);

    const socket = this.connections.get(userId);
    if (socket) {
      const messageListenerCount = socket.listenerCount('message');
      console.debug(`[H1] Listeners still attached: ${messageListenerCount}`);

      // Result: Found 3+ listeners on same event!
      // Root cause: Not removing listeners on reconnect
      socket.removeAllListeners();
      this.connections.delete(userId);
    }
  }
}

Advanced Console Debugging

高级控制台调试

```javascript // Basic logging console.log('Basic log'); console.error('Error message'); console.warn('Warning');
// Object inspection with depth console.dir(object, { depth: null, colors: true }); console.table(array);
// Performance timing console.time('operation'); // ... code ... console.timeEnd('operation');
// Memory usage console.memory; // Chrome only
// Stack trace console.trace('Trace point');
// Grouping for organized logs console.group('User Authentication Flow'); console.log('Step 1: Validate credentials'); console.log('Step 2: Generate token'); console.groupEnd();
// Conditional logging const debug = (label, data) => { if (process.env.DEBUG) { console.log(`[DEBUG] ${label}:`, JSON.stringify(data, null, 2)); } };
// Hypothesis testing helper function testHypothesis(name, test, expected) { console.group(`Testing: ${name}`); console.log(`Expected: ${expected}`); const actual = test(); console.log(`Actual: ${actual}`); console.log(`Result: ${actual === expected ? 'PASS' : 'FAIL'}`); console.groupEnd(); return actual === expected; }
// Usage testHypothesis( 'H1: Cache returns stale data', () => cache.get('key').timestamp, Date.now() ); ```
javascript
// Basic logging
console.log('Basic log');
console.error('Error message');
console.warn('Warning');

// Object inspection with depth
console.dir(object, { depth: null, colors: true });
console.table(array);

// Performance timing
console.time('operation');
// ... code ...
console.timeEnd('operation');

// Memory usage
console.memory; // Chrome only

// Stack trace
console.trace('Trace point');

// Grouping for organized logs
console.group('User Authentication Flow');
console.log('Step 1: Validate credentials');
console.log('Step 2: Generate token');
console.groupEnd();

// Conditional logging
const debug = (label, data) => {
  if (process.env.DEBUG) {
    console.log(`[DEBUG] ${label}:`, JSON.stringify(data, null, 2));
  }
};

// Hypothesis testing helper
function testHypothesis(name, test, expected) {
  console.group(`Testing: ${name}`);
  console.log(`Expected: ${expected}`);
  const actual = test();
  console.log(`Actual: ${actual}`);
  console.log(`Result: ${actual === expected ? 'PASS' : 'FAIL'}`);
  console.groupEnd();
  return actual === expected;
}

// Usage
testHypothesis(
  'H1: Cache returns stale data',
  () => cache.get('key').timestamp,
  Date.now()
);

Debugging Async/Promise Issues

异步/Promise问题调试

```javascript // Track promise chains const debugPromise = (label, promise) => { console.log(`[${label}] Started`); return promise .then(result => { console.log(`[${label}] Resolved:`, result); return result; }) .catch(error => { console.error(`[${label}] Rejected:`, error); throw error; }); };
// Usage await debugPromise('DB Query', db.users.findOne({ id: 123 }));
// Debugging race conditions async function debugRaceCondition() { const operations = [ { name: 'Op1', fn: async () => { await delay(100); return 'A'; } }, { name: 'Op2', fn: async () => { await delay(50); return 'B'; } }, { name: 'Op3', fn: async () => { await delay(150); return 'C'; } } ];
const results = await Promise.allSettled( operations.map(async op => { const start = Date.now(); const result = await op.fn(); const duration = Date.now() - start; console.log(`${op.name} completed in ${duration}ms: ${result}`); return { op: op.name, result, duration }; }) );
console.table(results.map(r => r.value)); }
// Debugging memory leaks with weak references class DebugMemoryLeaks { constructor() { this.weakMap = new WeakMap(); this.strongRefs = new Map(); }
trackObject(id, obj) { // Weak reference - will be GC'd if no other references this.weakMap.set(obj, { id, created: Date.now() });
// Strong reference - prevents GC (potential leak source)
this.strongRefs.set(id, obj);

console.log(\`Tracking \${id}: Strong refs=\${this.strongRefs.size}\`);
}
release(id) { this.strongRefs.delete(id); console.log(`Released ${id}: Strong refs=${this.strongRefs.size}`); }
checkLeaks() { console.log(`Potential leaks: ${this.strongRefs.size} strong references`); return Array.from(this.strongRefs.keys()); } } ```
javascript
// Track promise chains
const debugPromise = (label, promise) => {
  console.log(`[${label}] Started`);
  return promise
    .then(result => {
      console.log(`[${label}] Resolved:`, result);
      return result;
    })
    .catch(error => {
      console.error(`[${label}] Rejected:`, error);
      throw error;
    });
};

// Usage
await debugPromise('DB Query', db.users.findOne({ id: 123 }));

// Debugging race conditions
async function debugRaceCondition() {
  const operations = [
    { name: 'Op1', fn: async () => { await delay(100); return 'A'; } },
    { name: 'Op2', fn: async () => { await delay(50); return 'B'; } },
    { name: 'Op3', fn: async () => { await delay(150); return 'C'; } }
  ];

  const results = await Promise.allSettled(
    operations.map(async op => {
      const start = Date.now();
      const result = await op.fn();
      const duration = Date.now() - start;
      console.log(`${op.name} completed in ${duration}ms: ${result}`);
      return { op: op.name, result, duration };
    })
  );

  console.table(results.map(r => r.value));
}

// Debugging memory leaks with weak references
class DebugMemoryLeaks {
  constructor() {
    this.weakMap = new WeakMap();
    this.strongRefs = new Map();
  }

  trackObject(id, obj) {
    // Weak reference - will be GC'd if no other references
    this.weakMap.set(obj, { id, created: Date.now() });

    // Strong reference - prevents GC (potential leak source)
    this.strongRefs.set(id, obj);

    console.log(`Tracking ${id}: Strong refs=${this.strongRefs.size}`);
  }

  release(id) {
    this.strongRefs.delete(id);
    console.log(`Released ${id}: Strong refs=${this.strongRefs.size}`);
  }

  checkLeaks() {
    console.log(`Potential leaks: ${this.strongRefs.size} strong references`);
    return Array.from(this.strongRefs.keys());
  }
}

Node.js Inspector

Node.js调试器

```bash
bash
undefined

Start with inspector

Start with inspector

node --inspect app.js node --inspect-brk app.js # Break on first line
node --inspect app.js node --inspect-brk app.js # Break on first line

Debug with Chrome DevTools

Debug with Chrome DevTools

Open chrome://inspect

Open chrome://inspect

```
undefined

VS Code Debug Configuration

VS Code调试配置

```json { "version": "0.2.0", "configurations": [ { "type": "node", "request": "launch", "name": "Debug Agent", "program": "${workspaceFolder}/src/index.js", "env": { "NODE_ENV": "development" } } ] } ```
json
{
  "version": "0.2.0",
  "configurations": [
    {
      "type": "node",
      "request": "launch",
      "name": "Debug Agent",
      "program": "${workspaceFolder}/src/index.js",
      "env": {
        "NODE_ENV": "development"
      }
    }
  ]
}

Container Debugging

容器调试

Docker

Docker

```bash
bash
undefined

View logs

View logs

docker logs <container> --tail=100 -f
docker logs <container> --tail=100 -f

Execute shell

Execute shell

docker exec -it <container> /bin/sh
docker exec -it <container> /bin/sh

Inspect container

Inspect container

docker inspect <container>
docker inspect <container>

Resource usage

Resource usage

docker stats <container>
docker stats <container>

Debug running container

Debug running container

docker run -it --rm
--network=container:<target>
nicolaka/netshoot ```
docker run -it --rm
--network=container:<target>
nicolaka/netshoot
undefined

Kubernetes

Kubernetes

```bash
bash
undefined

Pod logs

Pod logs

kubectl logs <pod> -n agents -f kubectl logs <pod> -n agents --previous # Previous crash
kubectl logs <pod> -n agents -f kubectl logs <pod> -n agents --previous # Previous crash

Execute in pod

Execute in pod

kubectl exec -it <pod> -n agents -- /bin/sh
kubectl exec -it <pod> -n agents -- /bin/sh

Debug with ephemeral container

Debug with ephemeral container

kubectl debug <pod> -n agents -it --image=busybox
kubectl debug <pod> -n agents -it --image=busybox

Port forward for local debugging

Port forward for local debugging

kubectl port-forward <pod> 8080:8080 -n agents
kubectl port-forward <pod> 8080:8080 -n agents

Events

Events

kubectl get events -n agents --sort-by='.lastTimestamp'
kubectl get events -n agents --sort-by='.lastTimestamp'

Resource usage

Resource usage

kubectl top pods -n agents ```
kubectl top pods -n agents
undefined

Log Analysis

日志分析

Pattern Matching

模式匹配

```bash
bash
undefined

Search logs for errors

Search logs for errors

grep -i "error|exception|failed" app.log
grep -i "error|exception|failed" app.log

Count occurrences

Count occurrences

grep -c "ERROR" app.log
grep -c "ERROR" app.log

Context around matches

Context around matches

grep -B 5 -A 5 "OutOfMemory" app.log
grep -B 5 -A 5 "OutOfMemory" app.log

Filter by time range

Filter by time range

awk '/2024-01-15 10:00/,/2024-01-15 11:00/' app.log ```
awk '/2024-01-15 10:00/,/2024-01-15 11:00/' app.log
undefined

JSON Logs

JSON日志

```bash
bash
undefined

Parse JSON logs with jq

Parse JSON logs with jq

cat app.log | jq 'select(.level == "error")' cat app.log | jq 'select(.timestamp > "2024-01-15T10:00:00")'
cat app.log | jq 'select(.level == "error")' cat app.log | jq 'select(.timestamp > "2024-01-15T10:00:00")'

Extract specific fields

Extract specific fields

cat app.log | jq -r '[.timestamp, .level, .message] | @tsv' ```
cat app.log | jq -r '[.timestamp, .level, .message] | @tsv'
undefined

Performance Debugging

性能调试

Python Profiling

Python性能分析

```python
python
undefined

cProfile

cProfile

import cProfile cProfile.run('main()', 'output.prof')
import cProfile cProfile.run('main()', 'output.prof')

Line profiler

Line profiler

@profile def slow_function(): pass
@profile def slow_function(): pass

Memory profiler

Memory profiler

from memory_profiler import profile
@profile def memory_heavy(): pass ```
from memory_profiler import profile
@profile def memory_heavy(): pass
undefined

Network Debugging

网络调试

```bash
bash
undefined

Check connectivity

Check connectivity

ping <host> telnet <host> <port> nc -zv <host> <port>
ping <host> telnet <host> <port> nc -zv <host> <port>

DNS resolution

DNS resolution

nslookup <host> dig <host>
nslookup <host> dig <host>

HTTP debugging

HTTP debugging

curl -v http://localhost:8080/health curl -X POST -d '{"test": true}' -H "Content-Type: application/json" http://localhost:8080/api ```
curl -v http://localhost:8080/health curl -X POST -d '{"test": true}' -H "Content-Type: application/json" http://localhost:8080/api
undefined

Common Debug Checklist

通用调试检查清单

  1. Check Logs: Application, system, container logs
  2. Verify Configuration: Environment variables, config files
  3. Test Connectivity: Network, database, external services
  4. Check Resources: CPU, memory, disk space
  5. Review Recent Changes: Git log, deployment history
  6. Reproduce Locally: Same environment, same data
  7. Binary Search: Isolate the problem scope
  1. 检查日志:应用日志、系统日志、容器日志
  2. 验证配置:环境变量、配置文件
  3. 测试连通性:网络、数据库、外部服务
  4. 检查资源使用:CPU、内存、磁盘空间
  5. 查看最近变更:Git日志、部署历史
  6. 本地复现:相同环境、相同数据
  7. 二分排查:缩小问题范围

Debugging Decision Tree

调试决策树

Use this decision tree to determine the right debugging approach:
START: What kind of bug?
├─ Known error message/stack trace
│  └─ Use: Direct log analysis + Stack trace walkthrough
├─ Intermittent/Race condition
│  └─ Use: Extended thinking + Timeline reconstruction + Hypothesis-driven
├─ Performance degradation
│  └─ Use: Profiling + Hypothesis-driven + MECE analysis
├─ Distributed system failure
│  └─ Use: Extended thinking + Timeline reconstruction + Multi-system tracing
├─ Complex state bug
│  └─ Use: Extended thinking + Hypothesis-driven + pdb/debugger
├─ Memory leak
│  └─ Use: Memory profiling + Hypothesis-driven + Weak reference analysis
└─ Unknown root cause
   └─ Use: Extended thinking + MECE analysis + 5 Whys
使用以下决策树选择合适的调试方法:
START: What kind of bug?
├─ Known error message/stack trace
│  └─ Use: Direct log analysis + Stack trace walkthrough
├─ Intermittent/Race condition
│  └─ Use: Extended thinking + Timeline reconstruction + Hypothesis-driven
├─ Performance degradation
│  └─ Use: Profiling + Hypothesis-driven + MECE analysis
├─ Distributed system failure
│  └─ Use: Extended thinking + Timeline reconstruction + Multi-system tracing
├─ Complex state bug
│  └─ Use: Extended thinking + Hypothesis-driven + pdb/debugger
├─ Memory leak
│  └─ Use: Memory profiling + Hypothesis-driven + Weak reference analysis
└─ Unknown root cause
   └─ Use: Extended thinking + MECE analysis + 5 Whys

Best Practices for Complex Debugging

复杂调试最佳实践

1. Document Your Investigation

1. 记录你的调查过程

Always maintain a debugging log:
markdown
undefined
始终维护调试日志:
markdown
undefined

Bug Investigation: [Title]

Bug Investigation: [Title]

Start Time: 2024-01-15 10:00 Investigator: [Name]
Start Time: 2024-01-15 10:00 Investigator: [Name]

Timeline

Timeline

  • 10:00 - Started investigation, checked logs
  • 10:15 - Found error pattern in auth service
  • 10:30 - Hypothesis: Cache expiration race condition
  • 10:45 - Added debug logging, confirmed hypothesis
  • 11:00 - Implemented fix, testing
  • 10:00 - Started investigation, checked logs
  • 10:15 - Found error pattern in auth service
  • 10:30 - Hypothesis: Cache expiration race condition
  • 10:45 - Added debug logging, confirmed hypothesis
  • 11:00 - Implemented fix, testing

Hypotheses Tested

Hypotheses Tested

  • H1: Cache race condition (CONFIRMED)
  • H2: Database connection pool (REJECTED)
  • H3: Network timeout (NOT TESTED)
  • H1: Cache race condition (CONFIRMED)
  • H2: Database connection pool (REJECTED)
  • H3: Network timeout (NOT TESTED)

Root Cause

Root Cause

[Final determination]
[Final determination]

Fix Applied

Fix Applied

[Solution details]
[Solution details]

Prevention

Prevention

[How to prevent recurrence]
undefined
[How to prevent recurrence]
undefined

2. Use the Scientific Method

2. 运用科学方法

  1. Observe: Gather symptoms, error messages, logs
  2. Hypothesize: Generate 3-5 plausible explanations
  3. Predict: What would you see if hypothesis is true?
  4. Test: Design experiments to validate/invalidate
  5. Analyze: Compare predictions vs actual results
  6. Conclude: Confirm root cause with evidence
  1. 观察:收集症状、错误信息、日志
  2. 假设:生成3-5个合理的解释
  3. 预测:如果假设成立,你会看到什么?
  4. 测试:设计实验验证/推翻假设
  5. 分析:比较预测与实际结果
  6. 结论:用证据确认根因

3. Leverage Extended Thinking

3. 利用深度思考

When to activate extended thinking:
  • Complexity threshold: More than 3 interacting systems
  • Uncertainty high: Multiple equally plausible causes
  • Stakes high: Production outage, security issue, data loss
  • Pattern unclear: No obvious error messages or logs
  • Time-sensitive: Need systematic approach under pressure
何时激活深度思考:
  • 复杂度阈值:涉及3个以上交互系统
  • 不确定性高:多个看似合理的原因
  • 风险高:生产环境故障、安全问题、数据丢失
  • 模式不清晰:无明显错误信息或日志
  • 时间紧迫:需要在压力下采用系统化方法

4. Avoid Common Pitfalls

4. 避免常见陷阱

markdown
AVOID:
- ❌ Changing multiple things at once (can't isolate cause)
- ❌ Assuming first hypothesis is correct (confirmation bias)
- ❌ Debugging without logs/evidence (guessing)
- ❌ Not documenting what you tried (repeating failed attempts)
- ❌ Skipping reproduction step (fix might not work)

DO:
- ✅ Change one variable at a time
- ✅ Test multiple hypotheses systematically
- ✅ Add instrumentation before debugging
- ✅ Keep investigation log
- ✅ Write regression test after fix
markdown
AVOID:
- ❌ 同时修改多个内容(无法定位原因)
- ❌ 假设第一个假设是正确的(确认偏差)
- ❌ 无日志/证据调试(猜测)
- ❌ 不记录尝试过的操作(重复失败的尝试)
- ❌ 跳过复现步骤(修复可能无效)

DO:
- ✅ 一次只修改一个变量
- ✅ 系统化测试多个假设
- ✅ 调试前添加监控
- ✅ 保留调查日志
- ✅ 修复后编写回归测试

5. Debugging Instrumentation Patterns

5. 调试监控模式

python
undefined
python
undefined

Python: Comprehensive debugging decorator

Python: 全面调试装饰器

import functools import time import logging
def debug_trace(func): """Decorator to trace function execution with timing and state""" @functools.wraps(func) def wrapper(*args, **kwargs): func_name = func.qualname logger.debug(f"→ Entering {func_name}") logger.debug(f" Args: {args}") logger.debug(f" Kwargs: {kwargs}")
    start = time.perf_counter()
    try:
        result = func(*args, **kwargs)
        duration = time.perf_counter() - start
        logger.debug(f"← Exiting {func_name} ({duration:.3f}s)")
        logger.debug(f"  Result: {result}")
        return result
    except Exception as e:
        duration = time.perf_counter() - start
        logger.error(f"✗ Exception in {func_name} ({duration:.3f}s): {e}")
        raise

return wrapper
import functools import time import logging
def debug_trace(func): """Decorator to trace function execution with timing and state""" @functools.wraps(func) def wrapper(*args, **kwargs): func_name = func.qualname logger.debug(f"→ Entering {func_name}") logger.debug(f" Args: {args}") logger.debug(f" Kwargs: {kwargs}")
    start = time.perf_counter()
    try:
        result = func(*args, **kwargs)
        duration = time.perf_counter() - start
        logger.debug(f"← Exiting {func_name} ({duration:.3f}s)")
        logger.debug(f"  Result: {result}")
        return result
    except Exception as e:
        duration = time.perf_counter() - start
        logger.error(f"✗ Exception in {func_name} ({duration:.3f}s): {e}")
        raise

return wrapper

Usage

Usage

@debug_trace def complex_operation(user_id, data): # Your code here pass

```javascript
// JavaScript: Comprehensive debugging wrapper
function debugTrace(label) {
  return function(target, propertyKey, descriptor) {
    const originalMethod = descriptor.value;

    descriptor.value = async function(...args) {
      console.log(\`→ Entering \${label || propertyKey}\`);
      console.log(\`  Args:\`, args);

      const start = performance.now();
      try {
        const result = await originalMethod.apply(this, args);
        const duration = performance.now() - start;
        console.log(\`← Exiting \${label || propertyKey} (\${duration.toFixed(2)}ms)\`);
        console.log(\`  Result:\`, result);
        return result;
      } catch (error) {
        const duration = performance.now() - start;
        console.error(\`✗ Exception in \${label || propertyKey} (\${duration.toFixed(2)}ms):\`, error);
        throw error;
      }
    };

    return descriptor;
  };
}

// Usage
class UserService {
  @debugTrace('UserService.getUser')
  async getUser(userId) {
    // Your code here
  }
}
@debug_trace def complex_operation(user_id, data): # Your code here pass

```javascript
// JavaScript: 全面调试包装器
function debugTrace(label) {
  return function(target, propertyKey, descriptor) {
    const originalMethod = descriptor.value;

    descriptor.value = async function(...args) {
      console.log(`→ Entering ${label || propertyKey}`);
      console.log(`  Args:`, args);

      const start = performance.now();
      try {
        const result = await originalMethod.apply(this, args);
        const duration = performance.now() - start;
        console.log(`← Exiting ${label || propertyKey} (${duration.toFixed(2)}ms)`);
        console.log(`  Result:`, result);
        return result;
      } catch (error) {
        const duration = performance.now() - start;
        console.error(`✗ Exception in ${label || propertyKey} (${duration.toFixed(2)}ms):`, error);
        throw error;
      }
    };

    return descriptor;
  };
}

// Usage
class UserService {
  @debugTrace('UserService.getUser')
  async getUser(userId) {
    // Your code here
  }
}

Cross-References and Related Skills

交叉引用与相关技能

Related Skills

相关技能

This debugging skill integrates with:
  1. extended-thinking (
    .claude/skills/extended-thinking/SKILL.md
    )
    • Use for: Complex bugs with unknown root causes
    • Activation: Add "use extended thinking" to your debugging prompt
    • Benefit: Deeper pattern recognition, systematic hypothesis generation
  2. complex-reasoning (
    .claude/skills/complex-reasoning/SKILL.md
    )
    • Use for: Multi-step debugging requiring logical chains
    • Patterns: Chain-of-thought, tree-of-thought for bug investigation
    • Benefit: Structured reasoning through complex bug scenarios
  3. deep-analysis (
    .claude/skills/deep-analysis/SKILL.md
    )
    • Use for: Post-mortem analysis, root cause investigation
    • Patterns: Comprehensive code review, architectural analysis
    • Benefit: Identifies systemic issues beyond surface bugs
  4. testing (
    .claude/skills/testing/SKILL.md
    )
    • Use for: Writing regression tests after bug fix
    • Integration: Bug → Debug → Fix → Test → Validate
    • Benefit: Ensures bug doesn't recur
  5. kubernetes (
    .claude/skills/kubernetes/SKILL.md
    )
    • Use for: Distributed system debugging in K8s
    • Tools: kubectl logs, exec, debug, events
    • Integration: Container debugging patterns
此调试技能可与以下技能集成:
  1. extended-thinking
    .claude/skills/extended-thinking/SKILL.md
    • 适用场景:根因未知的复杂Bug
    • 激活方式:在调试提示中添加“use extended thinking”
    • 优势:更深入的模式识别、系统化假设生成
  2. complex-reasoning
    .claude/skills/complex-reasoning/SKILL.md
    • 适用场景:需要逻辑链的多步骤调试
    • 模式:用于Bug调查的思维链、思维树
    • 优势:结构化推理复杂Bug场景
  3. deep-analysis
    .claude/skills/deep-analysis/SKILL.md
    • 适用场景:事后分析、根因调查
    • 模式:全面代码审查、架构分析
    • 优势:识别表面Bug之外的系统性问题
  4. testing
    .claude/skills/testing/SKILL.md
    • 适用场景:Bug修复后编写回归测试
    • 集成流程:Bug → 调试 → 修复 → 测试 → 验证
    • 优势:确保Bug不再复发
  5. kubernetes
    .claude/skills/kubernetes/SKILL.md
    • 适用场景:K8s中的分布式系统调试
    • 工具:kubectl logs、exec、debug、events
    • 集成:容器调试模式

When to Combine Skills

何时组合技能

ScenarioSkills to CombineReasoning
Production outagedebugging + extended-thinking + kubernetesComplex distributed system requires deep reasoning
Intermittent test failuredebugging + testing + complex-reasoningNeed systematic hypothesis testing
Performance regressiondebugging + deep-analysisRoot cause may be architectural
Security vulnerabilitydebugging + extended-thinking + deep-analysisRequires careful, thorough analysis
Memory leakdebugging + complex-reasoningMulti-step investigation needed
场景组合技能理由
生产环境故障debugging + extended-thinking + kubernetes复杂分布式系统需要深度推理
间歇性测试失败debugging + testing + complex-reasoning需要系统化假设测试
性能回归debugging + deep-analysis根因可能是架构层面的问题
安全漏洞debugging + extended-thinking + deep-analysis需要仔细、全面的分析
内存泄漏debugging + complex-reasoning需要多步骤调查

Integration Examples

集成示例

Example 1: Complex Production Bug

示例1:复杂生产环境Bug

bash
undefined
bash
undefined

Prompt combining skills

Prompt combining skills

Claude, I have a complex production bug affecting multiple services. Please use extended thinking and the debugging skill to help investigate.
Symptoms:
  • API requests timeout intermittently (1 in 50 requests)
  • Only affects authenticated users
  • Started after recent deployment
  • No obvious errors in logs
Please use:
  1. MECE analysis to categorize possible causes
  2. Hypothesis-driven debugging framework
  3. Timeline reconstruction of recent changes
undefined
Claude, I have a complex production bug affecting multiple services. Please use extended thinking and the debugging skill to help investigate.
Symptoms:
  • API requests timeout intermittently (1 in 50 requests)
  • Only affects authenticated users
  • Started after recent deployment
  • No obvious errors in logs
Please use:
  1. MECE analysis to categorize possible causes
  2. Hypothesis-driven debugging framework
  3. Timeline reconstruction of recent changes
undefined

Example 2: Memory Leak Investigation

示例2:内存泄漏调查

bash
undefined
bash
undefined

Prompt combining skills

Prompt combining skills

Claude, use complex reasoning and debugging skills to investigate a memory leak.
Context:
  • Node.js service memory grows from 200MB to 2GB over 6 hours
  • No errors logged
  • Happens only in production, not staging
Apply:
  1. Hypothesis-driven framework (generate 5 hypotheses)
  2. Memory leak detection patterns (weak references)
  3. Extended thinking for pattern recognition across codebase
undefined
Claude, use complex reasoning and debugging skills to investigate a memory leak.
Context:
  • Node.js service memory grows from 200MB to 2GB over 6 hours
  • No errors logged
  • Happens only in production, not staging
Apply:
  1. Hypothesis-driven framework (generate 5 hypotheses)
  2. Memory leak detection patterns (weak references)
  3. Extended thinking for pattern recognition across codebase
undefined

Quick Reference Card

快速参考卡

Debugging Workflow Summary

调试工作流总结

1. OBSERVE
   - Collect error messages, logs, metrics
   - Identify patterns (frequency, conditions, scope)
   - Document symptoms

2. HYPOTHESIZE (use extended thinking if complex)
   - Generate 3-5 plausible hypotheses
   - Rank by likelihood
   - Design tests for each

3. TEST
   - Change one variable at a time
   - Add instrumentation (logging, tracing)
   - Collect evidence

4. ANALYZE
   - Compare predictions vs results
   - Eliminate invalidated hypotheses
   - Refine remaining hypotheses

5. FIX
   - Implement solution
   - Add regression test
   - Document root cause

6. VALIDATE
   - Verify fix in affected environment
   - Monitor metrics
   - Update documentation
1. OBSERVE
   - 收集错误信息、日志、指标
   - 识别模式(频率、条件、范围)
   - 记录症状

2. HYPOTHESIZE(复杂场景下使用深度思考)
   - 生成3-5个合理假设
   - 按可能性排序
   - 为每个假设设计测试

3. TEST
   - 一次只修改一个变量
   - 添加监控(日志、追踪)
   - 收集证据

4. ANALYZE
   - 比较预测与结果
   - 排除无效假设
   - 细化剩余假设

5. FIX
   - 实施解决方案
   - 添加回归测试
   - 记录根因

6. VALIDATE
   - 在受影响环境中验证修复
   - 监控指标
   - 更新文档

Tool Selection Guide

工具选择指南

Problem TypePrimary ToolSecondary Tools
Logic errorpdb/debuggerLogging, unit tests
PerformanceProfilerHypothesis testing, metrics
Memory leakMemory profilerWeak references, heap dumps
Async/timingTimeline reconstructionExtended thinking, logging
DistributedTracing (logs)Kubernetes tools, MECE analysis
Unknown causeExtended thinkingMECE, 5 Whys, hypothesis-driven

Skill version: 2.0 (Enhanced with extended thinking integration) Last updated: 2024-01-15 Maintained by: Golden Armada AI Agent Fleet
问题类型主要工具次要工具
逻辑错误pdb/调试器日志记录、单元测试
性能问题性能分析器假设测试、指标
内存泄漏内存分析器弱引用、堆转储
异步/时序问题时间线重建深度思考、日志
分布式系统问题追踪(日志)Kubernetes工具、MECE分析
根因未知深度思考MECE、5个为什么、基于假设的调试

技能版本:2.0(增强深度思考集成) 最后更新:2024-01-15 维护者:Golden Armada AI Agent Fleet