root-cause-tracing

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Root Cause Tracing

根因追踪

Overview

概述

Bugs often manifest deep in the call stack (git init in wrong directory, file created in wrong location, database opened with wrong path). Your instinct is to fix where the error appears, but that's treating a symptom.

Core principle: Trace backward through the call chain until you find the original trigger, then fix at the source.

Bug通常会在调用栈的深处显现（比如在错误目录执行git init、文件创建在错误位置、使用错误路径打开数据库）。你的第一反应可能是修复错误出现的地方，但这只是治标不治本。

核心原则： 沿着调用链反向追踪，直到找到最初的触发点，然后从根源处修复问题。

When to Use

适用场景

Use when:

Error happens deep in execution (not at entry point)
Stack trace shows long call chain
Unclear where invalid data originated
Need to find which test/code triggers the problem

适用情况：

错误发生在执行流程深处（而非入口点）
栈追踪显示调用链很长
不清楚无效数据的来源
需要找出触发问题的测试用例/代码

The Tracing Process

追踪流程

1. Observe the Symptom

1. 观察症状

Error: git init failed in ~/project/packages/core

Error: git init failed in ~/project/packages/core

2. Find Immediate Cause

2. 找到直接原因

What code directly causes this?

typescript

await execFileAsync('git', ['init'], { cwd: projectDir });

哪段代码直接导致了这个错误？

typescript

await execFileAsync('git', ['init'], { cwd: projectDir });

3. Ask: What Called This?

3. 追问：谁调用了这段代码？

typescript

WorktreeManager.createSessionWorktree(projectDir, sessionId)
  → called by Session.initializeWorkspace()
  → called by Session.create()
  → called by test at Project.create()

typescript

WorktreeManager.createSessionWorktree(projectDir, sessionId)
  → called by Session.initializeWorkspace()
  → called by Session.create()
  → called by test at Project.create()

4. Keep Tracing Up

4. 继续向上追踪

What value was passed?

```
projectDir = ''
```
(empty string!)
Empty string as
```
cwd
```
resolves to
```
process.cwd()
```
That's the source code directory!

传入的参数值是什么？

```
projectDir = ''
```
（空字符串！）
空字符串作为
```
cwd
```
参数时，会解析为
```
process.cwd()
```
（当前进程工作目录）
而这正好是源码目录！

5. Find Original Trigger

5. 找到最初触发点

Where did empty string come from?

typescript

const context = setupCoreTest(); // Returns { tempDir: '' }
Project.create('name', context.tempDir); // Accessed before beforeEach!

空字符串来自哪里？

typescript

const context = setupCoreTest(); // Returns { tempDir: '' }
Project.create('name', context.tempDir); // Accessed before beforeEach!

Adding Stack Traces

添加调用栈追踪

When you can't trace manually, add instrumentation:

typescript

// Before the problematic operation
async function gitInit(directory: string) {
  const stack = new Error().stack;
  console.error('DEBUG git init:', {
    directory,
    cwd: process.cwd(),
    nodeEnv: process.env.NODE_ENV,
    stack,
  });

  await execFileAsync('git', ['init'], { cwd: directory });
}

Critical: Use

console.error()

in tests (not logger - may not show)

Run and capture:

bash

bun test 2>&1 | grep 'DEBUG git init'

Analyze stack traces:

Look for test file names
Find the line number triggering the call
Identify the pattern (same test? same parameter?)

当无法手动追踪时，可以添加埋点代码：

typescript

// Before the problematic operation
async function gitInit(directory: string) {
  const stack = new Error().stack;
  console.error('DEBUG git init:', {
    directory,
    cwd: process.cwd(),
    nodeEnv: process.env.NODE_ENV,
    stack,
  });

  await execFileAsync('git', ['init'], { cwd: directory });
}

重点： 在测试中使用

console.error()

（不要用日志工具，可能不会输出）

运行并捕获输出：

bash

bun test 2>&1 | grep 'DEBUG git init'

分析调用栈：

查找测试文件名
找到触发调用的行号
识别规律（同一个测试用例？同一个参数？）

Finding Which Test Causes Pollution

找出导致环境污染的测试用例

If something appears during tests but you don't know which test:

Use the bisection script to run tests one-by-one:

bash

undefined

如果测试过程中出现问题，但不知道是哪个测试用例导致的：

使用二分法脚本逐个运行测试用例：

bash

undefined

Example: find which test creates .git in wrong place

bun test --run --bail 2>&1 | tee test-output.log


Runs tests one-by-one, stops at first polluter.

bun test --run --bail 2>&1 | tee test-output.log


逐个运行测试用例，遇到第一个导致污染的用例时停止。

Real Example: Empty projectDir

实际案例：空projectDir

Symptom:

.git

created in

packages/core/

(source code)

Trace chain:

```
git init
```
runs in
```
process.cwd()
```
← empty cwd parameter
WorktreeManager called with empty projectDir
Session.create() passed empty string
Test accessed
```
context.tempDir
```
before beforeEach
setupCoreTest() returns
```
{ tempDir: '' }
```
initially

Root cause: Top-level variable initialization accessing empty value

Fix: Made tempDir a getter that throws if accessed before beforeEach

Also added defense-in-depth:

Layer 1: Project.create() validates directory
Layer 2: WorkspaceManager validates not empty
Layer 3: NODE_ENV guard refuses git init outside tmpdir
Layer 4: Stack trace logging before git init

症状：

.git

目录被创建在

packages/core/

（源码目录）中

追踪链：

```
git init
```
在
```
process.cwd()
```
中执行 ← 传入了空的cwd参数
WorktreeManager被传入了空的projectDir
Session.create()被传入了空字符串
测试用例在beforeEach之前就访问了
```
context.tempDir
```
setupCoreTest()初始返回
```
{ tempDir: '' }
```

根因： 顶层变量初始化时访问了空值

修复方案： 将tempDir改为getter，若在beforeEach之前访问则抛出错误

额外添加的纵深防御：

第一层：Project.create()验证目录有效性
第二层：WorkspaceManager验证参数非空
第三层：NODE_ENV防护，禁止在临时目录外执行git init
第四层：执行git init前记录调用栈

Key Principle

核心原则

NEVER fix just where the error appears. Trace back to find the original trigger.

永远不要只修复错误出现的地方。要回溯找到最初的触发点。

Stack Trace Tips

调用栈追踪技巧

In tests: Use

console.error()

not logger - logger may be suppressed Before operation: Log before the dangerous operation, not after it fails Include context: Directory, cwd, environment variables, timestamps Capture stack:

new Error().stack

shows complete call chain

在测试中： 使用

console.error()

而非日志工具——日志工具可能被屏蔽 操作前： 在危险操作前记录日志，而非失败后 包含上下文： 目录、当前工作目录、环境变量、时间戳 捕获调用栈：

new Error().stack

会显示完整的调用链

Real-World Impact

实际效果

From debugging session:

Found root cause through 5-level trace
Fixed at source (getter validation)
Added 4 layers of defense
1847 tests passed, zero pollution

某次调试过程中：

通过5层追踪找到根因
从根源修复（getter验证）
添加了4层防御机制
1847个测试用例全部通过，无环境污染