investigate

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Investigate

问题排查

Systematic methodology for finding the root cause of bugs, failures, and unexpected behavior. Cycle through characterize-isolate-hypothesize-test steps, with oracle escalation for hard problems. Diagnose the root cause — do not apply fixes. Return results for the main agent to act on.
Optional:
$ARGUMENTS
contains the problem description or error message.
本方法用于系统性定位bug、失败案例及异常行为的根本原因。通过「特征分析-隔离定位-假设推测-验证测试」的步骤循环处理,遇到疑难问题可升级至oracle处理。仅诊断根本原因,不提供修复方案,将结果返回给主Agent以采取后续行动。
可选参数:
$ARGUMENTS
包含问题描述或错误信息。

Step 1: Characterize

步骤1:特征分析

Gather the symptom and establish what is actually happening:
  1. Collect evidence — error message, stack trace, test output, log entries, or user description of unexpected behavior
  2. Classify the problem type:
SignalType
Stack trace / exceptionRuntime error
Test assertion failureTest failure
Compilation / bundler / build errorBuild failure
Type checker error (tsc, mypy, pyright)Type error
Slow response / high CPU / memory growthPerformance
"It does X instead of Y" / no errorUnexpected behavior
  1. Establish reproduction — run the failing command, test, or operation. If the problem cannot be reproduced (intermittent, environment-specific), document the constraints and proceed with historical evidence.
Record the exact reproduction command and its output for verification.
收集症状信息,明确实际发生的问题:
  1. 收集证据 —— 错误信息、堆栈跟踪、测试输出、日志条目或用户描述的异常行为
  2. 分类问题类型
信号类型
堆栈跟踪/异常运行时错误
测试断言失败测试失败
编译/打包/构建错误构建失败
类型检查器错误(tsc, mypy, pyright)类型错误
响应缓慢/CPU占用高/内存持续增长性能问题
“实际执行X而非Y”/无错误提示异常行为
  1. 复现问题 —— 执行触发失败的命令、测试或操作。若问题无法复现(间歇性、环境特定),记录约束条件并基于已有历史证据继续排查。
记录用于验证的精确复现命令及其输出结果。

Step 2: Isolate

步骤2:隔离定位

Narrow from "something is wrong" to "the problem is in this area." Read references/problem-type-playbooks.md for type-specific first moves and tool sequences.
从“某处存在问题”缩小范围至“问题出在该区域”。参考 references/problem-type-playbooks.md 中针对不同问题类型的初始操作和工具序列。

Git Archeology

Git溯源

For all problem types, check what changed recently near the failure point:
bash
git log --oneline -20 -- <file>
git blame -L <start>,<end> <file>
If a known-good state exists (e.g., "this worked yesterday"), consider
git bisect
to pinpoint the breaking commit.
针对所有类型的问题,检查故障点附近最近的变更:
bash
git log --oneline -20 -- <file>
git blame -L <start>,<end> <file>
若存在已知正常状态(例如:“昨天还能正常工作”),可考虑使用
git bisect
定位引入问题的提交记录。

Scope Narrowing

范围缩小

  • Stack traces: Read the throwing function and its callers — full functions, not just the flagged line
  • Test failures: Read both the test and the system under test
  • Build errors: Read the config file and the referenced source
  • Unexpected behavior: Trace the data flow from input to the unexpected output
  • 堆栈跟踪:阅读抛出异常的函数及其调用者的完整代码,而非仅关注标记的行
  • 测试失败:同时阅读测试代码和被测系统代码
  • 构建错误:阅读配置文件和关联的源码
  • 异常行为:追踪从输入到异常输出的数据流

Step 3: Hypothesize

步骤3:假设推测

Generate 2-4 hypotheses ranked by likelihood. Each hypothesis must be falsifiable — specify what evidence would confirm or refute it.
Format:
H1 (most likely): [description] — confirmed if [X], refuted if [Y]
H2: [description] — confirmed if [X], refuted if [Y]
H3: [description] — confirmed if [X], refuted if [Y]
生成2-4个按可能性排序的假设。每个假设必须具备可证伪性 —— 明确说明哪些证据可证实或推翻该假设。
格式:
H1(可能性最高): [假设描述] —— 若[X]则证实,若[Y]则推翻
H2: [假设描述] —— 若[X]则证实,若[Y]则推翻
H3: [假设描述] —— 若[X]则证实,若[Y]则推翻

Parallel Investigation

并行调查

For complex problems with 3+ hypotheses and a non-obvious root cause, spawn parallel background investigators simultaneously.
Spawn condition: 3+ hypotheses AND the problem is not a simple typo, missing import, or syntax error.
Skip when 1-2 hypotheses are obvious (e.g., stack trace points directly to the bug).
Launch in parallel using
run_in_background: true
:
  1. One subagent per hypothesis — each receives the hypothesis, relevant file paths, what evidence to look for, and instructions to report confirmed / refuted / inconclusive with evidence. Budget: max 5 tool calls per subagent.
  2. Codex exec (read-only) — run the
    /codex
    skill in exec mode with a focused prompt describing the problem, reproduction, and files examined. Provides an independent perspective that may spot patterns the hypothesis-driven subagents miss. Run the
    /evaluate-findings
    skill on its output.
After all investigators complete, merge results. Codex findings that overlap with a subagent's confirmed hypothesis reinforce confidence. Novel codex findings become additional hypotheses to test in Step 4.
针对存在3个及以上假设且根本原因不明确的复杂问题,可同时启动多个并行后台调查Agent。
触发条件:存在3个及以上假设,且问题并非简单的拼写错误、缺失依赖或语法错误。
跳过场景:仅存在1-2个明确假设(例如:堆栈跟踪直接指向bug)。
使用
run_in_background: true
启动并行调查:
  1. 每个假设对应一个子Agent —— 每个子Agent将收到假设、相关文件路径、需查找的证据,以及返回已证实/已推翻/无结论并附带证据的指令。每个子Agent最多可调用5次工具。
  2. Codex执行(只读) —— 在执行模式下运行
    /codex
    技能,传入聚焦于问题、复现步骤及已检查文件的提示。提供独立视角,可能发现基于假设的子Agent未注意到的模式。对其输出运行
    /evaluate-findings
    技能。
所有调查完成后合并结果。Codex的发现若与某个子Agent已证实的假设重叠,可提升结论可信度。Codex的新发现将成为步骤4中需测试的额外假设。

Step 4: Test

步骤4:验证测试

Verify each hypothesis with minimal, targeted actions:
Action TypeTool
Find usage or patternGrep
Read surrounding codeRead
Check recent changesBash (
git log
,
git blame
,
git diff
)
Run isolated testBash (specific test command)
Check dependency versionBash (
npm ls
,
pip3 show
, etc.)
Inspect runtime stateBash (add temporary logging, run, check output)
Record each result:
HypothesisVerdictEvidence
H1confirmed / refuted / inconclusive[what was found]
H2confirmed / refuted / inconclusive[what was found]
通过最小化、针对性的操作验证每个假设:
操作类型工具
查找用法或模式Grep
阅读周边代码Read
检查近期变更Bash(
git log
,
git blame
,
git diff
运行隔离测试Bash(特定测试命令)
检查依赖版本Bash(
npm ls
,
pip3 show
等)
检查运行时状态Bash(添加临时日志、运行、检查输出)
记录每个结果:
假设结论证据
H1已证实/已推翻/无结论[发现的内容]
H2已证实/已推翻/无结论[发现的内容]

Iteration

迭代

If all hypotheses are refuted or inconclusive:
  1. Document what was learned — each refuted hypothesis eliminates a possibility and narrows the search
  2. Return to Step 2 with the new information to re-isolate
  3. Generate new hypotheses in Step 3 based on updated understanding
Cycle budget: maximum 2 full cycles (hypothesize → test → learn → repeat) before escalating.
若所有假设均被推翻或无结论:
  1. 记录已了解的信息 —— 每个被推翻的假设都排除了一种可能性,缩小了搜索范围
  2. 带着新信息返回步骤2重新进行隔离定位
  3. 基于更新后的认知在步骤3中生成新假设
循环预算:最多进行2次完整循环(假设→测试→学习→重复),之后需升级处理。

Escalation

升级处理

After 2 failed hypothesis cycles, offer escalation to
/oracle
via
AskUserQuestion
:
Investigation stalled after [N] hypothesis cycles.

Tested: [summary of hypotheses and evidence]
Remaining unknowns: [what is still unclear]

Escalate to Oracle? (consults external model with full context)
Proceed only if the user approves.
经过2次假设循环仍未解决问题时,通过
AskUserQuestion
提议升级至
/oracle
经过[N]次假设循环后,排查陷入停滞。

已测试内容:[假设及证据摘要]
剩余未知项:[仍不明确的内容]

是否升级至Oracle?(将调用外部模型并传入完整上下文)
仅在用户批准后继续执行。

Investigation Report

排查报告

Present results using
AskUserQuestion
:
Investigation Report:

Problem: [one-line description]
Type: [runtime error | test failure | build failure | type error | performance | unexpected behavior]
Root cause: [confirmed cause, or "unresolved" with best hypothesis]

Evidence:
- [what confirmed the root cause]

Suggested fix: [description of what to change, or "needs further investigation"]
Reproduction command: [command to verify the fix once applied]

Hypotheses tested:
1. [hypothesis] — [confirmed/refuted/inconclusive] — [evidence]
2. [hypothesis] — [confirmed/refuted/inconclusive] — [evidence]

Escalation: [none | oracle]
使用
AskUserQuestion
呈现结果:
排查报告:

问题:[单行描述]
类型:[运行时错误 | 测试失败 | 构建失败 | 类型错误 | 性能问题 | 异常行为]
根本原因:[已确认的原因,或“未解决”并附上可能性最高的假设]

证据:
- [证实根本原因的内容]

建议修复方案:[需修改的内容描述,或“需进一步排查”]
复现命令:[修复后用于验证的命令]

已测试的假设:
1. [假设内容] —— [已证实/已推翻/无结论] —— [证据]
2. [假设内容] —— [已证实/已推翻/无结论] —— [证据]

升级状态:[无 | Oracle]

Rules

注意规则

  • If the problem turns out to be environmental (wrong Node version, missing dependency, OS-specific), report that clearly — it may not require a code fix.
  • If the problem is in a dependency (not the project's code), document the dependency issue and suggest workaround options rather than patching the dependency.
  • 若问题属于环境因素(Node版本错误、缺失依赖、特定系统问题),需明确报告——此类问题可能无需修改代码。
  • 若问题出在依赖库(而非项目自身代码),需记录依赖库的问题并建议替代方案,而非直接修改依赖库代码。