investigate
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseInvestigate
问题排查
Systematic methodology for finding the root cause of bugs, failures, and unexpected behavior. Cycle through characterize-isolate-hypothesize-test steps, with oracle escalation for hard problems. Diagnose the root cause — do not apply fixes. Return results for the main agent to act on.
Optional: contains the problem description or error message.
$ARGUMENTS本方法用于系统性定位bug、失败案例及异常行为的根本原因。通过「特征分析-隔离定位-假设推测-验证测试」的步骤循环处理,遇到疑难问题可升级至oracle处理。仅诊断根本原因,不提供修复方案,将结果返回给主Agent以采取后续行动。
可选参数: 包含问题描述或错误信息。
$ARGUMENTSStep 1: Characterize
步骤1:特征分析
Gather the symptom and establish what is actually happening:
- Collect evidence — error message, stack trace, test output, log entries, or user description of unexpected behavior
- Classify the problem type:
| Signal | Type |
|---|---|
| Stack trace / exception | Runtime error |
| Test assertion failure | Test failure |
| Compilation / bundler / build error | Build failure |
| Type checker error (tsc, mypy, pyright) | Type error |
| Slow response / high CPU / memory growth | Performance |
| "It does X instead of Y" / no error | Unexpected behavior |
- Establish reproduction — run the failing command, test, or operation. If the problem cannot be reproduced (intermittent, environment-specific), document the constraints and proceed with historical evidence.
Record the exact reproduction command and its output for verification.
收集症状信息,明确实际发生的问题:
- 收集证据 —— 错误信息、堆栈跟踪、测试输出、日志条目或用户描述的异常行为
- 分类问题类型:
| 信号 | 类型 |
|---|---|
| 堆栈跟踪/异常 | 运行时错误 |
| 测试断言失败 | 测试失败 |
| 编译/打包/构建错误 | 构建失败 |
| 类型检查器错误(tsc, mypy, pyright) | 类型错误 |
| 响应缓慢/CPU占用高/内存持续增长 | 性能问题 |
| “实际执行X而非Y”/无错误提示 | 异常行为 |
- 复现问题 —— 执行触发失败的命令、测试或操作。若问题无法复现(间歇性、环境特定),记录约束条件并基于已有历史证据继续排查。
记录用于验证的精确复现命令及其输出结果。
Step 2: Isolate
步骤2:隔离定位
Narrow from "something is wrong" to "the problem is in this area." Read references/problem-type-playbooks.md for type-specific first moves and tool sequences.
从“某处存在问题”缩小范围至“问题出在该区域”。参考 references/problem-type-playbooks.md 中针对不同问题类型的初始操作和工具序列。
Git Archeology
Git溯源
For all problem types, check what changed recently near the failure point:
bash
git log --oneline -20 -- <file>
git blame -L <start>,<end> <file>If a known-good state exists (e.g., "this worked yesterday"), consider to pinpoint the breaking commit.
git bisect针对所有类型的问题,检查故障点附近最近的变更:
bash
git log --oneline -20 -- <file>
git blame -L <start>,<end> <file>若存在已知正常状态(例如:“昨天还能正常工作”),可考虑使用 定位引入问题的提交记录。
git bisectScope Narrowing
范围缩小
- Stack traces: Read the throwing function and its callers — full functions, not just the flagged line
- Test failures: Read both the test and the system under test
- Build errors: Read the config file and the referenced source
- Unexpected behavior: Trace the data flow from input to the unexpected output
- 堆栈跟踪:阅读抛出异常的函数及其调用者的完整代码,而非仅关注标记的行
- 测试失败:同时阅读测试代码和被测系统代码
- 构建错误:阅读配置文件和关联的源码
- 异常行为:追踪从输入到异常输出的数据流
Step 3: Hypothesize
步骤3:假设推测
Generate 2-4 hypotheses ranked by likelihood. Each hypothesis must be falsifiable — specify what evidence would confirm or refute it.
Format:
H1 (most likely): [description] — confirmed if [X], refuted if [Y]
H2: [description] — confirmed if [X], refuted if [Y]
H3: [description] — confirmed if [X], refuted if [Y]生成2-4个按可能性排序的假设。每个假设必须具备可证伪性 —— 明确说明哪些证据可证实或推翻该假设。
格式:
H1(可能性最高): [假设描述] —— 若[X]则证实,若[Y]则推翻
H2: [假设描述] —— 若[X]则证实,若[Y]则推翻
H3: [假设描述] —— 若[X]则证实,若[Y]则推翻Parallel Investigation
并行调查
For complex problems with 3+ hypotheses and a non-obvious root cause, spawn parallel background investigators simultaneously.
Spawn condition: 3+ hypotheses AND the problem is not a simple typo, missing import, or syntax error.
Skip when 1-2 hypotheses are obvious (e.g., stack trace points directly to the bug).
Launch in parallel using :
run_in_background: true- One subagent per hypothesis — each receives the hypothesis, relevant file paths, what evidence to look for, and instructions to report confirmed / refuted / inconclusive with evidence. Budget: max 5 tool calls per subagent.
- Codex exec (read-only) — run the skill in exec mode with a focused prompt describing the problem, reproduction, and files examined. Provides an independent perspective that may spot patterns the hypothesis-driven subagents miss. Run the
/codexskill on its output./evaluate-findings
After all investigators complete, merge results. Codex findings that overlap with a subagent's confirmed hypothesis reinforce confidence. Novel codex findings become additional hypotheses to test in Step 4.
针对存在3个及以上假设且根本原因不明确的复杂问题,可同时启动多个并行后台调查Agent。
触发条件:存在3个及以上假设,且问题并非简单的拼写错误、缺失依赖或语法错误。
跳过场景:仅存在1-2个明确假设(例如:堆栈跟踪直接指向bug)。
使用 启动并行调查:
run_in_background: true- 每个假设对应一个子Agent —— 每个子Agent将收到假设、相关文件路径、需查找的证据,以及返回已证实/已推翻/无结论并附带证据的指令。每个子Agent最多可调用5次工具。
- Codex执行(只读) —— 在执行模式下运行 技能,传入聚焦于问题、复现步骤及已检查文件的提示。提供独立视角,可能发现基于假设的子Agent未注意到的模式。对其输出运行
/codex技能。/evaluate-findings
所有调查完成后合并结果。Codex的发现若与某个子Agent已证实的假设重叠,可提升结论可信度。Codex的新发现将成为步骤4中需测试的额外假设。
Step 4: Test
步骤4:验证测试
Verify each hypothesis with minimal, targeted actions:
| Action Type | Tool |
|---|---|
| Find usage or pattern | Grep |
| Read surrounding code | Read |
| Check recent changes | Bash ( |
| Run isolated test | Bash (specific test command) |
| Check dependency version | Bash ( |
| Inspect runtime state | Bash (add temporary logging, run, check output) |
Record each result:
| Hypothesis | Verdict | Evidence |
|---|---|---|
| H1 | confirmed / refuted / inconclusive | [what was found] |
| H2 | confirmed / refuted / inconclusive | [what was found] |
通过最小化、针对性的操作验证每个假设:
| 操作类型 | 工具 |
|---|---|
| 查找用法或模式 | Grep |
| 阅读周边代码 | Read |
| 检查近期变更 | Bash( |
| 运行隔离测试 | Bash(特定测试命令) |
| 检查依赖版本 | Bash( |
| 检查运行时状态 | Bash(添加临时日志、运行、检查输出) |
记录每个结果:
| 假设 | 结论 | 证据 |
|---|---|---|
| H1 | 已证实/已推翻/无结论 | [发现的内容] |
| H2 | 已证实/已推翻/无结论 | [发现的内容] |
Iteration
迭代
If all hypotheses are refuted or inconclusive:
- Document what was learned — each refuted hypothesis eliminates a possibility and narrows the search
- Return to Step 2 with the new information to re-isolate
- Generate new hypotheses in Step 3 based on updated understanding
Cycle budget: maximum 2 full cycles (hypothesize → test → learn → repeat) before escalating.
若所有假设均被推翻或无结论:
- 记录已了解的信息 —— 每个被推翻的假设都排除了一种可能性,缩小了搜索范围
- 带着新信息返回步骤2重新进行隔离定位
- 基于更新后的认知在步骤3中生成新假设
循环预算:最多进行2次完整循环(假设→测试→学习→重复),之后需升级处理。
Escalation
升级处理
After 2 failed hypothesis cycles, offer escalation to via :
/oracleAskUserQuestionInvestigation stalled after [N] hypothesis cycles.
Tested: [summary of hypotheses and evidence]
Remaining unknowns: [what is still unclear]
Escalate to Oracle? (consults external model with full context)Proceed only if the user approves.
经过2次假设循环仍未解决问题时,通过 提议升级至 :
AskUserQuestion/oracle经过[N]次假设循环后,排查陷入停滞。
已测试内容:[假设及证据摘要]
剩余未知项:[仍不明确的内容]
是否升级至Oracle?(将调用外部模型并传入完整上下文)仅在用户批准后继续执行。
Investigation Report
排查报告
Present results using :
AskUserQuestionInvestigation Report:
Problem: [one-line description]
Type: [runtime error | test failure | build failure | type error | performance | unexpected behavior]
Root cause: [confirmed cause, or "unresolved" with best hypothesis]
Evidence:
- [what confirmed the root cause]
Suggested fix: [description of what to change, or "needs further investigation"]
Reproduction command: [command to verify the fix once applied]
Hypotheses tested:
1. [hypothesis] — [confirmed/refuted/inconclusive] — [evidence]
2. [hypothesis] — [confirmed/refuted/inconclusive] — [evidence]
Escalation: [none | oracle]使用 呈现结果:
AskUserQuestion排查报告:
问题:[单行描述]
类型:[运行时错误 | 测试失败 | 构建失败 | 类型错误 | 性能问题 | 异常行为]
根本原因:[已确认的原因,或“未解决”并附上可能性最高的假设]
证据:
- [证实根本原因的内容]
建议修复方案:[需修改的内容描述,或“需进一步排查”]
复现命令:[修复后用于验证的命令]
已测试的假设:
1. [假设内容] —— [已证实/已推翻/无结论] —— [证据]
2. [假设内容] —— [已证实/已推翻/无结论] —— [证据]
升级状态:[无 | Oracle]Rules
注意规则
- If the problem turns out to be environmental (wrong Node version, missing dependency, OS-specific), report that clearly — it may not require a code fix.
- If the problem is in a dependency (not the project's code), document the dependency issue and suggest workaround options rather than patching the dependency.
- 若问题属于环境因素(Node版本错误、缺失依赖、特定系统问题),需明确报告——此类问题可能无需修改代码。
- 若问题出在依赖库(而非项目自身代码),需记录依赖库的问题并建议替代方案,而非直接修改依赖库代码。