why

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Why

核心目的

Investigate the motivation and intent behind code. Why was it built this way? What edge cases were considered? What product, business, or operational constraints shaped the design? What alternatives were rejected, and why?
Companion to the
how
skill.
how
answers what the code does and how it works.
why
answers what forces led to its shape.
调查代码背后的设计动机与意图。为什么要以这种方式构建?考虑了哪些边缘情况?哪些产品、业务或运营约束影响了设计?哪些备选方案被否决,原因是什么?
这是
how
技能的配套工具。
how
技能回答代码的功能与运行机制,
why
技能则解释促成当前代码形态的各种因素。

How this skill works

技能工作原理

Historical context spreads across seven evidence categories: source control history, issue or ticket tracking, long-form documents, real-time team chat, infrastructure observability, error or exception tracking, and product analytics warehouses. You cannot predict from the question alone which one holds the answer, so the skill enumerates available MCPs at run time, maps each to a category, queries all seven in parallel, then synthesizes with explicit confidence calibration. Null results from searched categories are first-class evidence about how the decision was made; report them alongside positive findings. The default is coverage, not minimalism.
历史信息分散在七个证据类别中:源代码控制历史、问题/工单追踪、长篇文档、团队实时聊天、基础设施可观测性、错误/异常追踪、产品分析仓库。仅通过问题本身无法预判哪个类别包含答案,因此该技能会在运行时枚举可用的MCP,将其映射到对应类别,并行查询所有七个类别,再结合明确的置信度校准进行综合分析。搜索类别返回的空结果是决策过程的重要证据,需与阳性发现一同报告。默认策略是全面覆盖,而非最小化搜索范围。

Operating Posture

操作准则

Operate as a careful, cautious, precise investigator. Think like a detective piecing together a historical case from fragmentary records. When the record is thin, say so.
Concretely:
  • Evidence before narrative. Collect the pieces first, then see what story they support. Never pick a story and recruit the evidence that fits it.
  • Precision over polish. Prefer the exact quote and citation over a smooth paraphrase. A reader should be able to follow any claim back to its source and verify it in under a minute.
  • Consider what you haven't seen. The evidence you find is a sample, not the whole truth. Before concluding, ask what you would expect to see if an alternative explanation were true, and whether you looked for it.
  • Name the gaps. If a thread goes cold, a source isn't searchable, or a question has no answer, document the gap. Don't paper it over with an authoritative-sounding guess.
  • Hedge on purpose. When evidence is indirect, your language should signal it ("appears to", "likely", "suggests"). Confidence-matching phrasing is a feature of the output, not a stylistic choice the synthesizer may override.
  • No shortcut by code-reading. The code tells you what it does, rarely why it exists. Resist inferring intent from code shape.
This posture is the working method, not a disclaimer.
以严谨、审慎、精准的调查者身份开展工作。像侦探一样,从碎片化记录中拼凑历史事件的全貌。当记录不足时,如实说明情况。
具体要求:
  • 先证据后叙事。先收集所有线索,再梳理能支撑的结论。绝不能先预设结论,再筛选符合的证据。
  • 精准优先于流畅。优先使用精确引用和来源标注,而非流畅的转述。读者应能根据任何结论追溯到来源,并在一分钟内完成验证。
  • 考虑未发现的信息。你找到的证据只是样本,并非全部真相。得出结论前,思考如果存在其他解释,你应该能找到哪些证据,以及是否已经搜索过相关内容。
  • 明确标注信息缺口。如果线索中断、某个来源无法搜索,或问题没有答案,请记录该缺口。不要用看似权威的猜测掩盖真相。
  • 刻意使用模糊表述。当证据间接时,语言需体现这一点(如“似乎”、“可能”、“表明”)。与置信度匹配的表述是输出的核心要求,而非可被忽略的风格选择。
  • 避免通过读代码走捷径。代码只能告诉你它的功能,很少能说明存在的原因。不要通过代码形态推断设计意图。
这些准则是工作方法,而非免责声明。

Core Epistemics

核心认知原则

This skill builds a patchwork understanding from fragmented historical evidence. Tickets go stale. Chat threads get deleted. Commit messages lie. People change their minds between the PR description and the implementation. The original author may have left the company.
Be ruthlessly honest about what you know versus what you're inferring. The goal is not a satisfying story; it is to surface evidence, calibrate confidence, and let the user decide.
Principles:
  • Cite everything. Every claim about intent should reference a specific commit hash, PR number, ticket ID, doc URL, chat permalink, or code comment. If you can't cite it, it's inference, not fact, and must be labeled as such.
  • Prefer "appears to" over "because". Hedge when evidence is indirect. Reserve confident language for direct, explicit evidence.
  • Surface contradictions. If two sources disagree, show both. Don't quietly pick the one that fits your narrative.
  • Acknowledge gaps. If a question has no answer in any source you searched, say so. An honest "we couldn't find out why" beats a confident guess.
  • Multiple hypotheses are valid. When the evidence fits several stories, present them all with the evidence for each. Let the user triangulate.
  • Beware rationalization. Code that makes sense today may have been written for reasons that no longer apply, or for no good reason at all. Don't retrofit intent.
Read
references/epistemics.md
for the full confidence framework and phrasing guide. The synthesizer must follow it.
该技能通过碎片化的历史证据构建拼凑式认知。工单会过时,聊天线程会被删除,提交信息可能不实。PR描述与最终实现之间,作者可能改变想法。原作者甚至可能已离职。
必须如实区分已知信息与推断内容。目标不是构建一个令人满意的故事,而是呈现证据、校准置信度,让用户自行判断。
原则:
  • 所有结论都需引用。任何关于意图的结论都应引用具体的提交哈希、PR编号、工单ID、文档URL、聊天永久链接或代码注释。如果无法引用,则属于推断内容,必须明确标注。
  • 用“似乎”代替“因为”。当证据间接时使用模糊表述。仅对直接、明确的证据使用肯定性语言。
  • 呈现矛盾信息。如果两个来源存在分歧,需同时展示双方内容。不要悄悄选择符合你叙事的一方。
  • 承认信息缺口。如果搜索所有来源后仍无法回答问题,如实说明。诚实的“我们无法找到原因”比自信的猜测更有价值。
  • 允许多种假设并存。当证据支持多种结论时,需全部呈现,并列出各自的支持证据。让用户自行验证。
  • 警惕合理化解释。如今看似合理的代码,可能是基于已不再适用的原因编写,甚至根本没有合理的编写理由。不要事后强加意图。
完整的置信度框架与表述指南请阅读
references/epistemics.md
。合成器必须严格遵循该文档。

Step 1. Understand the Target and the Question

步骤1:明确目标与问题

Parse what the user is asking. The target is usually a chunk of code, a pattern, a feature, or a named design decision. The question is usually one of:
  • "Why was X designed this way?" Design rationale.
  • "Why do we do X instead of Y?" Tradeoff or alternatives.
  • "What edge cases motivated this?" Defensive reasoning.
  • "What business or product constraint led to this?" External forcing function.
  • "Why does this code still exist?" Dead-code territory.
  • "What's the history of X?" Broad archaeological sweep.
If the target is vague ("why do we do it this way?" with no clear referent), make your best guess from conversation context (open files, recent edits, cursor location, what was just discussed). State your interpretation briefly so the user can redirect if you're off, then proceed.
解析用户的问题。目标通常是一段代码、一种模式、一个功能或一项明确的设计决策。问题通常属于以下类型:
  • “为什么X要这样设计?”——设计依据。
  • “我们为什么选择X而非Y?”——权衡或备选方案。
  • “哪些边缘情况促成了这个设计?”——防御性设计理由。
  • “哪些业务或产品约束影响了这个设计?”——外部驱动因素。
  • “这段代码为什么还存在?”——遗留代码问题。
  • “X的历史演变是怎样的?”——广泛的溯源分析。
如果目标模糊(如未明确指向的“我们为什么要这么做?”),请根据对话上下文(打开的文件、最近的编辑、光标位置、刚讨论的内容)做出最佳猜测。简要说明你的理解,以便用户在你偏离方向时纠正,然后继续执行。

Step 2. Establish the Code Anchor

步骤2:建立代码锚点

Before spawning investigators, anchor the investigation in concrete code. You need:
  • The relevant file path(s) and line range(s)
  • The key symbols (function names, class names, constants)
  • An initial commit list. The last few commits touching the target.
  • PR numbers from merge commits (pattern
    (#1234)
    in the subject line)
Build this inline. It's cheap, and every investigator needs it.
bash
undefined
在启动调查前,将调查锚定到具体代码上。你需要:
  • 相关文件路径和行范围
  • 关键符号(函数名、类名、常量)
  • 初始提交列表:最近几次修改目标代码的提交
  • 合并提交中的PR编号(主题行中的
    (#1234)
    格式)
在线构建这些信息。成本很低,但每个调查者都需要这些内容。
bash
undefined

Blame target lines for last-touch commits

查看目标行的最后修改提交

git blame -L <start>,<end> <file>
git blame -L <start>,<end> <file>

Full file history, with patches, through renames

查看文件完整历史(含补丁,支持文件重命名追踪)

git log --follow -p -- <file>
git log --follow -p -- <file>

Last N commits touching the file, PR numbers visible

查看最近N次修改文件的提交,显示PR编号

git log --oneline -20 -- <file>
git log --oneline -20 -- <file>

Extract PR numbers from a commit message

从提交信息中提取PR编号

git log -1 --format=%B <commit>

Pull PR bodies and discussion via `gh` for any substantive commits:

```bash
gh pr view <number> --json title,body,author,createdAt,mergedAt,labels,closingIssuesReferences,comments,reviews
Capture this as seed context (file paths, symbols, commits, PR numbers, linked ticket IDs). Pass it to the investigators so they don't rediscover it.
git log -1 --format=%B <commit>

通过`gh`获取重要提交的PR正文与讨论内容:

```bash
gh pr view <number> --json title,body,author,createdAt,mergedAt,labels,closingIssuesReferences,comments,reviews
将这些信息作为种子上下文(文件路径、符号、提交、PR编号、关联工单ID)保存下来,并传递给调查者,避免重复搜索。

Step 3. Spawn Parallel Investigators (default posture)

步骤3:启动并行调查者(默认策略)

Default to the full parallel investigation. Each evidence category lives in a different kind of system, and you cannot tell from the question alone which one holds the answer without looking. So look across every available category, in parallel, by default.
默认执行全面并行调查。每个证据类别都存储在不同的系统中,仅通过问题无法预判哪个类别包含答案。因此默认需并行搜索所有可用类别。

Discovery

资源发现

Before spawning investigators, list the available MCPs from the Cursor environment. Use the available-tools map when present. Otherwise inspect the
mcps/
directory Cursor exposes for enabled MCP servers.
Map each available MCP to one evidence category:
  1. Source control history
  2. Issue / ticket tracker
  3. Long-form documents
  4. Real-time team chat
  5. Infrastructure observability
  6. Error / exception tracking
  7. Product analytics warehouse
Source control is always available through git and
gh
. For the other six, classify using the MCP name, server instructions, tool names, and resource descriptors. If an MCP could fit more than one category, choose the one matching its primary evidence. Record ambiguous cases in the coverage map.
Aim for a complete coverage map, not a minimal one. A null result from an issue tracker is evidence the decision was not ticketed, a useful fact in itself. Document the null, don't skip the search.
Launch all matching investigators in a single message so they run concurrently. One investigator per category lets each specialize in one tool's query vocabulary and result shape. Don't ask one agent to cover multiple MCPs.
Subagent config (each):
  • subagent_type
    :
    generalPurpose
  • model
    :
    composer-2.5-fast
  • readonly
    :
    false
    (agent mode). Do not use readonly/Ask mode. It strips MCP access, which disables MCP-backed investigators entirely. The source control investigator would be safe in readonly, but keep modes uniform. Investigators still shouldn't write anything. That's a posture, not a sandbox.
Each investigator gets:
  1. The base prompt from
    references/investigator-prompt.md
  2. The category playbook
    references/sources/<source>.md
    for the selected MCP, adapted from the examples in
    references/source-playbook.md
  3. The cross-cutting
    references/sources/incident-postmortem.md
    if the target code looks defensive (null checks, retry logic, timeout handling, rate limiting, feature flags, egress guards, OOM handlers)
  4. The code anchor from Step 2 (file paths, symbols, commit hashes, PR numbers, ticket IDs)
  5. The user's original question
启动调查者前,列出Cursor环境中可用的MCP。如果存在可用工具映射,则使用该映射;否则检查Cursor暴露的
mcps/
目录中的已启用MCP服务器。
将每个可用MCP映射到一个证据类别:
  1. 源代码控制历史
  2. 问题/工单追踪
  3. 长篇文档
  4. 团队实时聊天
  5. 基础设施可观测性
  6. 错误/异常追踪
  7. 产品分析仓库
源代码控制始终可通过git和
gh
访问。对于其他六个类别,通过MCP名称、服务器说明、工具名称和资源描述进行分类。如果一个MCP可匹配多个类别,选择与其主要证据类型匹配的类别。将模糊情况记录在覆盖范围地图中。
目标是构建完整的覆盖范围地图,而非最小化地图。问题追踪器返回空结果,说明该决策未创建工单,这本身就是有用的信息。请记录空结果,不要跳过搜索。
在一条消息中启动所有匹配的调查者,使其并发运行。每个类别对应一个调查者,让每个调查者专注于一种工具的查询语法和结果格式。不要让一个Agent覆盖多个MCP。
子Agent配置(每个调查者):
  • subagent_type
    :
    generalPurpose
  • model
    :
    composer-2.5-fast
  • readonly
    :
    false
    (Agent模式)。请勿使用只读/Ask模式。该模式会剥夺MCP访问权限,导致基于MCP的调查者完全失效。源代码控制调查者在只读模式下可正常工作,但需保持模式统一。调查者仍不应写入任何内容,这是操作准则,而非沙箱限制。
每个调查者将获得:
  1. references/investigator-prompt.md
    中的基础提示词
  2. 所选MCP对应的类别手册
    references/sources/<source>.md
    ,改编自
    references/source-playbook.md
    中的示例
  3. 跨领域手册
    references/sources/incident-postmortem.md
    如果目标代码具有防御性,如空值检查、重试逻辑、超时处理、速率限制、功能标志、出口防护、OOM处理)
  4. 步骤2中的代码锚点(文件路径、符号、提交哈希、PR编号、工单ID)
  5. 用户的原始问题

Investigator roster. One per available evidence category

调查者清单:每个可用证据类别对应一个调查者

Spawn one investigator per category that has a matching MCP. Each owns exactly one tool or MCP.
Each entry lists what the category physically contains and the kind of "why" it uniquely surfaces. Use it to know what to expect back, how to name a gap when a category returns empty, and (only in the rare provably-irrelevant case) to justify a skip. Every category overlaps, but each owns a kind of evidence the others cannot recover.
  1. Source control investigator. Git history,
    gh
    for PRs, code comments, tests. Always spawn; the only guaranteed source. Best at surfacing implementation-time rationale captured during review. PR descriptions stating the problem, review threads debating alternatives, inline comments encoding non-obvious constraints, test names that encode motivating edge cases, and commit messages linking tickets or incidents. Most trustworthy because it ties directly to the diff that shipped.
  2. Issue / ticket tracker investigator (e.g. Linear, Jira, GitHub Issues, Plane, Shortcut MCP). Tickets, project docs, status updates, spec attachments. Best at surfacing the product or business forcing function. Customer requests ("Acme needs X for their SOC2 audit"), compliance deadlines, parent-initiative framing ("Q3 enterprise readiness"), ticket-level scope changes, and labels that categorize the motivation (
    customer:*
    ,
    incident-followup
    ,
    compliance
    ,
    perf-regression
    ). Strongest when the why is external to engineering.
  3. Long-form documents investigator (e.g. Notion, Confluence, Google Docs, Coda MCP). PRDs, specs, RFCs, design docs, ADRs, postmortems, team pages, meeting notes. Best at surfacing long-form design rationale. Problem statements, explicit "alternatives considered" and "rejected approaches" sections, strategy documents that set priorities, ADRs with finalized decisions, and postmortem action items that tie directly to code. Where the why is written out before it becomes code.
  4. Real-time team chat investigator (e.g. Slack, Discord, Microsoft Teams, Mattermost MCP). Feature-name and symbol searches, PR URL mentions, incident channels (
    #sev-*
    ,
    #incident-*
    ), author-handle activity around the ship date. Best at surfacing real-time deliberation that never reached a doc. Fire-drill decisions during incidents, Q&A between the PR author and reviewers, casual "we decided X because Y" threads, and rationale for small changes that didn't warrant a PRD. Especially important when the source control, ticket, and doc paper trail is thin.
  5. Infrastructure observability investigator (e.g. Datadog, New Relic, Honeycomb, Grafana, Splunk MCP). Metrics, monitors, dashboards, logs, APM traces, formal incidents. Infra/runtime view. Best at surfacing infrastructure and runtime reality that motivated the code. Monitor thresholds whose numbers match code constants, metric spikes in the window right before a PR merge, dashboards created as postmortem action items, incident timelines that reference the target. Strongest when the target reacts to an infra signal (timeouts, retries, rate limits, circuit breakers).
  6. Error / exception tracking investigator (e.g. Sentry, Rollbar, Bugsnag, Airbrake MCP). Issues, events, stack traces, releases. Best at surfacing the specific exceptions and error trajectories that motivated defensive or corrective code. Stack traces that pass through the target function, issues whose first-seen/last-seen windows bracket the PR ship date, release correlations that show an error stopping at a specific version. Strongest for catch blocks, null guards, type checks, retries, and other defenses.
  7. Product analytics warehouse investigator (e.g. Databricks, Snowflake, BigQuery, ClickHouse, dbt, Redshift MCP). Product-analytics events, experiment and feature-flag exposure tables, usage and billing events, query history, warehouse telemetry. Product/data view. Complements infrastructure observability by covering user behavior and data reality around the ship date rather than infra metrics. Best at surfacing product and data reality that shaped the code. Feature-usage trajectories (a step-function ramp from zero is strong evidence that this PR launched it), experiment/flag exposure data tied to ship decisions, pre-ship distributions that reveal where a threshold constant came from (e.g.,
    limit = 128 * 1024
    matching the p99 of an upload-size column), and data-pipeline scale evidence for migrations/backfills. Strongest for flag-gated code, experiment-driven ships, data migrations, and "where did this number come from" questions.
为每个有匹配MCP的类别启动一个调查者。每个调查者仅负责一种工具或MCP。
每个条目列出了该类别实际包含的内容,以及它能独特揭示的“原因”类型。使用这些信息预判返回结果、标注类别返回空时的缺口,以及(仅在极少数可证明无关的情况下)证明跳过的合理性。所有类别存在重叠,但每个类别都拥有其他类别无法获取的证据类型。
  1. 源代码控制调查者。Git历史、
    gh
    用于获取PR、代码注释、测试。始终启动;这是唯一有保障的来源。最适合揭示实现阶段在评审中记录的设计依据。PR描述中说明的问题、评审线程中讨论的备选方案、编码非明显约束的内联注释、编码驱动边缘情况的测试名称,以及关联工单或事件的提交信息。可信度最高,因为它直接关联到已发布的代码变更。
  2. 问题/工单追踪调查者(如Linear、Jira、GitHub Issues、Plane、Shortcut MCP)。工单、项目文档、状态更新、规范附件。最适合揭示产品或业务驱动因素。客户需求(“Acme需要X以通过SOC2审计”)、合规截止日期、上级计划框架(“Q3企业就绪”)、工单级别的范围变更,以及分类动机的标签(
    customer:*
    incident-followup
    compliance
    perf-regression
    )。当原因来自工程外部时,该来源最有效。
  3. 长篇文档调查者(如Notion、Confluence、Google Docs、Coda MCP)。PRD、规范、RFC、设计文档、ADR、事后复盘、团队页面、会议记录。最适合揭示长篇设计依据。问题陈述、明确的“备选方案考虑”和“否决方案”部分、设定优先级的战略文档、包含最终决策的ADR,以及直接关联代码的事后复盘行动项。这是代码编写前就已记录的设计原因。
  4. 实时团队聊天调查者(如Slack、Discord、Microsoft Teams、Mattermost MCP)。功能名称和符号搜索、PR URL提及、事件频道(
    #sev-*
    #incident-*
    )、作者在发布日期前后的活动。最适合揭示从未记录到文档中的实时讨论。事件期间的紧急决策、PR作者与评审者之间的问答、非正式的“我们选择X是因为Y”线程,以及无需PRD的小变更的设计依据。当源代码控制、工单和文档的记录不足时,该来源尤为重要。
  5. 基础设施可观测性调查者(如Datadog、New Relic、Honeycomb、Grafana、Splunk MCP)。指标、监控、仪表盘、日志、APM追踪、正式事件。基础设施/运行时视角。最适合揭示促成代码设计的基础设施与运行时实际情况。数值与代码常量匹配的监控阈值、PR合并前窗口内的指标峰值、作为事后复盘行动项创建的仪表盘、引用目标代码的事件时间线。当目标代码响应基础设施信号(超时、重试、速率限制、断路器)时,该来源最有效。
  6. 错误/异常追踪调查者(如Sentry、Rollbar、Bugsnag、Airbrake MCP)。问题、事件、堆栈跟踪、版本。最适合揭示促成防御性或修正性代码的具体异常与错误轨迹。经过目标函数的堆栈跟踪、首次/末次出现窗口与PR发布日期重合的问题、显示错误在特定版本停止的版本关联。对捕获块、空值防护、类型检查、重试和其他防御性代码最有效。
  7. 产品分析仓库调查者(如Databricks、Snowflake、BigQuery、ClickHouse、dbt、Redshift MCP)。产品分析事件、实验与功能标志曝光表、使用与计费事件、查询历史、仓库遥测。产品/数据视角。补充基础设施可观测性,覆盖发布日期前后的用户行为与数据实际情况,而非基础设施指标。最适合揭示影响代码设计的产品与数据实际情况。功能使用轨迹(从零开始的阶跃式增长是PR发布功能的有力证据)、与发布决策关联的实验/标志曝光数据、揭示阈值常量来源的发布前分布(如
    limit = 128 * 1024
    与上传大小列的p99值匹配),以及数据迁移/回填的管道规模证据。对标志 gated 代码、实验驱动发布、数据迁移,以及“这个数值从何而来”的问题最有效。

When to skip an investigator

何时跳过调查者

Only skip with an explicit, written justification that goes in the final "Sources Consulted" section. Two valid reasons:
  • No MCP is available for that category in this environment. Flag this as a gap, not a choice. Example: "Real-time team chat skipped. No matching MCP available, so the conversational record was not searchable."
  • The source is provably irrelevant, not just "probably irrelevant." A high bar. Example: "Error / exception tracking skipped. Target is a build-time script with no runtime code path." Not "probably not in error tracking, it's a feature not an error."
"It's pure feature code, error tracking won't have anything" is not sufficient, and neither is "I doubt long-form docs would have this." Run the search; let the null result speak. The cost of an investigator returning empty is one subagent. The cost of missing a design doc that actually exists is a wrong answer.
If your scope assessment suggests a single-commit trivial target where the PR description already contains the complete answer, you may answer inline only after confirming all seven available category searches would be redundant. Say so explicitly. This should be rare.
仅在有明确书面理由的情况下跳过,并将理由写入最终的“参考来源”部分。两种有效理由:
  • 该类别在当前环境中无可用MCP。将其标记为缺口,而非主动选择。示例:“跳过实时团队聊天。无匹配MCP可用,因此无法搜索对话记录。”
  • 该来源可证明无关,而非“可能无关”。门槛很高。示例:“跳过错误/异常追踪。目标是构建时脚本,无运行时代码路径。”而非“错误追踪可能没有相关内容,这是一个功能而非错误。”
“这是纯功能代码,错误追踪不会有相关内容”符合要求,“我怀疑长篇文档不会有相关内容”也不符合。执行搜索;让空结果自行说明问题。调查者返回空结果的成本只是一个子Agent;而错过实际存在的设计文档会导致错误的答案。
如果你的范围评估表明目标是单个提交的微小变更,且PR描述已包含完整答案,你可在确认所有七个可用类别搜索均为冗余后,直接在线回答。需明确说明这一点。这种情况应很少见。

Step 4. Synthesize

步骤4:综合分析

Spawn one synthesizer subagent:
  • subagent_type
    :
    generalPurpose
  • model
    :
    claude-opus-4-8-thinking-xhigh
  • readonly
    :
    false
    (agent mode). The synthesizer's quality check spot-verifies citations, which can require MCP access. Readonly/Ask mode strips MCPs and defeats that.
The synthesizer gets:
  1. The investigator findings, including any null results and any categories skipped with justification
  2. The code anchor from Step 2 (file paths, symbols, commit hashes, PR numbers, ticket IDs)
  3. The user's original question
  4. The epistemics framework from
    references/epistemics.md
  5. The synthesizer prompt template from
    references/synthesizer-prompt.md
Its job is the final output: a confidence-weighted, evidence-cited narrative with clearly separated "what we know" and "what we're inferring" sections, plus honest acknowledgment of gaps and null-result sources.
启动一个合成子Agent:
  • subagent_type
    :
    generalPurpose
  • model
    :
    claude-opus-4-8-thinking-xhigh
  • readonly
    :
    false
    (Agent模式)。合成器的质量检查会验证引用,这可能需要MCP访问权限。只读/Ask模式会剥夺MCP访问权限,导致该功能失效。
合成器将获得:
  1. 调查者的发现,包括所有空结果和有理由跳过的类别
  2. 步骤2中的代码锚点(文件路径、符号、提交哈希、PR编号、工单ID)
  3. 用户的原始问题
  4. references/epistemics.md
    中的认知框架
  5. references/synthesizer-prompt.md
    中的合成器提示词模板
其任务是生成最终输出:一个带有置信度权重、证据引用的叙事内容,明确区分“已知内容”与“推断内容”部分,并如实承认信息缺口和空结果来源。

Step 5. Present

步骤5:呈现结果

Take the synthesizer's output and present it to the user. You may lightly edit for clarity or add context from the conversation, but do not rewrite the confidence language. The epistemic framing is the product. Dropping the hedges to sound more authoritative is the exact failure mode this skill exists to prevent.
将合成器的输出呈现给用户。你可稍作编辑以提升清晰度,或添加对话上下文,但不得修改置信度表述。认知框架是该技能的核心价值。为了听起来更权威而删除模糊表述,正是该技能要避免的失败模式。

Output Format

输出格式

The final output uses this structure. Adapt as needed, but keep the confidence separation intact.
The Question. Restate what the user asked, concisely.
The Code in Question. File paths, line ranges, and key symbols. One or two lines so the reader is anchored.
What We Found (direct evidence). Claims with explicit citations (PR #, ticket ID, doc URL, chat permalink, commit hash, code comment with file:line). Each bullet is a thing we have textual evidence for. Use present tense and quote or paraphrase the source.
What We Can Reasonably Infer. Claims well-supported by indirect evidence or combinations of signals, but not explicitly stated anywhere. Each bullet must explain the inference chain: "Given A and B, it's likely that C." Use hedged language ("appears to", "likely", "suggests").
Competing Hypotheses. If the evidence fits multiple stories, list them. For each, give the hypothesis, the evidence for it, and the evidence against it. Don't force a winner when the record doesn't support one. (Skip this section if there's a clear answer.)
What We Don't Know. Explicit gaps. Questions the user asked that the evidence didn't answer. Sources we searched and came up empty. Be specific. "We searched the issue tracker for 'rate limit' and found no ticket discussing this specific threshold" is more useful than "we don't know why."
Sources Consulted. One line per investigator, including the ones that returned nothing. The reader should see at a glance (a) which MCPs were queried, (b) which came back empty, and (c) which were skipped and why. This coverage map lets the user judge breadth and redirect if something obvious was missed.
Format each line as:
- <Source>: <what was searched>. <what was found, or "no relevant results," or "skipped. reason">.
Example:
  • Source control (git/gh):
    git log --follow backend/retry.ts
    , PRs #49074, #47812. Found PR #49074 introduced exponential backoff and linked ENG-4421.
  • Issue tracker (Linear): searched for "retry" and ENG-4421. Found ENG-4421 parent issue but no discussion of backoff parameters.
  • Long-form docs (Notion): searched for "retry policy," "backend retries," "ENG-4421." No relevant results.
  • Real-time team chat (Slack): skipped. No matching MCP available in this environment. Gap: conversational record not searched.
  • Infrastructure observability (Datadog): searched for
    retry_count
    metric and monitors around 2024-08-14. Found monitor "Upstream 5xx rate > 1%" created same day as PR #49074.
  • Error / exception tracking (Sentry): searched for issues first-seen in Aug 2024 with stack through
    retry.ts
    . Found issue SENTRY-3821 spiking in the week before the PR.
  • Product analytics warehouse (Databricks): queried
    <your_analytics_db>.<schema>.stg_backend_upstream_retry
    for the 30-day window around 2024-08-14. Daily failure-classified event count fell from ~1.2k/day pre-PR to <50/day post-PR. Also checked
    system.query.history
    for relevant migration queries. None found.
After the Sources Consulted block, if the user's
why
question is a precursor to actually changing this code, convert the lineage findings into a Preserve / Change / Avoid / Risk constraint set suitable for planning the change.
最终输出采用以下结构。可根据需要调整,但需保持置信度区分。
问题。简洁重述用户的问题。
涉及代码。文件路径、行范围和关键符号。用一到两行内容让读者锚定目标。
已发现内容(直接证据)。带有明确引用(PR编号、工单ID、文档URL、聊天永久链接、提交哈希、带文件:行的代码注释)的结论。每个项目符号都是有文本证据支持的内容。使用现在时态,并引用或转述来源。
合理推断内容。由间接证据或多种信号组合支持,但未在任何来源中明确说明的结论。每个项目符号必须解释推断过程:“基于A和B,可能存在C。”使用模糊表述(“似乎”、“可能”、“表明”)。
竞争假设。如果证据支持多种结论,列出所有假设。每个假设需包含假设内容、支持证据和反对证据。当记录无法支持唯一结论时,不要强行选择。(如果答案明确,可跳过此部分。)
未知内容。明确的信息缺口。用户提出但证据无法回答的问题。搜索后返回空结果的来源。需具体说明。“我们在问题追踪器中搜索‘速率限制’,未找到讨论该特定阈值的工单”比“我们不知道原因”更有用。
参考来源。每个调查者对应一行,包括返回空结果的调查者。读者应能一眼看到(a)查询了哪些MCP,(b)哪些返回空结果,(c)哪些被跳过及原因。该覆盖范围地图让用户判断搜索广度,并在遗漏明显内容时重新引导。
每行格式:
- <来源>: <搜索内容>。<发现内容,或“无相关结果”,或“跳过。原因”>。
示例:
  • 源代码控制(git/gh):
    git log --follow backend/retry.ts
    ,PR #49074、#47812。发现PR #49074引入了指数退避,并关联ENG-4421。
  • 问题追踪器(Linear):搜索“retry”和ENG-4421。发现ENG-4421父工单,但未找到关于退避参数的讨论。
  • 长篇文档(Notion):搜索“retry policy”、“backend retries”、“ENG-4421”。无相关结果。
  • 实时团队聊天(Slack):跳过。当前环境中无匹配MCP可用。缺口:未搜索对话记录。
  • 基础设施可观测性(Datadog):搜索
    retry_count
    指标及2024-08-14前后的监控。发现与PR #49074同一天创建的监控“上游5xx错误率>1%”。
  • 错误/异常追踪(Sentry):搜索2024年8月首次出现且堆栈包含
    retry.ts
    的问题。发现问题SENTRY-3821在PR发布前一周出现峰值。
  • 产品分析仓库(Databricks):查询
    <your_analytics_db>.<schema>.stg_backend_upstream_retry
    在2024-08-14前后30天的数据。每日失败事件数从PR前的约1.2k/天降至PR后的<50/天。同时检查
    system.query.history
    中的相关迁移查询。未找到。
在“参考来源”块之后,如果用户的“why”问题是修改代码的前置步骤,请将历史发现转换为适合规划变更的“保留/修改/避免/风险”约束集。

Common Failure Modes to Avoid

需避免的常见失败模式

  • Confident storytelling. A plausible narrative built from thin evidence. A bullet with no citation goes in "inferred" or "hypotheses," not "what we found."
  • Citing the code as evidence for its own intent. "Handles the null case because it checks for null" is mechanics, not motivation. Motivation comes from an external source (PR discussion, ticket, comment, conversation) or is labeled as inference.
  • Recency bias. Assuming the most recent commit is authoritative. The current shape is often the accretion of many earlier decisions. Trace back.
  • Sycophantic agreement. If the user suggests a reason ("I assume this is for performance?"), treat it as a hypothesis and check the evidence independently, don't just confirm it.
  • Skipping the gaps section. An honest accounting of what you couldn't find out is part of the value.
  • Skipping investigators by anticipation. Deciding up front that "long-form docs probably don't have this" or "this isn't an error tracking thing" without searching. The default-to-all-seven posture prevents this. A null result is a data point; a skipped search is a blind spot.
  • Collapsing investigators into one agent. Each MCP has its own query vocabulary, result shape, and pitfalls; pooling them dilutes specialization and makes coverage harder to reason about. Always one investigator per category.
  • 自信的叙事。基于薄弱证据构建的看似合理的故事。无引用的项目符号应归入“推断”或“假设”,而非“已发现内容”。
  • 将代码作为自身意图的证据。“处理空值情况是因为它检查了空值”是机制描述,而非动机。动机来自外部来源(PR讨论、工单、注释、对话),或被标注为推断内容。
  • 近期偏差。假设最新的提交具有权威性。当前代码形态往往是多个早期决策的累积结果。需追溯历史。
  • 附和用户。如果用户提出一个理由(“我认为这是为了性能?”),将其视为假设并独立检查证据,不要直接确认。
  • 跳过缺口部分。如实说明无法找到的信息是该技能价值的一部分。
  • 预判性跳过调查者。未搜索就预先判定“长篇文档可能没有相关内容”或“这不属于错误追踪范畴”。默认搜索所有七个类别的策略可避免这种情况。空结果是数据点;跳过搜索是盲点。
  • 将多个调查者合并为一个Agent。每个MCP都有自己的查询语法、结果格式和陷阱;合并会削弱专业性,使覆盖范围更难评估。始终为每个类别单独配置一个调查者。

Reference Files

参考文件

  • references/epistemics.md
    . Confidence tiers and phrasing guide. The synthesizer must follow it.
  • references/investigator-prompt.md
    . Base prompt template for investigator subagents.
  • references/source-playbook.md
    . Index pointing at the category playbooks below.
  • references/sources/*.md
    . One self-contained example playbook per category, plus cross-cutting
    incident-postmortem.md
    . Give an investigator the single file that matches its category and adapt it to the available MCP.
  • references/synthesizer-prompt.md
    . Prompt template for the synthesizer subagent, including the output format.
  • references/epistemics.md
    。置信度层级与表述指南。合成器必须遵循该文档。
  • references/investigator-prompt.md
    。调查者子Agent的基础提示词模板。
  • references/source-playbook.md
    。指向以下类别手册的索引。
  • references/sources/*.md
    。每个类别对应一个独立的示例手册,加上跨领域的
    incident-postmortem.md
    。为调查者提供与其类别匹配的单个文件,并根据可用MCP进行调整。
  • references/synthesizer-prompt.md
    。合成器子Agent的提示词模板,包含输出格式。