dossier

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Dossier — Decision-Grade Entity Research

Dossier — 决策级实体研究

Portability: Requires
WebSearch
+
WebFetch
, Node.js with
docx
package, and optionally
bash_tool
+
curl
for free APIs (SEC EDGAR, GitHub, ProPublica). BYOK MCPs (LinkedIn, Crunchbase, Apollo, Pitchbook, SimilarWeb) are optional enhancements. Works in Claude Code CLI natively.

可移植性： 需要
WebSearch
+
WebFetch
、安装了
docx
包的Node.js，对于免费API（SEC EDGAR、GitHub、ProPublica），还可选择搭配
bash_tool
+
curl
使用。BYOK MCPs（LinkedIn、Crunchbase、Apollo、Pitchbook、SimilarWeb）为可选增强工具。可原生在Claude Code CLI中运行。

Non-Generic Framing — The Differentiator

非通用定位——核心差异化优势

This skill is decision-grade entity research with hypothesis-testing. It refuses to be "tell me about Microsoft". Every invocation forces the user to expose their hypothesis upfront (Q4) so the dossier tests it rather than confirms it.

The use case shape:

"I'm pitching Microsoft Tuesday. My hypothesis is they're consolidating AI spend on their first-party Foundry platform. Validate or disprove, and give me three conversation hooks tied to what you find."

NOT:

"Tell me about Microsoft."

The forcing Q4 — the hypothesis question — is the non-generic anchor. Skip it and the skill produces a Wikipedia summary.

See

references/hypothesis_testing_discipline.md

for the canon.

本Skill是基于假设验证的决策级实体研究工具。它拒绝执行类似“告诉我关于微软的信息”这类通用请求。每次调用都会强制用户预先明确提出假设（问题4），确保dossier对假设进行验证而非单纯确认。

典型使用场景：

"我周二要向微软做推介。我的假设是他们正将AI投入整合到自研的Foundry平台中。请验证或推翻该假设，并给出3个基于研究结果的沟通切入点。"

错误示例：

"告诉我关于微软的信息。"

强制提出的问题4（假设问题）是实现非通用研究的核心锚点。跳过该问题，Skill将仅生成维基百科式的摘要内容。

详见

references/hypothesis_testing_discipline.md

中的规范说明。

Agent Integrity Rules (Research-Pack Convention)

Agent诚信规则（研究包惯例）

Locked verbatim per PR #657 audit.

Execution discipline. Sequential search calls. WebSearch + WebFetch have looser rate limits than Consensus but still apply 1 q/sec etiquette. Confirm response received before next call.
Source discipline. Cite only sources returned by this session's tool calls. Wikipedia / training knowledge labeled
```
[Background — verify before quoting]
```
and excluded from primary findings count.
Three-count tracking. Queries sent / sources received / sources cited. Plus per-tier breakdown (primary / secondary / tertiary) unique to dossier. Surfaced in audit log.
Retry policy. On failure → wait 3s → retry once → log. After 3 consecutive failures: stop, alert user.
Source reliability tier. Each citation tagged primary (official, SEC, court records) / secondary (mainstream news, trade press) / tertiary (blogs, forums). DOCX surfaces tier on every flag.

严格遵循PR #657审计确定的规则。

执行规范： 按顺序发起搜索请求。WebSearch + WebFetch的速率限制比Consensus宽松，但仍需遵守每秒1次的礼仪规范。确认收到响应后再发起下一次请求。
来源规范： 仅引用本次会话工具调用返回的来源。维基百科/训练知识库内容需标记为
```
[背景信息——引用前请核实]
```
，且不计入主要发现数量。
三项计数跟踪： 已发送查询数 / 已接收来源数 / 已引用来源数。此外，dossier特有的分层统计（一级/二级/三级）将在审计日志中展示。
重试策略： 调用失败时→等待3秒→重试1次→记录日志。连续3次失败后：停止操作，向用户发出警报。
来源可靠性分层： 每个引用需标记为一级（官方文件、SEC、法院记录）/二级（主流新闻、行业媒体）/三级（博客、论坛）。DOCX文档中每个警示点都会标注对应的层级。

Phase 1: Grill-Me Intake (6 forcing questions, one at a time)

第一阶段：严格信息采集（6个强制问题，逐一提出）

Q1 (root) — Subject identity

问题1（核心）——研究对象身份

Who is the subject? Give me the exact name and, if a company, the website or LinkedIn URL. If a person, their LinkedIn URL or a unique identifier (company affiliation + role).

Why I'm asking: Disambiguation. There are 47 John Smiths. There are three companies called "Atlas". I need a specific entity to research.

If user gives only a name, push for a second identifier. Refuse to proceed on ambiguous names.

研究对象是谁？请提供准确名称；如果是公司，请提供官网或LinkedIn网址；如果是个人，请提供LinkedIn网址或唯一标识（所属公司+职位）。

提问原因： 消除歧义。存在47个名为John Smith的人，还有三家名为“Atlas”的公司。我需要明确的实体才能开展研究。

如果用户仅提供名称，需进一步索要第二个标识信息。拒绝处理身份模糊的名称。

Q2 (depends on Q1) — Subject type

问题2（基于问题1）——研究对象类型

What kind of subject is this? Pick one: person / company / nonprofit / government org / other.

Why I'm asking: Different source matrices apply. For people I check LinkedIn, GitHub, Scholar, news; for companies I check SEC EDGAR (if public), Crunchbase, news, GitHub for tech orgs; for nonprofits I check Form 990s on ProPublica.

Forcing choice. "Other" requires a one-line description.

研究对象属于哪一类？请选择一项：个人 / 公司 / 非营利组织 / 政府机构 / 其他。

提问原因： 不同类型的对象适用不同的来源矩阵。针对个人，我会查看LinkedIn、GitHub、学术平台、新闻；针对公司，我会查看SEC EDGAR（若为上市公司）、Crunchbase、新闻、科技公司的GitHub；针对非营利组织，我会查看ProPublica上的990表格。

强制选择。若选择“其他”，需提供一行描述说明。

Q3 (depends on Q2) — Purpose

问题3（基于问题2）——研究目的

What are you preparing for? Pick one:

Sales meeting / partnership pitch

Investment diligence

Acquisition diligence

Journalism / due diligence

Job interview prep

Competitive intelligence

Personal vetting (date, hire, business partner)

Other (specify)

Why I'm asking: The purpose dictates the angle, the depth, and the red-flag sensitivity. Sales prep needs conversation hooks. Investment diligence needs traction signals. Personal vetting needs careful sensitivity boundaries.

你开展此项研究是为了什么？请选择一项：

销售会议 / 合作推介

投资尽职调查

收购尽职调查

新闻报道 / 尽职调查

求职面试准备

竞争情报分析

个人背景核查（约会、招聘、商业伙伴）

其他（请说明）

提问原因： 研究目的决定了研究角度、深度和风险警示的敏感度。销售准备需要沟通切入点，投资尽职调查需要业务增长信号，个人背景核查需要严格的敏感信息边界。

Q4 (depends on Q3) — Hypothesis — MANDATORY

问题4（基于问题3）——假设——必填项

What's your hypothesis going in? What do you already believe about this subject, and what do you want to verify or disprove?

Why I'm asking: This is the critical question. A dossier that just confirms what you already think is worthless. By stating your hypothesis upfront, I can search for evidence that would disprove it as well as evidence that supports it — and give you a verdict you can actually use.

Examples:

"I believe Microsoft is consolidating AI spend on first-party Foundry. Verify or disprove."

"I think the CEO is over their head — too much TAM talk, no traction. Test that."

"I believe this nonprofit's overhead ratio is sketchy. Check the 990s."

"I think this person is technical enough to handle a CTO role. Verify."

MANDATORY. If user says "I don't have one", push back once: "Then guess. Commit to a position you can update later. The dossier needs a hypothesis to test, otherwise it's a generic profile and won't help you make a decision."

If still refused: fall back to implicit hypothesis "what's the most surprising thing I could find?" and flag the fallback in audit log.

This question is the non-generic anchor. Skip it and the skill becomes a Wikipedia summary.

你预先的假设是什么？你对该研究对象已有哪些看法，想要验证或推翻什么结论？

提问原因： 这是最关键的问题。仅确认你已有观点的dossier毫无价值。通过预先明确假设，我可以同时搜索支持和推翻假设的证据，为你提供真正可用于决策的判定结果。

示例：

"我认为微软正将AI投入整合到自研的Foundry平台中。请验证或推翻该假设。"

"我认为该CEO能力不足——过多谈论市场规模，却没有实际业务进展。请验证此观点。"

"我认为该非营利组织的管理费用比例存在问题。请查看990表格。"

"我认为此人具备胜任CTO职位的技术能力。请验证。"

必填项。 如果用户表示“我没有假设”，需提醒一次：“那请你先做出一个推测。先给出一个可以后续调整的立场。dossier需要假设来进行验证，否则只是一份通用简介，无法帮助你做出决策。”

若用户仍拒绝提供假设：则默认采用隐含假设“我能发现的最令人惊讶的事实是什么？”，并在审计日志中标记该 fallback 情况。

此问题是实现非通用研究的核心锚点。跳过该问题，Skill将退化为生成维基百科式摘要的工具。

Q5 (depends on Q3) — Depth

问题5（基于问题3）——研究深度

Time horizon: 5-minute brief or 15-minute decision-grade dossier?

Why I'm asking: Brief mode caps at ~10 searches and skips the network + reputation passes. Decision-grade goes deeper on every section. Pick based on how much skin you have in this decision.

Forcing choice.

时间范围选择：5分钟简报还是15分钟决策级dossier？

提问原因： 简报模式最多进行约10次搜索，跳过网络和声誉调查环节。决策级模式会对每个板块进行深度研究。请根据你在该决策中的投入程度选择。

强制选择。

Q6 (asked only if Q3 ∈ {journalism, personal vetting}) — Sensitivities

问题6（仅当问题3选择{新闻报道、个人背景核查}时提出）——敏感内容排除

Anything sensitive to exclude? E.g., personal medical, family details, political history, or specific topics off-limits?

Why I'm asking: Some research contexts have ethical constraints. I'd rather know upfront than surface something you'd never share.

Skip for sales/investment/acquisition/competitive intel (low sensitivity); ask for journalism/personal vetting (high sensitivity).

Stop condition: After Q6 (or earlier with dependency skips), commit and start Phase 2. Never re-open intake after Phase 2 begins.

是否有需要排除的敏感内容？例如：个人医疗信息、家庭细节、政治历史或特定禁止讨论的话题？

提问原因： 某些研究场景存在伦理限制。我宁愿提前了解，也不愿披露你不愿公开的内容。

销售/投资/收购/竞争情报场景（低敏感度）无需提问；新闻报道/个人背景核查场景（高敏感度）需提问。

停止条件： 完成问题6后（或根据依赖关系提前结束），确认信息采集完成并进入第二阶段。第二阶段开始后不得重新开启信息采集环节。

Phase 2: Subject Disambiguation

第二阶段：研究对象身份确认

Before Phase 3, resolve the subject to a specific entity:

For people: confirm LinkedIn URL OR (employer + role + city)
For companies: confirm domain OR (legal name + incorporation jurisdiction)
For nonprofits: confirm EIN OR (legal name + state)
For government orgs: confirm official .gov URL

If still ambiguous after Q1 push-back: halt and re-ask Q1 with disambiguating identifiers. Refuse to proceed.

进入第三阶段前，需将研究对象明确为特定实体：

个人：确认LinkedIn网址 OR（雇主+职位+城市）
公司：确认域名 OR（法定名称+注册管辖地）
非营利组织：确认EIN OR（法定名称+所在州）
政府机构：确认官方.gov网址

若经过问题1的提醒后身份仍模糊：暂停操作，重新询问问题1并要求提供明确标识。拒绝继续执行。

Phase 3: Source Matrix Selection

第三阶段：来源矩阵选择

Routed by Q2 subject type. See

references/subject_type_source_matrix.md

for the full canon.

根据问题2确定的研究对象类型选择来源。完整规范详见

references/subject_type_source_matrix.md

。

Person

个人

LinkedIn (manual fetch or LinkedIn MCP if BYOK)
Personal website
Twitter/X (rate-limited; degrade gracefully)
GitHub (if technical subject)
Google Scholar (if academic)
News (WebSearch + WebFetch)
Conference talk transcripts, podcasts (WebSearch)

LinkedIn（手动获取，或使用BYOK的LinkedIn MCP）
个人网站
Twitter/X（受速率限制；无法获取时优雅降级）
GitHub（若为技术领域人士）
Google Scholar（若为学术领域人士）
新闻（WebSearch + WebFetch）
会议演讲文稿、播客（WebSearch）

Company

公司

Official website (about, leadership, news, careers)
SEC EDGAR (free API; 10-Ks, 10-Qs, 8-Ks for public co's)
Crunchbase free tier (or Crunchbase MCP if BYOK)
News (WebSearch + WebFetch)
GitHub (for tech orgs)
Glassdoor + Comparably (sentiment; degrade gracefully if scraping blocked)
LinkedIn company page

官方网站（关于我们、领导层、新闻、招聘）
SEC EDGAR（免费API；上市公司的10-K、10-Q、8-K文件）
Crunchbase免费版（或使用BYOK的Crunchbase MCP）
新闻（WebSearch + WebFetch）
GitHub（针对科技公司）
Glassdoor + Comparably（舆情分析；若抓取被阻止则优雅降级）
LinkedIn公司主页

Nonprofit

非营利组织

ProPublica Nonprofit Explorer (free; Form 990s)
Official website
News
GuideStar (if accessible)

ProPublica Nonprofit Explorer（免费；990表格）
官方网站
新闻
GuideStar（若可访问）

Government org

政府机构

Official .gov sites
News
ProPublica (for federal agencies)

If a paid MCP is connected (Apollo, Pitchbook, SimilarWeb), use it but mark findings as BYOK-sourced in the audit log.

官方.gov网站
新闻
ProPublica（针对联邦机构）

若已连接付费MCP（Apollo、Pitchbook、SimilarWeb），可使用其数据，但需在审计日志中标记为BYOK来源。

Phase 4: Hypothesis-Driven Search

第四阶段：基于假设的搜索

Every Phase 4 search MUST be classified as either:

Supporting evidence (confirms hypothesis), OR
Disconfirming evidence (would refute hypothesis)

≥30% of search budget allocated to disconfirming queries. Enforced via

scripts/disconfirming_evidence_balance.py

Example for hypothesis "Microsoft is consolidating AI spend on Foundry":

Supporting: "Microsoft Foundry adoption 2026", "Microsoft AI infrastructure consolidation"
Disconfirming: "Microsoft OpenAI deal renegotiation", "Microsoft AI vendor diversification", "Microsoft third-party model partnerships 2026"

This is what makes the dossier decision-grade rather than confirmation-biased.

For each search:

Record via
```
citation_tracker.py
```
with classification (supporting / disconfirming)
Apply source tier from
```
source_tier_classifier.py
```
to each result URL

第四阶段的每一次搜索都必须归类为：

支持性证据（证实假设），或
反驳性证据（推翻假设）

至少30%的搜索预算需分配给反驳性查询。此规则通过

scripts/disconfirming_evidence_balance.py

强制执行。

假设“微软正将AI投入整合到Foundry平台”的搜索示例：

支持性： "Microsoft Foundry adoption 2026"、"Microsoft AI infrastructure consolidation"
反驳性： "Microsoft OpenAI deal renegotiation"、"Microsoft AI vendor diversification"、"Microsoft third-party model partnerships 2026"

这正是dossier具备决策级价值而非仅存在确认偏差的原因。

针对每次搜索：

通过
```
citation_tracker.py
```
记录，并归类为支持性/反驳性
通过
```
source_tier_classifier.py
```
为每个结果URL标记来源层级

Phase 5: 12-Month Activity Timeline

第五阶段：12个月活动时间线

Default 12-month window for activity timeline; deeper for foundational identity.

Categories:

News (acquisitions, hires, departures, product launches)
Funding rounds / financial events
Controversies / legal events
Public statements / strategy shifts

Reverse chronological. Each entry hyperlinked + tiered.

默认时间范围为12个月；基础身份信息可追溯更久。

分类：

新闻（收购、招聘、离职、产品发布）
融资轮次 / 财务事件
争议 / 法律事件
公开声明 / 战略调整

按时间倒序排列。每个条目包含超链接和层级标记。

Phase 6: Network + Reputation Signals

第六阶段：关联网络 + 声誉信号

Network

关联网络

Companies: investors (in/out), customers (named), partners
People: co-founders, advisors, mentors, employers, board roles
Nonprofits: funders, board, leadership

5-10 entries, ranked by relevance to hypothesis.

公司： 投资方（进出）、客户（已公开）、合作伙伴
个人： 联合创始人、顾问、导师、雇主、董事会职位
非营利组织： 资助方、董事会、领导层

5-10个条目，按与假设的相关性排序。

Reputation

声誉

Sentiment from news (recent 12 months)
Glassdoor for companies (overall rating + 3 representative reviews)
Peer mentions for people
Caveat: reputation data is noisy; tier accordingly

新闻舆情（最近12个月）
公司的Glassdoor评分（总体评分+3条代表性评论）
个人的同行评价
注意：声誉数据存在噪声；需按层级标记

Phase 7: Red-Flag Pass

第七阶段：风险警示排查

Surface but don't sensationalize:

Litigation (court records → primary tier)
Regulatory actions (SEC, DOJ, agency actions → primary)
Unusual departures (key personnel exits within 90 days)
Financial signals (going-concern notes in 10-Ks → primary)
Reputation hits (sustained negative coverage → secondary)

Each flag tiered. Tier shows up next to every flag in the DOCX.

客观呈现但不夸大：

诉讼（法院记录→一级来源）
监管行动（SEC、DOJ、机构行动→一级来源）
异常人事变动（90天内核心人员离职）
财务信号（10-K文件中的持续经营提示→一级来源）
声誉受损（持续负面报道→二级来源）

每个警示点都需标记层级。层级信息会显示在DOCX文档中每个警示点旁。

Phase 8: Conversation Hook Generation

第八阶段：沟通切入点生成

3-5 specific hooks tied to actual findings, not generic talking points.

See

references/conversation_hook_quality.md

for the canon.

❌ Generic	✅ Finding-tied
"Ask about their roadmap"	"Mention their recent acquisition of [X] — it signals they're investing in vertical Y. Suggested framing: 'Saw the [X] announcement — how does that change your roadmap on Y?'"
"Ask about hiring"	"Their VP Engineering left 3 weeks ago (LinkedIn). Suggested framing: 'I noticed [name] moved on — what's the eng leadership plan?'"
"Talk about their values"	"They updated their pricing page last week (their official site). Suggested framing: 'Saw the pricing refresh — what drove that?'"

Each hook:

The hook (one sentence)
The finding it's tied to (with hyperlink + tier)
Suggested framing (verbatim phrasing user can adapt)

生成3-5个基于实际研究发现的具体切入点，而非通用话术。

规范详见

references/conversation_hook_quality.md

。

❌ 通用话术	✅ 基于发现的切入点
"询问他们的路线图"	"提及他们近期收购的[X]——这表明他们正在垂直领域Y投入。建议表述：'看到了[X]的公告——这会如何改变你们在Y领域的路线图？'"
"询问招聘情况"	"他们的工程副总裁3周前离职了（LinkedIn）。建议表述：'我注意到[姓名]已离职——工程团队的领导规划是什么？'"
"谈论他们的价值观"	"他们上周更新了定价页面（官方网站）。建议表述：'看到了定价更新——背后的驱动因素是什么？'"

每个切入点包含：

切入点内容（一句话）
关联研究发现（带超链接+层级标记）
建议表述（用户可直接使用或调整的完整话术）

Phase 9: DOCX Generation (9 Sections)

第九阶段：DOCX生成（9个板块）

Via Node.js +

docx

library.

Executive Summary — one paragraph: who they are + why they matter + verdict on the hypothesis (SUPPORTED / PARTIALLY SUPPORTED / DISPROVEN / INCONCLUSIVE) + 3 things-you-should-know bullets.
Identity Facts Table — founded/born, location, size/stage, current role, key affiliations. All cells sourced; hover-text tier.
Hypothesis Test — user's hypothesis stated verbatim. Supporting evidence (3-5 bullets with hyperlinked citations). Disconfirming evidence (3-5 bullets with hyperlinked citations). Verdict paragraph (2-3 sentences explaining the weight).
12-Month Activity Timeline — News, funding, hires, departures, product launches, controversies. Reverse chronological. Each entry hyperlinked.
Network Signals — Collaborators / investors / associates. 5-10 entries, ranked by relevance to hypothesis.
Reputation Signals — Sentiment from news, Glassdoor for companies, peer mentions for people. Caveat: reputation data is noisy.
Red Flags + Hidden Patterns — Litigation, regulatory actions, unusual departures, financial signals, reputation hits. Tiered.
Conversation Hooks — 3-5 specific hooks tied to findings. Each: hook + finding + suggested framing.
Source Provenance + Audit Log — Per-source list with tier. Search summary table (#, query, classification, sources returned, sources cited). Three counts + per-tier counts. Failed searches. BYOK-MCP usage flag.

通过Node.js +

docx

库生成。

执行摘要——一段内容：研究对象身份+重要性+假设判定结果（支持/部分支持/推翻/不确定）+3个关键要点。
身份事实表格——成立/出生日期、地点、规模/阶段、当前职位、核心关联方。所有单元格均标注来源；悬停显示层级。
假设验证——原文呈现用户的假设。支持性证据（3-5个带超链接引用的要点）。反驳性证据（3-5个带超链接引用的要点）。判定说明（2-3句话解释证据权重）。
12个月活动时间线——新闻、融资、招聘、离职、产品发布、争议。按时间倒序排列。每个条目带超链接。
关联网络信号——合作者/投资方/关联方。5-10个条目，按与假设的相关性排序。
声誉信号——新闻舆情、公司Glassdoor评分、个人同行评价。注意：声誉数据存在噪声。
风险警示 + 潜在模式——诉讼、监管行动、异常人事变动、财务信号、声誉受损。带层级标记。
沟通切入点——3-5个基于研究发现的具体切入点。每个包含：切入点内容+关联发现+建议表述。
来源溯源 + 审计日志——按来源列出的清单，带层级标记。搜索汇总表（编号、查询内容、分类、已接收来源数、已引用来源数）。三项计数+分层计数。失败的搜索请求。BYOK-MCP使用标记。

Styling

样式规范

Arial 12pt body, navy headings (#1a3a5c), light blue table headers (#e8f0f8), red red-flag callout, green conversation-hook callout.

正文为Arial 12号字体，标题为深蓝色（#1a3a5c），表格表头为浅蓝色（#e8f0f8），风险警示为红色，沟通切入点为绿色。

Hyperlink patterns

超链接格式

new ExternalHyperlink({
  link: "https://...",
  children: [new TextRun({ text: title, style: "Hyperlink" })],
});

new ExternalHyperlink({
  link: "https://...",
  children: [new TextRun({ text: title, style: "Hyperlink" })],
});

Phase 10: Deliver

第十阶段：交付

Save:

<output-dir>/dossier_<entity-slug>_<YYYY-MM-DD>.docx

Chat summary: file path + verdict on hypothesis + audit counts + tier breakdown + BYOK MCPs used (if any)

Validate:

python scripts/office/validate.py <docx>

保存路径：

<output-dir>/dossier_<entity-slug>_<YYYY-MM-DD>.docx

聊天摘要：文件路径 + 假设判定结果 + 审计计数 + 层级统计 + 使用的BYOK MCP（如有）

验证：

python scripts/office/validate.py <docx>

Tooling

工具脚本

Script	Role
`scripts/citation_tracker.py`	Three-count audit + supporting/disconfirming classification + source-tier tagging at `~/.dossier_sessions/<session>.json`
`scripts/disconfirming_evidence_balance.py`	Verifies ≥30% of search budget allocated to disconfirming queries; warns if biased
`scripts/source_tier_classifier.py`	URL → primary / secondary / tertiary classification via domain heuristics

脚本	作用
`scripts/citation_tracker.py`	三项审计计数 + 支持/反驳分类 + 来源层级标记，存储于 `~/.dossier_sessions/<session>.json`
`scripts/disconfirming_evidence_balance.py`	验证至少30%的搜索预算分配给反驳性查询；若存在偏差则发出警告
`scripts/source_tier_classifier.py`	通过域名规则将URL分类为一级/二级/三级来源

References

参考文档

```
references/hypothesis_testing_discipline.md
```
— ≥30% rule + decision-grade vs encyclopedic (7+ sources)
```
references/subject_type_source_matrix.md
```
— person/company/nonprofit/gov source matrices (7+ sources)
```
references/conversation_hook_quality.md
```
— finding-tied hook discipline (7+ sources)

```
references/hypothesis_testing_discipline.md
```
——至少30%反驳性查询规则 + 决策级vs百科式研究（7+来源）
```
references/subject_type_source_matrix.md
```
——个人/公司/非营利组织/政府机构的来源矩阵（7+来源）
```
references/conversation_hook_quality.md
```
——基于研究发现的切入点规范（7+来源）

Error Handling

错误处理

Failure	Behavior
Subject name ambiguous	Refuse to proceed. Re-ask Q1 with disambiguating identifier.
User refuses to state hypothesis	Push back once. If still refused, fall back to "what's the most surprising thing I could find?" implicit hypothesis. Flag in audit.
Subject has zero public footprint	Surface explicitly. Suggest different name or early-stage. Don't fabricate.
LinkedIn scrape blocked	Note in audit; fall back to WebSearch; suggest user verify manually.
SEC EDGAR fails	Retry once. If still failing, note "public filings not retrieved" and continue.
Sentiment data sparse	Mark reputation section as "limited public signal"; don't infer from training.
Sensitive topic surfaces (Q6 exclusion)	Exclude from DOCX. Note in chat (not in DOCX) so user knows the exclusion was honored.
3 consecutive tool failures	Stop, alert user, share collected so far.
DOCX generation fails	Save raw data as JSON fallback.

失败场景	处理方式
研究对象名称模糊	拒绝继续执行。重新询问问题1并要求提供明确标识。
用户拒绝提供假设	提醒一次。若仍拒绝，采用隐含假设“我能发现的最令人惊讶的事实是什么？”。在审计日志中标记。
研究对象无公开信息	明确告知用户。建议核对名称或确认是否为早期阶段实体。不得编造内容。
LinkedIn抓取被阻止	在审计日志中记录；改用WebSearch；建议用户手动核实。
SEC EDGAR调用失败	重试一次。若仍失败，记录“未获取公开 filings”并继续执行。
舆情数据不足	将声誉板块标记为“公开信号有限”；不得基于训练数据推断。
出现问题6中排除的敏感内容	从DOCX中排除。在聊天中告知用户（不在DOCX中记录），确认已遵守排除规则。
连续3次工具调用失败	停止操作，向用户发出警报，分享已收集的内容。
DOCX生成失败	将原始数据保存为JSON作为备用。

Anti-Patterns To Reject

需拒绝的反模式

Producing a dossier without forcing Q4 hypothesis
Allocating <30% of search budget to disconfirming evidence
Batching intake questions
Accepting ambiguous subject names
Generic conversation hooks ("ask about their roadmap")
Sensationalizing red flags (tier them, don't editorialize)
Skipping the source-reliability tier on flags
Fabricating coverage when LinkedIn or scraping is blocked
Using BYOK-MCP data without flagging in audit log
Including sensitive topics user excluded in Q6
Confirmation-biased verdict ("SUPPORTED" without engaging with disconfirming evidence)

Version: 1.0.0 Source spec:

megaprompts/12-dossier-megaprompt.md

Build pattern: Path B (direct conversion). Research-pack sibling, hypothesis-testing variant.

未强制要求问题4的假设就生成dossier
分配给反驳性证据的搜索预算低于30%
批量提出信息采集问题
接受模糊的研究对象名称
生成通用沟通切入点（如“询问他们的路线图”）
夸大风险警示（需标记层级，不得主观评论）
风险警示未标记来源可靠性层级
LinkedIn或抓取被阻止时编造内容
使用BYOK-MCP数据但未在审计日志中标记
包含用户在问题6中排除的敏感内容
存在确认偏差的判定结果（如仅依据支持性证据就判定“支持”）

版本： 1.0.0 来源规范：

megaprompts/12-dossier-megaprompt.md

构建模式： B路径（直接转换）。研究包姊妹工具，假设验证变体。