cross-verified-research
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCross-Verified Research
交叉验证研究
Systematic research engine with anti-hallucination safeguards and source quality tiering.
具备防幻觉保障机制和来源质量分级的系统化研究引擎。
Rules (Absolute)
绝对规则
- Never fabricate sources. No fake URLs, no invented papers, no hallucinated statistics.
- Confidence gate. If confidence < 90% on a factual claim, do NOT present it as fact. State uncertainty explicitly.
- No speculation as fact. Do not present unverified claims using hedging language as if they were findings. Banned patterns: "아마도", "~인 것 같습니다", "~로 보입니다", "~수도 있습니다", "probably", "I think", "seems like", "appears to be", "likely". If a claim is not verified, label it explicitly as Unverified or Contested — do not soften it with hedging.
- BLUF output. Lead with conclusion, follow with evidence. Never bury the answer.
- Minimum effort. At least 5 distinct search queries per research task. At least 5 verified sources in final output.
- Cross-verify. Every key claim must appear in 2+ independent sources before presenting as fact.
- 绝不编造来源。不得使用虚假URL、虚构论文、幻觉生成的统计数据。
- 置信度门槛。如果对一项事实声明的置信度低于90%,不得将其作为事实呈现,需明确说明不确定性。
- 不得将猜测作为事实。不要使用含糊措辞将未经验证的声明当作研究结果呈现。禁用表述模式:"아마도", "~인 것 같습니다", "~로 보입니다", "~수도 있습니다", "probably", "I think", "seems like", "appears to be", "likely"。如果某条声明未经验证,需明确标注为未验证或存在争议——不得用模糊措辞弱化表述。
- BLUF输出。优先给出结论,后续附上证据,绝不要将答案隐藏在内容末尾。
- 最低工作量要求。每个研究任务至少执行5次不同的搜索查询,最终输出中至少包含5个已验证来源。
- 交叉验证。每一项核心声明在作为事实呈现前,必须在2个及以上独立来源中出现。
Pipeline
执行流程
Execute these 4 stages sequentially. Do NOT skip stages.
按顺序执行以下4个阶段,不得跳过。
Stage 1: Deconstruct
阶段1:问题解构
Break the research question into atomic sub-questions.
Input: "Should we use Bun or Node.js for our backend?"
Decomposed:
1. Runtime performance benchmarks (CPU, memory, startup)
2. Ecosystem maturity (npm compatibility, native modules)
3. Production stability (known issues, enterprise adoption)
4. Developer experience (tooling, debugging, testing)
5. Long-term viability (funding, community, roadmap)- Identify what requires external verification vs. internal knowledge
- Flag any sub-question where confidence < 90%
将研究问题拆解为原子化的子问题。
Input: "Should we use Bun or Node.js for our backend?"
Decomposed:
1. Runtime performance benchmarks (CPU, memory, startup)
2. Ecosystem maturity (npm compatibility, native modules)
3. Production stability (known issues, enterprise adoption)
4. Developer experience (tooling, debugging, testing)
5. Long-term viability (funding, community, roadmap)- 区分需要外部验证的内容与属于内部知识的内容
- 标记所有置信度低于90%的子问题
Stage 2: Search & Collect
阶段2:搜索与收集
For each sub-question requiring verification:
- Formulate diverse queries — vary keywords, include year filters, try both English and Korean
- Use WebSearch for broad discovery, WebFetch for specific page analysis
- Classify every source by tier immediately (see Source Tiers below)
- Extract specific data points — numbers, dates, versions, quotes with attribution
- Record contradictions — when sources disagree, note both positions
Minimum search pattern:
Query 1: [topic] + "benchmark" or "comparison"
Query 2: [topic] + "production" or "enterprise"
Query 3: [topic] + [current year] + "review"
Query 4: [topic] + "issues" or "problems" or "limitations"
Query 5: [topic] + site:github.com (issues, discussions)Fallback when WebSearch is unavailable or returns no results:
- Use WebFetch to directly access known authoritative URLs (official docs, GitHub repos, Wikipedia)
- Rely on internal knowledge but label all claims as Unverified (no external search available)
- Ask the user to provide source URLs or documents for verification
- Reduce the minimum source requirement but maintain cross-verification where possible
针对每个需要验证的子问题:
- 制定多样化查询词——变换关键词、增加年份筛选、同时尝试英文和韩语查询
- 使用WebSearch做广泛检索,使用WebFetch做特定页面分析
- 立即按分级规则对每个来源进行分类(参见下方来源分级说明)
- 提取特定数据点——数字、日期、版本、带出处的引用内容
- 记录矛盾点:如果来源观点不一致,同时记录两种立场
最低搜索模式:
Query 1: [topic] + "benchmark" or "comparison"
Query 2: [topic] + "production" or "enterprise"
Query 3: [topic] + [current year] + "review"
Query 4: [topic] + "issues" or "problems" or "limitations"
Query 5: [topic] + site:github.com (issues, discussions)WebSearch不可用或无返回结果时的降级方案:
- 使用WebFetch直接访问已知权威URL(官方文档、GitHub仓库、Wikipedia)
- 依赖内部知识,但需将所有声明标注为未验证(无外部搜索结果可用)
- 要求用户提供来源URL或文档用于验证
- 降低最低来源要求,但尽可能保留交叉验证机制
Stage 3: Cross-Verify
阶段3:交叉验证
For each key finding:
- Does it appear in 2+ independent Tier S/A sources? → Verified
- Does it appear in only 1 source? → Unverified (label it)
- Do sources contradict? → Contested (present both sides with tier labels)
Build a verification matrix:
| Claim | Source 1 (Tier) | Source 2 (Tier) | Status |
|-------|----------------|----------------|--------|
| Bun 3x faster startup | benchmarks.dev (A) | bun.sh/blog (B) | Verified (note: Bun's own blog = biased) |针对每个核心发现:
- 是否在2个及以上独立的S/A级来源中出现?→ 已验证
- 仅在1个来源中出现?→ 未验证(需标注)
- 来源存在矛盾?→ 存在争议(同时呈现双方观点并标注来源等级)
构建验证矩阵:
| Claim | Source 1 (Tier) | Source 2 (Tier) | Status |
|-------|----------------|----------------|--------|
| Bun 3x faster startup | benchmarks.dev (A) | bun.sh/blog (B) | Verified (note: Bun's own blog = biased) |Stage 4: Synthesize
阶段4:内容整合
Produce the final report in BLUF format.
以BLUF格式生成最终报告。
Output Format
输出格式
markdown
undefinedmarkdown
undefinedResearch: [Topic]
Research: [Topic]
Conclusion (BLUF)
Conclusion (BLUF)
[1-3 sentence definitive answer or recommendation]
[1-3 sentence definitive answer or recommendation]
Key Findings
Key Findings
[Numbered findings, each with inline source tier labels]
-
[Finding] — [evidence summary] Sources: 🏛️ [source1], 🛡️ [source2]
-
[Finding] — [evidence summary] Sources: 🛡️ [source1], 🛡️ [source2]
[Numbered findings, each with inline source tier labels]
-
[Finding] — [evidence summary] Sources: 🏛️ [source1], 🛡️ [source2]
-
[Finding] — [evidence summary] Sources: 🛡️ [source1], 🛡️ [source2]
Contested / Uncertain
Contested / Uncertain
[Any claims that couldn't be cross-verified or where sources conflict]
- ⚠️ [claim] — Source A says X, Source B says Y
[Any claims that couldn't be cross-verified or where sources conflict]
- ⚠️ [claim] — Source A says X, Source B says Y
Verification Matrix
Verification Matrix
| Claim | Sources | Tier | Status |
|---|---|---|---|
| ... | ... | ... | Verified/Unverified/Contested |
| Claim | Sources | Tier | Status |
|---|---|---|---|
| ... | ... | ... | Verified/Unverified/Contested |
Sources
Sources
[All sources, grouped by tier]
[All sources, grouped by tier]
🏛️ Tier S — Academic & Primary Research
🏛️ Tier S — Academic & Primary Research
- Title — Journal/Org (Year)
- Title — Journal/Org (Year)
🛡️ Tier A — Trusted Official
🛡️ Tier A — Trusted Official
- Title — Source (Year)
- Title — Source (Year)
⚠️ Tier B — Community / Caution
⚠️ Tier B — Community / Caution
- Title — Platform (Year)
- Title — Platform (Year)
Tier C — General
Tier C — General
- Title
undefined- Title
undefinedSource Tiers
来源分级
Classify every source on discovery.
| Tier | Label | Trust Level | Examples |
|---|---|---|---|
| S | 🏛️ | Academic, peer-reviewed, primary research, official specs | Google Scholar, arXiv, PubMed, W3C/IETF RFCs, language specs (ECMAScript, PEPs) |
| A | 🛡️ | Government, .edu, major press, official docs | .gov/.edu, Reuters/AP/BBC, official framework docs, company engineering blogs (Google AI, Netflix Tech) |
| B | ⚠️ | Social media, forums, personal blogs, wikis — flag to user | Twitter/X, Reddit, StackOverflow, Medium, dev.to, Wikipedia, 나무위키 |
| C | (none) | General websites not fitting above categories | Corporate marketing, press releases, SEO content, news aggregators |
发现来源时立即对其进行分类。
| 等级 | 标签 | 信任等级 | 示例 |
|---|---|---|---|
| S | 🏛️ | 学术内容、同行评审内容、基础研究、官方规范 | Google Scholar、arXiv、PubMed、W3C/IETF RFC、语言规范(ECMAScript、PEP) |
| A | 🛡️ | 政府站点、.edu域名站点、主流媒体、官方文档 | .gov/.edu站点、路透社/美联社/BBC、官方框架文档、企业技术博客(Google AI、Netflix Tech) |
| B | ⚠️ | 社交媒体、论坛、个人博客、维基类站点——需向用户标记 | Twitter/X、Reddit、StackOverflow、Medium、dev.to、Wikipedia、나무위키 |
| C | (无) | 不属于以上类别的普通网站 | 企业营销内容、新闻稿、SEO内容、新闻聚合站点 |
Tier Classification Rules
分级规则
- Company's own content about their product:
- Official docs → Tier A
- Feature announcements → Tier A (existence), Tier B (performance claims)
- Marketing pages → Tier C
- GitHub:
- Official repos (e.g., facebook/react) → Tier A
- Issues/Discussions with reproduction → Tier A (for bug existence)
- Random user repos → Tier B
- Benchmarks:
- Independent, reproducible, methodology disclosed → Tier S
- Official by neutral party → Tier A
- Vendor's own benchmarks → Tier B (note bias)
- StackOverflow: Accepted answers with high votes = borderline Tier A; non-accepted = Tier B
- Tier B sources must never be cited alone — corroborate with Tier S or A
- 企业针对自身产品发布的内容:
- 官方文档 → A级
- 功能公告 → A级(功能存在性)、B级(性能相关声明)
- 营销页面 → C级
- GitHub内容:
- 官方仓库(例如facebook/react) → A级
- 带复现步骤的Issue/讨论 → A级(Bug存在性)
- 普通用户仓库 → B级
- 基准测试:
- 独立、可复现、公开测试方法 → S级
- 中立机构发布的官方测试 → A级
- 厂商自行发布的基准测试 → B级(需标注偏向性)
- StackOverflow: 高票采纳答案 = 接近A级;未采纳答案 = B级
- B级来源绝不能单独引用——需要S级或A级来源佐证
When to Use
适用场景
- Technology evaluation or comparison
- Fact-checking specific claims
- Architecture decision research
- Market/competitor analysis
- "Is X true?" verification tasks
- Any question where accuracy matters more than speed
- 技术评估或对比
- 特定声明的事实核查
- 架构决策调研
- 市场/竞品分析
- "X是否属实?"类验证任务
- 任何准确性优先于速度的问题
When NOT to Use
不适用场景
- Creative writing or brainstorming (use )
creativity-sampler - Code implementation (use for library discovery)
search-first - Simple questions answerable from internal knowledge with high confidence
- Opinion-based questions with no verifiable answer
- 创意写作或头脑风暴(使用)
creativity-sampler - 代码实现(库检索使用)
search-first - 可通过内部知识高置信度回答的简单问题
- 无验证性答案的观点类问题
Integration Notes
集成说明
- With brainstorming: Can be invoked during brainstorming's "Explore context" phase for fact-based inputs
- With search-first: search-first finds tools/libraries to USE; this skill VERIFIES factual claims. Different purposes.
- With adversarial-review: Research findings can feed into adversarial review for stress-testing conclusions
- 与头脑风暴配合: 可在头脑风暴的「探索上下文」阶段调用,提供基于事实的输入
- 与search-first配合: search-first用于查找可使用的工具/库;本功能用于验证事实声明,二者用途不同
- 与对抗性评审配合: 研究结果可输入到对抗性评审流程,对结论进行压力测试