swing-research

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Cross-Verified Research

交叉验证式研究

Systematic research engine with anti-hallucination safeguards and source quality tiering.
具备防幻觉保障和来源质量分级的系统化研究引擎。

Rules (Absolute)

绝对规则

  1. Never fabricate sources. No fake URLs, no invented papers, no hallucinated statistics.
  2. Source-traceability gate. Every factual claim must be traceable to a specific, citable source. If a claim cannot be traced to any source, mark it as Unverified (internal knowledge only) and state what verification would be needed. Never present untraced claims as findings.
  3. No speculation as fact. Do not present unverified claims using hedging language as if they were findings. Banned patterns: "아마도", "~인 것 같습니다", "~로 보입니다", "~수도 있습니다", "probably", "I think", "seems like", "appears to be", "likely". If a claim is not verified, label it explicitly as Unverified or Contested — do not soften it with hedging.
  4. BLUF output. Lead with conclusion, follow with evidence. Never bury the answer.
  5. Scaled effort. Match research depth to question scope:
    • Narrow factual (single claim, date, specification): 2-3 queries, 2+ sources
    • Technology comparison (A vs B): 5+ queries, 5+ sources
    • Broad landscape (market analysis, state-of-art): 8+ queries, 8+ sources Default to the higher tier when scope is ambiguous.
  6. Cross-verify. Every key claim must appear in 2+ independent sources before presenting as fact. "Independent" means the sources conducted their own analysis or reporting — two articles that both cite the same original source (press release, blog post, study) count as ONE source, not two. Trace claims back to their origin.
  7. Scope before search. If the research question is ambiguous or overly broad, decompose it into specific sub-questions in Stage 1 and present them to the user for confirmation before proceeding to Stage 2. Do not research a vague question — sharpen it first.
  1. 绝不编造来源。禁止伪造URL、虚构论文、捏造统计数据。
  2. 来源可追溯门槛。每一个事实性主张都必须能追溯到具体的、可引用的来源。如果某个主张无法追溯到任何来源,需标记为未经验证(仅内部知识),并说明需要哪些验证步骤。绝不能将无追溯来源的主张作为研究结果呈现。
  3. 不得将推测当作事实。不得使用模糊表述将未经验证的主张伪装成研究结果。禁止使用的表述模式:“아마도”、“~인 것 같습니다”、“~로 보입니다”、“~수도 있습니다”、“probably”、“I think”、“seems like”、“appears to be”、“likely”。如果某个主张未经验证,需明确标记为未经验证存在争议——不得用模糊表述弱化其不确定性。
  4. BLUF输出。结论先行,随后附上证据。切勿隐藏答案。
  5. 按比例投入精力。根据问题范围匹配研究深度:
    • 窄范围事实类(单个主张、日期、规格):2-3次查询,2+个来源
    • 技术对比类(A vs B):5+次查询,5+个来源
    • 广泛全景类(市场分析、技术现状):8+次查询,8+个来源 当范围不明确时,默认采用更高层级的投入标准。
  6. 交叉验证。每一个关键主张必须在2+个独立来源中出现,才能作为事实呈现。“独立”指各来源开展了自主分析或报道——两篇均引用同一原始来源(新闻稿、博客文章、研究报告)的文章只能算作一个来源,而非两个。需追溯主张的原始出处。
  7. 先明确范围再搜索。如果研究问题模糊或过于宽泛,在第一阶段将其拆解为具体的子问题,提交用户确认后再进入第二阶段。不得针对模糊问题开展研究——先明确问题边界。

Pipeline

流程

Execute these 4 stages sequentially. Do NOT skip stages.
按顺序执行以下4个阶段,不得跳过任何阶段

Stage 1: Deconstruct

阶段1:拆解问题

Break the research question into atomic sub-questions.
Input: "Should we use Bun or Node.js for our backend?"
Decomposed:
  1. Runtime performance benchmarks (CPU, memory, startup)
  2. Ecosystem maturity (npm compatibility, native modules)
  3. Production stability (known issues, enterprise adoption)
  4. Developer experience (tooling, debugging, testing)
  5. Long-term viability (funding, community, roadmap)
  • Identify what requires external verification vs. internal knowledge
  • If the original question is vague or overly broad, present the decomposed sub-questions to the user for confirmation before proceeding (Rule 7)
  • For each sub-question, note what a traceable source would look like
将研究问题拆解为原子化的子问题。
输入:“我们后端应该使用Bun还是Node.js?”
拆解结果:
  1. 运行时性能基准测试(CPU、内存、启动速度)
  2. 生态系统成熟度(npm兼容性、原生模块)
  3. 生产稳定性(已知问题、企业级采用情况)
  4. 开发者体验(工具链、调试、测试)
  5. 长期可行性(资金支持、社区、路线图)
  • 识别哪些内容需要外部验证,哪些可依赖内部知识
  • 如果原始问题模糊或过于宽泛,需将拆解后的子问题提交用户确认后再继续(规则7)
  • 为每个子问题标注可追溯来源的类型

Stage 2: Search & Collect

阶段2:搜索与收集

For each sub-question requiring verification:
  1. Formulate diverse queries — vary keywords, include year filters, try both English and Korean
  2. Use WebSearch for broad discovery, WebFetch for specific page analysis
  3. Classify every source by tier immediately (see Source Tiers below)
  4. Extract specific data points — numbers, dates, versions, quotes with attribution
  5. Record contradictions — when sources disagree, note both positions
  6. Trace origin — when multiple sources cite the same underlying source, identify the original
Search pattern (scale per Rule 5):
Query 1: [topic] + "benchmark" or "comparison"
Query 2: [topic] + "production" or "enterprise"
Query 3: [topic] + [current year] + "review"
Query 4: [topic] + "issues" or "problems" or "limitations"
Query 5: [topic] + site:github.com (issues, discussions)
Fallback when WebSearch is unavailable or returns no results:
  1. Use WebFetch to directly access known authoritative URLs (official docs, GitHub repos, Wikipedia)
  2. Rely on internal knowledge but label all claims as Unverified (no external search available)
  3. Ask the user to provide source URLs or documents for verification
  4. Reduce the minimum source requirement but maintain cross-verification where possible
针对每个需要验证的子问题:
  1. 制定多样化查询词——变换关键词,加入年份筛选,同时尝试英文和韩文查询
  2. 使用WebSearch进行广泛发现,使用WebFetch进行特定页面分析
  3. 立即对每个来源进行分级(见下方“来源分级”)
  4. 提取具体数据点——数字、日期、版本号、带引用的引述内容
  5. 记录矛盾点——当来源存在分歧时,记录双方观点
  6. 追溯原始出处——当多个来源引用同一底层来源时,识别出原始来源
搜索模式(根据规则5调整规模):
查询1:[主题] + "benchmark" 或 "comparison"
查询2:[主题] + "production" 或 "enterprise"
查询3:[主题] + [当前年份] + "review"
查询4:[主题] + "issues" 或 "problems" 或 "limitations"
查询5:[主题] + site:github.com(问题、讨论区)
当WebSearch不可用或无结果时的 fallback 方案:
  1. 使用WebFetch直接访问已知权威URL(官方文档、GitHub仓库、维基百科)
  2. 依赖内部知识,但需将所有主张标记为未经验证(无外部搜索可用)
  3. 请求用户提供来源URL或文档用于验证
  4. 降低最低来源要求,但尽可能保持交叉验证

Stage 3: Cross-Verify

阶段3:交叉验证

For each key finding:
  • Does it appear in 2+ independent Tier S/A sources? → Verified
  • Does it appear in only 1 source? → Unverified (label it)
  • Do sources contradict? → Contested (present both sides with tier labels)
Remember: "independent" means each source did its own analysis. Two articles both citing the same benchmark study = 1 source.
Build a verification matrix:
| Claim | Source 1 (Tier) | Source 2 (Tier) | Status |
|-------|----------------|----------------|--------|
| Bun 3x faster startup | benchmarks.dev (A) | bun.sh/blog (B) | Verified (note: Bun's own blog = biased) |
针对每个关键发现:
  • 是否在2+个独立的S/A级来源中出现?→ 已验证
  • 是否仅在1个来源中出现?→ 未经验证(标记出来)
  • 来源是否存在矛盾?→ 存在争议(呈现双方观点并标注来源等级)
请注意:“独立”指每个来源都开展了自主分析。两篇均引用同一基准测试研究的文章 = 1个来源。
构建验证矩阵:
| 主张 | 来源1(等级) | 来源2(等级) | 状态 |
|-------|----------------|----------------|--------|
| Bun启动速度快3倍 | benchmarks.dev (A) | bun.sh/blog (B) | 已验证(注意:Bun官方博客存在偏向性) |

Stage 4: Synthesize

阶段4:合成结果

Produce the final report in BLUF format.
以BLUF格式生成最终报告。

Output Format

输出格式

markdown
undefined
markdown
undefined

Research: [Topic]

研究:[主题]

Conclusion (BLUF)

结论(BLUF)

[1-3 sentence definitive answer or recommendation]
[1-3句明确的答案或建议]

Key Findings

关键发现

[Numbered findings, each with inline source tier labels]
  1. [Finding] — [evidence summary] Sources: 🏛️ [source1], 🛡️ [source2]
  2. [Finding] — [evidence summary] Sources: 🛡️ [source1], 🛡️ [source2]
[编号列出的发现,每个发现内嵌来源等级标签]
  1. [发现内容] — [证据摘要] 来源:🏛️ [来源1], 🛡️ [来源2]
  2. [发现内容] — [证据摘要] 来源:🛡️ [来源1], 🛡️ [来源2]

Contested / Uncertain

存在争议/不确定内容

[Any claims that couldn't be cross-verified or where sources conflict]
  • ⚠️ [claim] — Source A says X, Source B says Y
[所有无法交叉验证或来源存在冲突的主张]
  • ⚠️ [主张内容] — 来源A称X,来源B称Y

Verification Matrix

验证矩阵

ClaimSourcesTierStatus
.........Verified/Unverified/Contested
主张来源等级状态
.........已验证/未经验证/存在争议

Sources

来源列表

[All sources, grouped by tier]
[所有来源,按等级分组]

🏛️ Tier S — Academic & Primary Research

🏛️ S级 — 学术与原始研究

  • Title — Journal/Org (Year)
  • 标题 — 期刊/机构(年份)

🛡️ Tier A — Trusted Official

🛡️ A级 — 可信官方来源

  • Title — Source (Year)
  • 标题 — 来源(年份)

⚠️ Tier B — Community / Caution

⚠️ B级 — 社区来源/需谨慎参考

  • Title — Platform (Year)
  • 标题 — 平台(年份)

Tier C — General

C级 — 通用来源

  • Title
undefined
  • 标题
undefined

Quality Calibration

质量校准

BAD Example — What to Avoid

反面示例 — 需避免的情况

markdown
undefined
markdown
undefined

Research: Is Rust faster than Go for web servers?

研究:Rust作为Web服务器是否比Go更快?

Conclusion (BLUF)

结论(BLUF)

Rust is generally faster than Go for web servers due to zero-cost abstractions.
由于零成本抽象,Rust作为Web服务器通常比Go更快。

Key Findings

关键发现

  1. Rust is 2-5x faster than Go — Rust's ownership model eliminates GC pauses. Sources: 🛡️ https://rust-performance-comparison.example.com
  2. Rust uses less memory — Typically 50% less memory in production. Sources: 🛡️ https://memory-benchmarks.example.com
  3. Go is easier to learn — Most developers pick up Go in a week. Sources: 🏛️ https://developer-survey.example.com
  1. Rust比Go快2-5倍 — Rust的所有权模型消除了GC停顿。 来源:🛡️ https://rust-performance-comparison.example.com
  2. Rust内存占用更低 — 生产环境中通常低50%。 来源:🛡️ https://memory-benchmarks.example.com
  3. Go更易学习 — 大多数开发者可在一周内掌握Go。 来源:🏛️ https://developer-survey.example.com

Verification Matrix

验证矩阵

ClaimSourcesTierStatus
2-5x faster1 benchmark siteAVerified
50% less memory1 benchmark siteAVerified

**Why this is bad:**
- Source URLs are fabricated (nonexistent domains)
- "2-5x faster" and "50% less memory" are presented as **Verified** with only 1 source each
- No contested claims section despite this being a nuanced topic
- Claims are restated internal knowledge dressed up with fake citations
- No origin tracing — where did "2-5x" come from?
- The "Verified" labels are false — nothing was actually cross-verified
主张来源等级状态
快2-5倍1个基准测试网站A已验证
内存低50%1个基准测试网站A已验证

**问题所在:**
- 来源URL是编造的(不存在的域名)
- “快2-5倍”和“内存低50%”仅基于1个来源,却被标记为**已验证**
- 未设置“存在争议内容”部分,尽管这是一个存在争议的主题
- 主张是内部知识的重述,仅伪装成有引用来源
- 未追溯原始出处——“2-5倍”的数据来自哪里?
- “已验证”标签是虚假的——实际上未进行任何交叉验证

GOOD Example — What to Aim For

正面示例 — 目标标准

markdown
undefined
markdown
undefined

Research: Is Rust faster than Go for web servers?

研究:Rust作为Web服务器是否比Go更快?

Conclusion (BLUF)

结论(BLUF)

Rust outperforms Go in raw throughput benchmarks (typically 1.5-3x in TechEmpower), but the gap narrows significantly with real-world I/O workloads. Go's GC pauses (sub-millisecond since Go 1.19) are rarely a bottleneck for typical web services. Choose based on your latency tail requirements, not averages.
在原始吞吐量基准测试中,Rust的性能优于Go(在TechEmpower测试中通常快1.5-3倍),但在真实世界的I/O工作负载下,差距会显著缩小。Go 1.19及以后版本的GC停顿(亚毫秒级)在典型Web服务中很少成为瓶颈。应根据延迟尾部要求而非平均性能进行选择。

Key Findings

关键发现

  1. Rust frameworks lead TechEmpower benchmarks — Actix-web and Axum consistently rank in the top 10; Go's stdlib and Gin rank 20-40 range in plaintext/JSON tests. Sources: 🏛️ TechEmpower Round 22 (2024), 🛡️ Axum GitHub benchmarks
  2. Go's GC latency is sub-millisecond since 1.19 — p99 GC pause < 500μs confirmed by the Go team. Sources: 🛡️ Go Blog "Getting to Go" (2022), 🛡️ Go 1.19 Release Notes
  3. Real-world gap is smaller than microbenchmarks suggest — Discord's 2020 migration (Go→Rust) showed tail latency improvements, but their workload (millions of concurrent connections) is atypical. Sources: 🛡️ Discord Engineering Blog (2020), ⚠️ HN discussion with Discord engineer comments
  1. Rust框架在TechEmpower基准测试中名列前茅 — Actix-web和Axum始终位列前10;Go标准库和Gin在明文/JSON测试中位列20-40名。 来源:🏛️ TechEmpower第22轮测试(2024), 🛡️ Axum GitHub基准测试
  2. Go 1.19及以后版本的GC延迟为亚毫秒级 — Go团队确认p99 GC停顿<500μs。 来源:🛡️ Go博客《Getting to Go》(2022), 🛡️ Go 1.19发布说明
  3. 真实世界中的性能差距比微基准测试更小 — Discord 2020年从Go迁移到Rust后,尾部延迟得到改善,但他们的工作负载(数百万并发连接)并不典型。 来源:🛡️ Discord工程博客(2020), ⚠️ Hacker News上Discord工程师的讨论

Contested / Uncertain

存在争议/不确定内容

  • ⚠️ "Rust uses 50% less memory than Go" — Frequently repeated on Reddit/HN but no independent benchmark reproduces a consistent figure. Memory usage depends heavily on allocator choice (jemalloc vs system) and workload. Unverified.
  • ⚠️ Developer productivity trade-off — Go advocates claim 2-3x faster development time. No peer-reviewed study supports a specific multiplier. Unverified (internal knowledge only) — would need controlled study to verify.
  • ⚠️ “Rust内存占用比Go低50%” — Reddit/HN上经常提到这一点,但没有独立基准测试能重现一致的数据。内存占用很大程度上取决于分配器选择(jemalloc vs 系统分配器)和工作负载。未经验证。
  • ⚠️ 开发者生产力权衡 — Go的支持者声称开发速度快2-3倍。目前没有同行评审研究支持具体的倍数。未经验证(仅内部知识)——需要对照研究才能验证。

Verification Matrix

验证矩阵

ClaimSourcesTierStatus
Rust 1.5-3x faster (synthetic)TechEmpower R22 (S), Axum bench (A)S+AVerified
Go GC < 500μs p99Go Blog (A), Release Notes (A)A+AVerified
Discord latency improvementDiscord Blog (A), HN thread (B)A+BVerified (single case study)
Rust 50% less memoryReddit threads (B) onlyBUnverified
Go 2-3x dev speedNo source foundUnverified (internal knowledge only)
主张来源等级状态
Rust在合成测试中快1.5-3倍TechEmpower R22 (S), Axum基准测试(A)S+A已验证
Go GC p99 <500μsGo博客(A), 发布说明(A)A+A已验证
Discord延迟改善Discord博客(A), HN讨论(B)A+B已验证(单个案例研究)
Rust内存低50%仅Reddit讨论(B)B未经验证
Go开发速度快2-3倍未找到来源未经验证(仅内部知识)

Sources

来源列表

🏛️ Tier S — Academic & Primary Research

🏛️ S级 — 学术与原始研究

🛡️ Tier A — Trusted Official

🛡️ A级 — 可信官方来源

⚠️ Tier B — Community / Caution

⚠️ B级 — 社区来源/需谨慎参考


**Why this is good:**
- Every URL is a real, verifiable page
- Claims that lack sources are explicitly labeled **Unverified**
- The "50% less memory" myth is called out rather than repeated
- Verification matrix honestly shows what's verified vs. not
- Sources are independent (TechEmpower did their own benchmarks, not citing each other)
- Nuance preserved: "the gap narrows with real-world I/O"

**优势所在:**
- 所有URL都是真实、可验证的页面
- 无来源的主张被明确标记为**未经验证**
- “内存低50%”的传言被指出而非重复传播
- 验证矩阵如实展示了哪些内容已验证、哪些未验证
- 来源是独立的(TechEmpower开展了自主基准测试,未互相引用)
- 保留了细节:“在真实世界I/O场景下差距缩小”

Source Tiers

来源分级

Classify every source on discovery.
TierLabelTrust LevelExamples
S🏛️Academic, peer-reviewed, primary research, official specsGoogle Scholar, arXiv, PubMed, W3C/IETF RFCs, language specs (ECMAScript, PEPs)
A🛡️Government, .edu, major press, official docs.gov/.edu, Reuters/AP/BBC, official framework docs, company engineering blogs (Google AI, Netflix Tech)
B⚠️Social media, forums, personal blogs, wikis — flag to userTwitter/X, Reddit, StackOverflow, Medium, dev.to, Wikipedia, 나무위키
C(none)General websites not fitting above categoriesCorporate marketing, press releases, SEO content, news aggregators
在发现来源时立即进行分类。
等级标识可信度示例
S🏛️学术、同行评审、原始研究、官方规范Google Scholar、arXiv、PubMed、W3C/IETF RFC、语言规范(ECMAScript、PEPs)
A🛡️政府、.edu、主流媒体、官方文档.gov/.edu域名、路透社/美联社/BBC、官方框架文档、企业工程博客(Google AI、Netflix Tech)
B⚠️社交媒体、论坛、个人博客、维基类网站——需向用户标记Twitter/X、Reddit、StackOverflow、Medium、dev.to、维基百科、나무위키
C(无标识)不符合上述类别的通用网站企业营销页面、新闻稿、SEO内容、新闻聚合器

Tier Classification Rules

等级分类规则

  • Company's own content about their product:
    • Official docs → Tier A
    • Feature announcements → Tier A (existence), Tier B (performance claims)
    • Marketing pages → Tier C
  • GitHub:
    • Official repos (e.g., facebook/react) → Tier A
    • Issues/Discussions with reproduction → Tier A (for bug existence)
    • Random user repos → Tier B
  • Benchmarks:
    • Independent, reproducible, methodology disclosed → Tier S
    • Official by neutral party → Tier A
    • Vendor's own benchmarks → Tier B (note bias)
  • StackOverflow: Accepted answers with high votes = borderline Tier A; non-accepted = Tier B
  • Tier B sources must never be cited alone — corroborate with Tier S or A
  • 企业关于自身产品的内容:
    • 官方文档 → A级
    • 功能公告 → (存在性)A级,(性能主张)B级
    • 营销页面 → C级
  • GitHub:
    • 官方仓库(如facebook/react)→ A级
    • 带复现步骤的问题/讨论 → (bug存在性)A级
    • 普通用户仓库 → B级
  • 基准测试:
    • 独立、可复现、披露方法论 → S级
    • 中立第三方官方测试 → A级
    • 厂商自行开展的基准测试 → B级(标注偏向性)
  • StackOverflow: 高票已采纳答案接近A级;未采纳答案为B级
  • B级来源不得单独引用——需与S级或A级来源相互佐证

When to Use

适用场景

  • Technology evaluation or comparison
  • Fact-checking specific claims
  • Architecture decision research
  • Market/competitor analysis
  • "Is X true?" verification tasks
  • Any question where accuracy matters more than speed
  • 技术评估或对比
  • 特定主张的事实核查
  • 架构决策研究
  • 市场/竞品分析
  • “X是否为真?”类验证任务
  • 任何准确性优先于速度的问题

When NOT to Use

不适用场景

  • Creative writing or brainstorming (use
    swing-options
    )
  • Code implementation (use
    search-first
    for library discovery)
  • Simple questions answerable from internal knowledge with high confidence
  • Opinion-based questions with no verifiable answer
  • 创意写作或头脑风暴(使用
    swing-options
  • 代码实现(使用
    search-first
    发现库)
  • 可通过内部知识高置信度回答的简单问题
  • 无验证答案的基于观点的问题

Integration Notes

集成说明

  • With swing-clarify: Run swing-clarify first on ambiguous requests before invoking this skill. Clarified scope produces better results.
  • With brainstorming: Can be invoked during brainstorming's "Explore context" phase for fact-based inputs
  • With search-first: search-first finds tools/libraries to USE; this skill VERIFIES factual claims. Different purposes.
  • With swing-review: Research findings can feed into adversarial review for stress-testing conclusions
  • 与swing-clarify配合: 对模糊请求,先运行swing-clarify,再调用本技能。明确的范围能产生更好的结果。
  • 与头脑风暴配合: 可在头脑风暴的“探索背景”阶段调用本技能,获取基于事实的输入。
  • 与search-first配合: search-first用于发现可使用的工具/库;本技能用于验证事实主张。用途不同。
  • 与swing-review配合: 研究发现可用于对抗性评审,对结论进行压力测试。