swing-research
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCross-Verified Research
交叉验证式研究
Systematic research engine with anti-hallucination safeguards and source quality tiering.
具备防幻觉保障和来源质量分级的系统化研究引擎。
Rules (Absolute)
绝对规则
- Never fabricate sources. No fake URLs, no invented papers, no hallucinated statistics.
- Source-traceability gate. Every factual claim must be traceable to a specific, citable source. If a claim cannot be traced to any source, mark it as Unverified (internal knowledge only) and state what verification would be needed. Never present untraced claims as findings.
- No speculation as fact. Do not present unverified claims using hedging language as if they were findings. Banned patterns: "아마도", "~인 것 같습니다", "~로 보입니다", "~수도 있습니다", "probably", "I think", "seems like", "appears to be", "likely". If a claim is not verified, label it explicitly as Unverified or Contested — do not soften it with hedging.
- BLUF output. Lead with conclusion, follow with evidence. Never bury the answer.
- Scaled effort. Match research depth to question scope:
- Narrow factual (single claim, date, specification): 2-3 queries, 2+ sources
- Technology comparison (A vs B): 5+ queries, 5+ sources
- Broad landscape (market analysis, state-of-art): 8+ queries, 8+ sources Default to the higher tier when scope is ambiguous.
- Cross-verify. Every key claim must appear in 2+ independent sources before presenting as fact. "Independent" means the sources conducted their own analysis or reporting — two articles that both cite the same original source (press release, blog post, study) count as ONE source, not two. Trace claims back to their origin.
- Scope before search. If the research question is ambiguous or overly broad, decompose it into specific sub-questions in Stage 1 and present them to the user for confirmation before proceeding to Stage 2. Do not research a vague question — sharpen it first.
- 绝不编造来源。禁止伪造URL、虚构论文、捏造统计数据。
- 来源可追溯门槛。每一个事实性主张都必须能追溯到具体的、可引用的来源。如果某个主张无法追溯到任何来源,需标记为未经验证(仅内部知识),并说明需要哪些验证步骤。绝不能将无追溯来源的主张作为研究结果呈现。
- 不得将推测当作事实。不得使用模糊表述将未经验证的主张伪装成研究结果。禁止使用的表述模式:“아마도”、“~인 것 같습니다”、“~로 보입니다”、“~수도 있습니다”、“probably”、“I think”、“seems like”、“appears to be”、“likely”。如果某个主张未经验证,需明确标记为未经验证或存在争议——不得用模糊表述弱化其不确定性。
- BLUF输出。结论先行,随后附上证据。切勿隐藏答案。
- 按比例投入精力。根据问题范围匹配研究深度:
- 窄范围事实类(单个主张、日期、规格):2-3次查询,2+个来源
- 技术对比类(A vs B):5+次查询,5+个来源
- 广泛全景类(市场分析、技术现状):8+次查询,8+个来源 当范围不明确时,默认采用更高层级的投入标准。
- 交叉验证。每一个关键主张必须在2+个独立来源中出现,才能作为事实呈现。“独立”指各来源开展了自主分析或报道——两篇均引用同一原始来源(新闻稿、博客文章、研究报告)的文章只能算作一个来源,而非两个。需追溯主张的原始出处。
- 先明确范围再搜索。如果研究问题模糊或过于宽泛,在第一阶段将其拆解为具体的子问题,提交用户确认后再进入第二阶段。不得针对模糊问题开展研究——先明确问题边界。
Pipeline
流程
Execute these 4 stages sequentially. Do NOT skip stages.
按顺序执行以下4个阶段,不得跳过任何阶段。
Stage 1: Deconstruct
阶段1:拆解问题
Break the research question into atomic sub-questions.
Input: "Should we use Bun or Node.js for our backend?"
Decomposed:
1. Runtime performance benchmarks (CPU, memory, startup)
2. Ecosystem maturity (npm compatibility, native modules)
3. Production stability (known issues, enterprise adoption)
4. Developer experience (tooling, debugging, testing)
5. Long-term viability (funding, community, roadmap)- Identify what requires external verification vs. internal knowledge
- If the original question is vague or overly broad, present the decomposed sub-questions to the user for confirmation before proceeding (Rule 7)
- For each sub-question, note what a traceable source would look like
将研究问题拆解为原子化的子问题。
输入:“我们后端应该使用Bun还是Node.js?”
拆解结果:
1. 运行时性能基准测试(CPU、内存、启动速度)
2. 生态系统成熟度(npm兼容性、原生模块)
3. 生产稳定性(已知问题、企业级采用情况)
4. 开发者体验(工具链、调试、测试)
5. 长期可行性(资金支持、社区、路线图)- 识别哪些内容需要外部验证,哪些可依赖内部知识
- 如果原始问题模糊或过于宽泛,需将拆解后的子问题提交用户确认后再继续(规则7)
- 为每个子问题标注可追溯来源的类型
Stage 2: Search & Collect
阶段2:搜索与收集
For each sub-question requiring verification:
- Formulate diverse queries — vary keywords, include year filters, try both English and Korean
- Use WebSearch for broad discovery, WebFetch for specific page analysis
- Classify every source by tier immediately (see Source Tiers below)
- Extract specific data points — numbers, dates, versions, quotes with attribution
- Record contradictions — when sources disagree, note both positions
- Trace origin — when multiple sources cite the same underlying source, identify the original
Search pattern (scale per Rule 5):
Query 1: [topic] + "benchmark" or "comparison"
Query 2: [topic] + "production" or "enterprise"
Query 3: [topic] + [current year] + "review"
Query 4: [topic] + "issues" or "problems" or "limitations"
Query 5: [topic] + site:github.com (issues, discussions)Fallback when WebSearch is unavailable or returns no results:
- Use WebFetch to directly access known authoritative URLs (official docs, GitHub repos, Wikipedia)
- Rely on internal knowledge but label all claims as Unverified (no external search available)
- Ask the user to provide source URLs or documents for verification
- Reduce the minimum source requirement but maintain cross-verification where possible
针对每个需要验证的子问题:
- 制定多样化查询词——变换关键词,加入年份筛选,同时尝试英文和韩文查询
- 使用WebSearch进行广泛发现,使用WebFetch进行特定页面分析
- 立即对每个来源进行分级(见下方“来源分级”)
- 提取具体数据点——数字、日期、版本号、带引用的引述内容
- 记录矛盾点——当来源存在分歧时,记录双方观点
- 追溯原始出处——当多个来源引用同一底层来源时,识别出原始来源
搜索模式(根据规则5调整规模):
查询1:[主题] + "benchmark" 或 "comparison"
查询2:[主题] + "production" 或 "enterprise"
查询3:[主题] + [当前年份] + "review"
查询4:[主题] + "issues" 或 "problems" 或 "limitations"
查询5:[主题] + site:github.com(问题、讨论区)当WebSearch不可用或无结果时的 fallback 方案:
- 使用WebFetch直接访问已知权威URL(官方文档、GitHub仓库、维基百科)
- 依赖内部知识,但需将所有主张标记为未经验证(无外部搜索可用)
- 请求用户提供来源URL或文档用于验证
- 降低最低来源要求,但尽可能保持交叉验证
Stage 3: Cross-Verify
阶段3:交叉验证
For each key finding:
- Does it appear in 2+ independent Tier S/A sources? → Verified
- Does it appear in only 1 source? → Unverified (label it)
- Do sources contradict? → Contested (present both sides with tier labels)
Remember: "independent" means each source did its own analysis. Two articles both citing the same benchmark study = 1 source.
Build a verification matrix:
| Claim | Source 1 (Tier) | Source 2 (Tier) | Status |
|-------|----------------|----------------|--------|
| Bun 3x faster startup | benchmarks.dev (A) | bun.sh/blog (B) | Verified (note: Bun's own blog = biased) |针对每个关键发现:
- 是否在2+个独立的S/A级来源中出现?→ 已验证
- 是否仅在1个来源中出现?→ 未经验证(标记出来)
- 来源是否存在矛盾?→ 存在争议(呈现双方观点并标注来源等级)
请注意:“独立”指每个来源都开展了自主分析。两篇均引用同一基准测试研究的文章 = 1个来源。
构建验证矩阵:
| 主张 | 来源1(等级) | 来源2(等级) | 状态 |
|-------|----------------|----------------|--------|
| Bun启动速度快3倍 | benchmarks.dev (A) | bun.sh/blog (B) | 已验证(注意:Bun官方博客存在偏向性) |Stage 4: Synthesize
阶段4:合成结果
Produce the final report in BLUF format.
以BLUF格式生成最终报告。
Output Format
输出格式
markdown
undefinedmarkdown
undefinedResearch: [Topic]
研究:[主题]
Conclusion (BLUF)
结论(BLUF)
[1-3 sentence definitive answer or recommendation]
[1-3句明确的答案或建议]
Key Findings
关键发现
[Numbered findings, each with inline source tier labels]
-
[Finding] — [evidence summary] Sources: 🏛️ [source1], 🛡️ [source2]
-
[Finding] — [evidence summary] Sources: 🛡️ [source1], 🛡️ [source2]
[编号列出的发现,每个发现内嵌来源等级标签]
-
[发现内容] — [证据摘要] 来源:🏛️ [来源1], 🛡️ [来源2]
-
[发现内容] — [证据摘要] 来源:🛡️ [来源1], 🛡️ [来源2]
Contested / Uncertain
存在争议/不确定内容
[Any claims that couldn't be cross-verified or where sources conflict]
- ⚠️ [claim] — Source A says X, Source B says Y
[所有无法交叉验证或来源存在冲突的主张]
- ⚠️ [主张内容] — 来源A称X,来源B称Y
Verification Matrix
验证矩阵
| Claim | Sources | Tier | Status |
|---|---|---|---|
| ... | ... | ... | Verified/Unverified/Contested |
| 主张 | 来源 | 等级 | 状态 |
|---|---|---|---|
| ... | ... | ... | 已验证/未经验证/存在争议 |
Sources
来源列表
[All sources, grouped by tier]
[所有来源,按等级分组]
🏛️ Tier S — Academic & Primary Research
🏛️ S级 — 学术与原始研究
- Title — Journal/Org (Year)
- 标题 — 期刊/机构(年份)
🛡️ Tier A — Trusted Official
🛡️ A级 — 可信官方来源
- Title — Source (Year)
- 标题 — 来源(年份)
⚠️ Tier B — Community / Caution
⚠️ B级 — 社区来源/需谨慎参考
- Title — Platform (Year)
- 标题 — 平台(年份)
Tier C — General
C级 — 通用来源
- Title
undefined- 标题
undefinedQuality Calibration
质量校准
BAD Example — What to Avoid
反面示例 — 需避免的情况
markdown
undefinedmarkdown
undefinedResearch: Is Rust faster than Go for web servers?
研究:Rust作为Web服务器是否比Go更快?
Conclusion (BLUF)
结论(BLUF)
Rust is generally faster than Go for web servers due to zero-cost abstractions.
由于零成本抽象,Rust作为Web服务器通常比Go更快。
Key Findings
关键发现
- Rust is 2-5x faster than Go — Rust's ownership model eliminates GC pauses. Sources: 🛡️ https://rust-performance-comparison.example.com
- Rust uses less memory — Typically 50% less memory in production. Sources: 🛡️ https://memory-benchmarks.example.com
- Go is easier to learn — Most developers pick up Go in a week. Sources: 🏛️ https://developer-survey.example.com
- Rust比Go快2-5倍 — Rust的所有权模型消除了GC停顿。 来源:🛡️ https://rust-performance-comparison.example.com
- Rust内存占用更低 — 生产环境中通常低50%。 来源:🛡️ https://memory-benchmarks.example.com
- Go更易学习 — 大多数开发者可在一周内掌握Go。 来源:🏛️ https://developer-survey.example.com
Verification Matrix
验证矩阵
| Claim | Sources | Tier | Status |
|---|---|---|---|
| 2-5x faster | 1 benchmark site | A | Verified |
| 50% less memory | 1 benchmark site | A | Verified |
**Why this is bad:**
- Source URLs are fabricated (nonexistent domains)
- "2-5x faster" and "50% less memory" are presented as **Verified** with only 1 source each
- No contested claims section despite this being a nuanced topic
- Claims are restated internal knowledge dressed up with fake citations
- No origin tracing — where did "2-5x" come from?
- The "Verified" labels are false — nothing was actually cross-verified| 主张 | 来源 | 等级 | 状态 |
|---|---|---|---|
| 快2-5倍 | 1个基准测试网站 | A | 已验证 |
| 内存低50% | 1个基准测试网站 | A | 已验证 |
**问题所在:**
- 来源URL是编造的(不存在的域名)
- “快2-5倍”和“内存低50%”仅基于1个来源,却被标记为**已验证**
- 未设置“存在争议内容”部分,尽管这是一个存在争议的主题
- 主张是内部知识的重述,仅伪装成有引用来源
- 未追溯原始出处——“2-5倍”的数据来自哪里?
- “已验证”标签是虚假的——实际上未进行任何交叉验证GOOD Example — What to Aim For
正面示例 — 目标标准
markdown
undefinedmarkdown
undefinedResearch: Is Rust faster than Go for web servers?
研究:Rust作为Web服务器是否比Go更快?
Conclusion (BLUF)
结论(BLUF)
Rust outperforms Go in raw throughput benchmarks (typically 1.5-3x in TechEmpower), but the gap narrows significantly with real-world I/O workloads. Go's GC pauses (sub-millisecond since Go 1.19) are rarely a bottleneck for typical web services. Choose based on your latency tail requirements, not averages.
在原始吞吐量基准测试中,Rust的性能优于Go(在TechEmpower测试中通常快1.5-3倍),但在真实世界的I/O工作负载下,差距会显著缩小。Go 1.19及以后版本的GC停顿(亚毫秒级)在典型Web服务中很少成为瓶颈。应根据延迟尾部要求而非平均性能进行选择。
Key Findings
关键发现
- Rust frameworks lead TechEmpower benchmarks — Actix-web and Axum consistently rank in the top 10; Go's stdlib and Gin rank 20-40 range in plaintext/JSON tests. Sources: 🏛️ TechEmpower Round 22 (2024), 🛡️ Axum GitHub benchmarks
- Go's GC latency is sub-millisecond since 1.19 — p99 GC pause < 500μs confirmed by the Go team. Sources: 🛡️ Go Blog "Getting to Go" (2022), 🛡️ Go 1.19 Release Notes
- Real-world gap is smaller than microbenchmarks suggest — Discord's 2020 migration (Go→Rust) showed tail latency improvements, but their workload (millions of concurrent connections) is atypical. Sources: 🛡️ Discord Engineering Blog (2020), ⚠️ HN discussion with Discord engineer comments
- Rust框架在TechEmpower基准测试中名列前茅 — Actix-web和Axum始终位列前10;Go标准库和Gin在明文/JSON测试中位列20-40名。 来源:🏛️ TechEmpower第22轮测试(2024), 🛡️ Axum GitHub基准测试
- Go 1.19及以后版本的GC延迟为亚毫秒级 — Go团队确认p99 GC停顿<500μs。 来源:🛡️ Go博客《Getting to Go》(2022), 🛡️ Go 1.19发布说明
- 真实世界中的性能差距比微基准测试更小 — Discord 2020年从Go迁移到Rust后,尾部延迟得到改善,但他们的工作负载(数百万并发连接)并不典型。 来源:🛡️ Discord工程博客(2020), ⚠️ Hacker News上Discord工程师的讨论
Contested / Uncertain
存在争议/不确定内容
- ⚠️ "Rust uses 50% less memory than Go" — Frequently repeated on Reddit/HN but no independent benchmark reproduces a consistent figure. Memory usage depends heavily on allocator choice (jemalloc vs system) and workload. Unverified.
- ⚠️ Developer productivity trade-off — Go advocates claim 2-3x faster development time. No peer-reviewed study supports a specific multiplier. Unverified (internal knowledge only) — would need controlled study to verify.
- ⚠️ “Rust内存占用比Go低50%” — Reddit/HN上经常提到这一点,但没有独立基准测试能重现一致的数据。内存占用很大程度上取决于分配器选择(jemalloc vs 系统分配器)和工作负载。未经验证。
- ⚠️ 开发者生产力权衡 — Go的支持者声称开发速度快2-3倍。目前没有同行评审研究支持具体的倍数。未经验证(仅内部知识)——需要对照研究才能验证。
Verification Matrix
验证矩阵
| Claim | Sources | Tier | Status |
|---|---|---|---|
| Rust 1.5-3x faster (synthetic) | TechEmpower R22 (S), Axum bench (A) | S+A | Verified |
| Go GC < 500μs p99 | Go Blog (A), Release Notes (A) | A+A | Verified |
| Discord latency improvement | Discord Blog (A), HN thread (B) | A+B | Verified (single case study) |
| Rust 50% less memory | Reddit threads (B) only | B | Unverified |
| Go 2-3x dev speed | No source found | — | Unverified (internal knowledge only) |
| 主张 | 来源 | 等级 | 状态 |
|---|---|---|---|
| Rust在合成测试中快1.5-3倍 | TechEmpower R22 (S), Axum基准测试(A) | S+A | 已验证 |
| Go GC p99 <500μs | Go博客(A), 发布说明(A) | A+A | 已验证 |
| Discord延迟改善 | Discord博客(A), HN讨论(B) | A+B | 已验证(单个案例研究) |
| Rust内存低50% | 仅Reddit讨论(B) | B | 未经验证 |
| Go开发速度快2-3倍 | 未找到来源 | — | 未经验证(仅内部知识) |
Sources
来源列表
🏛️ Tier S — Academic & Primary Research
🏛️ S级 — 学术与原始研究
- TechEmpower Framework Benchmarks Round 22 — TechEmpower (2024)
- TechEmpower框架基准测试第22轮 — TechEmpower(2024)
🛡️ Tier A — Trusted Official
🛡️ A级 — 可信官方来源
- Getting to Go: The Journey of Go's Garbage Collector — Go Blog (2022)
- Go 1.19 Release Notes — Go Team (2022)
- Why Discord is Switching from Go to Rust — Discord Engineering (2020)
- Axum Benchmarks — Tokio Project
- Getting to Go: The Journey of Go's Garbage Collector — Go博客(2022)
- Go 1.19发布说明 — Go团队(2022)
- Why Discord is Switching from Go to Rust — Discord工程团队(2020)
- Axum基准测试 — Tokio项目
⚠️ Tier B — Community / Caution
⚠️ B级 — 社区来源/需谨慎参考
- HN Discussion on Discord migration — Hacker News (2020)
**Why this is good:**
- Every URL is a real, verifiable page
- Claims that lack sources are explicitly labeled **Unverified**
- The "50% less memory" myth is called out rather than repeated
- Verification matrix honestly shows what's verified vs. not
- Sources are independent (TechEmpower did their own benchmarks, not citing each other)
- Nuance preserved: "the gap narrows with real-world I/O"- Discord迁移相关HN讨论 — Hacker News(2020)
**优势所在:**
- 所有URL都是真实、可验证的页面
- 无来源的主张被明确标记为**未经验证**
- “内存低50%”的传言被指出而非重复传播
- 验证矩阵如实展示了哪些内容已验证、哪些未验证
- 来源是独立的(TechEmpower开展了自主基准测试,未互相引用)
- 保留了细节:“在真实世界I/O场景下差距缩小”Source Tiers
来源分级
Classify every source on discovery.
| Tier | Label | Trust Level | Examples |
|---|---|---|---|
| S | 🏛️ | Academic, peer-reviewed, primary research, official specs | Google Scholar, arXiv, PubMed, W3C/IETF RFCs, language specs (ECMAScript, PEPs) |
| A | 🛡️ | Government, .edu, major press, official docs | .gov/.edu, Reuters/AP/BBC, official framework docs, company engineering blogs (Google AI, Netflix Tech) |
| B | ⚠️ | Social media, forums, personal blogs, wikis — flag to user | Twitter/X, Reddit, StackOverflow, Medium, dev.to, Wikipedia, 나무위키 |
| C | (none) | General websites not fitting above categories | Corporate marketing, press releases, SEO content, news aggregators |
在发现来源时立即进行分类。
| 等级 | 标识 | 可信度 | 示例 |
|---|---|---|---|
| S | 🏛️ | 学术、同行评审、原始研究、官方规范 | Google Scholar、arXiv、PubMed、W3C/IETF RFC、语言规范(ECMAScript、PEPs) |
| A | 🛡️ | 政府、.edu、主流媒体、官方文档 | .gov/.edu域名、路透社/美联社/BBC、官方框架文档、企业工程博客(Google AI、Netflix Tech) |
| B | ⚠️ | 社交媒体、论坛、个人博客、维基类网站——需向用户标记 | Twitter/X、Reddit、StackOverflow、Medium、dev.to、维基百科、나무위키 |
| C | (无标识) | 不符合上述类别的通用网站 | 企业营销页面、新闻稿、SEO内容、新闻聚合器 |
Tier Classification Rules
等级分类规则
- Company's own content about their product:
- Official docs → Tier A
- Feature announcements → Tier A (existence), Tier B (performance claims)
- Marketing pages → Tier C
- GitHub:
- Official repos (e.g., facebook/react) → Tier A
- Issues/Discussions with reproduction → Tier A (for bug existence)
- Random user repos → Tier B
- Benchmarks:
- Independent, reproducible, methodology disclosed → Tier S
- Official by neutral party → Tier A
- Vendor's own benchmarks → Tier B (note bias)
- StackOverflow: Accepted answers with high votes = borderline Tier A; non-accepted = Tier B
- Tier B sources must never be cited alone — corroborate with Tier S or A
- 企业关于自身产品的内容:
- 官方文档 → A级
- 功能公告 → (存在性)A级,(性能主张)B级
- 营销页面 → C级
- GitHub:
- 官方仓库(如facebook/react)→ A级
- 带复现步骤的问题/讨论 → (bug存在性)A级
- 普通用户仓库 → B级
- 基准测试:
- 独立、可复现、披露方法论 → S级
- 中立第三方官方测试 → A级
- 厂商自行开展的基准测试 → B级(标注偏向性)
- StackOverflow: 高票已采纳答案接近A级;未采纳答案为B级
- B级来源不得单独引用——需与S级或A级来源相互佐证
When to Use
适用场景
- Technology evaluation or comparison
- Fact-checking specific claims
- Architecture decision research
- Market/competitor analysis
- "Is X true?" verification tasks
- Any question where accuracy matters more than speed
- 技术评估或对比
- 特定主张的事实核查
- 架构决策研究
- 市场/竞品分析
- “X是否为真?”类验证任务
- 任何准确性优先于速度的问题
When NOT to Use
不适用场景
- Creative writing or brainstorming (use )
swing-options - Code implementation (use for library discovery)
search-first - Simple questions answerable from internal knowledge with high confidence
- Opinion-based questions with no verifiable answer
- 创意写作或头脑风暴(使用)
swing-options - 代码实现(使用发现库)
search-first - 可通过内部知识高置信度回答的简单问题
- 无验证答案的基于观点的问题
Integration Notes
集成说明
- With swing-clarify: Run swing-clarify first on ambiguous requests before invoking this skill. Clarified scope produces better results.
- With brainstorming: Can be invoked during brainstorming's "Explore context" phase for fact-based inputs
- With search-first: search-first finds tools/libraries to USE; this skill VERIFIES factual claims. Different purposes.
- With swing-review: Research findings can feed into adversarial review for stress-testing conclusions
- 与swing-clarify配合: 对模糊请求,先运行swing-clarify,再调用本技能。明确的范围能产生更好的结果。
- 与头脑风暴配合: 可在头脑风暴的“探索背景”阶段调用本技能,获取基于事实的输入。
- 与search-first配合: search-first用于发现可使用的工具/库;本技能用于验证事实主张。用途不同。
- 与swing-review配合: 研究发现可用于对抗性评审,对结论进行压力测试。