deep-research

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Deep Research Skill

深度研究Skill

Trigger

触发条件

Activate this skill when the user wants to:
  • "Research a topic", "literature review", "find papers about", "survey papers on"
  • "Deep dive into [topic]", "what's the state of the art in [topic]"
  • Uses
    /research <topic>
    slash command
当用户有以下需求时激活该Skill:
  • "研究某个主题"、"文献综述"、"查找关于...的论文"、"关于...的调研论文"
  • "深入研究[主题]"、"[主题]的当前研究进展如何"
  • 使用
    /research <topic>
    斜杠命令

Overview

概述

This skill conducts systematic academic literature reviews in 6 phases, producing structured notes, a curated paper database, and a synthesized final report. Output is organized by phase for clarity.
Installation:
~/.claude/skills/deep-research/
— scripts, references, and this skill definition. Output:
.//Users/lingzhi/Code/deep-research-output/{slug}/
relative to the current working directory.
该Skill通过6个阶段开展系统化学术文献综述,生成结构化笔记、精选论文数据库以及综合最终报告。输出内容按阶段划分,清晰明了。
安装路径
~/.claude/skills/deep-research/
—— 包含脚本、参考资料以及本Skill定义文件。 输出路径
.//Users/lingzhi/Code/deep-research-output/{slug}/
(相对于当前工作目录)。

Paper Quality Policy

论文质量准则

Peer-reviewed conference papers take priority over arXiv preprints. Many arXiv papers have not undergone peer review and may contain unverified claims.
同行评审的会议论文优先级高于arXiv预印本。许多arXiv论文未经过同行评审,可能包含未经证实的结论。

Source Priority (highest to lowest)

来源优先级(从高到低)

  1. Top AI conferences: NeurIPS, ICLR, ICML, ACL, EMNLP, NAACL, AAAI, IJCAI, CVPR, KDD, CoRL
  2. Peer-reviewed journals: JMLR, TACL, Nature, Science, etc.
  3. Workshop papers: NeurIPS/ICML workshops (lower bar but still reviewed)
  4. arXiv preprints with high citations: Likely high-quality but unverified
  5. Recent arXiv preprints: Use cautiously, note "preprint" status explicitly
  1. 顶级AI会议:NeurIPS, ICLR, ICML, ACL, EMNLP, NAACL, AAAI, IJCAI, CVPR, KDD, CoRL
  2. 同行评审期刊:JMLR, TACL, 《自然》, 《科学》等
  3. 研讨会论文:NeurIPS/ICML研讨会(门槛较低但仍经过评审)
  4. 高引用量的arXiv预印本:可能质量较高但未经验证
  5. 近期arXiv预印本:谨慎使用,明确标注“预印本”状态

When to Use arXiv Papers

何时使用arXiv论文

  • As supplementary evidence alongside peer-reviewed work
  • For very recent results (< 3 months old) not yet at conferences
  • When a peer-reviewed version doesn't exist yet — note
    (preprint)
    in citations
  • For survey/review papers (these are useful even without peer review)
  • 作为同行评审成果的补充证据
  • 用于获取最新(发布时间<3个月)的、尚未在会议上发表的研究结果
  • 当某研究暂无同行评审版本时——在引用中注明
    (preprint)
  • 用于综述/调研论文(即使未经过同行评审也有参考价值)

Search Tools (by priority)

检索工具(按优先级排序)

1. paper_finder (primary — conference papers only)

1. paper_finder(主要工具——仅检索会议论文)

Location:
/Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py
Searches ai-paper-finder.info (HuggingFace Space) for published conference papers. Supports filtering by conference + year. Outputs JSONL with BibTeX.
bash
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --mode scrape --config <config.yaml>
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --mode download --jsonl <results.jsonl>
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --list-venues
Config example:
yaml
searches:
  - query: "long horizon reasoning agent"
    num_results: 100
    venues:
      neurips: [2024, 2025]
      iclr: [2024, 2025, 2026]
      icml: [2024, 2025]
output:
  root: /Users/lingzhi/Code/deep-research-output/{slug}/phase1_frontier/search_results
  overwrite: true
位置
/Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py
检索ai-paper-finder.info(HuggingFace Space)上已发表的会议论文,支持按会议+年份筛选,输出带BibTeX的JSONL文件。
bash
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --mode scrape --config <config.yaml>
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --mode download --jsonl <results.jsonl>
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --list-venues
配置示例:
yaml
searches:
  - query: "long horizon reasoning agent"
    num_results: 100
    venues:
      neurips: [2024, 2025]
      iclr: [2024, 2025, 2026]
      icml: [2024, 2025]
output:
  root: /Users/lingzhi/Code/deep-research-output/{slug}/phase1_frontier/search_results
  overwrite: true

2. search_semantic_scholar.py (supplementary — citation data + broader coverage)

2. search_semantic_scholar.py(补充工具——提供引用数据+更广泛的覆盖范围)

Location:
/Users/lingzhi/.claude/skills/deep-research/scripts/search_semantic_scholar.py
Supports
--peer-reviewed-only
and
--top-conferences
filters. API key:
/Users/lingzhi/Code/keys.md
(field
S2_API_Key
)
位置
/Users/lingzhi/.claude/skills/deep-research/scripts/search_semantic_scholar.py
支持
--peer-reviewed-only
--top-conferences
筛选条件。API密钥路径:
/Users/lingzhi/Code/keys.md
(字段
S2_API_Key

3. search_arxiv.py (supplementary — latest preprints)

3. search_arxiv.py(补充工具——检索最新预印本)

Location:
/Users/lingzhi/.claude/skills/deep-research/scripts/search_arxiv.py
For searching recent papers not yet published at conferences. Mark citations with
(preprint)
.
位置
/Users/lingzhi/.claude/skills/deep-research/scripts/search_arxiv.py
用于检索尚未在会议上发表的近期论文。引用时需标注
(preprint)

Other Scripts

其他脚本

ScriptLocationKey Flags
download_papers.py
~/.claude/skills/deep-research/scripts/
--jsonl
,
--output-dir
,
--max-downloads
,
--sort-by-citations
extract_pdf.py
~/.claude/skills/deep-research/scripts/
--pdf
,
--pdf-dir
,
--output-dir
,
--sections-only
paper_db.py
~/.claude/skills/deep-research/scripts/
subcommands:
merge
,
search
,
filter
,
tag
,
stats
,
add
,
export
bibtex_manager.py
~/.claude/skills/deep-research/scripts/
--jsonl
,
--output
,
--keys-only
compile_report.py
~/.claude/skills/deep-research/scripts/
--topic-dir
脚本位置关键参数
download_papers.py
~/.claude/skills/deep-research/scripts/
--jsonl
,
--output-dir
,
--max-downloads
,
--sort-by-citations
extract_pdf.py
~/.claude/skills/deep-research/scripts/
--pdf
,
--pdf-dir
,
--output-dir
,
--sections-only
paper_db.py
~/.claude/skills/deep-research/scripts/
子命令:
merge
,
search
,
filter
,
tag
,
stats
,
add
,
export
bibtex_manager.py
~/.claude/skills/deep-research/scripts/
--jsonl
,
--output
,
--keys-only
compile_report.py
~/.claude/skills/deep-research/scripts/
--topic-dir

WebFetch Mode (no Bash)

WebFetch模式(无需Bash)

  1. Paper discovery:
    WebSearch
    +
    WebFetch
    to query Semantic Scholar/arXiv APIs
  2. Paper reading:
    WebFetch
    on ar5iv HTML or
    Read
    tool on downloaded PDFs
  3. Writing:
    Write
    tool for JSONL, notes, report files
  1. 论文发现
    WebSearch
    +
    WebFetch
    查询Semantic Scholar/arXiv API
  2. 论文阅读:对ar5iv HTML使用
    WebFetch
    ,或对已下载的PDF使用
    Read
    工具
  3. 写作:使用
    Write
    工具生成JSONL、笔记、报告文件

6-Phase Workflow

6阶段工作流

Phase 1: Frontier

阶段1:前沿调研

Search the latest conference proceedings and preprints to understand current trends.
  1. Write
    phase1_frontier/paper_finder_config.yaml
    targeting latest 1-2 years
  2. Run paper_finder scrape
  3. WebSearch for latest accepted paper lists
  4. Identify trending directions, key breakthroughs → Output:
    phase1_frontier/frontier.md
    ,
    phase1_frontier/search_results/
检索最新会议论文集和预印本,了解当前研究趋势。
  1. 编写
    phase1_frontier/paper_finder_config.yaml
    ,目标为最近1-2年的成果
  2. 运行paper_finder进行爬取
  3. 网页搜索最新的已接收论文列表
  4. 识别热门研究方向、关键突破成果 → 输出:
    phase1_frontier/frontier.md
    ,
    phase1_frontier/search_results/

Phase 2: Survey

阶段2:全面调研

Build a comprehensive landscape with broader time range. Target 35-80 papers after filtering.
  1. Write
    phase2_survey/paper_finder_config.yaml
    covering 2023-2025
  2. Run paper_finder + Semantic Scholar + arXiv
  3. Merge all results:
    python /Users/lingzhi/.claude/skills/deep-research/scripts/paper_db.py merge
  4. Filter to 35-80 most relevant:
    python /Users/lingzhi/.claude/skills/deep-research/scripts/paper_db.py filter --min-score 0.80 --max-papers 70
  5. Cluster by theme, write survey notes → Output:
    phase2_survey/survey.md
    ,
    phase2_survey/search_results/
    ,
    paper_db.jsonl
扩大时间范围,构建全面的研究全景。筛选后目标为35-80篇论文。
  1. 编写
    phase2_survey/paper_finder_config.yaml
    ,覆盖2023-2025年
  2. 运行paper_finder + Semantic Scholar + arXiv检索
  3. 合并所有结果:
    python /Users/lingzhi/.claude/skills/deep-research/scripts/paper_db.py merge
  4. 筛选出35-80篇最相关的论文:
    python /Users/lingzhi/.claude/skills/deep-research/scripts/paper_db.py filter --min-score 0.80 --max-papers 70
  5. 按主题聚类,撰写调研笔记 → 输出:
    phase2_survey/survey.md
    ,
    phase2_survey/search_results/
    ,
    paper_db.jsonl

Phase 3: Deep Dive

阶段3:深度研读

Select 8-15 papers. Prefer peer-reviewed papers for deep reading. Write selection rationale, then read fully and take structured notes. → Output:
phase3_deep_dive/selection.md
,
phase3_deep_dive/deep_dive.md
,
phase3_deep_dive/papers/
选择8-15篇论文。优先选择同行评审论文进行深度阅读。 撰写选择理由,然后完整阅读并记录结构化笔记。 → 输出:
phase3_deep_dive/selection.md
,
phase3_deep_dive/deep_dive.md
,
phase3_deep_dive/papers/

Phase 4: Code & Tools

阶段4:代码与工具

Extract GitHub URLs, web search for implementations, benchmarks. → Output:
phase4_code/code_repos.md
提取GitHub链接,网页搜索相关实现、基准测试。 → 输出:
phase4_code/code_repos.md

Phase 5: Synthesis

阶段5:综合分析

Cross-paper analysis. Weight peer-reviewed findings higher. Taxonomy, comparative tables, gap analysis. → Output:
phase5_synthesis/synthesis.md
,
phase5_synthesis/gaps.md
跨论文分析。优先考虑同行评审研究成果的权重。 构建分类体系、对比表格、研究缺口分析。 → 输出:
phase5_synthesis/synthesis.md
,
phase5_synthesis/gaps.md

Phase 6: Compilation

阶段6:报告汇编

Assemble final report. Mark preprint citations with
(preprint)
suffix. → Output:
phase6_report/report.md
,
phase6_report/references.bib
整合最终报告。引用预印本时需添加
(preprint)
后缀。 → 输出:
phase6_report/report.md
,
phase6_report/references.bib

Output Directory

输出目录结构

output/{topic-slug}/
├── paper_db.jsonl                    # Master database (accumulated)
├── phase1_frontier/
│   ├── paper_finder_config.yaml
│   ├── search_results/
│   └── frontier.md
├── phase2_survey/
│   ├── paper_finder_config.yaml
│   ├── search_results/
│   └── survey.md
├── phase3_deep_dive/
│   ├── papers/
│   ├── selection.md
│   └── deep_dive.md
├── phase4_code/
│   └── code_repos.md
├── phase5_synthesis/
│   ├── synthesis.md
│   └── gaps.md
└── phase6_report/
    ├── report.md
    └── references.bib
output/{topic-slug}/
├── paper_db.jsonl                    # 主数据库(逐步积累)
├── phase1_frontier/
│   ├── paper_finder_config.yaml
│   ├── search_results/
│   └── frontier.md
├── phase2_survey/
│   ├── paper_finder_config.yaml
│   ├── search_results/
│   └── survey.md
├── phase3_deep_dive/
│   ├── papers/
│   ├── selection.md
│   └── deep_dive.md
├── phase4_code/
│   └── code_repos.md
├── phase5_synthesis/
│   ├── synthesis.md
│   └── gaps.md
└── phase6_report/
    ├── report.md
    └── references.bib

Key Conventions

关键约定

  • Paper IDs: Use
    arxiv_id
    when available, otherwise Semantic Scholar
    paperId
  • Citations:
    [@key]
    format, key = firstAuthorYearWord (e.g.,
    [@vaswani2017attention]
    )
  • JSONL schema: title, authors, abstract, year, venue, venue_normalized, peer_reviewed, citationCount, paperId, arxiv_id, pdf_url, tags, source
  • Preprint marking: Always note
    (preprint)
    when citing non-peer-reviewed work
  • Incremental saves: Each phase writes to disk immediately
  • Paper count: Target 35-80 papers in final paper_db.jsonl (use
    paper_db.py filter
    )
  • 论文ID:优先使用
    arxiv_id
    ,若无则使用Semantic Scholar的
    paperId
  • 引用格式
    [@key]
    格式,key = 第一作者姓氏+年份+关键词(例如:
    [@vaswani2017attention]
  • JSONL schema:title, authors, abstract, year, venue, venue_normalized, peer_reviewed, citationCount, paperId, arxiv_id, pdf_url, tags, source
  • 预印本标注:引用非同行评审成果时必须注明
    (preprint)
  • 增量保存:每个阶段的成果立即写入磁盘
  • 论文数量:最终
    paper_db.jsonl
    中目标论文数量为35-80篇(使用
    paper_db.py filter
    工具筛选)

References

参考资料

  • /Users/lingzhi/.claude/skills/deep-research/references/workflow-phases.md
    — Detailed 6-phase methodology
  • /Users/lingzhi/.claude/skills/deep-research/references/note-format.md
    — Note templates, BibTeX format, report structure
  • /Users/lingzhi/.claude/skills/deep-research/references/api-reference.md
    — arXiv, Semantic Scholar, ar5iv API guide
  • /Users/lingzhi/.claude/skills/deep-research/references/workflow-phases.md
    — 6阶段方法学详细说明
  • /Users/lingzhi/.claude/skills/deep-research/references/note-format.md
    — 笔记模板、BibTeX格式、报告结构
  • /Users/lingzhi/.claude/skills/deep-research/references/api-reference.md
    — arXiv、Semantic Scholar、ar5iv API指南

Related Skills

相关Skill

  • Downstream: literature-search, literature-review, citation-management
  • See also: novelty-assessment, survey-generation
  • 下游Skill:literature-search, literature-review, citation-management
  • 其他相关Skill:novelty-assessment, survey-generation