Deep Research Skill
Trigger
Activate this skill when the user wants to:
- "Research a topic", "literature review", "find papers about", "survey papers on"
- "Deep dive into [topic]", "what's the state of the art in [topic]"
- Uses slash command
Overview
This skill conducts systematic academic literature reviews in 6 phases, producing structured notes, a curated paper database, and a synthesized final report. Output is organized by phase for clarity.
Installation:
~/.claude/skills/deep-research/
— scripts, references, and this skill definition.
Output:
.//Users/lingzhi/Code/deep-research-output/{slug}/
relative to the current working directory.
Paper Quality Policy
Peer-reviewed conference papers take priority over arXiv preprints. Many arXiv papers have not undergone peer review and may contain unverified claims.
Source Priority (highest to lowest)
- Top AI conferences: NeurIPS, ICLR, ICML, ACL, EMNLP, NAACL, AAAI, IJCAI, CVPR, KDD, CoRL
- Peer-reviewed journals: JMLR, TACL, Nature, Science, etc.
- Workshop papers: NeurIPS/ICML workshops (lower bar but still reviewed)
- arXiv preprints with high citations: Likely high-quality but unverified
- Recent arXiv preprints: Use cautiously, note "preprint" status explicitly
When to Use arXiv Papers
- As supplementary evidence alongside peer-reviewed work
- For very recent results (< 3 months old) not yet at conferences
- When a peer-reviewed version doesn't exist yet — note in citations
- For survey/review papers (these are useful even without peer review)
Search Tools (by priority)
1. paper_finder (primary — conference papers only)
Location:
/Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py
Searches ai-paper-finder.info (HuggingFace Space) for published conference papers. Supports filtering by conference + year. Outputs JSONL with BibTeX.
bash
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --mode scrape --config <config.yaml>
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --mode download --jsonl <results.jsonl>
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --list-venues
Config example:
yaml
searches:
- query: "long horizon reasoning agent"
num_results: 100
venues:
neurips: [2024, 2025]
iclr: [2024, 2025, 2026]
icml: [2024, 2025]
output:
root: /Users/lingzhi/Code/deep-research-output/{slug}/phase1_frontier/search_results
overwrite: true
2. search_semantic_scholar.py (supplementary — citation data + broader coverage)
Location:
/Users/lingzhi/.claude/skills/deep-research/scripts/search_semantic_scholar.py
Supports
and
filters. API key:
/Users/lingzhi/Code/keys.md
(field
)
3. search_arxiv.py (supplementary — latest preprints)
Location:
/Users/lingzhi/.claude/skills/deep-research/scripts/search_arxiv.py
For searching recent papers not yet published at conferences. Mark citations with
.
Other Scripts
| Script | Location | Key Flags |
|---|
| ~/.claude/skills/deep-research/scripts/
| , , , |
| ~/.claude/skills/deep-research/scripts/
| , , , |
| ~/.claude/skills/deep-research/scripts/
| subcommands: , , , , , , |
| ~/.claude/skills/deep-research/scripts/
| , , |
| ~/.claude/skills/deep-research/scripts/
| |
WebFetch Mode (no Bash)
- Paper discovery: + to query Semantic Scholar/arXiv APIs
- Paper reading: on ar5iv HTML or tool on downloaded PDFs
- Writing: tool for JSONL, notes, report files
6-Phase Workflow
Phase 1: Frontier
Search the latest conference proceedings and preprints to understand current trends.
- Write
phase1_frontier/paper_finder_config.yaml
targeting latest 1-2 years
- Run paper_finder scrape
- WebSearch for latest accepted paper lists
- Identify trending directions, key breakthroughs
→ Output:
phase1_frontier/frontier.md
, phase1_frontier/search_results/
Phase 2: Survey
Build a comprehensive landscape with broader time range. Target 35-80 papers after filtering.
- Write
phase2_survey/paper_finder_config.yaml
covering 2023-2025
- Run paper_finder + Semantic Scholar + arXiv
- Merge all results:
python /Users/lingzhi/.claude/skills/deep-research/scripts/paper_db.py merge
- Filter to 35-80 most relevant:
python /Users/lingzhi/.claude/skills/deep-research/scripts/paper_db.py filter --min-score 0.80 --max-papers 70
- Cluster by theme, write survey notes
→ Output: ,
phase2_survey/search_results/
,
Phase 3: Deep Dive
Select 8-15 papers.
Prefer peer-reviewed papers for deep reading.
Write selection rationale, then read fully and take structured notes.
→ Output:
phase3_deep_dive/selection.md
,
phase3_deep_dive/deep_dive.md
,
Phase 4: Code & Tools
Extract GitHub URLs, web search for implementations, benchmarks.
→ Output:
phase4_code/code_repos.md
Phase 5: Synthesis
Cross-paper analysis.
Weight peer-reviewed findings higher.
Taxonomy, comparative tables, gap analysis.
→ Output:
phase5_synthesis/synthesis.md
,
Phase 6: Compilation
Assemble final report. Mark preprint citations with
suffix.
→ Output:
,
phase6_report/references.bib
Output Directory
output/{topic-slug}/
├── paper_db.jsonl # Master database (accumulated)
├── phase1_frontier/
│ ├── paper_finder_config.yaml
│ ├── search_results/
│ └── frontier.md
├── phase2_survey/
│ ├── paper_finder_config.yaml
│ ├── search_results/
│ └── survey.md
├── phase3_deep_dive/
│ ├── papers/
│ ├── selection.md
│ └── deep_dive.md
├── phase4_code/
│ └── code_repos.md
├── phase5_synthesis/
│ ├── synthesis.md
│ └── gaps.md
└── phase6_report/
├── report.md
└── references.bib
Key Conventions
- Paper IDs: Use when available, otherwise Semantic Scholar
- Citations: format, key = firstAuthorYearWord (e.g., )
- JSONL schema: title, authors, abstract, year, venue, venue_normalized, peer_reviewed, citationCount, paperId, arxiv_id, pdf_url, tags, source
- Preprint marking: Always note when citing non-peer-reviewed work
- Incremental saves: Each phase writes to disk immediately
- Paper count: Target 35-80 papers in final paper_db.jsonl (use )
References
/Users/lingzhi/.claude/skills/deep-research/references/workflow-phases.md
— Detailed 6-phase methodology
/Users/lingzhi/.claude/skills/deep-research/references/note-format.md
— Note templates, BibTeX format, report structure
/Users/lingzhi/.claude/skills/deep-research/references/api-reference.md
— arXiv, Semantic Scholar, ar5iv API guide
Related Skills
- Downstream: literature-search, literature-review, citation-management
- See also: novelty-assessment, survey-generation