Loading...
Loading...
Explore and analyze GitHub repositories related to a research topic. Reads deep-research output, discovers repos from multiple sources, deeply analyzes code, and produces integration blueprints.
npx skill4agent add lingzhi227/agent-research-skills github-research/github-research <deep-research-output-dir>~/.claude/skills/github-research/./github-research-output/{slug}/paper_db.jsonlcode_repos.mdPhase 1: Intake → Extract refs, URLs, keywords from deep-research output
Phase 2: Discovery → Multi-source broad GitHub search (50-200 repos)
Phase 3: Filtering → Score & rank → select top 15-30 repos
Phase 4: Deep Dive → Clone & deeply analyze top 8-15 repos (code reading)
Phase 5: Analysis → Per-repo reports + cross-repo comparison
Phase 6: Blueprint → Integration/reuse plan for research topicgithub-research-output/{slug}/
├── repo_db.jsonl # Master repo database
├── phase1_intake/
│ ├── extracted_refs.jsonl # URLs, keywords, paper-repo links
│ └── intake_summary.md
├── phase2_discovery/
│ ├── search_results/ # Raw JSONL from each search
│ └── discovery_log.md
├── phase3_filtering/
│ ├── ranked_repos.jsonl # Scored & ranked subset
│ └── filtering_report.md
├── phase4_deep_dive/
│ ├── repos/ # Cloned repos (shallow)
│ ├── analyses/ # Per-repo analysis .md files
│ └── deep_dive_summary.md
├── phase5_analysis/
│ ├── comparison_matrix.md # Cross-repo comparison
│ ├── technique_map.md # Paper concept → code mapping
│ └── analysis_report.md
└── phase6_blueprint/
├── integration_plan.md # How to combine repos
├── reuse_catalog.md # Reusable components catalog
├── final_report.md # Complete compiled report
└── blueprint_summary.md~/.claude/skills/github-research/scripts/| Script | Purpose | Key Flags |
|---|---|---|
| Parse deep-research output for GitHub URLs, paper refs, keywords | |
| Search GitHub repos via | |
| Search GitHub code for implementations | |
| Search Papers With Code for paper→repo mappings | |
| JSONL repo database management | subcommands: |
| Fetch detailed metadata via | |
| Shallow-clone repos for analysis | |
| Map file tree, key files, LOC stats | |
| Extract and parse dependency files | |
| Search cloned repo for specific code patterns | |
| Fetch README without cloning | |
| Generate comparison matrix across repos | |
| Assemble final report from all phases | |
SLUG=$(echo "$TOPIC" | tr '[:upper:]' '[:lower:]' | tr ' ' '-' | tr -cd 'a-z0-9-')
mkdir -p github-research-output/$SLUG/{phase1_intake,phase2_discovery/search_results,phase3_filtering,phase4_deep_dive/{repos,analyses},phase5_analysis,phase6_blueprint}python ~/.claude/skills/github-research/scripts/extract_research_refs.py \
--research-dir <deep-research-output-dir> \
--output github-research-output/$SLUG/phase1_intake/extracted_refs.jsonlphase1_intake/intake_summary.mdextracted_refs.jsonlintake_summary.mdpython ~/.claude/skills/github-research/scripts/repo_metadata.py \
--repos owner1/name1 owner2/name2 ... \
--output github-research-output/$SLUG/phase2_discovery/search_results/direct_urls.jsonlpython ~/.claude/skills/github-research/scripts/search_paperswithcode.py \
--arxiv-id 2401.12345 \
--output github-research-output/$SLUG/phase2_discovery/search_results/pwc_2401.12345.jsonlpython ~/.claude/skills/github-research/scripts/search_github.py \
--query "multi-agent LLM coordination" \
--min-stars 10 --sort stars --max-results 50 \
--output github-research-output/$SLUG/phase2_discovery/search_results/gh_query1.jsonlpython ~/.claude/skills/github-research/scripts/search_github_code.py \
--query "class MultiAgentOrchestrator" \
--language python --max-results 30 \
--output github-research-output/$SLUG/phase2_discovery/search_results/code_query1.jsonlpython ~/.claude/skills/github-research/scripts/repo_readme_fetch.py \
--input <repos.jsonl> \
--output github-research-output/$SLUG/phase2_discovery/search_results/readmes.jsonlpython ~/.claude/skills/github-research/scripts/repo_db.py merge \
--inputs github-research-output/$SLUG/phase2_discovery/search_results/*.jsonl \
--output github-research-output/$SLUG/repo_db.jsonlphase2_discovery/discovery_log.md--delay 1.0repo_db.jsonldiscovery_log.mdpython ~/.claude/skills/github-research/scripts/repo_metadata.py \
--input github-research-output/$SLUG/repo_db.jsonl \
--output github-research-output/$SLUG/repo_db.jsonl \
--delay 0.5python ~/.claude/skills/github-research/scripts/repo_db.py score \
--input github-research-output/$SLUG/repo_db.jsonl \
--output github-research-output/$SLUG/repo_db.jsonlrelevance_scorepython ~/.claude/skills/github-research/scripts/repo_db.py tag \
--input github-research-output/$SLUG/repo_db.jsonl \
--ids owner/name --tags "relevance:0.85"python ~/.claude/skills/github-research/scripts/repo_db.py score \
--input github-research-output/$SLUG/repo_db.jsonl \
--output github-research-output/$SLUG/repo_db.jsonl
python ~/.claude/skills/github-research/scripts/repo_db.py rank \
--input github-research-output/$SLUG/repo_db.jsonl \
--output github-research-output/$SLUG/phase3_filtering/ranked_repos.jsonl \
--by composite_scorepython ~/.claude/skills/github-research/scripts/repo_db.py filter \
--input github-research-output/$SLUG/phase3_filtering/ranked_repos.jsonl \
--output github-research-output/$SLUG/phase3_filtering/ranked_repos.jsonl \
--max-repos 30 --not-archivedphase3_filtering/filtering_report.mdactivity_score = sigmoid((days_since_push < 90) * 0.4 + has_recent_commits * 0.3 + open_issues_ratio * 0.3)
quality_score = normalize(log(stars+1) * 0.3 + log(forks+1) * 0.2 + has_license * 0.15 + has_readme * 0.15 + not_archived * 0.2)
composite_score = relevance * 0.4 + quality * 0.35 + activity * 0.25ranked_repos.jsonlfiltering_report.mdpython ~/.claude/skills/github-research/scripts/clone_repo.py \
--repo owner/name \
--output-dir github-research-output/$SLUG/phase4_deep_dive/repos/python ~/.claude/skills/github-research/scripts/analyze_repo_structure.py \
--repo-dir github-research-output/$SLUG/phase4_deep_dive/repos/name/ \
--output github-research-output/$SLUG/phase4_deep_dive/analyses/name_structure.jsonpython ~/.claude/skills/github-research/scripts/extract_dependencies.py \
--repo-dir github-research-output/$SLUG/phase4_deep_dive/repos/name/ \
--output github-research-output/$SLUG/phase4_deep_dive/analyses/name_deps.jsonpython ~/.claude/skills/github-research/scripts/find_implementations.py \
--repo-dir github-research-output/$SLUG/phase4_deep_dive/repos/name/ \
--patterns "class Transformer" "def forward" "attention" \
--output github-research-output/$SLUG/phase4_deep_dive/analyses/name_impls.jsonlphase4_deep_dive/analyses/{name}_analysis.mdphase4_deep_dive/deep_dive_summary.mdrepos/analyses/deep_dive_summary.mdpython ~/.claude/skills/github-research/scripts/compare_repos.py \
--input github-research-output/$SLUG/phase4_deep_dive/analyses/ \
--output github-research-output/$SLUG/phase5_analysis/comparison.jsonphase5_analysis/comparison_matrix.mdphase5_analysis/technique_map.mdphase5_analysis/analysis_report.mdcomparison_matrix.mdtechnique_map.mdanalysis_report.mdphase6_blueprint/integration_plan.mdphase6_blueprint/reuse_catalog.mdpython ~/.claude/skills/github-research/scripts/compile_github_report.py \
--topic-dir github-research-output/$SLUG/phase6_blueprint/blueprint_summary.mdintegration_plan.mdreuse_catalog.mdfinal_report.mdblueprint_summary.mdrelevance × 0.4 + quality × 0.35 + activity × 0.25ghrepo_idghreferences/phase-guide.md~/.claude/skills/deep-research/SKILL.md~/.claude/skills/deep-research/scripts/paper_db.py