github-explorer

Original🇨🇳 Chinese
Translated

Deep-dive analysis of GitHub projects. Use when the user mentions a GitHub repo/project name and wants to understand it — triggered by phrases like "help me look at this project", "learn about XXX", "how is this project", "analyze the repo", or any request to explore/evaluate a GitHub project. Covers architecture, community health, competitive landscape, and cross-platform knowledge sources.

8installs
Added on

NPX Install

npx skill4agent add blessonism/openclaw-skills github-explorer

SKILL.md Content (Chinese)

View Translation Comparison →

GitHub Explorer — In-Depth Project Analysis

Philosophy: The README is just the facade; the real value lies in Issues, Commits, and community discussions.

Workflow

[Project Name] → [1. Locate Repo] → [2. Multi-Source Collection] → [3. Analysis and Judgment] → [4. Structured Output]

Phase 1: Locate the Repo

  • Use
    web_search
    to search
    site:github.com <project_name>
    to confirm the full org/repo path
  • Use
    search-layer
    (Deep Mode + Intent Awareness) to supplement community links and non-GitHub resources:
    bash
    python3 skills/search-layer/scripts/search.py \
      --queries "<project_name> review" "<project_name> evaluation user experience" \
      --mode deep --intent exploratory --num 5
  • Use
    web_fetch
    to crawl the repo homepage for basic information (README, Stars, Forks, License, latest updates)

Phase 2: Multi-Source Collection (Parallel)

⚠️ GitHub Page Crawling Rules (Mandatory): GitHub repo pages are SPAs (client-side rendered),
web_fetch
can only get the navigation bar shell. Do NOT use web_fetch to crawl github.com repo pages. Always use the GitHub API:
  • README:
    curl -s -H "Authorization: token {PAT}" -H "Accept: application/vnd.github.v3.raw" "https://api.github.com/repos/{owner}/{repo}/readme"
  • Repo Metadata:
    curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}"
  • Issues:
    curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/issues?state=all&sort=comments&per_page=10"
  • Commits:
    curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/commits?per_page=10"
  • File Tree:
    curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/git/trees/{branch}?recursive=1"
See TOOLS.md for PAT.
Check the following sources as needed, collect if available, skip if not:
SourceURL PatternCollected ContentRecommended Tool
GitHub Repo
github.com/{org}/{repo}
README, About, Contributors
web_fetch
GitHub Issues
github.com/{org}/{repo}/issues?q=sort:comments
Top 3-5 high-quality Issues
browser
Chinese CommunitiesWeChat/Zhihu/XiaohongshuIn-depth reviews, usage experience
content-extract
Technical BlogsMedium/Dev.toTechnical architecture analysis
web_fetch
/
content-extract
Discussion ForumsV2EX/RedditUser feedback, pain points
search-layer
(Deep Mode)

search-layer Calling Specifications

search-layer v2 supports intent-aware scoring. Recommended usage for github-explorer scenarios:
ScenarioCommandDescription
Project Research (Default)
python3 skills/search-layer/scripts/search.py --queries "<project> review" "<project> evaluation" --mode deep --intent exploratory --num 5
Parallel multi-query, sorted by authority
Latest Updates
python3 skills/search-layer/scripts/search.py "<project> latest release" --mode deep --intent status --freshness pw --num 5
Prioritize freshness, filter content from the past week
Competitor Comparison
python3 skills/search-layer/scripts/search.py --queries "<project> vs <competitor>" "<project> alternatives" --mode deep --intent comparison --num 5
Comparison intent, dual weighting of keywords and authority
Quick Link Lookup
python3 skills/search-layer/scripts/search.py "<project> official docs" --mode fast --intent resource --num 3
Exact match, fastest speed
Community Discussions
python3 skills/search-layer/scripts/search.py "<project> discussion experience" --mode deep --intent exploratory --domain-boost reddit.com,news.ycombinator.com --num 5
Weighted community sites
Intent Type Quick Reference:
factual
(Factual) /
status
(Status) /
comparison
(Comparison) /
tutorial
(Tutorial) /
exploratory
(Exploratory) /
news
(News) /
resource
(Resource Locator)
Without
--intent
, the behavior is exactly the same as v1 (no scoring, output in original order).
Degradation Rules: If either Exa/Tavily returns 429/5xx → continue using remaining sources; if the entire script fails → fall back to single-source
web_search
.

Extraction Upgrade and Degradation Protocol

Must upgrade from
web_fetch
to
content-extract
when encountering the following situations:
  1. Domain Restrictions:
    mp.weixin.qq.com
    ,
    zhihu.com
    ,
    xiaohongshu.com
    .
  2. Complex Structure: Pages contain a large number of formulas (LaTeX), complex tables, or the Markdown returned by
    web_fetch
    is extremely messy.
  3. Content Missing:
    web_fetch
    returns empty content or a Challenge page due to anti-crawling measures.
Calling Method:
bash
python3 skills/content-extract/scripts/content_extract.py --url <URL>
content-extract internally:
  • First checks the domain whitelist (WeChat/Zhihu, etc.), uses MinerU directly if matched
  • Otherwise, first uses
    web_fetch
    for probing, falls back to MinerU-HTML if failed
  • Returns a unified JSON contract (including fields like
    ok
    ,
    markdown
    ,
    sources
    )

Phase 3: Analysis and Judgment

Make judgments based on collected data:
  • Project Phase: Early Experimentation / Rapid Growth / Mature & Stable / Maintenance Mode / Stagnant (based on commit frequency and content)
  • High-Quality Issue Criteria: High number of comments, maintainer participation, exposes architecture issues, or contains valuable technical discussions
  • Competitor Identification: Extract from the "Comparison"/"Alternatives" section of the README, Issue discussions, and web searches

Phase 4: Structured Output

Strictly follow the template below, each module must have substantive content or clearly marked "Not Found".

Formatting Rules (Mandatory)

  1. Title Must Link to GitHub Repository (Format:
    # [Project Name](https://github.com/org/repo)
    , ensure clickable jump)
  2. Uniform Empty Lines Around Titles (End of previous section → empty line → title → empty line → content, ensure clear visual separation)
  3. Telegram Empty Line Fix (Mandatory): Telegram will swallow empty lines after list items (starting with
    -
    ). Solution: Insert a line of braille space
    (U+2800) between the end of the list block and the next title, format as follows:
    - Last item in list
    
    **Next Title**
    This ensures the empty line before the title is not swallowed during Telegram rendering.
  4. All Titles Bold (emoji + bold text)
  5. Competitor Comparison Must Include Links (GitHub / official website / documentation, at least one)
  6. Community Volume Must Be Specific: Quote specific post/tweet/discussion content summaries, attach original links. Do not use general descriptions like "highly praised" or "very popular"; instead, write "someone said something" or "a certain post discussed specific issues"
  7. Information Traceability Principle: All quoted external information should include the original link so readers can trace the source
markdown
# [{Project Name}]({GitHub Repo URL})

**🎯 One-sentence Positioning**

{What it is, what problem it solves}

**⚙️ Core Mechanism**

{Technical principles/architecture, explained in plain language, not copied from README. Include key tech stack.}

**📊 Project Health**

- **Stars**: {Number}  |  **Forks**: {Number}  |  **License**: {Type}
- **Team/Author**: {Background}
- **Commit Trend**: {Recent activity + project phase judgment}
- **Latest Updates**: {Overview of recent important commits}

**🔥 Selected Issues**

{Top 3-5 high-quality Issues, each including title, link, core discussion points. Note if no high-quality Issues are available.}

**✅ Applicable Scenarios**

{When to use it, what specific problems it solves}

**⚠️ Limitations**

{When to avoid it, known issues}

**🆚 Competitor Comparison**

{Comparison with same-track projects, differences. Each competitor must include a GitHub or official website link, example format:}
- **vs [GraphRAG](https://github.com/microsoft/graphrag)** — Difference description
- **vs [RAGFlow](https://github.com/infiniflow/ragflow)** — Difference description

**🌐 Knowledge Graph**

- **DeepWiki**: {Link or "Not Included"}
- **Zread.ai**: {Link or "Not Included"}

**🎬 Demo**

{Online experience link, or "None"}

**📄 Related Papers**

{arXiv link, or "None"}

**📰 Community Volume**

**X/Twitter**

{Quote specific tweet content summaries + links, example format:}
- [@Username](Link): "Specific content..."
- [Discussion Thread](Link): Discussed specific issues...
{Note "No relevant discussions found" if none are available}

**Chinese Communities**

{Quote specific post titles/content summaries + links, example format:}
- [Zhihu: Post Title](Link) — Discussed content
- [V2EX: Post Title](Link) — Discussed content
{Note "No relevant discussions found" if none are available}

**💬 My Judgment**

{Subjective evaluation: Whether it's worth investing time in, suitable for what level of users, suggestions on how to use it}

Execution Notes

  • Prioritize using
    web_search
    +
    web_fetch
    , with
    browser
    as a fallback
  • Search Enhancement: For project research tasks, default to
    search-layer
    v2 Deep Mode +
    --intent exploratory
    (Brave + Exa + Tavily three-source parallel deduplication + intent-aware scoring), single-source failure does not block the main process
  • Mandatory Crawling Degradation: When
    web_fetch
    fails/returns 403/anti-crawling page/too short content, or the source domain belongs to high-risk sites (such as WeChat/Zhihu/Xiaohongshu): switch to
    content-extract
    (which internally falls back to MinerU-HTML) to get cleaner Markdown + traceable sources
  • Collect from different sources in parallel to improve efficiency
  • All links must be valid and accessible, do not fabricate URLs
  • Output in Chinese, retain English for technical terms

⚠️ Output Self-Check List (Mandatory, Check Item by Item Before Each Output)

Before sending the output report, must check the following items one by one, send only if all are passed:
  • Title Link: In the format
    # [Project Name](GitHub URL)
    , clickable for jump
  • Empty Lines Around Titles: Each bold title (
    **🎯 ...**
    ) has one empty line before and after
  • Telegram Empty Line: There is a braille space
    line between the end of each list block and the next title (prevents Telegram from swallowing empty lines)
  • Issue Links: Each selected Issue is in the complete format
    [#Number Title](Full URL)
  • Competitor Links: Each competitor is attached with
    [Name](GitHub/Official Website Link)
  • Community Volume Links: Each quote is in the format
    [Source: Title](URL)
  • No Vague Descriptions: No general descriptions like "highly praised" or "very popular" in the community volume section
  • Information Traceability: All external quotes are attached with original links

Dependencies

This Skill depends on the following OpenClaw tools and Skills:
DependencyTypePurpose
web_search
Built-in ToolBrave Search retrieval
web_fetch
Built-in ToolWeb content crawling
browser
Built-in ToolDynamic page rendering (fallback)
search-layer
SkillMulti-source search + intent-aware scoring (Brave + Exa + Tavily + Grok), v2.1 supports
--intent
/
--queries
/
--freshness
content-extract
SkillHigh-fidelity content extraction (degradation solution for anti-crawling sites)