github-explorer

Original：🇨🇳 Chinese

Translated

Deep-dive analysis of GitHub projects. Use when the user mentions a GitHub repo/project name and wants to understand it — triggered by phrases like "help me look at this project", "learn about XXX", "how is this project", "analyze the repo", or any request to explore/evaluate a GitHub project. Covers architecture, community health, competitive landscape, and cross-platform knowledge sources.

8installs

Sourceblessonism/openclaw-skills

Added on2026-02-21

NPX Install

npx skill4agent add blessonism/openclaw-skills github-explorer

SKILL.md Content (Chinese)

View Translation Comparison →

GitHub Explorer — In-Depth Project Analysis

Philosophy: The README is just the facade; the real value lies in Issues, Commits, and community discussions.

Workflow

[Project Name] → [1. Locate Repo] → [2. Multi-Source Collection] → [3. Analysis and Judgment] → [4. Structured Output]

Phase 1: Locate the Repo

Use
```
web_search
```
to search
```
site:github.com <project_name>
```
to confirm the full org/repo path

Use

search-layer

(Deep Mode + Intent Awareness) to supplement community links and non-GitHub resources:

bash

python3 skills/search-layer/scripts/search.py \
  --queries "<project_name> review" "<project_name> evaluation user experience" \
  --mode deep --intent exploratory --num 5

Use
```
web_fetch
```
to crawl the repo homepage for basic information (README, Stars, Forks, License, latest updates)

Phase 2: Multi-Source Collection (Parallel)

⚠️ GitHub Page Crawling Rules (Mandatory): GitHub repo pages are SPAs (client-side rendered),

web_fetch

can only get the navigation bar shell. Do NOT use web_fetch to crawl github.com repo pages. Always use the GitHub API:

README:

curl -s -H "Authorization: token {PAT}" -H "Accept: application/vnd.github.v3.raw" "https://api.github.com/repos/{owner}/{repo}/readme"

Repo Metadata:

curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}"

Issues:

curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/issues?state=all&sort=comments&per_page=10"

Commits:

curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/commits?per_page=10"

File Tree:

curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/git/trees/{branch}?recursive=1"

See TOOLS.md for PAT.

Check the following sources as needed, collect if available, skip if not:

Source	URL Pattern	Collected Content	Recommended Tool
GitHub Repo	`github.com/{org}/{repo}`	README, About, Contributors	`web_fetch`
GitHub Issues	`github.com/{org}/{repo}/issues?q=sort:comments`	Top 3-5 high-quality Issues	`browser`
Chinese Communities	WeChat/Zhihu/Xiaohongshu	In-depth reviews, usage experience	`content-extract`
Technical Blogs	Medium/Dev.to	Technical architecture analysis	`web_fetch` / `content-extract`
Discussion Forums	V2EX/Reddit	User feedback, pain points	`search-layer` (Deep Mode)

search-layer Calling Specifications

search-layer v2 supports intent-aware scoring. Recommended usage for github-explorer scenarios:

Scenario	Command	Description
Project Research (Default)	`python3 skills/search-layer/scripts/search.py --queries "<project> review" "<project> evaluation" --mode deep --intent exploratory --num 5`	Parallel multi-query, sorted by authority
Latest Updates	`python3 skills/search-layer/scripts/search.py "<project> latest release" --mode deep --intent status --freshness pw --num 5`	Prioritize freshness, filter content from the past week
Competitor Comparison	`python3 skills/search-layer/scripts/search.py --queries "<project> vs <competitor>" "<project> alternatives" --mode deep --intent comparison --num 5`	Comparison intent, dual weighting of keywords and authority
Quick Link Lookup	`python3 skills/search-layer/scripts/search.py "<project> official docs" --mode fast --intent resource --num 3`	Exact match, fastest speed
Community Discussions	`python3 skills/search-layer/scripts/search.py "<project> discussion experience" --mode deep --intent exploratory --domain-boost reddit.com,news.ycombinator.com --num 5`	Weighted community sites

Intent Type Quick Reference:

factual

(Factual) /

status

(Status) /

comparison

(Comparison) /

tutorial

(Tutorial) /

exploratory

(Exploratory) /

news

(News) /

resource

(Resource Locator)

Without
--intent
, the behavior is exactly the same as v1 (no scoring, output in original order).

Degradation Rules: If either Exa/Tavily returns 429/5xx → continue using remaining sources; if the entire script fails → fall back to single-source

web_search

Extraction Upgrade and Degradation Protocol

Must upgrade from

web_fetch

content-extract

when encountering the following situations:

Domain Restrictions:

mp.weixin.qq.com

zhihu.com

xiaohongshu.com

Complex Structure: Pages contain a large number of formulas (LaTeX), complex tables, or the Markdown returned by
```
web_fetch
```
is extremely messy.
Content Missing:
```
web_fetch
```
returns empty content or a Challenge page due to anti-crawling measures.

Calling Method:

bash

python3 skills/content-extract/scripts/content_extract.py --url <URL>

content-extract internally:

First checks the domain whitelist (WeChat/Zhihu, etc.), uses MinerU directly if matched
Otherwise, first uses
```
web_fetch
```
for probing, falls back to MinerU-HTML if failed
Returns a unified JSON contract (including fields like
```
ok
```
,
```
markdown
```
,
```
sources
```
)

Phase 3: Analysis and Judgment

Make judgments based on collected data:

Project Phase: Early Experimentation / Rapid Growth / Mature & Stable / Maintenance Mode / Stagnant (based on commit frequency and content)
High-Quality Issue Criteria: High number of comments, maintainer participation, exposes architecture issues, or contains valuable technical discussions
Competitor Identification: Extract from the "Comparison"/"Alternatives" section of the README, Issue discussions, and web searches

Phase 4: Structured Output

Strictly follow the template below, each module must have substantive content or clearly marked "Not Found".

Formatting Rules (Mandatory)

Title Must Link to GitHub Repository (Format:
```
# [Project Name](https://github.com/org/repo)
```
, ensure clickable jump)
Uniform Empty Lines Around Titles (End of previous section → empty line → title → empty line → content, ensure clear visual separation)
Telegram Empty Line Fix (Mandatory): Telegram will swallow empty lines after list items (starting with
```
-
```
). Solution: Insert a line of braille space
```
⠀
```
(U+2800) between the end of the list block and the next title, format as follows:
```
- Last item in list

⠀
**Next Title**
```
This ensures the empty line before the title is not swallowed during Telegram rendering.
All Titles Bold (emoji + bold text)
Competitor Comparison Must Include Links (GitHub / official website / documentation, at least one)
Community Volume Must Be Specific: Quote specific post/tweet/discussion content summaries, attach original links. Do not use general descriptions like "highly praised" or "very popular"; instead, write "someone said something" or "a certain post discussed specific issues"
Information Traceability Principle: All quoted external information should include the original link so readers can trace the source

markdown

# [{Project Name}]({GitHub Repo URL})

**🎯 One-sentence Positioning**

{What it is, what problem it solves}

**⚙️ Core Mechanism**

{Technical principles/architecture, explained in plain language, not copied from README. Include key tech stack.}

**📊 Project Health**

- **Stars**: {Number}  |  **Forks**: {Number}  |  **License**: {Type}
- **Team/Author**: {Background}
- **Commit Trend**: {Recent activity + project phase judgment}
- **Latest Updates**: {Overview of recent important commits}

**🔥 Selected Issues**

{Top 3-5 high-quality Issues, each including title, link, core discussion points. Note if no high-quality Issues are available.}

**✅ Applicable Scenarios**

{When to use it, what specific problems it solves}

**⚠️ Limitations**

{When to avoid it, known issues}

**🆚 Competitor Comparison**

{Comparison with same-track projects, differences. Each competitor must include a GitHub or official website link, example format:}
- **vs [GraphRAG](https://github.com/microsoft/graphrag)** — Difference description
- **vs [RAGFlow](https://github.com/infiniflow/ragflow)** — Difference description

**🌐 Knowledge Graph**

- **DeepWiki**: {Link or "Not Included"}
- **Zread.ai**: {Link or "Not Included"}

**🎬 Demo**

{Online experience link, or "None"}

**📄 Related Papers**

{arXiv link, or "None"}

**📰 Community Volume**

**X/Twitter**

{Quote specific tweet content summaries + links, example format:}
- [@Username](Link): "Specific content..."
- [Discussion Thread](Link): Discussed specific issues...
{Note "No relevant discussions found" if none are available}

**Chinese Communities**

{Quote specific post titles/content summaries + links, example format:}
- [Zhihu: Post Title](Link) — Discussed content
- [V2EX: Post Title](Link) — Discussed content
{Note "No relevant discussions found" if none are available}

**💬 My Judgment**

{Subjective evaluation: Whether it's worth investing time in, suitable for what level of users, suggestions on how to use it}

Execution Notes

Prioritize using
```
web_search
```
+
```
web_fetch
```
, with
```
browser
```
as a fallback
Search Enhancement: For project research tasks, default to
```
search-layer
```
v2 Deep Mode +
```
--intent exploratory
```
(Brave + Exa + Tavily three-source parallel deduplication + intent-aware scoring), single-source failure does not block the main process
Mandatory Crawling Degradation: When
```
web_fetch
```
fails/returns 403/anti-crawling page/too short content, or the source domain belongs to high-risk sites (such as WeChat/Zhihu/Xiaohongshu): switch to
```
content-extract
```
(which internally falls back to MinerU-HTML) to get cleaner Markdown + traceable sources
Collect from different sources in parallel to improve efficiency
All links must be valid and accessible, do not fabricate URLs
Output in Chinese, retain English for technical terms

⚠️ Output Self-Check List (Mandatory, Check Item by Item Before Each Output)

Before sending the output report, must check the following items one by one, send only if all are passed:

Title Link: In the format
```
# [Project Name](GitHub URL)
```
, clickable for jump
Empty Lines Around Titles: Each bold title (
```
**🎯 ...**
```
) has one empty line before and after
Telegram Empty Line: There is a braille space
```
⠀
```
line between the end of each list block and the next title (prevents Telegram from swallowing empty lines)
Issue Links: Each selected Issue is in the complete format
```
[#Number Title](Full URL)
```
Competitor Links: Each competitor is attached with
```
[Name](GitHub/Official Website Link)
```
Community Volume Links: Each quote is in the format
```
[Source: Title](URL)
```
No Vague Descriptions: No general descriptions like "highly praised" or "very popular" in the community volume section
Information Traceability: All external quotes are attached with original links

Dependencies

This Skill depends on the following OpenClaw tools and Skills:

Dependency	Type	Purpose
`web_search`	Built-in Tool	Brave Search retrieval
`web_fetch`	Built-in Tool	Web content crawling
`browser`	Built-in Tool	Dynamic page rendering (fallback)
`search-layer`	Skill	Multi-source search + intent-aware scoring (Brave + Exa + Tavily + Grok), v2.1 supports `--intent` / `--queries` / `--freshness`
`content-extract`	Skill	High-fidelity content extraction (degradation solution for anti-crawling sites)