github-explorer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGitHub Explorer — 项目深度分析
GitHub Explorer — In-Depth Project Analysis
Philosophy: README 只是门面,真正的价值藏在 Issues、Commits 和社区讨论里。
Philosophy: The README is just the facade; the real value lies in Issues, Commits, and community discussions.
Workflow
Workflow
[项目名] → [1. 定位 Repo] → [2. 多源采集] → [3. 分析研判] → [4. 结构化输出][Project Name] → [1. Locate Repo] → [2. Multi-Source Collection] → [3. Analysis and Judgment] → [4. Structured Output]Phase 1: 定位 Repo
Phase 1: Locate the Repo
- 用 搜索
web_search确认完整 org/reposite:github.com <project_name> - 用 (Deep 模式 + 意图感知)补充获取社区链接和非 GitHub 资源:
search-layerbashpython3 skills/search-layer/scripts/search.py \ --queries "<project_name> review" "<project_name> 评测 使用体验" \ --mode deep --intent exploratory --num 5 - 用 抓取 repo 主页获取基础信息(README、Stars、Forks、License、最近更新)
web_fetch
- Use to search
web_searchto confirm the full org/repo pathsite:github.com <project_name> - Use (Deep Mode + Intent Awareness) to supplement community links and non-GitHub resources:
search-layerbashpython3 skills/search-layer/scripts/search.py \ --queries "<project_name> review" "<project_name> evaluation user experience" \ --mode deep --intent exploratory --num 5 - Use to crawl the repo homepage for basic information (README, Stars, Forks, License, latest updates)
web_fetch
Phase 2: 多源采集(并行)
Phase 2: Multi-Source Collection (Parallel)
⚠️ GitHub 页面抓取规则(强制):GitHub repo 页面是 SPA(客户端渲染), 只能拿到导航栏壳子,禁止用 web_fetch 抓 github.com 的 repo 页面。一律使用 GitHub API:
web_fetch- README:
curl -s -H "Authorization: token {PAT}" -H "Accept: application/vnd.github.v3.raw" "https://api.github.com/repos/{owner}/{repo}/readme" - Repo 元数据:
curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}" - Issues:
curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/issues?state=all&sort=comments&per_page=10" - Commits:
curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/commits?per_page=10" - File tree:
curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/git/trees/{branch}?recursive=1"
PAT 见 TOOLS.md。
以下来源按需检查,有则采集,无则跳过:
| 来源 | URL 模式 | 采集内容 | 建议工具 |
|---|---|---|---|
| GitHub Repo | | README、About、Contributors | |
| GitHub Issues | | Top 3-5 高质量 Issue | |
| 中文社区 | 微信/知乎/小红书 | 深度评测、使用经验 | |
| 技术博客 | Medium/Dev.to | 技术架构分析 | |
| 讨论区 | V2EX/Reddit | 用户反馈、槽点 | |
⚠️ GitHub Page Crawling Rules (Mandatory): GitHub repo pages are SPAs (client-side rendered), can only get the navigation bar shell. Do NOT use web_fetch to crawl github.com repo pages. Always use the GitHub API:
web_fetch- README:
curl -s -H "Authorization: token {PAT}" -H "Accept: application/vnd.github.v3.raw" "https://api.github.com/repos/{owner}/{repo}/readme" - Repo Metadata:
curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}" - Issues:
curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/issues?state=all&sort=comments&per_page=10" - Commits:
curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/commits?per_page=10" - File Tree:
curl -s -H "Authorization: token {PAT}" "https://api.github.com/repos/{owner}/{repo}/git/trees/{branch}?recursive=1"
See TOOLS.md for PAT.
Check the following sources as needed, collect if available, skip if not:
| Source | URL Pattern | Collected Content | Recommended Tool |
|---|---|---|---|
| GitHub Repo | | README, About, Contributors | |
| GitHub Issues | | Top 3-5 high-quality Issues | |
| Chinese Communities | WeChat/Zhihu/Xiaohongshu | In-depth reviews, usage experience | |
| Technical Blogs | Medium/Dev.to | Technical architecture analysis | |
| Discussion Forums | V2EX/Reddit | User feedback, pain points | |
search-layer 调用规范
search-layer Calling Specifications
search-layer v2 支持意图感知评分。github-explorer 场景下的推荐用法:
| 场景 | 命令 | 说明 |
|---|---|---|
| 项目调研(默认) | | 多查询并行,按权威性排序 |
| 最新动态 | | 优先新鲜度,过滤一周内 |
| 竞品对比 | | 对比意图,关键词+权威双权重 |
| 快速查链接 | | 精确匹配,最快 |
| 社区讨论 | | 加权社区站点 |
意图类型速查:(事实) / (动态) / (对比) / (教程) / (探索) / (新闻) / (资源定位)
factualstatuscomparisontutorialexploratorynewsresource不带时行为与 v1 完全一致(无评分,按原始顺序输出)。--intent
降级规则:Exa/Tavily 任一 429/5xx → 继续用剩余源;脚本整体失败 → 退回 单源。
web_searchsearch-layer v2 supports intent-aware scoring. Recommended usage for github-explorer scenarios:
| Scenario | Command | Description |
|---|---|---|
| Project Research (Default) | | Parallel multi-query, sorted by authority |
| Latest Updates | | Prioritize freshness, filter content from the past week |
| Competitor Comparison | | Comparison intent, dual weighting of keywords and authority |
| Quick Link Lookup | | Exact match, fastest speed |
| Community Discussions | | Weighted community sites |
Intent Type Quick Reference: (Factual) / (Status) / (Comparison) / (Tutorial) / (Exploratory) / (News) / (Resource Locator)
factualstatuscomparisontutorialexploratorynewsresourceWithout, the behavior is exactly the same as v1 (no scoring, output in original order).--intent
Degradation Rules: If either Exa/Tavily returns 429/5xx → continue using remaining sources; if the entire script fails → fall back to single-source .
web_search抓取降级与增强协议 (Extraction Upgrade)
Extraction Upgrade and Degradation Protocol
当遇到以下情况时,必须从 升级为 :
web_fetchcontent-extract- 域名限制: ,
mp.weixin.qq.com,zhihu.com。xiaohongshu.com - 结构复杂: 页面包含大量公式 (LaTeX)、复杂表格、或 返回的 Markdown 极其凌乱。
web_fetch - 内容缺失: 因反爬返回空内容或 Challenge 页面。
web_fetch
调用方式:
bash
python3 skills/content-extract/scripts/content_extract.py --url <URL>content-extract 内部会:
- 先检查域名白名单(微信/知乎等),命中则直接走 MinerU
- 否则先用 探针,失败再 fallback 到 MinerU-HTML
web_fetch - 返回统一 JSON 合同(含 ,
ok,markdown等字段)sources
Must upgrade from to when encountering the following situations:
web_fetchcontent-extract- Domain Restrictions: ,
mp.weixin.qq.com,zhihu.com.xiaohongshu.com - Complex Structure: Pages contain a large number of formulas (LaTeX), complex tables, or the Markdown returned by is extremely messy.
web_fetch - Content Missing: returns empty content or a Challenge page due to anti-crawling measures.
web_fetch
Calling Method:
bash
python3 skills/content-extract/scripts/content_extract.py --url <URL>content-extract internally:
- First checks the domain whitelist (WeChat/Zhihu, etc.), uses MinerU directly if matched
- Otherwise, first uses for probing, falls back to MinerU-HTML if failed
web_fetch - Returns a unified JSON contract (including fields like ,
ok,markdown)sources
Phase 3: 分析研判
Phase 3: Analysis and Judgment
基于采集数据进行判断:
- 项目阶段: 早期实验 / 快速成长 / 成熟稳定 / 维护模式 / 停滞(基于 commit 频率和内容)
- 精选 Issue 标准: 评论数多、maintainer 参与、暴露架构问题、或包含有价值的技术讨论
- 竞品识别: 从 README 的 "Comparison"/"Alternatives" 章节、Issues 讨论、以及 web 搜索中提取
Make judgments based on collected data:
- Project Phase: Early Experimentation / Rapid Growth / Mature & Stable / Maintenance Mode / Stagnant (based on commit frequency and content)
- High-Quality Issue Criteria: High number of comments, maintainer participation, exposes architecture issues, or contains valuable technical discussions
- Competitor Identification: Extract from the "Comparison"/"Alternatives" section of the README, Issue discussions, and web searches
Phase 4: 结构化输出
Phase 4: Structured Output
严格按以下模板输出,每个模块都必须有实质内容或明确标注"未找到"。
Strictly follow the template below, each module must have substantive content or clearly marked "Not Found".
排版规则(强制)
Formatting Rules (Mandatory)
- 标题必须链接到 GitHub 仓库(格式:,确保可点击跳转)
# [Project Name](https://github.com/org/repo) - 标题前后都统一空行(上一板块结尾 → 空行 → 标题 → 空行 → 内容,确保视觉分隔清晰)
- Telegram 空行修复(强制):Telegram 会吞掉列表项(开头)后面的空行。解决方案:在列表末尾与下一个标题之间,插入一行盲文空格
-(U+2800),格式如下:⠀这确保在 Telegram 渲染时标题前的空行不被吞掉。- 列表最后一项 ⠀ **下一个标题** - 所有标题加粗(emoji + 粗体文字)
- 竞品对比必须附链接(GitHub / 官网 / 文档,至少一个)
- 社区声量必须具体:引用具体的帖子/推文/讨论内容摘要,附原始链接。不要写"评价很高"、"热度很高"这种概括性描述,要写"某某说了什么"或"某帖讨论了什么具体问题"
- 信息溯源原则:所有引用的外部信息都应附上原始链接,让读者能追溯到源头
markdown
undefined- Title Must Link to GitHub Repository (Format: , ensure clickable jump)
# [Project Name](https://github.com/org/repo) - Uniform Empty Lines Around Titles (End of previous section → empty line → title → empty line → content, ensure clear visual separation)
- Telegram Empty Line Fix (Mandatory): Telegram will swallow empty lines after list items (starting with ). Solution: Insert a line of braille space
-(U+2800) between the end of the list block and the next title, format as follows:⠀This ensures the empty line before the title is not swallowed during Telegram rendering.- Last item in list ⠀ **Next Title** - All Titles Bold (emoji + bold text)
- Competitor Comparison Must Include Links (GitHub / official website / documentation, at least one)
- Community Volume Must Be Specific: Quote specific post/tweet/discussion content summaries, attach original links. Do not use general descriptions like "highly praised" or "very popular"; instead, write "someone said something" or "a certain post discussed specific issues"
- Information Traceability Principle: All quoted external information should include the original link so readers can trace the source
markdown
undefined[{Project Name}]({GitHub Repo URL})
[{Project Name}]({GitHub Repo URL})
🎯 一句话定位
{是什么、解决什么问题}
⚙️ 核心机制
{技术原理/架构,用人话讲清楚,不是复制 README。包含关键技术栈。}
📊 项目健康度
- Stars: {数量} | Forks: {数量} | License: {类型}
- 团队/作者: {背景}
- Commit 趋势: {最近活跃度 + 项目阶段判断}
- 最近动态: {最近几条重要 commit 概述}
🔥 精选 Issue
{Top 3-5 高质量 Issue,每条包含标题、链接、核心讨论点。如无高质量 Issue 则注明。}
✅ 适用场景
{什么时候该用,解决什么具体问题}
⚠️ 局限
{什么时候别碰,已知问题}
🆚 竞品对比
{同赛道项目对比,差异点。每个竞品必须附 GitHub 或官网链接,格式示例:}
🌐 知识图谱
- DeepWiki: {链接或"未收录"}
- Zread.ai: {链接或"未收录"}
🎬 Demo
{在线体验链接,或"无"}
📄 关联论文
{arXiv 链接,或"无"}
📰 社区声量
X/Twitter
{具体引用推文内容摘要 + 链接,格式示例:}
- @某用户: "具体说了什么..."
- 某讨论串: 讨论了什么具体问题... {如未找到则注明"未找到相关讨论"}
中文社区
{具体引用帖子标题/内容摘要 + 链接,格式示例:}
- 知乎: 帖子标题 — 讨论了什么
- V2EX: 帖子标题 — 讨论了什么 {如未找到则注明"未找到相关讨论"}
💬 我的判断
{主观评价:值不值得投入时间,适合什么水平的人,建议怎么用}
undefined🎯 One-sentence Positioning
{What it is, what problem it solves}
⚙️ Core Mechanism
{Technical principles/architecture, explained in plain language, not copied from README. Include key tech stack.}
📊 Project Health
- Stars: {Number} | Forks: {Number} | License: {Type}
- Team/Author: {Background}
- Commit Trend: {Recent activity + project phase judgment}
- Latest Updates: {Overview of recent important commits}
🔥 Selected Issues
{Top 3-5 high-quality Issues, each including title, link, core discussion points. Note if no high-quality Issues are available.}
✅ Applicable Scenarios
{When to use it, what specific problems it solves}
⚠️ Limitations
{When to avoid it, known issues}
🆚 Competitor Comparison
{Comparison with same-track projects, differences. Each competitor must include a GitHub or official website link, example format:}
🌐 Knowledge Graph
- DeepWiki: {Link or "Not Included"}
- Zread.ai: {Link or "Not Included"}
🎬 Demo
{Online experience link, or "None"}
📄 Related Papers
{arXiv link, or "None"}
📰 Community Volume
X/Twitter
{Quote specific tweet content summaries + links, example format:}
- @Username: "Specific content..."
- Discussion Thread: Discussed specific issues... {Note "No relevant discussions found" if none are available}
Chinese Communities
{Quote specific post titles/content summaries + links, example format:}
- Zhihu: Post Title — Discussed content
- V2EX: Post Title — Discussed content {Note "No relevant discussions found" if none are available}
💬 My Judgment
{Subjective evaluation: Whether it's worth investing time in, suitable for what level of users, suggestions on how to use it}
undefinedExecution Notes
Execution Notes
- 优先使用 +
web_search,browser 作为备选web_fetch - 搜索增强:项目调研类任务默认使用 v2 Deep 模式 +
search-layer(Brave + Exa + Tavily 三源并行去重 + 意图感知评分),单源失败不阻塞主流程--intent exploratory - 抓取降级(强制):当 失败/403/反爬页/正文过短,或来源域名属于高风险站点(如微信/知乎/小红书)时:改用
web_fetch(其内部会 fallback 到 MinerU-HTML),拿到更干净的 Markdown + 可追溯 sourcescontent-extract - 并行采集不同来源以提高效率
- 所有链接必须真实可访问,不要编造 URL
- 中文输出,技术术语保留英文
- Prioritize using +
web_search, withweb_fetchas a fallbackbrowser - Search Enhancement: For project research tasks, default to v2 Deep Mode +
search-layer(Brave + Exa + Tavily three-source parallel deduplication + intent-aware scoring), single-source failure does not block the main process--intent exploratory - Mandatory Crawling Degradation: When fails/returns 403/anti-crawling page/too short content, or the source domain belongs to high-risk sites (such as WeChat/Zhihu/Xiaohongshu): switch to
web_fetch(which internally falls back to MinerU-HTML) to get cleaner Markdown + traceable sourcescontent-extract - Collect from different sources in parallel to improve efficiency
- All links must be valid and accessible, do not fabricate URLs
- Output in Chinese, retain English for technical terms
⚠️ 输出自检清单(强制,每次输出前逐条核对)
⚠️ Output Self-Check List (Mandatory, Check Item by Item Before Each Output)
输出报告前,必须逐条检查以下项目,全部通过才可发送:
- 标题链接:格式,可点击跳转
# [Project Name](GitHub URL) - 标题空行:每个粗体标题()前后各有一个空行
**🎯 ...** - Telegram 空行:每个列表块末尾与下一个标题之间有盲文空格 行(防止 Telegram 吞空行)
⠀ - Issue 链接:精选 Issue 每条都有完整 格式
[#号 标题](完整URL) - 竞品链接:每个竞品都附
[名称](GitHub/官网链接) - 社区声量链接:每条引用都有 格式
[来源: 标题](URL) - 无空泛描述:社区声量部分没有"评价很高"、"热度很高"等概括性描述
- 信息溯源:所有外部引用都附原始链接
Before sending the output report, must check the following items one by one, send only if all are passed:
- Title Link: In the format , clickable for jump
# [Project Name](GitHub URL) - Empty Lines Around Titles: Each bold title () has one empty line before and after
**🎯 ...** - Telegram Empty Line: There is a braille space line between the end of each list block and the next title (prevents Telegram from swallowing empty lines)
⠀ - Issue Links: Each selected Issue is in the complete format
[#Number Title](Full URL) - Competitor Links: Each competitor is attached with
[Name](GitHub/Official Website Link) - Community Volume Links: Each quote is in the format
[Source: Title](URL) - No Vague Descriptions: No general descriptions like "highly praised" or "very popular" in the community volume section
- Information Traceability: All external quotes are attached with original links
Dependencies
Dependencies
本 Skill 依赖以下 OpenClaw 工具和 Skills:
| 依赖 | 类型 | 用途 |
|---|---|---|
| 内置工具 | Brave Search 检索 |
| 内置工具 | 网页内容抓取 |
| 内置工具 | 动态页面渲染(备选) |
| Skill | 多源搜索 + 意图感知评分(Brave + Exa + Tavily + Grok),v2.1 支持 |
| Skill | 高保真内容提取(反爬站点降级方案) |
This Skill depends on the following OpenClaw tools and Skills:
| Dependency | Type | Purpose |
|---|---|---|
| Built-in Tool | Brave Search retrieval |
| Built-in Tool | Web content crawling |
| Built-in Tool | Dynamic page rendering (fallback) |
| Skill | Multi-source search + intent-aware scoring (Brave + Exa + Tavily + Grok), v2.1 supports |
| Skill | High-fidelity content extraction (degradation solution for anti-crawling sites) |