web-doc-resolver

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Web Documentation Resolver

网页文档解析器

Resolve query or URL inputs into compact, high-signal markdown for agents and RAG systems using an intelligent cascade.
通过智能级联流程,将查询内容或URL输入转换为紧凑、高信息密度的Markdown格式,供Agent和RAG系统使用。

When to Use This Skill

何时使用该Skill

Activate this skill when you need to:
  • Fetch and parse documentation from a URL
  • Search for technical information across the web
  • Build context from web sources
  • Extract markdown from websites
  • Query for technical documentation, APIs, or code examples
在以下场景激活该Skill:
  • 从URL获取并解析文档
  • 在全网搜索技术信息
  • 从网页来源构建上下文
  • 从网站提取Markdown内容
  • 查询技术文档、API或代码示例

Platform Tool Mapping

平台工具映射

This skill works across multiple platforms. Use the appropriate tools for your platform:
PlatformFetch ToolSearch Tool
opencode
webfetch
websearch
claude code
WebFetch
(MCP)
WebSearch
(MCP)
blackbox
web_fetch
web_search
Python scriptAuto-detects available toolsAuto-detects available tools
该Skill支持多平台使用,请根据你的平台选择合适的工具:
平台抓取工具搜索工具
opencode
webfetch
websearch
claude code
WebFetch
(MCP)
WebSearch
(MCP)
blackbox
web_fetch
web_search
Python脚本自动检测可用工具自动检测可用工具

Cascade Resolution Strategy

级联解析策略

For URL inputs

针对URL输入

Use this cascade (in order):
  1. Check llms.txt first: Probe
    https://origin/llms.txt
    for site-provided structured documentation (free, always check first)
  2. Fetch URL: Use platform's fetch tool to get markdown content
  3. Search fallback: Use platform's search tool to find cached/mirrored versions if direct fetch fails
按以下顺序执行级联流程:
  1. 优先检查llms.txt:访问
    https://origin/llms.txt
    获取站点提供的结构化文档(免费,需优先检查)
  2. 抓取URL内容:使用平台的抓取工具获取Markdown格式内容
  3. 搜索回退:若直接抓取失败,使用平台的搜索工具查找缓存/镜像版本

For query inputs

针对查询输入

Use this cascade (in order):
  1. Search first: Use platform's search tool with relevant query (fast, free)
  2. Fetch top results: Use fetch tool to get markdown from top search results if needed
按以下顺序执行级联流程:
  1. 优先搜索:使用平台的搜索工具执行相关查询(快速、免费)
  2. 抓取顶部结果:若需要,使用抓取工具从搜索顶部结果中获取Markdown内容

Implementation

实现方式

Python Script (scripts/resolve.py)

Python脚本(scripts/resolve.py)

The skill includes a Python script that auto-detects available tools:
bash
undefined
该Skill包含一个Python脚本,可自动检测可用工具:
bash
undefined

Resolve a URL

Resolve a URL

python scripts/resolve.py "https://docs.rust-lang.org/book/"
python scripts/resolve.py "https://docs.rust-lang.org/book/"

Resolve a query

Resolve a query

python scripts/resolve.py "Rust async programming"
python scripts/resolve.py "Rust async programming"

JSON output

JSON output

python scripts/resolve.py "query" --json
python scripts/resolve.py "query" --json

Custom max chars

Custom max chars

python scripts/resolve.py "query" --max-chars 4000
python scripts/resolve.py "query" --max-chars 4000

Force specific backend

Force specific backend

python scripts/resolve.py "query" --backend httpx
undefined
python scripts/resolve.py "query" --backend httpx
undefined

Direct Tool Usage by Platform

各平台直接工具调用

opencode

opencode

bash
undefined
bash
undefined

Check for llms.txt

Check for llms.txt

Fetch URL

Fetch URL

webfetch --format markdown https://docs.rust-lang.org/book/
webfetch --format markdown https://docs.rust-lang.org/book/

Search

Search

websearch "Rust book documentation"
undefined
websearch "Rust book documentation"
undefined

claude code (MCP)

claude code (MCP)

python
undefined
python
undefined

Check for llms.txt

Check for llms.txt

Fetch URL

Fetch URL

Search

Search

WebSearch(query="Rust book documentation")
undefined
WebSearch(query="Rust book documentation")
undefined

blackbox

blackbox

python
undefined
python
undefined

Check for llms.txt

Check for llms.txt

web_fetch(url="https://example.com/llms.txt", prompt="Extract all content")
web_fetch(url="https://example.com/llms.txt", prompt="Extract all content")

Fetch URL

Fetch URL

web_fetch(url="https://docs.rust-lang.org/book/", prompt="Extract main content")
web_fetch(url="https://docs.rust-lang.org/book/", prompt="Extract main content")

Search

Search

web_search(query="Rust book documentation")
undefined
web_search(query="Rust book documentation")
undefined

Usage Examples

使用示例

Basic URL Resolution

基础URL解析

bash
undefined
bash
undefined

Using Python script (auto-detects backend)

Using Python script (auto-detects backend)

python scripts/resolve.py "https://docs.rust-lang.org/book/"
python scripts/resolve.py "https://docs.rust-lang.org/book/"

Or use platform tool directly

Or use platform tool directly

undefined
undefined

Query Resolution

查询解析

bash
undefined
bash
undefined

Using Python script

Using Python script

python scripts/resolve.py "Rust async programming best practices 2026"
python scripts/resolve.py "Rust async programming best practices 2026"

Or use platform tool directly

Or use platform tool directly

websearch "Tokio runtime configuration options" # opencode
undefined
websearch "Tokio runtime configuration options" # opencode
undefined

Workflow for Building Context

上下文构建工作流

  1. Check for llms.txt first: Probe
    https://origin/llms.txt
  2. Fetch content: Use fetch tool to get markdown from the URL
  3. Search if needed: Use search tool for additional context or when fetch fails
  1. 优先检查llms.txt:访问
    https://origin/llms.txt
  2. 抓取内容:使用抓取工具从URL获取Markdown内容
  3. 按需搜索:若抓取失败,使用搜索工具获取额外上下文

Best Practices

最佳实践

  • Check for llms.txt first: Many documentation sites have
    /llms.txt
    for structured content
  • Use specific queries: "rust tokio spawn vs spawn_blocking difference" gets better results than "rust tokio"
  • Filter by date: Add "2025" or "2026" to queries for current information
  • Prefer official docs: Always check official documentation first
  • Try multiple sources: If one URL fails, search for alternative mirrors
  • 优先检查llms.txt:许多文档站点提供
    /llms.txt
    用于获取结构化内容
  • 使用精准查询:例如“rust tokio spawn vs spawn_blocking difference”比“rust tokio”能获得更优结果
  • 按日期筛选:在查询中加入“2025”或“2026”以获取最新信息
  • 优先官方文档:始终优先检查官方文档
  • 尝试多来源:若某个URL无法访问,搜索其他镜像站点

Quality Indicators

质量评估指标

Good content has:
  • Code examples with language markers
  • API signatures and type annotations
  • Configuration examples
  • Version information
  • Clear headings and structure
Poor content has:
  • Excessive boilerplate/navigation
  • Paywall blocks
  • Login requirements
  • Heavy advertising
优质内容具备:
  • 带语言标记的代码示例
  • API签名和类型注解
  • 配置示例
  • 版本信息
  • 清晰的标题和结构
劣质内容特征:
  • 过多的冗余内容/导航元素
  • 付费墙限制
  • 登录要求
  • 大量广告

Error Handling

错误处理

  • Provider failures should trigger cascade fallback
  • Use alternative sources when primary sources fail
  • Log errors for debugging
  • Fall back to search when direct fetch fails
  • 若工具调用失败,触发级联回退流程
  • 主来源失败时使用替代来源
  • 记录错误用于调试
  • 直接抓取失败时回退到搜索

Testing

测试

Run tests:
bash
cd .agents/skills/web-doc-resolver
python -m pytest tests/ -v
Run samples:
bash
python samples/sample_basic.py
python samples/sample_json.py
运行测试:
bash
cd .agents/skills/web-doc-resolver
python -m pytest tests/ -v
运行示例:
bash
python samples/sample_basic.py
python samples/sample_json.py

Files

文件说明

  • scripts/resolve.py
    - Main implementation (multi-backend)
  • tests/test_resolve.py
    - Unit tests
  • samples/sample_basic.py
    - Basic usage examples
  • samples/sample_json.py
    - JSON output examples
  • reference.md
    - Detailed reference documentation
  • scripts/resolve.py
    - 核心实现(支持多后端)
  • tests/test_resolve.py
    - 单元测试
  • samples/sample_basic.py
    - 基础使用示例
  • samples/sample_json.py
    - JSON输出示例
  • reference.md
    - 详细参考文档