web-doc-resolver
Original:🇺🇸 English
Translated
4 scriptsChecked / no sensitive code detected
Resolve queries or URLs into compact, LLM-ready markdown using a low-cost cascade. Prioritizes llms.txt for structured docs, uses web fetch/search tools for extraction. Use when you need to fetch documentation, resolve web URLs to markdown, search for technical content, or build context from web sources.
14installs
Added on
NPX Install
npx skill4agent add d-o-hub/rust-self-learning-memory web-doc-resolverTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Web Documentation Resolver
Resolve query or URL inputs into compact, high-signal markdown for agents and RAG systems using an intelligent cascade.
When to Use This Skill
Activate this skill when you need to:
- Fetch and parse documentation from a URL
- Search for technical information across the web
- Build context from web sources
- Extract markdown from websites
- Query for technical documentation, APIs, or code examples
Platform Tool Mapping
This skill works across multiple platforms. Use the appropriate tools for your platform:
| Platform | Fetch Tool | Search Tool |
|---|---|---|
| opencode | | |
| claude code | | |
| blackbox | | |
| Python script | Auto-detects available tools | Auto-detects available tools |
Cascade Resolution Strategy
For URL inputs
Use this cascade (in order):
- Check llms.txt first: Probe for site-provided structured documentation (free, always check first)
https://origin/llms.txt - Fetch URL: Use platform's fetch tool to get markdown content
- Search fallback: Use platform's search tool to find cached/mirrored versions if direct fetch fails
For query inputs
Use this cascade (in order):
- Search first: Use platform's search tool with relevant query (fast, free)
- Fetch top results: Use fetch tool to get markdown from top search results if needed
Implementation
Python Script (scripts/resolve.py)
The skill includes a Python script that auto-detects available tools:
bash
# Resolve a URL
python scripts/resolve.py "https://docs.rust-lang.org/book/"
# Resolve a query
python scripts/resolve.py "Rust async programming"
# JSON output
python scripts/resolve.py "query" --json
# Custom max chars
python scripts/resolve.py "query" --max-chars 4000
# Force specific backend
python scripts/resolve.py "query" --backend httpxDirect Tool Usage by Platform
opencode
bash
# Check for llms.txt
webfetch https://example.com/llms.txt
# Fetch URL
webfetch --format markdown https://docs.rust-lang.org/book/
# Search
websearch "Rust book documentation"claude code (MCP)
python
# Check for llms.txt
WebFetch(url="https://example.com/llms.txt")
# Fetch URL
WebFetch(url="https://docs.rust-lang.org/book/")
# Search
WebSearch(query="Rust book documentation")blackbox
python
# Check for llms.txt
web_fetch(url="https://example.com/llms.txt", prompt="Extract all content")
# Fetch URL
web_fetch(url="https://docs.rust-lang.org/book/", prompt="Extract main content")
# Search
web_search(query="Rust book documentation")Usage Examples
Basic URL Resolution
bash
# Using Python script (auto-detects backend)
python scripts/resolve.py "https://docs.rust-lang.org/book/"
# Or use platform tool directly
webfetch https://docs.rust-lang.org/book/ # opencodeQuery Resolution
bash
# Using Python script
python scripts/resolve.py "Rust async programming best practices 2026"
# Or use platform tool directly
websearch "Tokio runtime configuration options" # opencodeWorkflow for Building Context
- Check for llms.txt first: Probe
https://origin/llms.txt - Fetch content: Use fetch tool to get markdown from the URL
- Search if needed: Use search tool for additional context or when fetch fails
Best Practices
- Check for llms.txt first: Many documentation sites have for structured content
/llms.txt - Use specific queries: "rust tokio spawn vs spawn_blocking difference" gets better results than "rust tokio"
- Filter by date: Add "2025" or "2026" to queries for current information
- Prefer official docs: Always check official documentation first
- Try multiple sources: If one URL fails, search for alternative mirrors
Quality Indicators
Good content has:
- Code examples with language markers
- API signatures and type annotations
- Configuration examples
- Version information
- Clear headings and structure
Poor content has:
- Excessive boilerplate/navigation
- Paywall blocks
- Login requirements
- Heavy advertising
Error Handling
- Provider failures should trigger cascade fallback
- Use alternative sources when primary sources fail
- Log errors for debugging
- Fall back to search when direct fetch fails
Testing
Run tests:
bash
cd .agents/skills/web-doc-resolver
python -m pytest tests/ -vRun samples:
bash
python samples/sample_basic.py
python samples/sample_json.pyFiles
- - Main implementation (multi-backend)
scripts/resolve.py - - Unit tests
tests/test_resolve.py - - Basic usage examples
samples/sample_basic.py - - JSON output examples
samples/sample_json.py - - Detailed reference documentation
reference.md