web-fetch
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWeb Fetch Skill
Web Fetch 技能
Fetch and parse web content from URLs.
从URL获取并解析网页内容。
When to Use
适用场景
✅ USE this skill when:
- "Fetch content from URL"
- "Download file from..."
- "Extract article text from..."
- "Get page title and description"
- "Scrape data from webpage"
✅ 在以下场景使用本技能:
- "从URL获取内容"
- "从...下载文件"
- "从...提取文章文本"
- "获取页面标题和描述"
- "从网页抓取数据"
When NOT to Use
不适用场景
❌ DON'T use this skill when:
- Interactive browser actions → use browser-tools
- Authenticated sessions → use browser-tools with profile
- JavaScript-heavy SPAs → use browser-tools
❌ 请勿在以下场景使用本技能:
- 需要交互式浏览器操作 → 使用browser-tools
- 需要已认证会话 → 使用带配置文件的browser-tools
- 重度依赖JavaScript的SPA页面 → 使用browser-tools
Commands
命令
Fetch Content
获取内容
bash
{baseDir}/fetch.sh "https://example.com"
{baseDir}/fetch.sh "https://example.com" --markdown
{baseDir}/fetch.sh "https://example.com" --jsonbash
{baseDir}/fetch.sh "https://example.com"
{baseDir}/fetch.sh "https://example.com" --markdown
{baseDir}/fetch.sh "https://example.com" --jsonExtract Article
提取文章
bash
{baseDir}/extract.sh "https://example.com/article"
{baseDir}/extract.sh "https://example.com/article" --format markdownbash
{baseDir}/extract.sh "https://example.com/article"
{baseDir}/extract.sh "https://example.com/article" --format markdownDownload File
下载文件
bash
{baseDir}/download.sh "https://example.com/file.pdf" --out /tmp/file.pdf
{baseDir}/download.sh "https://example.com/archive.zip" --out /tmp/archive.zipbash
{baseDir}/download.sh "https://example.com/file.pdf" --out /tmp/file.pdf
{baseDir}/download.sh "https://example.com/archive.zip" --out /tmp/archive.zipGet Page Metadata
获取页面元数据
bash
{baseDir}/metadata.sh "https://example.com"
{baseDir}/metadata.sh "https://example.com" --jsonbash
{baseDir}/metadata.sh "https://example.com"
{baseDir}/metadata.sh "https://example.com" --jsonExtract Links
提取链接
bash
{baseDir}/links.sh "https://example.com"
{baseDir}/links.sh "https://example.com" --filter "blog"bash
{baseDir}/links.sh "https://example.com"
{baseDir}/links.sh "https://example.com" --filter "blog"Extract Images
提取图片
bash
{baseDir}/images.sh "https://example.com"
{baseDir}/images.sh "https://example.com" --download --out /tmp/images/bash
{baseDir}/images.sh "https://example.com"
{baseDir}/images.sh "https://example.com" --download --out /tmp/images/Options
选项
- : Output as markdown
--markdown - : Output as JSON
--json - : Plain text output
--text - : Timeout in seconds (default: 30)
--timeout N - : Custom user agent
--user-agent - : Output file path
--out <path>
- : 以markdown格式输出
--markdown - : 以JSON格式输出
--json - : 纯文本输出
--text - : 超时时间(秒,默认:30)
--timeout N - : 自定义用户代理
--user-agent - : 输出文件路径
--out <path>
Output Formats
输出格式
Plain Text
纯文本
Extract visible text from HTML, cleaned of scripts and styles.
从HTML中提取可见文本,移除脚本和样式。
Markdown
Markdown
Convert HTML to markdown with proper formatting.
将HTML转换为格式规范的markdown。
JSON
JSON
Structured output with title, content, metadata.
包含标题、内容、元数据的结构化输出。
Examples
示例
Get article content:
bash
{baseDir}/extract.sh "https://example.com/blog/post" --markdownDownload all PDFs from page:
bash
{baseDir}/links.sh "https://example.com" --filter ".pdf" | xargs -I {} download.sh "{}"Get page metadata:
bash
{baseDir}/metadata.sh "https://example.com" --json获取文章内容:
bash
{baseDir}/extract.sh "https://example.com/blog/post" --markdown下载页面中所有PDF:
bash
{baseDir}/links.sh "https://example.com" --filter ".pdf" | xargs -I {} download.sh "{}"获取页面元数据:
bash
{baseDir}/metadata.sh "https://example.com" --jsonOutput: {"title": "...", "description": "...", "og:image": "..."}
输出: {"title": "...", "description": "...", "og:image": "..."}
undefinedundefinedNotes
注意事项
- Respects robots.txt by default
- Rate limiting: 1 request per second by default
- Use to set custom user agent
--user-agent - For JavaScript-heavy pages, use browser-tools instead
- 默认遵守robots.txt规则
- 默认速率限制:每秒1次请求
- 使用设置自定义用户代理
--user-agent - 对于重度依赖JavaScript的页面,请使用browser-tools