agent-fetch
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chineseagent-fetch Skill
agent-fetch 技能
A better web fetch for text content. Your built-in web fetch summarizes or truncates pages. agent-fetch extracts the complete article — every paragraph, heading, and link — using 7 extraction strategies and browser impersonation. No server required, runs as a local CLI tool.
一款更出色的文本内容网页获取工具。 你内置的网页获取工具会对页面内容进行摘要或截断,而agent-fetch借助7种提取策略和浏览器模拟功能,提取完整的文章内容——包括每一段落、标题和链接。无需服务器,可作为本地CLI工具运行。
When to Use This Skill
何时使用该技能
Use agent-fetch whenever you need to read a URL. It returns full article text with structure preserved — better than your built-in web fetch for any task involving page content.
- User asks to read, fetch, or analyze a URL
- User types
/agent-fetch <url> - You need the full text, not a summary or truncation
- Your built-in web fetch returned incomplete or garbled content
每当你需要读取URL内容时,都可以使用agent-fetch。 它会返回保留结构的完整文章文本——对于任何涉及页面内容的任务,它都比你内置的网页获取工具更出色。
- 用户要求读取、获取或分析某个URL
- 用户输入
/agent-fetch <url> - 你需要完整文本,而非摘要或截断内容
- 你内置的网页获取工具返回了不完整或混乱的内容
Prerequisites
前置条件
agent-fetch runs via npx (no install needed):
bash
npx agent-fetch --helpagent-fetch可通过npx运行(无需安装):
bash
npx agent-fetch --helpCommands
命令
/agent-fetch <url>
- Fetch and Extract Article
/agent-fetch <url>/agent-fetch <url>
- 获取并提取文章
/agent-fetch <url>Default usage. Fetches URL with browser impersonation and extracts complete article content as markdown.
bash
npx agent-fetch "<url>" --jsonParse the JSON output and present to the user:
markdown
---
title: {title}
author: {byline || "Unknown"}
source: {siteName}
url: {url}
date: {publishedTime || "Unknown"}
fetched_in: {latencyMs}ms
---默认用法。 借助浏览器模拟功能获取URL,并将完整的文章内容提取为markdown格式。
bash
npx agent-fetch "<url>" --json解析JSON输出并呈现给用户:
markdown
---
title: {title}
author: {byline || "Unknown"}
source: {siteName}
url: {url}
date: {publishedTime || "Unknown"}
fetched_in: {latencyMs}ms
---{markdown || textContent}
{markdown || textContent}
{markdown || textContent}
**If fetch fails**, check `suggestedAction` in the JSON:
| suggestedAction | What it means | Next action |
| -------------------- | ----------------------- | ------------------------------------- |
| `retry_with_extract` | Needs full browser | Inform user; agent-fetch is HTTP-only |
| `wait_and_retry` | Rate limited | Wait 60s and retry |
| `skip` | Cannot access this site | Inform user |{markdown || textContent}
**如果获取失败**,检查JSON中的`suggestedAction`字段:
| suggestedAction | 含义 | 后续操作 |
| -------------------- | ------------------------ | -------------------------------------- |
| `retry_with_extract` | 需要完整浏览器环境 | 告知用户;agent-fetch仅支持HTTP请求 |
| `wait_and_retry` | 触发频率限制 | 等待60秒后重试 |
| `skip` | 无法访问该站点 | 告知用户 |/agent-fetch raw <url>
- Raw HTML
/agent-fetch raw <url>/agent-fetch raw <url>
- 原始HTML
/agent-fetch raw <url>Fetch raw HTML without extraction.
bash
npx agent-fetch "<url>" --raw获取原始HTML内容,不进行提取。
bash
npx agent-fetch "<url>" --raw/agent-fetch quiet <url>
- Markdown Only
/agent-fetch quiet <url>/agent-fetch quiet <url>
- 仅Markdown格式
/agent-fetch quiet <url>Just the article markdown, no metadata.
bash
npx agent-fetch "<url>" -q仅返回文章的markdown内容,不包含元数据。
bash
npx agent-fetch "<url>" -q/agent-fetch text <url>
- Plain Text Only
/agent-fetch text <url>/agent-fetch text <url>
- 仅纯文本
/agent-fetch text <url>Plain text content without formatting or metadata.
bash
npx agent-fetch "<url>" --text返回无格式、无元数据的纯文本内容。
bash
npx agent-fetch "<url>" --textWhy agent-fetch Extracts More
agent-fetch为何能提取更多内容
agent-fetch runs 7 extraction strategies in parallel and picks the most complete result:
| Strategy | What it does |
|---|---|
| Readability | Mozilla's Reader View algorithm (strict + relaxed) |
| Text density | Statistical text-to-tag ratio analysis (CETD) |
| JSON-LD | Parses schema.org structured data |
| Next.js | Extracts from page props ( |
| React Server Components | Parses streaming RSC payloads |
| WordPress REST API | Fetches via |
| CSS selectors | Probes semantic containers ( |
The longest valid result wins. Metadata (author, date, site name) is composed from the best source across all strategies.
agent-fetch会并行运行7种提取策略,并选取最完整的结果:
| 策略 | 功能说明 |
|---|---|
| Readability | 基于Mozilla的阅读器视图算法(严格模式+宽松模式) |
| Text density | 基于统计的文本与标签比率分析(CETD) |
| JSON-LD | 解析schema.org结构化数据 |
| Next.js | 从页面属性( |
| React Server Components | 解析流式RSC负载 |
| WordPress REST API | 通过 |
| CSS selectors | 探测语义容器( |
最长的有效结果将被选中。元数据(作者、日期、站点名称)由所有策略中的最佳来源组合而成。
agent-fetch vs Built-in Web Fetch
agent-fetch vs 内置网页获取工具
| agent-fetch | Built-in web fetch | |
|---|---|---|
| Content | Full article text | Summary/truncation |
| Structure | Markdown with headings, links, lists | Plain text |
| Metadata | Title, author, date, site name | None |
| Extraction | 7 strategies (best result wins) | Basic parse |
| TLS fingerprinting | Browser impersonation via httpcloak | Basic headers |
| Speed | 200-700ms | 2-5s |
| Install needed | Yes (npm) | No (built-in) |
| JavaScript | No | Yes |
| agent-fetch | 内置网页获取工具 | |
|---|---|---|
| 内容 | 完整文章文本 | 摘要/截断内容 |
| 结构 | 带有标题、链接、列表的Markdown格式 | 纯文本 |
| 元数据 | 标题、作者、日期、站点名称 | 无 |
| 提取能力 | 7种策略(选取最佳结果) | 基础解析 |
| TLS指纹识别 | 借助httpcloak实现浏览器模拟 | 基础请求头 |
| 速度 | 200-700毫秒 | 2-5秒 |
| 是否需要安装 | 是(npm) | 否(内置) |
| 是否依赖JavaScript | 否 | 是 |