agent-fetch

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

agent-fetch Skill

agent-fetch 技能

A better web fetch for text content. Your built-in web fetch summarizes or truncates pages. agent-fetch extracts the complete article — every paragraph, heading, and link — using 7 extraction strategies and browser impersonation. No server required, runs as a local CLI tool.

一款更出色的文本内容网页获取工具。 你内置的网页获取工具会对页面内容进行摘要或截断，而agent-fetch借助7种提取策略和浏览器模拟功能，提取完整的文章内容——包括每一段落、标题和链接。无需服务器，可作为本地CLI工具运行。

When to Use This Skill

何时使用该技能

Use agent-fetch whenever you need to read a URL. It returns full article text with structure preserved — better than your built-in web fetch for any task involving page content.

User asks to read, fetch, or analyze a URL
User types
```
/agent-fetch <url>
```
You need the full text, not a summary or truncation
Your built-in web fetch returned incomplete or garbled content

每当你需要读取URL内容时，都可以使用agent-fetch。 它会返回保留结构的完整文章文本——对于任何涉及页面内容的任务，它都比你内置的网页获取工具更出色。

用户要求读取、获取或分析某个URL
用户输入
```
/agent-fetch <url>
```
你需要完整文本，而非摘要或截断内容
你内置的网页获取工具返回了不完整或混乱的内容

Prerequisites

前置条件

agent-fetch runs via npx (no install needed):

bash

npx agent-fetch --help

agent-fetch可通过npx运行（无需安装）：

bash

npx agent-fetch --help

Commands

命令

/agent-fetch <url>

- Fetch and Extract Article

/agent-fetch <url>

- 获取并提取文章

Default usage. Fetches URL with browser impersonation and extracts complete article content as markdown.

bash

npx agent-fetch "<url>" --json

Parse the JSON output and present to the user:

markdown

---
title: {title}
author: {byline || "Unknown"}
source: {siteName}
url: {url}
date: {publishedTime || "Unknown"}
fetched_in: {latencyMs}ms
---

默认用法。 借助浏览器模拟功能获取URL，并将完整的文章内容提取为markdown格式。

bash

npx agent-fetch "<url>" --json

解析JSON输出并呈现给用户：

markdown

---
title: {title}
author: {byline || "Unknown"}
source: {siteName}
url: {url}
date: {publishedTime || "Unknown"}
fetched_in: {latencyMs}ms
---

{markdown || textContent}


**If fetch fails**, check `suggestedAction` in the JSON:

| suggestedAction      | What it means           | Next action                           |
| -------------------- | ----------------------- | ------------------------------------- |
| `retry_with_extract` | Needs full browser      | Inform user; agent-fetch is HTTP-only |
| `wait_and_retry`     | Rate limited            | Wait 60s and retry                    |
| `skip`               | Cannot access this site | Inform user                           |

{markdown || textContent}


**如果获取失败**，检查JSON中的`suggestedAction`字段：

| suggestedAction      | 含义                     | 后续操作                               |
| -------------------- | ------------------------ | -------------------------------------- |
| `retry_with_extract` | 需要完整浏览器环境       | 告知用户；agent-fetch仅支持HTTP请求     |
| `wait_and_retry`     | 触发频率限制             | 等待60秒后重试                         |
| `skip`               | 无法访问该站点           | 告知用户                               |

/agent-fetch raw <url>

- Raw HTML

/agent-fetch raw <url>

- 原始HTML

Fetch raw HTML without extraction.

bash

npx agent-fetch "<url>" --raw

获取原始HTML内容，不进行提取。

bash

npx agent-fetch "<url>" --raw

/agent-fetch quiet <url>

- Markdown Only

/agent-fetch quiet <url>

- 仅Markdown格式

Just the article markdown, no metadata.

bash

npx agent-fetch "<url>" -q

仅返回文章的markdown内容，不包含元数据。

bash

npx agent-fetch "<url>" -q

/agent-fetch text <url>

- Plain Text Only

/agent-fetch text <url>

- 仅纯文本

Plain text content without formatting or metadata.

bash

npx agent-fetch "<url>" --text

返回无格式、无元数据的纯文本内容。

bash

npx agent-fetch "<url>" --text

Why agent-fetch Extracts More

agent-fetch为何能提取更多内容

agent-fetch runs 7 extraction strategies in parallel and picks the most complete result:

Strategy	What it does
Readability	Mozilla's Reader View algorithm (strict + relaxed)
Text density	Statistical text-to-tag ratio analysis (CETD)
JSON-LD	Parses schema.org structured data
Next.js	Extracts from page props ( `__NEXT_DATA__` )
React Server Components	Parses streaming RSC payloads
WordPress REST API	Fetches via `/wp-json/wp/v2/` endpoints
CSS selectors	Probes semantic containers ( `<article>` , `.post-content` )

The longest valid result wins. Metadata (author, date, site name) is composed from the best source across all strategies.

agent-fetch会并行运行7种提取策略，并选取最完整的结果：

策略	功能说明
Readability	基于Mozilla的阅读器视图算法（严格模式+宽松模式）
Text density	基于统计的文本与标签比率分析（CETD）
JSON-LD	解析schema.org结构化数据
Next.js	从页面属性（ `__NEXT_DATA__` ）中提取内容
React Server Components	解析流式RSC负载
WordPress REST API	通过 `/wp-json/wp/v2/` 端点获取内容
CSS selectors	探测语义容器（ `<article>` 、 `.post-content` ）

最长的有效结果将被选中。元数据（作者、日期、站点名称）由所有策略中的最佳来源组合而成。

agent-fetch vs Built-in Web Fetch

agent-fetch vs 内置网页获取工具

	agent-fetch	Built-in web fetch
Content	Full article text	Summary/truncation
Structure	Markdown with headings, links, lists	Plain text
Metadata	Title, author, date, site name	None
Extraction	7 strategies (best result wins)	Basic parse
TLS fingerprinting	Browser impersonation via httpcloak	Basic headers
Speed	200-700ms	2-5s
Install needed	Yes (npm)	No (built-in)
JavaScript	No	Yes

	agent-fetch	内置网页获取工具
内容	完整文章文本	摘要/截断内容
结构	带有标题、链接、列表的Markdown格式	纯文本
元数据	标题、作者、日期、站点名称	无
提取能力	7种策略（选取最佳结果）	基础解析
TLS指纹识别	借助httpcloak实现浏览器模拟	基础请求头
速度	200-700毫秒	2-5秒
是否需要安装	是（npm）	否（内置）
是否依赖JavaScript	否	是

agent-fetch

Original

Translation

agent-fetch Skill

agent-fetch 技能

When to Use This Skill

何时使用该技能

Prerequisites

前置条件

Commands

命令

`/agent-fetch <url>`
- Fetch and Extract Article

`/agent-fetch <url>`
- 获取并提取文章

{markdown || textContent}

{markdown || textContent}

`/agent-fetch raw <url>`
- Raw HTML

`/agent-fetch raw <url>`
- 原始HTML

`/agent-fetch quiet <url>`
- Markdown Only

`/agent-fetch quiet <url>`
- 仅Markdown格式

`/agent-fetch text <url>`
- Plain Text Only

`/agent-fetch text <url>`
- 仅纯文本

Why agent-fetch Extracts More

agent-fetch为何能提取更多内容

agent-fetch vs Built-in Web Fetch

agent-fetch vs 内置网页获取工具