agent-fetch

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

agent-fetch Skill

agent-fetch 技能

A better web fetch for text content. Your built-in web fetch summarizes or truncates pages. agent-fetch extracts the complete article — every paragraph, heading, and link — using 7 extraction strategies and browser impersonation. No server required, runs as a local CLI tool.
一款更出色的文本内容网页获取工具。 你内置的网页获取工具会对页面内容进行摘要或截断,而agent-fetch借助7种提取策略和浏览器模拟功能,提取完整的文章内容——包括每一段落、标题和链接。无需服务器,可作为本地CLI工具运行。

When to Use This Skill

何时使用该技能

Use agent-fetch whenever you need to read a URL. It returns full article text with structure preserved — better than your built-in web fetch for any task involving page content.
  • User asks to read, fetch, or analyze a URL
  • User types
    /agent-fetch <url>
  • You need the full text, not a summary or truncation
  • Your built-in web fetch returned incomplete or garbled content
每当你需要读取URL内容时,都可以使用agent-fetch。 它会返回保留结构的完整文章文本——对于任何涉及页面内容的任务,它都比你内置的网页获取工具更出色。
  • 用户要求读取、获取或分析某个URL
  • 用户输入
    /agent-fetch <url>
  • 你需要完整文本,而非摘要或截断内容
  • 你内置的网页获取工具返回了不完整或混乱的内容

Prerequisites

前置条件

agent-fetch runs via npx (no install needed):
bash
npx agent-fetch --help
agent-fetch可通过npx运行(无需安装):
bash
npx agent-fetch --help

Commands

命令

/agent-fetch <url>
- Fetch and Extract Article

/agent-fetch <url>
- 获取并提取文章

Default usage. Fetches URL with browser impersonation and extracts complete article content as markdown.
bash
npx agent-fetch "<url>" --json
Parse the JSON output and present to the user:
markdown
---
title: {title}
author: {byline || "Unknown"}
source: {siteName}
url: {url}
date: {publishedTime || "Unknown"}
fetched_in: {latencyMs}ms
---
默认用法。 借助浏览器模拟功能获取URL,并将完整的文章内容提取为markdown格式。
bash
npx agent-fetch "<url>" --json
解析JSON输出并呈现给用户:
markdown
---
title: {title}
author: {byline || "Unknown"}
source: {siteName}
url: {url}
date: {publishedTime || "Unknown"}
fetched_in: {latencyMs}ms
---

{markdown || textContent}

{markdown || textContent}

{markdown || textContent}

**If fetch fails**, check `suggestedAction` in the JSON:

| suggestedAction      | What it means           | Next action                           |
| -------------------- | ----------------------- | ------------------------------------- |
| `retry_with_extract` | Needs full browser      | Inform user; agent-fetch is HTTP-only |
| `wait_and_retry`     | Rate limited            | Wait 60s and retry                    |
| `skip`               | Cannot access this site | Inform user                           |
{markdown || textContent}

**如果获取失败**,检查JSON中的`suggestedAction`字段:

| suggestedAction      | 含义                     | 后续操作                               |
| -------------------- | ------------------------ | -------------------------------------- |
| `retry_with_extract` | 需要完整浏览器环境       | 告知用户;agent-fetch仅支持HTTP请求     |
| `wait_and_retry`     | 触发频率限制             | 等待60秒后重试                         |
| `skip`               | 无法访问该站点           | 告知用户                               |

/agent-fetch raw <url>
- Raw HTML

/agent-fetch raw <url>
- 原始HTML

Fetch raw HTML without extraction.
bash
npx agent-fetch "<url>" --raw
获取原始HTML内容,不进行提取。
bash
npx agent-fetch "<url>" --raw

/agent-fetch quiet <url>
- Markdown Only

/agent-fetch quiet <url>
- 仅Markdown格式

Just the article markdown, no metadata.
bash
npx agent-fetch "<url>" -q
仅返回文章的markdown内容,不包含元数据。
bash
npx agent-fetch "<url>" -q

/agent-fetch text <url>
- Plain Text Only

/agent-fetch text <url>
- 仅纯文本

Plain text content without formatting or metadata.
bash
npx agent-fetch "<url>" --text
返回无格式、无元数据的纯文本内容。
bash
npx agent-fetch "<url>" --text

Why agent-fetch Extracts More

agent-fetch为何能提取更多内容

agent-fetch runs 7 extraction strategies in parallel and picks the most complete result:
StrategyWhat it does
ReadabilityMozilla's Reader View algorithm (strict + relaxed)
Text densityStatistical text-to-tag ratio analysis (CETD)
JSON-LDParses schema.org structured data
Next.jsExtracts from page props (
__NEXT_DATA__
)
React Server ComponentsParses streaming RSC payloads
WordPress REST APIFetches via
/wp-json/wp/v2/
endpoints
CSS selectorsProbes semantic containers (
<article>
,
.post-content
)
The longest valid result wins. Metadata (author, date, site name) is composed from the best source across all strategies.
agent-fetch会并行运行7种提取策略,并选取最完整的结果:
策略功能说明
Readability基于Mozilla的阅读器视图算法(严格模式+宽松模式)
Text density基于统计的文本与标签比率分析(CETD)
JSON-LD解析schema.org结构化数据
Next.js从页面属性(
__NEXT_DATA__
)中提取内容
React Server Components解析流式RSC负载
WordPress REST API通过
/wp-json/wp/v2/
端点获取内容
CSS selectors探测语义容器(
<article>
.post-content
最长的有效结果将被选中。元数据(作者、日期、站点名称)由所有策略中的最佳来源组合而成。

agent-fetch vs Built-in Web Fetch

agent-fetch vs 内置网页获取工具

agent-fetchBuilt-in web fetch
ContentFull article textSummary/truncation
StructureMarkdown with headings, links, listsPlain text
MetadataTitle, author, date, site nameNone
Extraction7 strategies (best result wins)Basic parse
TLS fingerprintingBrowser impersonation via httpcloakBasic headers
Speed200-700ms2-5s
Install neededYes (npm)No (built-in)
JavaScriptNoYes
agent-fetch内置网页获取工具
内容完整文章文本摘要/截断内容
结构带有标题、链接、列表的Markdown格式纯文本
元数据标题、作者、日期、站点名称
提取能力7种策略(选取最佳结果)基础解析
TLS指纹识别借助httpcloak实现浏览器模拟基础请求头
速度200-700毫秒2-5秒
是否需要安装是(npm)否(内置)
是否依赖JavaScript