browser-tools

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Browser Tools

浏览器工具

Browser automation and content capture patterns using agent-browser CLI, Playwright, and Puppeteer. Each category has individual rule files in

references/

loaded on-demand.

使用agent-browser CLI、Playwright和Puppeteer实现浏览器自动化与内容捕获的方案。每个分类的规则文件都存放在

references/

目录下，可按需加载。

Quick Reference

快速参考

Category	Rules	When to Use
Playwright Setup	1	Installing and configuring Playwright for automation
Page Interaction	1	Clicking, filling forms, navigating with snapshot + refs
Content Extraction	1	Extracting text, HTML, structured data from pages
SPA Extraction	1	React, Vue, Angular apps with client-side rendering
Scraping Strategies	1	Multi-page crawls, pagination, recursive crawling
Anti-Bot Handling	1	Rate limiting, CAPTCHA, session management
Authentication Flows	1	Login forms, OAuth, SSO, session persistence
Structured Output	1	Converting scraped content to clean markdown/JSON

Total: 8 rules across 8 categories

分类	规则数	使用场景
Playwright 环境搭建	1	安装并配置Playwright以实现自动化
页面交互	1	点击、表单填写、带快照与引用的导航操作
内容提取	1	从页面提取文本、HTML及结构化数据
SPA 内容提取	1	针对React、Vue、Angular等客户端渲染应用
抓取策略	1	多页面爬取、分页处理、递归抓取
反爬处理	1	速率限制、验证码处理、会话管理
认证流程	1	登录表单、OAuth、SSO、会话持久化
结构化输出	1	将抓取内容转换为整洁的Markdown/JSON格式

总计：8个分类，共8条规则

Quick Start

快速开始

bash

undefined

bash

undefined

Install agent-browser

npm install -g agent-browser agent-browser install # Download Chromium

Basic capture workflow

agent-browser open https://example.com agent-browser wait --load networkidle agent-browser snapshot -i # Get interactive elements with @refs agent-browser get text @e5 # Extract content by ref agent-browser screenshot /tmp/page.png agent-browser close


```bash


```bash

Fallback decision tree

1. Try WebFetch first (fast, no browser overhead)

2. If empty/partial -> use agent-browser

3. If SPA -> wait --load networkidle

4. If login required -> authentication flow + state save

5. If dynamic -> wait @element or wait --text


```bash


```bash

Authentication with state persistence

agent-browser open https://app.example.com/login agent-browser snapshot -i agent-browser fill @e1 "$EMAIL" agent-browser fill @e2 "$PASSWORD" agent-browser click @e3 agent-browser wait --url "**/dashboard" agent-browser state save /tmp/auth-state.json


```bash


```bash

SPA extraction (React/Vue/Angular)

agent-browser open https://react-app.example.com agent-browser wait --load networkidle agent-browser eval "document.querySelector('article').innerText"

undefined

agent-browser open https://react-app.example.com agent-browser wait --load networkidle agent-browser eval "document.querySelector('article').innerText"

undefined

Playwright Setup

Playwright 环境搭建

Browser automation setup using agent-browser CLI (93% less context than full Playwright MCP) or Playwright directly.

Rule	Description
`playwright-setup.md`	Installation, configuration, environment variables, cloud providers

Key Decisions: agent-browser CLI preferred |

snapshot -i

for element discovery |

--session

for parallel isolation

使用agent-browser CLI（比完整Playwright MCP少93%的上下文）或直接使用Playwright进行浏览器自动化环境搭建。

Rule	Description
`playwright-setup.md`	安装、配置、环境变量、云服务商配置

核心决策: agent-browser CLI优先 |

snapshot -i

用于元素发现 |

--session

实现并行隔离

Page Interaction

页面交互

Interact with page elements using snapshot refs for clicking, filling, and navigating.

Rule	Description
`page-interaction.md`	Click, fill, navigate, wait patterns using snapshot refs

Key Decisions: Always re-snapshot after navigation | Use refs (@e1) not CSS selectors | Wait for networkidle after navigation

通过快照引用与页面元素进行交互，包括点击、填写、导航等操作。

Rule	Description
`page-interaction.md`	使用快照引用实现点击、填写、导航、等待等操作的方案

核心决策: 导航后始终重新生成快照 | 使用引用（@e1）而非CSS选择器 | 导航后等待networkidle状态

Content Extraction

内容提取

Extract text, HTML, and structured data from web pages.

Rule	Description
`content-extraction.md`	Text extraction, HTML capture, JavaScript eval for custom extraction

Key Decisions: Use targeted refs over full-body extraction | Remove noise elements before extraction | Cache extracted content

从网页提取文本、HTML及结构化数据。

Rule	Description
`content-extraction.md`	文本提取、HTML捕获、自定义JavaScript执行提取

核心决策: 优先使用目标引用而非全body提取 | 提取前移除冗余元素 | 缓存提取的内容

SPA Extraction

SPA 内容提取

Extract content from JavaScript-rendered Single Page Applications.

Rule	Description
`spa-extraction.md`	React, Vue, Angular, Next.js, Nuxt, Docusaurus extraction patterns

Key Decisions: Wait for hydration, not just DOM ready | Use framework-specific detection | Handle infinite scroll and lazy loading

从JavaScript渲染的单页应用（SPA）中提取内容。

Rule	Description
`spa-extraction.md`	React、Vue、Angular、Next.js、Nuxt、Docusaurus等框架的内容提取方案

核心决策: 等待应用 hydration（而非仅DOM加载完成） | 使用框架特定的检测方式 | 处理无限滚动和懒加载

Scraping Strategies

抓取策略

Multi-page crawling, pagination handling, and recursive site extraction.

Rule	Description
`scraping-strategies.md`	Multi-page crawl, pagination, recursive depth-limited crawling, parallel sessions

Key Decisions: Extract links first, then visit | Depth-limit recursive crawls | Use parallel sessions for throughput

多页面爬取、分页处理及递归站点提取。

Rule	Description
`scraping-strategies.md`	多页面爬取、分页处理、递归深度限制爬取、并行会话

核心决策: 先提取链接再访问 | 限制递归爬取的深度 | 使用并行会话提升吞吐量

Anti-Bot Handling

反爬处理

Rate limiting, CAPTCHA handling, session management, and respectful scraping.

Rule	Description
`anti-bot-handling.md`	Rate limiting, robots.txt, CAPTCHA, error handling, resume capability

Key Decisions: Always check robots.txt | Add delays between requests | Use headed mode for CAPTCHA | Implement resume capability

速率限制、验证码处理、会话管理及合规抓取。

Rule	Description
`anti-bot-handling.md`	速率限制、robots.txt、验证码、错误处理、断点续爬

核心决策: 始终检查robots.txt | 请求间添加延迟 | 使用有头模式处理验证码 | 实现断点续爬功能

Authentication Flows

认证流程

Rule	Description
`auth-flows.md`	Form login, OAuth popup, SSO redirect, state save/restore, session management

Key Decisions: Save state after login | Use headed mode for OAuth/SSO | Never hardcode credentials | Clean up state files

登录表单、OAuth/SSO流程、会话持久化及多步骤认证。

Rule	Description
`auth-flows.md`	表单登录、OAuth弹窗、SSO重定向、状态保存/恢复、会话管理

核心决策: 登录后保存状态 | 使用有头模式处理OAuth/SSO | 绝不硬编码凭证 | 清理状态文件

Structured Output

结构化输出

Convert scraped content to clean, structured formats for downstream processing.

Rule	Description
`structured-output.md`	Markdown conversion, JSON extraction, metadata preservation, content cleaning

Key Decisions: Remove noise elements before extraction | Preserve metadata (title, URL, timestamp) | Validate extracted data structure

将抓取内容转换为整洁、结构化的格式，以便后续处理。

Rule	Description
`structured-output.md`	Markdown转换、JSON提取、元数据保留、内容清理

核心决策: 提取前移除冗余元素 | 保留元数据（标题、URL、时间戳） | 验证提取数据的结构

Anti-Patterns (FORBIDDEN)

反模式（禁止使用）

bash

undefined

bash

undefined

Automation

agent-browser fill @e2 "hardcoded-password" # Never hardcode credentials agent-browser open "$UNVALIDATED_URL" # Always validate URLs

Scraping

Crawling without checking robots.txt

No delay between requests (hammering servers)

Ignoring rate limit responses (429)

Content capture

agent-browser get text body # Prefer targeted ref extraction

Trusting page content without validation

Not waiting for SPA hydration before extraction

Session management

Storing auth state in code repositories

Not cleaning up state files after use

undefined

undefined

Agent-Browser Key Commands

Agent-Browser 核心命令

Command	Purpose
`open <url>`	Navigate to URL
`snapshot -i`	Interactive elements with refs
`click @e1`	Click element
`fill @e2 "text"`	Clear + type into input
`get text @e1`	Extract element text
`get html @e1`	Get element HTML
`eval "<js>"`	Run custom JavaScript
`wait --load networkidle`	Wait for SPA render
`wait --text "Expected"`	Wait for specific text
`wait @e1`	Wait for element
`screenshot <path>`	Save screenshot
`state save <file>`	Persist cookies/storage
`state load <file>`	Restore session
`--session <name>`	Isolate parallel sessions
`--headed`	Show browser window
`console`	Read JS console
`network requests`	Monitor XHR/fetch
`record start <path>`	Start video recording
`record stop`	Stop recording
`close`	Close browser

Run

agent-browser --help

for the full 60+ command reference.

命令	用途
`open <url>`	导航至指定URL
`snapshot -i`	获取带引用的交互式元素
`click @e1`	点击指定元素
`fill @e2 "text"`	清空并输入文本至输入框
`get text @e1`	提取元素文本
`get html @e1`	获取元素HTML
`eval "<js>"`	执行自定义JavaScript
`wait --load networkidle`	等待SPA渲染完成
`wait --text "Expected"`	等待指定文本出现
`wait @e1`	等待指定元素出现
`screenshot <path>`	保存截图
`state save <file>`	持久化Cookie/存储状态
`state load <file>`	恢复会话
`--session <name>`	隔离并行会话
`--headed`	显示浏览器窗口
`console`	读取JavaScript控制台输出
`network requests`	监控XHR/fetch请求
`record start <path>`	开始录制视频
`record stop`	停止录制视频
`close`	关闭浏览器

执行

agent-browser --help

查看完整的60+命令参考。

Detailed Documentation

详细文档

Resource	Description
references/playwright-setup.md	Installation, configuration, environment variables
references/page-interaction.md	Click, fill, navigate, wait patterns
references/content-extraction.md	Text, HTML, and JS-based content extraction
references/spa-extraction.md	React, Vue, Angular, Next.js extraction
references/scraping-strategies.md	Multi-page crawl, pagination, parallel sessions
references/anti-bot-handling.md	Rate limiting, robots.txt, CAPTCHA, resume
references/auth-flows.md	Login, OAuth, SSO, session persistence
references/structured-output.md	Markdown/JSON conversion, metadata, validation

资源	描述
references/playwright-setup.md	安装、配置、环境变量
references/page-interaction.md	点击、填写、导航、等待方案
references/content-extraction.md	文本、HTML及基于JS的内容提取
references/spa-extraction.md	React、Vue、Angular、Next.js内容提取
references/scraping-strategies.md	多页面爬取、分页处理、并行会话
references/anti-bot-handling.md	速率限制、robots.txt、验证码、断点续爬
references/auth-flows.md	登录、OAuth、SSO、会话持久化
references/structured-output.md	Markdown/JSON转换、元数据、验证

Related Skills

Capability Details

能力详情

browser-automation

Keywords: browser, automation, headless, agent-browser, playwright, puppeteer, CLI Solves:

Automate browser tasks with agent-browser CLI
Set up headless browser environments
Run parallel browser sessions
Cloud browser provider configuration

关键词： browser, automation, headless, agent-browser, playwright, puppeteer, CLI 解决的问题：

使用agent-browser CLI自动化浏览器任务
配置无头浏览器环境
运行并行浏览器会话
配置云浏览器服务商

content-capture

Keywords: capture, extract, scrape, content, text, HTML, screenshot, web page Solves:

Extract content from JavaScript-rendered pages
Capture screenshots and visual verification
Handle dynamic content loading
WebFetch returns empty or partial content

关键词： capture, extract, scrape, content, text, HTML, screenshot, web page 解决的问题：

从JavaScript渲染的页面提取内容
捕获截图并进行视觉验证
处理动态内容加载
WebFetch返回空或不完整内容的情况

spa-extraction

Keywords: react, vue, angular, spa, javascript, client-side, hydration, ssr, next.js, nuxt Solves:

React/Vue/Angular app content extraction
Wait for SPA hydration before extraction
Handle infinite scroll and lazy loading
Framework detection and specific wait strategies

关键词： react, vue, angular, spa, javascript, client-side, hydration, ssr, next.js, nuxt 解决的问题：

React/Vue/Angular应用的内容提取
等待SPA hydration完成后再提取内容
处理无限滚动和懒加载
框架检测及特定等待策略

authentication

Keywords: login, authentication, session, cookie, protected, private, gated, OAuth, SSO Solves:

Content behind login wall
Multi-step authentication flows
OAuth and SSO with headed mode
Session persistence across captures

关键词： login, authentication, session, cookie, protected, private, gated, OAuth, SSO 解决的问题：

登录墙后的内容获取
多步骤认证流程
使用有头模式处理OAuth和SSO
跨捕获会话的持久化

multi-page-crawl

Keywords: crawl, sitemap, navigation, multiple pages, documentation, pagination, recursive Solves:

Capture entire documentation sites
Handle click-based and URL-based pagination
Recursive depth-limited crawling
Parallel crawling with sessions

关键词： crawl, sitemap, navigation, multiple pages, documentation, pagination, recursive 解决的问题：

捕获整个文档站点
处理点击式和URL式分页
递归深度限制爬取
使用会话进行并行爬取

anti-bot

Keywords: rate limit, robots.txt, CAPTCHA, bot detection, delay, throttle Solves:

Respectful scraping with rate limits
Handling CAPTCHA and bot detection
Resume capability for interrupted crawls
Error handling for failed pages

关键词： rate limit, robots.txt, CAPTCHA, bot detection, delay, throttle 解决的问题：

合规抓取，遵循速率限制
处理验证码和机器人检测
中断爬取后的断点续爬
失败页面的错误处理

browser-tools

Original

Translation

Browser Tools

浏览器工具

Quick Reference

快速参考

Quick Start

快速开始

Install agent-browser

Install agent-browser

Basic capture workflow

Basic capture workflow

Fallback decision tree

Fallback decision tree

1. Try WebFetch first (fast, no browser overhead)

1. Try WebFetch first (fast, no browser overhead)

2. If empty/partial -> use agent-browser

2. If empty/partial -> use agent-browser

3. If SPA -> wait --load networkidle

3. If SPA -> wait --load networkidle

4. If login required -> authentication flow + state save

4. If login required -> authentication flow + state save

5. If dynamic -> wait @element or wait --text

5. If dynamic -> wait @element or wait --text

Authentication with state persistence

Authentication with state persistence

SPA extraction (React/Vue/Angular)

SPA extraction (React/Vue/Angular)

Playwright Setup

Playwright 环境搭建

Page Interaction

页面交互

Content Extraction

内容提取

SPA Extraction

SPA 内容提取

Scraping Strategies

抓取策略

Anti-Bot Handling

反爬处理

Authentication Flows

认证流程

Structured Output

结构化输出

Anti-Patterns (FORBIDDEN)

反模式（禁止使用）

Automation

Automation

Scraping

Scraping

Crawling without checking robots.txt

Crawling without checking robots.txt

No delay between requests (hammering servers)

No delay between requests (hammering servers)

Ignoring rate limit responses (429)

Ignoring rate limit responses (429)

Content capture

Content capture

Trusting page content without validation

Trusting page content without validation

Not waiting for SPA hydration before extraction

Not waiting for SPA hydration before extraction

Session management

Session management

Storing auth state in code repositories

Storing auth state in code repositories

Not cleaning up state files after use

Not cleaning up state files after use

Agent-Browser Key Commands

Agent-Browser 核心命令

Detailed Documentation

详细文档

Related Skills

相关技能

Capability Details

能力详情

browser-automation

browser-automation

content-capture