browser-tools
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBrowser Tools
浏览器工具
Browser automation and content capture patterns using agent-browser CLI, Playwright, and Puppeteer. Each category has individual rule files in loaded on-demand.
references/使用agent-browser CLI、Playwright和Puppeteer实现浏览器自动化与内容捕获的方案。每个分类的规则文件都存放在目录下,可按需加载。
references/Quick Reference
快速参考
| Category | Rules | When to Use |
|---|---|---|
| Playwright Setup | 1 | Installing and configuring Playwright for automation |
| Page Interaction | 1 | Clicking, filling forms, navigating with snapshot + refs |
| Content Extraction | 1 | Extracting text, HTML, structured data from pages |
| SPA Extraction | 1 | React, Vue, Angular apps with client-side rendering |
| Scraping Strategies | 1 | Multi-page crawls, pagination, recursive crawling |
| Anti-Bot Handling | 1 | Rate limiting, CAPTCHA, session management |
| Authentication Flows | 1 | Login forms, OAuth, SSO, session persistence |
| Structured Output | 1 | Converting scraped content to clean markdown/JSON |
Total: 8 rules across 8 categories
Quick Start
快速开始
bash
undefinedbash
undefinedInstall agent-browser
Install agent-browser
npm install -g agent-browser
agent-browser install # Download Chromium
npm install -g agent-browser
agent-browser install # Download Chromium
Basic capture workflow
Basic capture workflow
agent-browser open https://example.com
agent-browser wait --load networkidle
agent-browser snapshot -i # Get interactive elements with @refs
agent-browser get text @e5 # Extract content by ref
agent-browser screenshot /tmp/page.png
agent-browser close
```bashagent-browser open https://example.com
agent-browser wait --load networkidle
agent-browser snapshot -i # Get interactive elements with @refs
agent-browser get text @e5 # Extract content by ref
agent-browser screenshot /tmp/page.png
agent-browser close
```bashFallback decision tree
Fallback decision tree
1. Try WebFetch first (fast, no browser overhead)
1. Try WebFetch first (fast, no browser overhead)
2. If empty/partial -> use agent-browser
2. If empty/partial -> use agent-browser
3. If SPA -> wait --load networkidle
3. If SPA -> wait --load networkidle
4. If login required -> authentication flow + state save
4. If login required -> authentication flow + state save
5. If dynamic -> wait @element or wait --text
5. If dynamic -> wait @element or wait --text
```bash
```bashAuthentication with state persistence
Authentication with state persistence
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$EMAIL"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save /tmp/auth-state.json
```bashagent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$EMAIL"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save /tmp/auth-state.json
```bashSPA extraction (React/Vue/Angular)
SPA extraction (React/Vue/Angular)
agent-browser open https://react-app.example.com
agent-browser wait --load networkidle
agent-browser eval "document.querySelector('article').innerText"
undefinedagent-browser open https://react-app.example.com
agent-browser wait --load networkidle
agent-browser eval "document.querySelector('article').innerText"
undefinedPlaywright Setup
Playwright 环境搭建
Browser automation setup using agent-browser CLI (93% less context than full Playwright MCP) or Playwright directly.
| Rule | Description |
|---|---|
| Installation, configuration, environment variables, cloud providers |
Key Decisions: agent-browser CLI preferred | for element discovery | for parallel isolation
snapshot -i--session使用agent-browser CLI(比完整Playwright MCP少93%的上下文)或直接使用Playwright进行浏览器自动化环境搭建。
| Rule | Description |
|---|---|
| 安装、配置、环境变量、云服务商配置 |
核心决策: agent-browser CLI优先 | 用于元素发现 | 实现并行隔离
snapshot -i--sessionPage Interaction
页面交互
Interact with page elements using snapshot refs for clicking, filling, and navigating.
| Rule | Description |
|---|---|
| Click, fill, navigate, wait patterns using snapshot refs |
Key Decisions: Always re-snapshot after navigation | Use refs (@e1) not CSS selectors | Wait for networkidle after navigation
通过快照引用与页面元素进行交互,包括点击、填写、导航等操作。
| Rule | Description |
|---|---|
| 使用快照引用实现点击、填写、导航、等待等操作的方案 |
核心决策: 导航后始终重新生成快照 | 使用引用(@e1)而非CSS选择器 | 导航后等待networkidle状态
Content Extraction
内容提取
Extract text, HTML, and structured data from web pages.
| Rule | Description |
|---|---|
| Text extraction, HTML capture, JavaScript eval for custom extraction |
Key Decisions: Use targeted refs over full-body extraction | Remove noise elements before extraction | Cache extracted content
从网页提取文本、HTML及结构化数据。
| Rule | Description |
|---|---|
| 文本提取、HTML捕获、自定义JavaScript执行提取 |
核心决策: 优先使用目标引用而非全body提取 | 提取前移除冗余元素 | 缓存提取的内容
SPA Extraction
SPA 内容提取
Extract content from JavaScript-rendered Single Page Applications.
| Rule | Description |
|---|---|
| React, Vue, Angular, Next.js, Nuxt, Docusaurus extraction patterns |
Key Decisions: Wait for hydration, not just DOM ready | Use framework-specific detection | Handle infinite scroll and lazy loading
从JavaScript渲染的单页应用(SPA)中提取内容。
| Rule | Description |
|---|---|
| React、Vue、Angular、Next.js、Nuxt、Docusaurus等框架的内容提取方案 |
核心决策: 等待应用 hydration(而非仅DOM加载完成) | 使用框架特定的检测方式 | 处理无限滚动和懒加载
Scraping Strategies
抓取策略
Multi-page crawling, pagination handling, and recursive site extraction.
| Rule | Description |
|---|---|
| Multi-page crawl, pagination, recursive depth-limited crawling, parallel sessions |
Key Decisions: Extract links first, then visit | Depth-limit recursive crawls | Use parallel sessions for throughput
多页面爬取、分页处理及递归站点提取。
| Rule | Description |
|---|---|
| 多页面爬取、分页处理、递归深度限制爬取、并行会话 |
核心决策: 先提取链接再访问 | 限制递归爬取的深度 | 使用并行会话提升吞吐量
Anti-Bot Handling
反爬处理
Rate limiting, CAPTCHA handling, session management, and respectful scraping.
| Rule | Description |
|---|---|
| Rate limiting, robots.txt, CAPTCHA, error handling, resume capability |
Key Decisions: Always check robots.txt | Add delays between requests | Use headed mode for CAPTCHA | Implement resume capability
速率限制、验证码处理、会话管理及合规抓取。
| Rule | Description |
|---|---|
| 速率限制、robots.txt、验证码、错误处理、断点续爬 |
核心决策: 始终检查robots.txt | 请求间添加延迟 | 使用有头模式处理验证码 | 实现断点续爬功能
Authentication Flows
认证流程
Login forms, OAuth/SSO flows, session persistence, and multi-step authentication.
| Rule | Description |
|---|---|
| Form login, OAuth popup, SSO redirect, state save/restore, session management |
Key Decisions: Save state after login | Use headed mode for OAuth/SSO | Never hardcode credentials | Clean up state files
登录表单、OAuth/SSO流程、会话持久化及多步骤认证。
| Rule | Description |
|---|---|
| 表单登录、OAuth弹窗、SSO重定向、状态保存/恢复、会话管理 |
核心决策: 登录后保存状态 | 使用有头模式处理OAuth/SSO | 绝不硬编码凭证 | 清理状态文件
Structured Output
结构化输出
Convert scraped content to clean, structured formats for downstream processing.
| Rule | Description |
|---|---|
| Markdown conversion, JSON extraction, metadata preservation, content cleaning |
Key Decisions: Remove noise elements before extraction | Preserve metadata (title, URL, timestamp) | Validate extracted data structure
将抓取内容转换为整洁、结构化的格式,以便后续处理。
| Rule | Description |
|---|---|
| Markdown转换、JSON提取、元数据保留、内容清理 |
核心决策: 提取前移除冗余元素 | 保留元数据(标题、URL、时间戳) | 验证提取数据的结构
Anti-Patterns (FORBIDDEN)
反模式(禁止使用)
bash
undefinedbash
undefinedAutomation
Automation
agent-browser fill @e2 "hardcoded-password" # Never hardcode credentials
agent-browser open "$UNVALIDATED_URL" # Always validate URLs
agent-browser fill @e2 "hardcoded-password" # Never hardcode credentials
agent-browser open "$UNVALIDATED_URL" # Always validate URLs
Scraping
Scraping
Crawling without checking robots.txt
Crawling without checking robots.txt
No delay between requests (hammering servers)
No delay between requests (hammering servers)
Ignoring rate limit responses (429)
Ignoring rate limit responses (429)
Content capture
Content capture
agent-browser get text body # Prefer targeted ref extraction
agent-browser get text body # Prefer targeted ref extraction
Trusting page content without validation
Trusting page content without validation
Not waiting for SPA hydration before extraction
Not waiting for SPA hydration before extraction
Session management
Session management
Storing auth state in code repositories
Storing auth state in code repositories
Not cleaning up state files after use
Not cleaning up state files after use
undefinedundefinedAgent-Browser Key Commands
Agent-Browser 核心命令
| Command | Purpose |
|---|---|
| Navigate to URL |
| Interactive elements with refs |
| Click element |
| Clear + type into input |
| Extract element text |
| Get element HTML |
| Run custom JavaScript |
| Wait for SPA render |
| Wait for specific text |
| Wait for element |
| Save screenshot |
| Persist cookies/storage |
| Restore session |
| Isolate parallel sessions |
| Show browser window |
| Read JS console |
| Monitor XHR/fetch |
| Start video recording |
| Stop recording |
| Close browser |
Run for the full 60+ command reference.
agent-browser --help| 命令 | 用途 |
|---|---|
| 导航至指定URL |
| 获取带引用的交互式元素 |
| 点击指定元素 |
| 清空并输入文本至输入框 |
| 提取元素文本 |
| 获取元素HTML |
| 执行自定义JavaScript |
| 等待SPA渲染完成 |
| 等待指定文本出现 |
| 等待指定元素出现 |
| 保存截图 |
| 持久化Cookie/存储状态 |
| 恢复会话 |
| 隔离并行会话 |
| 显示浏览器窗口 |
| 读取JavaScript控制台输出 |
| 监控XHR/fetch请求 |
| 开始录制视频 |
| 停止录制视频 |
| 关闭浏览器 |
执行查看完整的60+命令参考。
agent-browser --helpDetailed Documentation
详细文档
| Resource | Description |
|---|---|
| references/playwright-setup.md | Installation, configuration, environment variables |
| references/page-interaction.md | Click, fill, navigate, wait patterns |
| references/content-extraction.md | Text, HTML, and JS-based content extraction |
| references/spa-extraction.md | React, Vue, Angular, Next.js extraction |
| references/scraping-strategies.md | Multi-page crawl, pagination, parallel sessions |
| references/anti-bot-handling.md | Rate limiting, robots.txt, CAPTCHA, resume |
| references/auth-flows.md | Login, OAuth, SSO, session persistence |
| references/structured-output.md | Markdown/JSON conversion, metadata, validation |
| 资源 | 描述 |
|---|---|
| references/playwright-setup.md | 安装、配置、环境变量 |
| references/page-interaction.md | 点击、填写、导航、等待方案 |
| references/content-extraction.md | 文本、HTML及基于JS的内容提取 |
| references/spa-extraction.md | React、Vue、Angular、Next.js内容提取 |
| references/scraping-strategies.md | 多页面爬取、分页处理、并行会话 |
| references/anti-bot-handling.md | 速率限制、robots.txt、验证码、断点续爬 |
| references/auth-flows.md | 登录、OAuth、SSO、会话持久化 |
| references/structured-output.md | Markdown/JSON转换、元数据、验证 |
Related Skills
相关技能
- - Comprehensive testing patterns including E2E and webapp testing
testing-patterns - - API design patterns for endpoints discovered during scraping
api-design - - Visualizing extracted data
data-visualization
- - 包含端到端测试和Web应用测试的综合测试方案
testing-patterns - - 针对抓取过程中发现的接口的API设计方案
api-design - - 对提取的数据进行可视化
data-visualization
Capability Details
能力详情
browser-automation
browser-automation
Keywords: browser, automation, headless, agent-browser, playwright, puppeteer, CLI
Solves:
- Automate browser tasks with agent-browser CLI
- Set up headless browser environments
- Run parallel browser sessions
- Cloud browser provider configuration
关键词: browser, automation, headless, agent-browser, playwright, puppeteer, CLI
解决的问题:
- 使用agent-browser CLI自动化浏览器任务
- 配置无头浏览器环境
- 运行并行浏览器会话
- 配置云浏览器服务商
content-capture
content-capture
Keywords: capture, extract, scrape, content, text, HTML, screenshot, web page
Solves:
- Extract content from JavaScript-rendered pages
- Capture screenshots and visual verification
- Handle dynamic content loading
- WebFetch returns empty or partial content
关键词: capture, extract, scrape, content, text, HTML, screenshot, web page
解决的问题:
- 从JavaScript渲染的页面提取内容
- 捕获截图并进行视觉验证
- 处理动态内容加载
- WebFetch返回空或不完整内容的情况
spa-extraction
spa-extraction
Keywords: react, vue, angular, spa, javascript, client-side, hydration, ssr, next.js, nuxt
Solves:
- React/Vue/Angular app content extraction
- Wait for SPA hydration before extraction
- Handle infinite scroll and lazy loading
- Framework detection and specific wait strategies
关键词: react, vue, angular, spa, javascript, client-side, hydration, ssr, next.js, nuxt
解决的问题:
- React/Vue/Angular应用的内容提取
- 等待SPA hydration完成后再提取内容
- 处理无限滚动和懒加载
- 框架检测及特定等待策略
authentication
authentication
Keywords: login, authentication, session, cookie, protected, private, gated, OAuth, SSO
Solves:
- Content behind login wall
- Multi-step authentication flows
- OAuth and SSO with headed mode
- Session persistence across captures
关键词: login, authentication, session, cookie, protected, private, gated, OAuth, SSO
解决的问题:
- 登录墙后的内容获取
- 多步骤认证流程
- 使用有头模式处理OAuth和SSO
- 跨捕获会话的持久化
multi-page-crawl
multi-page-crawl
Keywords: crawl, sitemap, navigation, multiple pages, documentation, pagination, recursive
Solves:
- Capture entire documentation sites
- Handle click-based and URL-based pagination
- Recursive depth-limited crawling
- Parallel crawling with sessions
关键词: crawl, sitemap, navigation, multiple pages, documentation, pagination, recursive
解决的问题:
- 捕获整个文档站点
- 处理点击式和URL式分页
- 递归深度限制爬取
- 使用会话进行并行爬取
anti-bot
anti-bot
Keywords: rate limit, robots.txt, CAPTCHA, bot detection, delay, throttle
Solves:
- Respectful scraping with rate limits
- Handling CAPTCHA and bot detection
- Resume capability for interrupted crawls
- Error handling for failed pages
关键词: rate limit, robots.txt, CAPTCHA, bot detection, delay, throttle
解决的问题:
- 合规抓取,遵循速率限制
- 处理验证码和机器人检测
- 中断爬取后的断点续爬
- 失败页面的错误处理