browser-agent
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBrowser Agent
Browser Agent
AI Agent 浏览器自动化工具集,提供三种互补的工具用于网页数据获取和自动化操作。
AI Agent browser automation toolset that provides three complementary tools for web data retrieval and automation operations.
工具选择指南
Tool Selection Guide
用户请求
│
├── 简单抓取静态内容?
│ └── 用 curl / WebFetch(更快)
│
├── 需要 JS 渲染 / 绕过反爬?
│ ├── agent-browser ── 提取无障碍树
│ │
│ ├── 截图? ── agent-browser -s
│ │
│ └── 目标网站在 actionbook 列表?
│ └── actionbook get <site> ── 获取专用食谱
│
└── 复杂多步骤自动化?
└── browser-use (Python) ── AI 驱动自主操作User Request
│
├── Simple static content scraping?
│ └── Use curl / WebFetch (faster)
│
├── Need JS rendering / bypass anti-scraping?
│ ├── agent-browser ── Extract accessibility tree
│ │
│ ├── Screenshot? ── agent-browser -s
│ │
│ └── Target site in actionbook list?
│ └── actionbook get <site> ── Get dedicated recipe
│
└── Complex multi-step automation?
└── browser-use (Python) ── AI-powered autonomous operationagent-browser
agent-browser
CLI 工具,使用 Playwright 启动无头浏览器并提取页面的无障碍树(Accessibility Tree)。
核心优势:
- 无需登录即可访问大部分内容
- 获取结构化的可读文本
- 支持截图
- 自动处理 JS 渲染
CLI tool that uses Playwright to launch a headless browser and extract the page's accessibility tree.
Core Advantages:
- Access most content without login
- Get structured, readable text
- Supports screenshots
- Automatically handles JS rendering
基本用法
Basic Usage
bash
undefinedbash
undefined提取网页内容(无障碍树)
Extract web content (accessibility tree)
agent-browser <URL>
agent-browser <URL>
截图
Take screenshot
agent-browser -s <URL>
agent-browser -s <URL>
指定输出格式
Specify output format
agent-browser -f markdown <URL>
agent-browser -f html <URL>
agent-browser -f text <URL>
agent-browser -f markdown <URL>
agent-browser -f html <URL>
agent-browser -f text <URL>
交互模式(可点击、滚动)
Interactive mode (click, scroll available)
agent-browser -i <URL>
agent-browser -i <URL>
指定浏览器
Specify browser
agent-browser --browser chromium <URL>
agent-browser --browser firefox <URL>
undefinedagent-browser --browser chromium <URL>
agent-browser --browser firefox <URL>
undefined常见场景
Common Scenarios
bash
undefinedbash
undefined获取 X/Twitter 帖子内容
Get X/Twitter post content
agent-browser "https://x.com/username/status/123456"
agent-browser "https://x.com/username/status/123456"
获取 GitHub 仓库信息
Get GitHub repository information
agent-browser "https://github.com/owner/repo"
agent-browser "https://github.com/owner/repo"
获取 Reddit 帖子
Get Reddit post
agent-browser "https://reddit.com/r/subreddit/comments/abc123"
agent-browser "https://reddit.com/r/subreddit/comments/abc123"
获取新闻文章(JS 渲染)
Get news article (JS-rendered)
agent-browser "https://example.com/article"
**详细命令参考**: [references/agent-browser-reference.md](references/agent-browser-reference.md)agent-browser "https://example.com/article"
**Detailed Command Reference**: [references/agent-browser-reference.md](references/agent-browser-reference.md)actionbook
actionbook
50+ 网站的预计算自动化"食谱",提供经过验证的自动化模板。
Pre-computed automation "recipes" for 50+ websites, providing validated automation templates.
基本用法
Basic Usage
bash
undefinedbash
undefined列出所有支持的网站
List all supported websites
actionbook list
actionbook list
获取特定网站的食谱
Get recipe for a specific site
actionbook get <site>
actionbook get <site>
示例
Examples
actionbook get github
actionbook get reddit
actionbook get amazon
undefinedactionbook get github
actionbook get reddit
actionbook get amazon
undefined工作流程
Workflow
- 运行 查看支持的网站
actionbook list - 运行 获取该网站的自动化模板
actionbook get <site> - 根据模板编写自动化脚本(使用 browser-use 或直接使用 agent-browser)
详细命令参考: references/actionbook-reference.md
- Run to view supported websites
actionbook list - Run to obtain the automation template for that site
actionbook get <site> - Write automation scripts based on the template (using browser-use or directly with agent-browser)
Detailed Command Reference: references/actionbook-reference.md
browser-use
browser-use
Python 库,使用 AI 自主控制浏览器完成复杂任务。
Python library that uses AI to autonomously control browsers for complex tasks.
安装
Installation
bash
pip install browser-use
playwright install chromiumbash
pip install browser-use
playwright install chromium基本用法
Basic Usage
python
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def main():
agent = Agent(
task="Go to GitHub and find the trending Python repositories",
llm=ChatOpenAI(model="gpt-4"),
)
result = await agent.run()
print(result)python
from browser_use import Agent
from langchain_openai import ChatOpenAI
async def main():
agent = Agent(
task="Go to GitHub and find the trending Python repositories",
llm=ChatOpenAI(model="gpt-4"),
)
result = await agent.run()
print(result)常见场景
Common Scenarios
python
undefinedpython
undefined表单填写
Form filling
agent = Agent(
task="Go to example.com and fill out the contact form with test data",
llm=llm,
)
agent = Agent(
task="Go to example.com and fill out the contact form with test data",
llm=llm,
)
数据抓取
Data scraping
agent = Agent(
task="Go to Amazon, search for 'wireless headphones', and extract the top 5 products with prices",
llm=llm,
)
agent = Agent(
task="Go to Amazon, search for 'wireless headphones', and extract the top 5 products with prices",
llm=llm,
)
多步骤操作
Multi-step operations
agent = Agent(
task="Log into Twitter, navigate to settings, and enable two-factor authentication",
llm=llm,
)
**详细 API 参考**: [references/browser-use-reference.md](references/browser-use-reference.md)agent = Agent(
task="Log into Twitter, navigate to settings, and enable two-factor authentication",
llm=llm,
)
**Detailed API Reference**: [references/browser-use-reference.md](references/browser-use-reference.md)决策流程
Decision Flow
任务类型 → 工具选择
Task Type → Tool Selection
| 任务类型 | 推荐工具 | 原因 |
|---|---|---|
| 快速抓取单个页面 | agent-browser | 简单直接,无障碍树输出 |
| 需要页面截图 | agent-browser -s | 内置截图功能 |
| 目标网站在 actionbook 中 | actionbook + browser-use | 有现成的最佳实践 |
| 复杂多步骤操作 | browser-use | AI 自主决策和执行 |
| 需要登录的网站 | browser-use | 可以处理登录流程 |
| 批量数据采集 | browser-use | 支持循环和条件判断 |
| Task Type | Recommended Tool | Reason |
|---|---|---|
| Quick single-page scraping | agent-browser | Simple and straightforward, accessibility tree output |
| Need page screenshots | agent-browser -s | Built-in screenshot functionality |
| Target site is in actionbook | actionbook + browser-use | Ready-made best practices available |
| Complex multi-step operations | browser-use | AI autonomous decision-making and execution |
| Sites requiring login | browser-use | Can handle login flows |
| Batch data collection | browser-use | Supports loops and conditional judgments |
示例工作流
Example Workflows
场景:获取 X/Twitter 帖子内容
bash
undefinedScenario: Get X/Twitter Post Content
bash
undefined方法 1:直接使用 agent-browser(推荐)
Method 1: Directly use agent-browser (recommended)
agent-browser "https://x.com/username/status/123456"
agent-browser "https://x.com/username/status/123456"
方法 2:使用 browser-use 进行更复杂操作
Method 2: Use browser-use for more complex operations
编写 Python 脚本
Write a Python script
**场景:GitHub Trending 分析**
```bash
**Scenario: GitHub Trending Analysis**
```bash方法 1:agent-browser
Method 1: agent-browser
agent-browser "https://github.com/trending"
agent-browser "https://github.com/trending"
方法 2:使用 actionbook 获取 GitHub 食谱
Method 2: Use actionbook to get GitHub recipe
actionbook get github
actionbook get github
然后根据食谱编写脚本
Then write a script based on the recipe
undefinedundefined注意事项
Notes
- 速率限制:频繁请求可能被目标网站封禁,建议添加延迟
- 登录要求:某些内容需要登录才能访问,使用 browser-use 处理
- 动态内容:agent-browser 会等待页面加载完成,但对于无限滚动页面需要交互模式
- 法律合规:确保抓取行为符合目标网站的服务条款
- Rate Limiting: Frequent requests may lead to being blocked by the target website, it is recommended to add delays
- Login Requirements: Some content requires login to access, use browser-use to handle this
- Dynamic Content: agent-browser waits for page loading to complete, but interactive mode is required for infinite scroll pages
- Legal Compliance: Ensure scraping behavior complies with the target website's terms of service
故障排除
Troubleshooting
| 问题 | 解决方案 |
|---|---|
| 页面加载超时 | 使用 |
| 内容未渲染 | 使用交互模式 |
| 反爬虫拦截 | 尝试不同的 user-agent 或使用 browser-use |
| 截图空白 | 确保页面完全加载后再截图 |
| Issue | Solution |
|---|---|
| Page loading timeout | Use |
| Content not rendered | Use interactive mode |
| Anti-scraping block | Try different user-agents or use browser-use |
| Blank screenshot | Ensure the page is fully loaded before taking the screenshot |