agent-browser
对比查看原文与翻译
🇺🇸
原文
英文🇨🇳
翻译
中文Browser Automation with agent-browser
基于agent-browser的浏览器自动化
Core Workflow
核心工作流程
Every browser automation follows this pattern:
- Navigate:
agent-browser open <url> - Snapshot: (get element refs like
agent-browser snapshot -i,@e1)@e2 - Interact: Use refs to click, fill, select
- Re-snapshot: After navigation or DOM changes, get fresh refs
bash
agent-browser open https://example.com/form
agent-browser snapshot -i所有浏览器自动化都遵循以下模式:
- 导航:
agent-browser open <url> - 快照:(获取元素引用,如
agent-browser snapshot -i、@e1)@e2 - 交互:使用引用进行点击、填写、选择操作
- 重新快照:页面导航或DOM变化后,获取新的引用
bash
agent-browser open https://example.com/form
agent-browser snapshot -iOutput: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"
输出:@e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
undefinedagent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # 检查结果
undefinedEssential Commands
核心命令
bash
undefinedbash
undefinedNavigation
导航
agent-browser open <url> # Navigate (aliases: goto, navigate)
agent-browser close # Close browser
agent-browser open <url> # 导航(别名:goto、navigate)
agent-browser close # 关闭浏览器
Snapshot
快照
agent-browser snapshot -i # Interactive elements with refs (recommended)
agent-browser snapshot -s "#selector" # Scope to CSS selector
agent-browser snapshot -i # 带引用的交互式元素(推荐)
agent-browser snapshot -s "#selector" # 限定CSS选择器范围
Interaction (use @refs from snapshot)
交互(使用快照中的@引用)
agent-browser click @e1 # Click element
agent-browser fill @e2 "text" # Clear and type text
agent-browser type @e2 "text" # Type without clearing
agent-browser select @e1 "option" # Select dropdown option
agent-browser check @e1 # Check checkbox
agent-browser press Enter # Press key
agent-browser scroll down 500 # Scroll page
agent-browser click @e1 # 点击元素
agent-browser fill @e2 "text" # 清空并输入文本
agent-browser type @e2 "text" # 输入文本不清空原有内容
agent-browser select @e1 "option" # 选择下拉选项
agent-browser check @e1 # 勾选复选框
agent-browser press Enter # 按下按键
agent-browser scroll down 500 # 向下滚动页面500像素
Get information
获取信息
agent-browser get text @e1 # Get element text
agent-browser get url # Get current URL
agent-browser get title # Get page title
agent-browser get text @e1 # 获取元素文本
agent-browser get url # 获取当前URL
agent-browser get title # 获取页面标题
Wait
等待
agent-browser wait @e1 # Wait for element
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/page" # Wait for URL pattern
agent-browser wait 2000 # Wait milliseconds
agent-browser wait @e1 # 等待元素出现
agent-browser wait --load networkidle # 等待网络空闲
agent-browser wait --url "**/page" # 等待URL匹配指定模式
agent-browser wait 2000 # 等待2000毫秒
Capture
捕获
agent-browser screenshot # Screenshot to temp dir
agent-browser screenshot --full # Full page screenshot
agent-browser pdf output.pdf # Save as PDF
undefinedagent-browser screenshot # 将截图保存到临时目录
agent-browser screenshot --full # 整页截图
agent-browser pdf output.pdf # 保存为PDF
undefinedCommon Patterns
常见模式
Form Submission
表单提交
bash
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidlebash
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidleAuthentication with State Persistence
带状态持久化的认证
bash
undefinedbash
undefinedLogin once and save state
登录一次并保存状态
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
Reuse in future sessions
在后续会话中复用
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
undefinedagent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
undefinedData Extraction
数据提取
bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5 # Get specific element text
agent-browser get text body > page.txt # Get all page textbash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5 # 获取指定元素的文本
agent-browser get text body > page.txt # 获取页面所有文本JSON output for parsing
输出JSON格式以便解析
agent-browser snapshot -i --json
agent-browser get text @e1 --json
undefinedagent-browser snapshot -i --json
agent-browser get text @e1 --json
undefinedParallel Sessions
并行会话
bash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com
agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i
agent-browser session listbash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com
agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i
agent-browser session listVisual Browser (Debugging)
可视化浏览器(调试用)
bash
agent-browser --headed open https://example.com
agent-browser highlight @e1 # Highlight element
agent-browser record start demo.webm # Record sessionbash
agent-browser --headed open https://example.com
agent-browser highlight @e1 # 高亮元素
agent-browser record start demo.webm # 录制会话Ref Lifecycle (Important)
引用生命周期(重要)
Refs (, , etc.) are invalidated when the page changes. Always re-snapshot after:
@e1@e2- Clicking links or buttons that navigate
- Form submissions
- Dynamic content loading (dropdowns, modals)
bash
agent-browser click @e5 # Navigates to new page
agent-browser snapshot -i # MUST re-snapshot
agent-browser click @e1 # Use new refs引用(、等)会在页面变化时失效。在以下操作后必须重新快照:
@e1@e2- 点击链接或按钮导致页面导航
- 表单提交
- 动态内容加载(下拉菜单、模态框)
bash
agent-browser click @e5 # 导航到新页面
agent-browser snapshot -i # 必须重新快照
agent-browser click @e1 # 使用新的引用Semantic Locators (Alternative to Refs)
语义定位器(引用的替代方案)
When refs are unavailable or unreliable, use semantic locators:
bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click当引用不可用或不可靠时,使用语义定位器:
bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" clickDeep-Dive Documentation
深度文档
| Reference | When to Use |
|---|---|
| references/commands.md | Full command reference with all options |
| references/snapshot-refs.md | Ref lifecycle, invalidation rules, troubleshooting |
| references/session-management.md | Parallel sessions, state persistence, concurrent scraping |
| references/authentication.md | Login flows, OAuth, 2FA handling, state reuse |
| references/video-recording.md | Recording workflows for debugging and documentation |
| references/proxy-support.md | Proxy configuration, geo-testing, rotating proxies |
| 参考文档 | 使用场景 |
|---|---|
| references/commands.md | 包含所有选项的完整命令参考 |
| references/snapshot-refs.md | 引用生命周期、失效规则、故障排除 |
| references/session-management.md | 并行会话、状态持久化、并发抓取 |
| references/authentication.md | 登录流程、OAuth、2FA处理、状态复用 |
| references/video-recording.md | 录制工作流用于调试和文档 |
| references/proxy-support.md | 代理配置、地域测试、轮换代理 |
Ready-to-Use Templates
即用型模板
| Template | Description |
|---|---|
| templates/form-automation.sh | Form filling with validation |
| templates/authenticated-session.sh | Login once, reuse state |
| templates/capture-workflow.sh | Content extraction with screenshots |
bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output| 模板 | 描述 |
|---|---|
| templates/form-automation.sh | 带验证的表单填写 |
| templates/authenticated-session.sh | 一次登录,复用状态 |
| templates/capture-workflow.sh | 结合截图的内容提取 |
bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output