agent-browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Agent Browser Testing Skill

Agent Browser 测试技能

Browser automation and end-to-end testing using Vercel's agent-browser CLI. Uses ref-based element targeting for reliable, AI-friendly browser interaction.
使用Vercel的agent-browser CLI实现浏览器自动化与端到端测试。采用基于ref的元素定位,实现可靠、适配AI的浏览器交互。

Quick Decision Tree

快速决策树

What do you need?
├─ Take a screenshot of a page?
│  └─ agent-browser open [url] && agent-browser screenshot
├─ Fill out a form?
│  └─ open → snapshot -i → fill @ref → click @submit → snapshot
├─ Test a login flow?
│  └─ See references/authentication.md
├─ Run an E2E test?
│  └─ See references/testing-patterns.md
├─ Scrape page content?
│  └─ agent-browser open [url] && agent-browser snapshot -i
└─ Debug element targeting?
   └─ agent-browser snapshot -i --format json
你需要什么功能?
├─ 给页面截图?
│  └─ agent-browser open [url] && agent-browser screenshot
├─ 填写表单?
│  └─ open → snapshot -i → fill @ref → click @submit → snapshot
├─ 测试登录流程?
│  └─ 查看 references/authentication.md
├─ 运行端到端测试?
│  └─ 查看 references/testing-patterns.md
├─ 抓取页面内容?
│  └─ agent-browser open [url] && agent-browser snapshot -i
└─ 调试元素定位?
   └─ agent-browser snapshot -i --format json

Installation

安装步骤

bash
undefined
bash
undefined

Install agent-browser globally

全局安装 agent-browser

npm install -g agent-browser
npm install -g agent-browser

Install browser dependencies (Chromium)

安装浏览器依赖(Chromium)

agent-browser install
agent-browser install

Verify installation

验证安装

agent-browser --version
undefined
agent-browser --version
undefined

Core Concept: Ref-Based Targeting

核心概念:基于Ref的定位

Agent-browser uses refs (like
@e1
,
@e2
,
@e3
) to identify interactive elements on the page. These refs are assigned when you take a snapshot.
bash
undefined
Agent-browser 使用refs(如
@e1
@e2
@e3
)来识别页面上的交互元素。这些refs会在你拍摄快照时分配。
bash
undefined

Take a snapshot with interactive elements labeled

拍摄快照并标记交互元素

agent-browser snapshot -i
agent-browser snapshot -i

Output shows refs:

输出会显示refs:

@e1: [button] "Sign In"

@e1: [button] "Sign In"

@e2: [input] Email field

@e2: [input] 邮箱字段

@e3: [input] Password field

@e3: [input] 密码字段

@e4: [button] "Submit"

@e4: [button] "Submit"

Use refs to interact

使用refs进行交互

agent-browser click @e1 agent-browser fill @e2 "user@example.com"

**Important:** Refs are session-specific and invalidate when the page changes. Always re-snapshot after navigation or DOM updates.
agent-browser click @e1 agent-browser fill @e2 "user@example.com"

**重要提示:** refs是会话特定的,页面变化后会失效。导航或DOM更新后,务必重新拍摄快照。

Essential Workflow

基础工作流

bash
undefined
bash
undefined

1. Open the target URL

1. 打开目标URL

agent-browser open https://example.com
agent-browser open https://example.com

2. Take a snapshot to see the page and get refs

2. 拍摄快照以查看页面并获取refs

agent-browser snapshot -i
agent-browser snapshot -i

3. Interact with elements using refs

3. 使用refs与元素交互

agent-browser click @e1 agent-browser fill @e2 "test value"
agent-browser click @e1 agent-browser fill @e2 "测试值"

4. Take another snapshot to verify changes

4. 再次拍摄快照以验证更改

agent-browser snapshot -i
undefined
agent-browser snapshot -i
undefined

Common Commands Quick Reference

常用命令速查

Navigation

导航

bash
agent-browser open <url>              # Navigate to URL
agent-browser back                    # Go back
agent-browser forward                 # Go forward
agent-browser refresh                 # Reload page
bash
agent-browser open <url>              # 导航至指定URL
agent-browser back                    # 返回上一页
agent-browser forward                 # 前进到下一页
agent-browser refresh                 # 重新加载页面

Snapshots

快照

bash
agent-browser snapshot                # Text snapshot
agent-browser snapshot -i             # With interactive refs
agent-browser snapshot --format json  # JSON output
agent-browser screenshot [path]       # Save screenshot
bash
agent-browser snapshot                # 文本快照
agent-browser snapshot -i             # 包含交互元素refs的快照
agent-browser snapshot --format json  # JSON格式输出
agent-browser screenshot [path]       # 保存截图

Interaction

交互操作

bash
agent-browser click @ref              # Click element
agent-browser fill @ref "value"       # Fill input field
agent-browser select @ref "option"    # Select dropdown option
agent-browser hover @ref              # Hover over element
agent-browser press Enter             # Press keyboard key
bash
agent-browser click @ref              # 点击元素
agent-browser fill @ref "value"       # 填充输入框
agent-browser select @ref "option"    # 选择下拉选项
agent-browser hover @ref              # 悬停在元素上
agent-browser press Enter             # 按下键盘按键

Semantic Locators

语义定位器

bash
agent-browser find role button "Submit"    # Find by ARIA role
agent-browser find text "Welcome"          # Find by visible text
agent-browser find label "Email"           # Find by label
bash
agent-browser find role button "Submit"    # 通过ARIA角色查找
agent-browser find text "Welcome"          # 通过可见文本查找
agent-browser find label "Email"           # 通过标签查找

Waiting

等待操作

bash
agent-browser wait visible @ref            # Wait for element visible
agent-browser wait hidden @ref             # Wait for element hidden
agent-browser wait network                 # Wait for network idle
agent-browser wait time 2000               # Wait milliseconds
bash
agent-browser wait visible @ref            # 等待元素可见
agent-browser wait hidden @ref             # 等待元素隐藏
agent-browser wait network                 # 等待网络空闲
agent-browser wait time 2000               # 等待指定毫秒数

Session Management

会话管理

bash
agent-browser session save mystate         # Save browser state
agent-browser session load mystate         # Load saved state
agent-browser session list                 # List saved sessions
agent-browser close                        # Close browser
bash
agent-browser session save mystate         # 保存浏览器状态
agent-browser session load mystate         # 加载已保存的状态
agent-browser session list                 # 列出已保存的会话
agent-browser close                        # 关闭浏览器

Security Notes

安全注意事项

Never commit these files:
  • *.state
    - Browser session state files contain cookies
  • agent-browser-profile/
    - Profile directories with credentials
  • Screenshots that may contain sensitive data
Add to
.gitignore
:
gitignore
*.state
agent-browser-profile/
.agent-browser/
screenshots/
请勿提交以下文件至版本库:
  • *.state
    - 浏览器会话状态文件包含Cookie
  • agent-browser-profile/
    - 包含凭证信息的配置文件目录
  • 可能包含敏感数据的截图
将以下内容添加到
.gitignore
gitignore
*.state
agent-browser-profile/
.agent-browser/
screenshots/

Integration with Other Skills

与其他技能集成

With Parallel Research

与并行研究集成

bash
undefined
bash
undefined

Research a topic, then verify claims on websites

研究某个主题,然后在网站上验证相关信息

parallel_research.py chat "Find pricing for Acme Corp"
parallel_research.py chat "查找Acme Corp的定价"

Then use agent-browser to verify on their actual pricing page

随后使用agent-browser在其实际定价页面验证

agent-browser open https://acme.com/pricing agent-browser snapshot -i
undefined
agent-browser open https://acme.com/pricing agent-browser snapshot -i
undefined

With Screenshot Comparison

与截图对比集成

bash
undefined
bash
undefined

Take baseline screenshots for visual regression

拍摄基准截图用于视觉回归测试

agent-browser open https://myapp.com agent-browser screenshot baseline.png
agent-browser open https://myapp.com agent-browser screenshot baseline.png

After changes, compare

更改后进行对比

agent-browser screenshot current.png
agent-browser screenshot current.png

Use image comparison tool

使用图片对比工具

undefined
undefined

With Form Data from Sheets

与表格表单数据集成

python
undefined
python
undefined

Load test data from Google Sheets, run form tests

从Google Sheets加载测试数据,运行表单测试

import subprocess test_data = get_sheet_data("Form Test Cases") for row in test_data: subprocess.run(["agent-browser", "fill", "@email", row["email"]]) subprocess.run(["agent-browser", "fill", "@password", row["password"]]) subprocess.run(["agent-browser", "click", "@submit"])
undefined
import subprocess test_data = get_sheet_data("Form Test Cases") for row in test_data: subprocess.run(["agent-browser", "fill", "@email", row["email"]]) subprocess.run(["agent-browser", "fill", "@password", row["password"]]) subprocess.run(["agent-browser", "click", "@submit"])
undefined

Files in This Skill

本技能包含的文件

  • references/commands.md
    - Full command reference
  • references/authentication.md
    - Login flow patterns
  • references/testing-patterns.md
    - E2E test workflows
  • references/snapshot-workflow.md
    - Ref system deep dive
  • scripts/browser_test.py
    - Python automation wrapper
  • references/commands.md
    - 完整命令参考
  • references/authentication.md
    - 登录流模式
  • references/testing-patterns.md
    - 端到端测试工作流
  • references/snapshot-workflow.md
    - Ref系统深度解析
  • scripts/browser_test.py
    - Python自动化封装脚本

Example: Complete Form Test

示例:完整表单测试

bash
undefined
bash
undefined

Open the registration page

打开注册页面

agent-browser open https://example.com/register
agent-browser open https://example.com/register

Get element refs

获取元素refs

agent-browser snapshot -i
agent-browser snapshot -i

Fill the form (refs from snapshot output)

填充表单(refs来自快照输出)

agent-browser fill @e1 "John Doe" agent-browser fill @e2 "john@example.com" agent-browser fill @e3 "SecurePass123!" agent-browser select @e4 "United States" agent-browser click @e5 # Terms checkbox agent-browser click @e6 # Submit button
agent-browser fill @e1 "John Doe" agent-browser fill @e2 "john@example.com" agent-browser fill @e3 "SecurePass123!" agent-browser select @e4 "United States" agent-browser click @e5 # 勾选条款复选框 agent-browser click @e6 # 点击提交按钮

Wait for navigation and verify

等待页面导航并验证

agent-browser wait network agent-browser snapshot -i
agent-browser wait network agent-browser snapshot -i

Take confirmation screenshot

拍摄确认截图

agent-browser screenshot registration-success.png
undefined
agent-browser screenshot registration-success.png
undefined

Troubleshooting

故障排除

Element not found:
  • Re-run
    snapshot -i
    to get fresh refs
  • Use semantic locators:
    agent-browser find text "Submit"
  • Check if element is in an iframe
Page not loading:
  • Increase timeout:
    agent-browser open <url> --timeout 30000
  • Wait for network:
    agent-browser wait network
Session expired:
  • Save state before tests:
    agent-browser session save backup
  • Load state to restore:
    agent-browser session load backup
元素未找到:
  • 重新运行
    snapshot -i
    以获取最新的refs
  • 使用语义定位器:
    agent-browser find text "Submit"
  • 检查元素是否在iframe中
页面加载失败:
  • 增加超时时间:
    agent-browser open <url> --timeout 30000
  • 等待网络空闲:
    agent-browser wait network
会话过期:
  • 测试前保存状态:
    agent-browser session save backup
  • 加载状态以恢复:
    agent-browser session load backup