vercel-agent-browser
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVercel Agent Browser Skill
Vercel Agent Browser Skill
Overview
概述
agent-browserKey Features:
- Native Rust performance with npm/Homebrew distribution
- Accessibility tree snapshots with stable element refs (,
@e1, etc.)@e2 - Semantic locators (role, text, label, placeholder)
- AI chat mode for natural language control
- Batch command execution
- Network interception and HAR recording
- Chrome DevTools Protocol (CDP) streaming
agent-browser核心特性:
- 原生Rust性能,支持npm/Homebrew分发
- 带有稳定元素引用(、
@e1等)的无障碍树快照@e2 - 语义定位符(角色、文本、标签、占位符)
- 自然语言控制的AI聊天模式
- 批量命令执行
- 网络拦截与HAR录制
- Chrome DevTools Protocol (CDP) 流处理
Installation
安装
Global (Recommended)
全局安装(推荐)
bash
npm install -g agent-browser
agent-browser install # Downloads Chrome for Testingbash
npm install -g agent-browser
agent-browser install # 下载Chrome for TestingProject Local
项目本地安装
bash
npm install agent-browser
npx agent-browser installbash
npm install agent-browser
npx agent-browser installAlternative Methods
其他安装方式
bash
undefinedbash
undefinedHomebrew (macOS)
Homebrew(macOS)
brew install agent-browser
agent-browser install
brew install agent-browser
agent-browser install
Cargo (Rust)
Cargo(Rust)
cargo install agent-browser
agent-browser install
cargo install agent-browser
agent-browser install
Linux with system dependencies
带系统依赖的Linux安装
agent-browser install --with-deps
undefinedagent-browser install --with-deps
undefinedUpgrading
升级
bash
agent-browser upgrade # Auto-detects installation methodbash
agent-browser upgrade # 自动检测安装方式Core Concepts
核心概念
Element References (@refs)
元素引用(@refs)
The most powerful feature for AI agents is the accessibility snapshot with stable refs:
bash
undefined对AI Agent而言最强大的功能是带有稳定引用的无障碍快照:
bash
undefinedGet accessibility tree with element refs
获取带元素引用的无障碍树
agent-browser open https://example.com
agent-browser snapshot
agent-browser open https://example.com
agent-browser snapshot
Output includes refs like:
输出包含如下引用:
@e1 heading "Example Domain"
@e1 heading "Example Domain"
@e2 link "More information..."
@e2 link "More information..."
@e3 textbox "Search" (placeholder)
@e3 textbox "Search" (placeholder)
Use refs directly in commands
直接在命令中使用引用
agent-browser click @e2
agent-browser fill @e3 "search query"
agent-browser get text @e1
undefinedagent-browser click @e2
agent-browser fill @e3 "search query"
agent-browser get text @e1
undefinedTraditional Selectors
传统选择器
Standard CSS selectors also work:
bash
agent-browser click "#submit-button"
agent-browser fill "input[name='email']" "user@example.com"
agent-browser get text ".header h1"标准CSS选择器同样适用:
bash
agent-browser click "#submit-button"
agent-browser fill "input[name='email']" "user@example.com"
agent-browser get text ".header h1"Semantic Locators
语义定位符
Find elements by their semantic meaning:
bash
undefined通过元素的语义含义查找元素:
bash
undefinedBy ARIA role
通过ARIA角色
agent-browser find role button click --name "Submit"
agent-browser find role textbox fill "user@example.com" --name "Email"
agent-browser find role button click --name "Submit"
agent-browser find role textbox fill "user@example.com" --name "Email"
By visible text
通过可见文本
agent-browser find text "Sign In" click
agent-browser find text "Welcome" click --exact
agent-browser find text "Sign In" click
agent-browser find text "Welcome" click --exact
By label
通过标签
agent-browser find label "Password" fill "secret123"
agent-browser find label "Password" fill "secret123"
By placeholder
通过占位符
agent-browser find placeholder "Search..." type "rust cli"
agent-browser find placeholder "Search..." type "rust cli"
By test ID
通过测试ID
agent-browser find testid "login-form" click
undefinedagent-browser find testid "login-form" click
undefinedCommon Workflows
常见工作流
Basic Navigation and Interaction
基础导航与交互
bash
undefinedbash
undefinedStart browser and navigate
启动浏览器并导航
agent-browser open https://github.com/login
agent-browser open https://github.com/login
Get page structure
获取页面结构
agent-browser snapshot
agent-browser snapshot
Fill login form using refs from snapshot
使用快照中的引用填写登录表单
agent-browser fill @e5 "username"
agent-browser fill @e6 "password"
agent-browser click @e7
agent-browser fill @e5 "username"
agent-browser fill @e6 "password"
agent-browser click @e7
Or use semantic locators
或者使用语义定位符
agent-browser find label "Username" fill "username"
agent-browser find label "Password" fill "password"
agent-browser find role button click --name "Sign in"
agent-browser find label "Username" fill "username"
agent-browser find label "Password" fill "password"
agent-browser find role button click --name "Sign in"
Wait for navigation
等待导航完成
agent-browser wait --url "**/dashboard"
agent-browser wait --url "**/dashboard"
Take screenshot
截图
agent-browser screenshot success.png
undefinedagent-browser screenshot success.png
undefinedForm Automation
表单自动化
bash
agent-browser open https://forms.example.combash
agent-browser open https://forms.example.comFill text inputs
填写文本输入框
agent-browser fill "#name" "John Doe"
agent-browser fill "#email" "john@example.com"
agent-browser fill "#name" "John Doe"
agent-browser fill "#email" "john@example.com"
Select dropdown
选择下拉选项
agent-browser select "#country" "United States"
agent-browser select "#country" "United States"
Check checkboxes
勾选复选框
agent-browser check "#newsletter"
agent-browser check "#terms"
agent-browser check "#newsletter"
agent-browser check "#terms"
Upload files
上传文件
agent-browser upload "#resume" "/path/to/resume.pdf"
agent-browser upload "#resume" "/path/to/resume.pdf"
Submit
提交表单
agent-browser find role button click --name "Submit"
agent-browser find role button click --name "Submit"
Wait for success message
等待成功提示
agent-browser wait --text "Thank you"
undefinedagent-browser wait --text "Thank you"
undefinedData Extraction
数据提取
bash
agent-browser open https://news.ycombinator.combash
agent-browser open https://news.ycombinator.comGet page snapshot for structure
获取页面快照查看结构
agent-browser snapshot -i # Interactive mode shows full tree
agent-browser snapshot -i # 交互模式显示完整树结构
Extract specific content
提取特定内容
agent-browser get text ".titleline > a"
agent-browser get text ".titleline > a"
Get multiple elements with JavaScript
使用JavaScript获取多个元素
agent-browser eval "Array.from(document.querySelectorAll('.titleline > a')).map(a => a.textContent)"
agent-browser eval "Array.from(document.querySelectorAll('.titleline > a')).map(a => a.textContent)"
Get structured data as JSON
获取结构化JSON数据
agent-browser eval "JSON.stringify({
title: document.querySelector('title').textContent,
links: Array.from(document.querySelectorAll('.titleline > a')).map(a => ({
text: a.textContent,
url: a.href
}))
})"
undefinedagent-browser eval "JSON.stringify({
title: document.querySelector('title').textContent,
links: Array.from(document.querySelectorAll('.titleline > a')).map(a => ({
text: a.textContent,
url: a.href
}))
})"
undefinedBatch Execution
批量执行
Reduce overhead by running multiple commands in one invocation:
bash
undefined通过一次调用运行多个命令以减少开销:
bash
undefinedArgument mode
参数模式
agent-browser batch
"open https://example.com"
"snapshot -i"
"click @e2"
"wait 1000"
"screenshot result.png"
"open https://example.com"
"snapshot -i"
"click @e2"
"wait 1000"
"screenshot result.png"
agent-browser batch
"open https://example.com"
"snapshot -i"
"click @e2"
"wait 1000"
"screenshot result.png"
"open https://example.com"
"snapshot -i"
"click @e2"
"wait 1000"
"screenshot result.png"
With --bail to stop on first error
使用--bail参数在首次出错时停止
agent-browser batch --bail
"open https://login.example.com"
"fill #username user@example.com"
"fill #password ${PASSWORD}"
"click #submit"
"wait --url '**/dashboard'"
"open https://login.example.com"
"fill #username user@example.com"
"fill #password ${PASSWORD}"
"click #submit"
"wait --url '**/dashboard'"
agent-browser batch --bail
"open https://login.example.com"
"fill #username user@example.com"
"fill #password ${PASSWORD}"
"click #submit"
"wait --url '**/dashboard'"
"open https://login.example.com"
"fill #username user@example.com"
"fill #password ${PASSWORD}"
"click #submit"
"wait --url '**/dashboard'"
JSON mode (piped from script)
JSON模式(从脚本管道输入)
echo '[
["open", "https://example.com"],
["snapshot"],
["click", "@e1"],
["screenshot"]
]' | agent-browser batch --json
undefinedecho '[
["open", "https://example.com"],
["snapshot"],
["click", "@e1"],
["screenshot"]
]' | agent-browser batch --json
undefinedAI Chat Mode
AI聊天模式
Natural language browser control:
bash
undefined自然语言浏览器控制:
bash
undefinedSingle-shot command
单次命令
agent-browser chat "go to github.com and search for rust browser automation"
agent-browser chat "go to github.com and search for rust browser automation"
Interactive REPL
交互式REPL
agent-browser chat
agent-browser chat
> navigate to example.com
> navigate to example.com
> click the first link
> click the first link
> take a screenshot
> take a screenshot
> exit
> exit
undefinedundefinedNetwork Interception
网络拦截
bash
undefinedbash
undefinedBlock specific resources
拦截特定资源
agent-browser network route "**/.jpg" --abort
agent-browser network route "https://ads.example.com/" --abort
agent-browser network route "**/.jpg" --abort
agent-browser network route "https://ads.example.com/" --abort
Block resource types
拦截资源类型
agent-browser network route '' --abort --resource-type script
agent-browser network route '' --abort --resource-type image,media
agent-browser network route '' --abort --resource-type script
agent-browser network route '' --abort --resource-type image,media
Mock API responses
模拟API响应
agent-browser network route "https://api.example.com/user" --body '{
"status": "ok",
"data": {"id": 1, "name": "Test User"}
}'
agent-browser network route "https://api.example.com/user" --body '{
"status": "ok",
"data": {"id": 1, "name": "Test User"}
}'
View network requests
查看网络请求
agent-browser network requests
agent-browser network requests --filter api
agent-browser network requests --type fetch,xhr
agent-browser network requests --method POST
agent-browser network requests --status 200
agent-browser network requests
agent-browser network requests --filter api
agent-browser network requests --type fetch,xhr
agent-browser network requests --method POST
agent-browser network requests --status 200
View specific request detail
查看特定请求详情
agent-browser network request req_123abc
agent-browser network request req_123abc
HAR recording
HAR录制
agent-browser network har start
agent-browser network har start
... perform actions ...
... 执行操作 ...
agent-browser network har stop capture.har
undefinedagent-browser network har stop capture.har
undefinedScreenshot Strategies
截图策略
bash
undefinedbash
undefinedBasic screenshot (saves to temp if no path)
基础截图(无路径则保存到临时目录)
agent-browser screenshot
agent-browser screenshot
Save to specific path
保存到指定路径
agent-browser screenshot ./screenshots/page.png
agent-browser screenshot ./screenshots/page.png
Full page screenshot
全屏截图
agent-browser screenshot --full full-page.png
agent-browser screenshot --full full-page.png
Annotated with element labels
带元素标注的截图
agent-browser screenshot --annotate labeled.png
agent-browser screenshot --annotate labeled.png
Custom directory and format
自定义目录与格式
agent-browser screenshot --screenshot-dir ./shots --screenshot-format jpeg --screenshot-quality 80
agent-browser screenshot --screenshot-dir ./shots --screenshot-format jpeg --screenshot-quality 80
Element screenshot
元素截图
agent-browser eval "document.querySelector('#main').screenshot()" -b > element.png
undefinedagent-browser eval "document.querySelector('#main').screenshot()" -b > element.png
undefinedMulti-Tab Operations
多标签页操作
bash
undefinedbash
undefinedList tabs
列出标签页
agent-browser tab
agent-browser tab
Open new tab with label
打开带标签的新标签页
agent-browser tab new --label docs https://docs.example.com
agent-browser tab new --label docs https://docs.example.com
Open link in new tab
在新标签页打开链接
agent-browser click "#external-link" --new-tab
agent-browser click "#external-link" --new-tab
Switch to tab by label
通过标签切换标签页
agent-browser tab docs
agent-browser tab docs
Switch to tab by ID
通过ID切换标签页
agent-browser tab t2
agent-browser tab t2
Close tab
关闭标签页
agent-browser tab close docs
agent-browser tab close t2
undefinedagent-browser tab close docs
agent-browser tab close t2
undefinedWait Strategies
等待策略
bash
undefinedbash
undefinedWait for element to appear
等待元素出现
agent-browser wait "#result"
agent-browser wait "#result"
Wait for time (milliseconds)
等待指定时长(毫秒)
agent-browser wait 2000
agent-browser wait 2000
Wait for text (substring match)
等待文本出现(子串匹配)
agent-browser wait --text "Success"
agent-browser wait --text "Success"
Wait for URL pattern
等待URL匹配模式
agent-browser wait --url "**/dashboard"
agent-browser wait --url "**/dashboard"
Wait for network idle
等待网络空闲
agent-browser wait --load networkidle
agent-browser wait --load networkidle
Wait for JavaScript condition
等待JavaScript条件满足
agent-browser wait --fn "document.readyState === 'complete'"
agent-browser wait --fn "window.dataLoaded === true"
agent-browser wait --fn "document.readyState === 'complete'"
agent-browser wait --fn "window.dataLoaded === true"
Wait for element to disappear
等待元素消失
agent-browser wait --fn "!document.body.innerText.includes('Loading...')"
agent-browser wait "#spinner" --state hidden
undefinedagent-browser wait --fn "!document.body.innerText.includes('Loading...')"
agent-browser wait "#spinner" --state hidden
undefinedKeyboard and Mouse Control
键盘与鼠标控制
bash
undefinedbash
undefinedKeyboard shortcuts
键盘快捷键
agent-browser press "Control+a"
agent-browser press "Control+c"
agent-browser press "Enter"
agent-browser press "Tab"
agent-browser press "Control+a"
agent-browser press "Control+c"
agent-browser press "Enter"
agent-browser press "Tab"
Type with real keystrokes (at current focus)
模拟真实按键输入(当前焦点处)
agent-browser keyboard type "Hello World"
agent-browser keyboard type "Hello World"
Insert text without key events
直接插入文本(无按键事件)
agent-browser keyboard inserttext "Pasted content"
agent-browser keyboard inserttext "Pasted content"
Hold and release keys
按住并释放按键
agent-browser keydown "Shift"
agent-browser press "a"
agent-browser keyup "Shift"
agent-browser keydown "Shift"
agent-browser press "a"
agent-browser keyup "Shift"
Mouse control
鼠标控制
agent-browser mouse move 100 200
agent-browser mouse down left
agent-browser mouse up left
agent-browser mouse wheel 100 0 # dy dx
agent-browser mouse move 100 200
agent-browser mouse down left
agent-browser mouse up left
agent-browser mouse wheel 100 0 # dy dx
Drag and drop
拖拽操作
agent-browser drag "#source" "#target"
undefinedagent-browser drag "#source" "#target"
undefinedCookies and Storage
Cookie与存储
bash
undefinedbash
undefinedCookies
Cookie管理
agent-browser cookies
agent-browser cookies set "sessionId" "abc123"
agent-browser cookies clear
agent-browser cookies
agent-browser cookies set "sessionId" "abc123"
agent-browser cookies clear
Import from cURL
从cURL导入Cookie
agent-browser cookies set --curl curl-cookies.txt
agent-browser cookies set --curl curl-cookies.txt
LocalStorage
LocalStorage管理
agent-browser storage local
agent-browser storage local get "token"
agent-browser storage local set "token" "eyJ..."
agent-browser storage local clear
agent-browser storage local
agent-browser storage local get "token"
agent-browser storage local set "token" "eyJ..."
agent-browser storage local clear
SessionStorage
SessionStorage管理
agent-browser storage session
agent-browser storage session set "tempData" '{"key":"value"}'
undefinedagent-browser storage session
agent-browser storage session set "tempData" '{"key":"value"}'
undefinedBrowser Configuration
浏览器配置
bash
undefinedbash
undefinedSet viewport
设置视口
agent-browser set viewport 1920 1080
agent-browser set viewport 1920 1080 2 # Retina (2x scale)
agent-browser set viewport 1920 1080
agent-browser set viewport 1920 1080 2 # 视网膜屏(2倍缩放)
Emulate device
模拟设备
agent-browser set device "iPhone 14"
agent-browser set device "iPhone 14"
Geolocation
设置地理位置
agent-browser set geo 37.7749 -122.4194 # San Francisco
agent-browser set geo 37.7749 -122.4194 # 旧金山
Offline mode
离线模式
agent-browser set offline on
agent-browser set offline off
agent-browser set offline on
agent-browser set offline off
Custom headers
自定义请求头
agent-browser set headers '{"X-Custom-Header":"value"}'
agent-browser set headers '{"X-Custom-Header":"value"}'
HTTP auth
HTTP认证
agent-browser set credentials username password
agent-browser set credentials username password
Color scheme
配色方案
agent-browser set media dark
agent-browser set media light
undefinedagent-browser set media dark
agent-browser set media light
undefinedAdvanced JavaScript Evaluation
高级JavaScript评估
bash
undefinedbash
undefinedSimple evaluation
简单评估
agent-browser eval "document.title"
agent-browser eval "document.title"
Complex extraction
复杂数据提取
agent-browser eval "
JSON.stringify({
url: window.location.href,
links: Array.from(document.querySelectorAll('a')).map(a => ({
text: a.textContent.trim(),
href: a.href
})).filter(l => l.text)
})
"
agent-browser eval "
JSON.stringify({
url: window.location.href,
links: Array.from(document.querySelectorAll('a')).map(a => ({
text: a.textContent.trim(),
href: a.href
})).filter(l => l.text)
})
"
Binary output (base64)
二进制输出(base64)
agent-browser eval "document.querySelector('canvas').toDataURL()" -b > canvas.png
agent-browser eval "document.querySelector('canvas').toDataURL()" -b > canvas.png
From stdin
从标准输入获取代码
echo "document.body.innerHTML" | agent-browser eval --stdin
undefinedecho "document.body.innerHTML" | agent-browser eval --stdin
undefinedAutomation Script Example
自动化脚本示例
Create a reusable automation script:
bash
#!/bin/bash创建可复用的自动化脚本:
bash
#!/bin/bashlogin-and-extract.sh
login-and-extract.sh
set -e
SESSION_FILE=".browser-session"
set -e
SESSION_FILE=".browser-session"
Start browser
启动浏览器
agent-browser open https://app.example.com/login
agent-browser open https://app.example.com/login
Login
登录
agent-browser find label "Email" fill "${USER_EMAIL}"
agent-browser find label "Password" fill "${USER_PASSWORD}"
agent-browser find role button click --name "Sign in"
agent-browser find label "Email" fill "${USER_EMAIL}"
agent-browser find label "Password" fill "${USER_PASSWORD}"
agent-browser find role button click --name "Sign in"
Wait for dashboard
等待仪表盘加载
agent-browser wait --url "**/dashboard"
agent-browser wait 1000
agent-browser wait --url "**/dashboard"
agent-browser wait 1000
Extract data
提取数据
DATA=$(agent-browser eval "
JSON.stringify({
user: document.querySelector('.user-name').textContent,
notifications: Array.from(document.querySelectorAll('.notification')).map(n => n.textContent)
})
")
echo "$DATA" | jq .
DATA=$(agent-browser eval "
JSON.stringify({
user: document.querySelector('.user-name').textContent,
notifications: Array.from(document.querySelectorAll('.notification')).map(n => n.textContent)
})
")
echo "$DATA" | jq .
Screenshot
截图
agent-browser screenshot dashboard.png
agent-browser screenshot dashboard.png
Cleanup
清理会话
agent-browser close
Usage:
```bash
export USER_EMAIL="user@example.com"
export USER_PASSWORD="secret"
chmod +x login-and-extract.sh
./login-and-extract.shagent-browser close
使用方式:
```bash
export USER_EMAIL="user@example.com"
export USER_PASSWORD="secret"
chmod +x login-and-extract.sh
./login-and-extract.shBatch Automation Pattern
批量自动化模式
For complex multi-step workflows, use batch mode with a script:
javascript
#!/usr/bin/env node
// automation.js
const commands = [
["open", "https://example.com"],
["snapshot"],
// Process snapshot, determine next steps...
["click", "@e5"],
["wait", "1000"],
["screenshot", "step1.png"],
["find", "role", "button", "click", "--name", "Next"],
["wait", "--url", "**/step2"],
["screenshot", "step2.png"]
];
console.log(JSON.stringify(commands));Run with:
bash
node automation.js | agent-browser batch --json --bail针对复杂的多步骤工作流,使用批量模式配合脚本:
javascript
#!/usr/bin/env node
// automation.js
const commands = [
["open", "https://example.com"],
["snapshot"],
// 处理快照,确定下一步操作...
["click", "@e5"],
["wait", "1000"],
["screenshot", "step1.png"],
["find", "role", "button", "click", "--name", "Next"],
["wait", "--url", "**/step2"],
["screenshot", "step2.png"]
];
console.log(JSON.stringify(commands));运行方式:
bash
node automation.js | agent-browser batch --json --bailCDP Streaming for Real-Time Monitoring
CDP流处理用于实时监控
Connect to Chrome DevTools Protocol for event streaming:
bash
undefined连接到Chrome DevTools Protocol进行事件流处理:
bash
undefinedGet CDP WebSocket URL
获取CDP WebSocket地址
CDP_URL=$(agent-browser get cdp-url)
echo "CDP URL: $CDP_URL"
CDP_URL=$(agent-browser get cdp-url)
echo "CDP URL: $CDP_URL"
Or connect directly to CDP port
或直接连接到CDP端口
agent-browser connect 9222
agent-browser connect 9222
Enable runtime streaming
启用运行时流处理
agent-browser stream enable --port 9223
agent-browser stream status
agent-browser stream enable --port 9223
agent-browser stream status
... perform actions, stream events to ws://localhost:9223 ...
... 执行操作,事件将流到ws://localhost:9223 ...
agent-browser stream disable
undefinedagent-browser stream disable
undefinedTroubleshooting
故障排除
Browser Not Found
浏览器未找到
bash
undefinedbash
undefinedRe-download Chrome for Testing
重新下载Chrome for Testing
agent-browser install
agent-browser install
Or specify custom Chrome path
或指定自定义Chrome路径
export CHROME_PATH=/path/to/chrome
agent-browser open
undefinedexport CHROME_PATH=/path/to/chrome
agent-browser open
undefinedElements Not Found
元素未找到
bash
undefinedbash
undefinedUse snapshot to see available refs
使用快照查看可用引用
agent-browser snapshot -i
agent-browser snapshot -i
Wait for element before interacting
等待元素加载后再交互
agent-browser wait "#dynamic-element"
agent-browser click "#dynamic-element"
agent-browser wait "#dynamic-element"
agent-browser click "#dynamic-element"
Use semantic selectors for robustness
使用语义选择器提高鲁棒性
agent-browser find role button click --name "Submit"
undefinedagent-browser find role button click --name "Submit"
undefinedSession Management
会话管理
bash
undefinedbash
undefinedList all active sessions
列出所有活跃会话
agent-browser tab
agent-browser tab
Close specific session
关闭特定会话
agent-browser close
agent-browser close
Close all sessions
关闭所有会话
agent-browser close --all
undefinedagent-browser close --all
undefinedDebugging
调试
bash
undefinedbash
undefinedGet CDP URL for DevTools connection
获取用于连接DevTools的CDP地址
agent-browser get cdp-url
agent-browser get cdp-url
View network activity
查看网络活动
agent-browser network requests
agent-browser network requests
Enable HAR recording
启用HAR录制
agent-browser network har start
agent-browser network har start
... actions ...
... 执行操作 ...
agent-browser network har stop debug.har
undefinedagent-browser network har stop debug.har
undefinedClipboard Access Issues
剪贴板访问问题
On Linux, clipboard requires or :
xclipxselbash
sudo apt-get install xclip在Linux系统中,剪贴板功能需要或:
xclipxselbash
sudo apt-get install xclipor
或
sudo apt-get install xsel
undefinedsudo apt-get install xsel
undefinedEnvironment Variables
环境变量
bash
undefinedbash
undefinedCustom Chrome binary
自定义Chrome二进制文件路径
export CHROME_PATH=/Applications/Brave Browser.app/Contents/MacOS/Brave Browser
export CHROME_PATH=/Applications/Brave Browser.app/Contents/MacOS/Brave Browser
Custom user data directory
自定义用户数据目录
export CHROME_USER_DATA_DIR=~/.config/agent-browser
export CHROME_USER_DATA_DIR=~/.config/agent-browser
Headless mode (default)
无头模式(默认开启)
export CHROME_HEADLESS=true
export CHROME_HEADLESS=true
Show browser UI
显示浏览器界面
export CHROME_HEADLESS=false
undefinedexport CHROME_HEADLESS=false
undefinedBest Practices for AI Agents
AI Agent最佳实践
- Use snapshots first: Always get to understand page structure before acting
agent-browser snapshot - Prefer refs over selectors: Use style refs from snapshots for stability
@e1 - Use semantic locators: is more resilient than CSS selectors
find role button --name "Submit" - Wait strategically: Add commands for dynamic content before interacting
wait - Batch when possible: Use mode to reduce per-command overhead
batch - Handle errors: Use in batch mode to stop on first error
--bail - Clean up: Always sessions when done
close - Screenshot for verification: Take screenshots at key steps for debugging
- Use labels for tabs: Assign meaningful labels when opening multiple tabs
- 先使用快照:在执行操作前,始终运行了解页面结构
agent-browser snapshot - 优先使用引用而非选择器:使用快照中的风格引用以保证稳定性
@e1 - 使用语义定位符:比CSS选择器更具韧性
find role button --name "Submit" - 合理使用等待:在交互动态内容前添加命令
wait - 尽可能使用批量模式:使用模式减少单命令开销
batch - 处理错误:在批量模式中使用参数在首次出错时停止
--bail - 清理会话:完成操作后务必执行关闭会话
close - 截图验证:在关键步骤截图用于调试
- 为标签页添加标签:打开多个标签页时分配有意义的标签
Quick Reference
快速参考
bash
undefinedbash
undefinedEssential commands for AI agents
AI Agent核心命令
agent-browser open <url> # Start
agent-browser snapshot # Get structure with refs
agent-browser click @eN # Interact with ref
agent-browser find role button click --name "Text" # Semantic
agent-browser wait --text "Success" # Wait for content
agent-browser screenshot result.png # Verify
agent-browser get text @eN # Extract
agent-browser eval "JS" # Complex extraction
agent-browser batch "cmd1" "cmd2" # Multi-step
agent-browser close # Cleanup
undefinedagent-browser open <url> # 启动浏览器
agent-browser snapshot # 获取带引用的页面结构
agent-browser click @eN # 通过引用交互元素
agent-browser find role button click --name "Text" # 语义定位交互
agent-browser wait --text "Success" # 等待内容出现
agent-browser screenshot result.png # 截图验证
agent-browser get text @eN # 提取文本
agent-browser eval "JS" # 复杂数据提取
agent-browser batch "cmd1" "cmd2" # 多步骤批量执行
agent-browser close # 清理会话
undefined