agent-browser
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBrowser Automation with agent-browser
使用agent-browser实现浏览器自动化
Quick start
快速开始
bash
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browserbash
agent-browser open <url> # 导航至页面
agent-browser snapshot -i # 获取带引用标识的交互元素
agent-browser click @e1 # 通过引用标识点击元素
agent-browser fill @e2 "text" # 通过引用标识填充输入框
agent-browser close # 关闭浏览器Core workflow
核心工作流程
- Navigate:
agent-browser open <url> - Snapshot: (returns elements with refs like
agent-browser snapshot -i,@e1)@e2 - Interact using refs from the snapshot
- Re-snapshot after navigation or significant DOM changes
- 导航:
agent-browser open <url> - 快照:(返回带@e1、@e2等引用标识的元素)
agent-browser snapshot -i - 使用快照中的引用标识进行交互操作
- 导航后或DOM发生重大变化时,重新生成快照
Commands
命令
Navigation
导航
bash
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browserbash
agent-browser open <url> # 导航至指定URL
agent-browser back # 返回上一页
agent-browser forward # 前进到下一页
agent-browser reload # 重新加载页面
agent-browser close # 关闭浏览器Snapshot (page analysis)
快照(页面分析)
bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3bash
agent-browser snapshot # 获取完整可访问性树
agent-browser snapshot -i # 仅获取交互元素(推荐使用)
agent-browser snapshot -c # 输出精简格式结果
agent-browser snapshot -d 3 # 限制输出深度为3Interactions (use @refs from snapshot)
交互操作(使用快照中的@引用标识)
bash
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser press Control+a # Key combination
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown
agent-browser scroll down 500 # Scroll page
agent-browser scrollintoview @e1 # Scroll element into viewbash
agent-browser click @e1 # 点击元素
agent-browser dblclick @e1 # 双击元素
agent-browser fill @e2 "text" # 清空并输入文本
agent-browser type @e2 "text" # 直接输入文本(不清空原有内容)
agent-browser press Enter # 按下回车键
agent-browser press Control+a # 按下组合键
agent-browser hover @e1 # 悬停在元素上
agent-browser check @e1 # 勾选复选框
agent-browser uncheck @e1 # 取消勾选复选框
agent-browser select @e1 "value" # 选择下拉选项
agent-browser scroll down 500 # 向下滚动500像素
agent-browser scrollintoview @e1 # 滚动至元素可见位置Get information
获取信息
bash
agent-browser get text @e1 # Get element text
agent-browser get value @e1 # Get input value
agent-browser get title # Get page title
agent-browser get url # Get current URLbash
agent-browser get text @e1 # 获取元素文本内容
agent-browser get value @e1 # 获取输入框值
agent-browser get title # 获取页面标题
agent-browser get url # 获取当前URLScreenshots
截图
bash
agent-browser screenshot # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full # Full pagebash
agent-browser screenshot # 将截图输出至标准输出
agent-browser screenshot path.png # 将截图保存至指定文件
agent-browser screenshot --full # 截取完整页面Wait
等待
bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text
agent-browser wait --load networkidle # Wait for network idlebash
agent-browser wait @e1 # 等待元素出现
agent-browser wait 2000 # 等待指定毫秒数
agent-browser wait --text "Success" # 等待指定文本出现
agent-browser wait --load networkidle # 等待网络空闲Semantic locators (alternative to refs)
语义定位器(替代引用标识的方式)
bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"Example: Form submission
示例:表单提交
bash
agent-browser open https://example.com/form
agent-browser snapshot -ibash
agent-browser open https://example.com/form
agent-browser snapshot -iOutput shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
输出内容示例:文本框 "Email" [引用标识=e1], 文本框 "Password" [引用标识=e2], 按钮 "Submit" [引用标识=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
undefinedagent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # 检查提交结果
undefined