agent-browser
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chineseagent-browser - Browser Automation for AI Agents
agent-browser - 面向AI Agent的浏览器自动化工具
When to use this skill
何时使用该技能
- Open websites and automate UI actions
- Fill forms, click controls, and verify outcomes
- Capture screenshots/PDFs or extract content
- Run deterministic web checks with accessibility refs
- Execute parallel browser tasks via isolated sessions
- 打开网站并自动化执行UI操作
- 填写表单、点击控件并验证执行结果
- 捕获截图/PDF或提取页面内容
- 利用可访问性引用运行确定性Web检查
- 通过隔离会话执行并行浏览器任务
Core workflow
核心工作流
Always use the deterministic ref loop:
agent-browser open <url>agent-browser snapshot -i- interact with refs (,
@e1, ...)@e2 - again after page/DOM changes
agent-browser snapshot -i
bash
agent-browser open https://example.com/form
agent-browser wait --load networkidle
agent-browser snapshot -i
agent-browser fill @e1 "user@example.com"
agent-browser click @e2
agent-browser snapshot -i请始终使用确定性引用循环:
agent-browser open <url>agent-browser snapshot -i- 与引用(、
@e1...)交互@e2 - 页面/DOM变更后再次执行
agent-browser snapshot -i
bash
agent-browser open https://example.com/form
agent-browser wait --load networkidle
agent-browser snapshot -i
agent-browser fill @e1 "user@example.com"
agent-browser click @e2
agent-browser snapshot -iCommand patterns
命令使用模式
Use chaining when intermediate output is not needed.
&&bash
undefined不需要中间输出时可使用链式调用。
&&bash
undefinedGood chaining: open -> wait -> snapshot
推荐链式调用:打开 -> 等待加载 -> 生成快照
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
Separate calls when output is needed first
需要先获取输出时分开调用
agent-browser snapshot -i
agent-browser snapshot -i
parse refs
解析引用
agent-browser click @e2
High-value commands:
- Navigation: `open`, `close`
- Snapshot: `snapshot -i`, `snapshot -i -C`, `snapshot -s "#selector"`
- Interaction: `click`, `fill`, `type`, `select`, `check`, `press`
- Verification: `diff snapshot`, `diff screenshot --baseline <file>`
- Capture: `screenshot`, `screenshot --annotate`, `pdf`
- Wait: `wait --load networkidle`, `wait <selector|@ref|ms>`agent-browser click @e2
高价值命令:
- 导航类:`open`、`close`
- 快照类:`snapshot -i`、`snapshot -i -C`、`snapshot -s "#selector"`
- 交互类:`click`、`fill`、`type`、`select`、`check`、`press`
- 验证类:`diff snapshot`、`diff screenshot --baseline <file>`
- 捕获类:`screenshot`、`screenshot --annotate`、`pdf`
- 等待类:`wait --load networkidle`、`wait <selector|@ref|ms>`Verification patterns
验证模式
Use explicit evidence after actions.
bash
undefined执行操作后请使用明确的证据进行校验。
bash
undefinedBaseline -> action -> verify structure
基线 -> 执行操作 -> 验证结构 流程
agent-browser snapshot -i
agent-browser click @e3
agent-browser diff snapshot
agent-browser snapshot -i
agent-browser click @e3
agent-browser diff snapshot
Visual regression
视觉回归校验
agent-browser screenshot baseline.png
agent-browser click @e5
agent-browser diff screenshot --baseline baseline.png
undefinedagent-browser screenshot baseline.png
agent-browser click @e5
agent-browser diff screenshot --baseline baseline.png
undefinedSafety and reliability
安全性与可靠性
- Refs are invalid after navigation or significant DOM updates; re-snapshot before next action.
- Prefer or selector/ref waits over fixed sleeps.
wait --load networkidle - For multi-step JS, use (or base64) to avoid shell escaping breakage.
eval --stdin - For concurrent tasks, isolate with .
--session <name> - Use output controls in long pages to reduce context flooding.
- Optional hardening in sensitive flows: domain allowlist and action policies.
Optional hardening examples:
bash
undefined- 导航或DOM发生重大更新后引用会失效,下次操作前请重新生成快照
- 推荐使用或选择器/引用等待,而非固定时长休眠
wait --load networkidle - 对于多步JS操作,使用(或base64)避免shell转义问题
eval --stdin - 并发任务场景下,通过实现隔离
--session <name> - 长页面下使用输出控制减少上下文冗余
- 敏感流程可选增强配置:域名白名单和操作策略
可选增强配置示例:
bash
undefinedWrap page content with boundaries to reduce prompt-injection risk
为页面内容添加边界降低prompt注入风险
export AGENT_BROWSER_CONTENT_BOUNDARIES=1
export AGENT_BROWSER_CONTENT_BOUNDARIES=1
Limit output volume for long pages
限制长页面的输出体积
export AGENT_BROWSER_MAX_OUTPUT=50000
export AGENT_BROWSER_MAX_OUTPUT=50000
Restrict navigation and network to trusted domains
仅允许导航和访问可信域名
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
Restrict allowed action types
限制允许的操作类型
export AGENT_BROWSER_ACTION_POLICY=./policy.json
Example `policy.json`:
```json
{"default":"deny","allow":["navigate","snapshot","click","fill","scroll","wait","get"],"deny":["eval","download","upload","network","state"]}CLI-flag equivalent:
bash
agent-browser --content-boundaries --max-output 50000 --allowed-domains "example.com,*.example.com" --action-policy ./policy.json open https://example.comexport AGENT_BROWSER_ACTION_POLICY=./policy.json
`policy.json`示例:
```json
{"default":"deny","allow":["navigate","snapshot","click","fill","scroll","wait","get"],"deny":["eval","download","upload","network","state"]}CLI参数等价写法:
bash
agent-browser --content-boundaries --max-output 50000 --allowed-domains "example.com,*.example.com" --action-policy ./policy.json open https://example.comTroubleshooting
故障排查
- : install and run
command not found.agent-browser install - Wrong element clicked: run again and use fresh refs.
snapshot -i - Dynamic SPA content missing: wait with or targeted
--load networkidleselector.wait - Session collisions: assign unique names and close each session.
--session - Large output pressure: narrow snapshots (,
-i,-c,-d) and extract only needed text.-s
- :安装工具后执行
command not foundagent-browser install - 点击了错误元素:重新执行使用最新的引用
snapshot -i - 动态SPA内容缺失:使用等待加载或指定
--load networkidle选择器wait - 会话冲突:分配唯一的名称并在使用后关闭每个会话
--session - 输出体积过大:缩小快照范围(、
-i、-c、-d)仅提取需要的文本-s
References
参考资料
Deep-dive docs in this skill:
- commands
- snapshot-refs
- session-management
- authentication
Related resources:
Ready templates:
./templates/form-automation.sh./templates/capture-workflow.sh
该技能的深度文档:
- commands
- snapshot-refs
- session-management
- authentication
相关资源:
可用模板:
./templates/form-automation.sh./templates/capture-workflow.sh
Metadata
元数据
- Version: 1.1.0
- Last updated: 2026-02-26
- Scope: deterministic browser automation for agent workflows
- 版本:1.1.0
- 最后更新:2026-02-26
- 适用范围:Agent工作流的确定性浏览器自动化