agent-browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

agent-browser - Browser Automation for AI Agents

agent-browser - 面向AI Agent的浏览器自动化工具

When to use this skill

何时使用该技能

  • Open websites and automate UI actions
  • Fill forms, click controls, and verify outcomes
  • Capture screenshots/PDFs or extract content
  • Run deterministic web checks with accessibility refs
  • Execute parallel browser tasks via isolated sessions
  • 打开网站并自动化执行UI操作
  • 填写表单、点击控件并验证执行结果
  • 捕获截图/PDF或提取页面内容
  • 利用可访问性引用运行确定性Web检查
  • 通过隔离会话执行并行浏览器任务

Core workflow

核心工作流

Always use the deterministic ref loop:
  1. agent-browser open <url>
  2. agent-browser snapshot -i
  3. interact with refs (
    @e1
    ,
    @e2
    , ...)
  4. agent-browser snapshot -i
    again after page/DOM changes
bash
agent-browser open https://example.com/form
agent-browser wait --load networkidle
agent-browser snapshot -i
agent-browser fill @e1 "user@example.com"
agent-browser click @e2
agent-browser snapshot -i
请始终使用确定性引用循环:
  1. agent-browser open <url>
  2. agent-browser snapshot -i
  3. 与引用(
    @e1
    @e2
    ...)交互
  4. 页面/DOM变更后再次执行
    agent-browser snapshot -i
bash
agent-browser open https://example.com/form
agent-browser wait --load networkidle
agent-browser snapshot -i
agent-browser fill @e1 "user@example.com"
agent-browser click @e2
agent-browser snapshot -i

Command patterns

命令使用模式

Use
&&
chaining when intermediate output is not needed.
bash
undefined
不需要中间输出时可使用
&&
链式调用。
bash
undefined

Good chaining: open -> wait -> snapshot

推荐链式调用:打开 -> 等待加载 -> 生成快照

agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i

Separate calls when output is needed first

需要先获取输出时分开调用

agent-browser snapshot -i
agent-browser snapshot -i

parse refs

解析引用

agent-browser click @e2

High-value commands:
- Navigation: `open`, `close`
- Snapshot: `snapshot -i`, `snapshot -i -C`, `snapshot -s "#selector"`
- Interaction: `click`, `fill`, `type`, `select`, `check`, `press`
- Verification: `diff snapshot`, `diff screenshot --baseline <file>`
- Capture: `screenshot`, `screenshot --annotate`, `pdf`
- Wait: `wait --load networkidle`, `wait <selector|@ref|ms>`
agent-browser click @e2

高价值命令:
- 导航类:`open`、`close`
- 快照类:`snapshot -i`、`snapshot -i -C`、`snapshot -s "#selector"`
- 交互类:`click`、`fill`、`type`、`select`、`check`、`press`
- 验证类:`diff snapshot`、`diff screenshot --baseline <file>`
- 捕获类:`screenshot`、`screenshot --annotate`、`pdf`
- 等待类:`wait --load networkidle`、`wait <selector|@ref|ms>`

Verification patterns

验证模式

Use explicit evidence after actions.
bash
undefined
执行操作后请使用明确的证据进行校验。
bash
undefined

Baseline -> action -> verify structure

基线 -> 执行操作 -> 验证结构 流程

agent-browser snapshot -i agent-browser click @e3 agent-browser diff snapshot
agent-browser snapshot -i agent-browser click @e3 agent-browser diff snapshot

Visual regression

视觉回归校验

agent-browser screenshot baseline.png agent-browser click @e5 agent-browser diff screenshot --baseline baseline.png
undefined
agent-browser screenshot baseline.png agent-browser click @e5 agent-browser diff screenshot --baseline baseline.png
undefined

Safety and reliability

安全性与可靠性

  • Refs are invalid after navigation or significant DOM updates; re-snapshot before next action.
  • Prefer
    wait --load networkidle
    or selector/ref waits over fixed sleeps.
  • For multi-step JS, use
    eval --stdin
    (or base64) to avoid shell escaping breakage.
  • For concurrent tasks, isolate with
    --session <name>
    .
  • Use output controls in long pages to reduce context flooding.
  • Optional hardening in sensitive flows: domain allowlist and action policies.
Optional hardening examples:
bash
undefined
  • 导航或DOM发生重大更新后引用会失效,下次操作前请重新生成快照
  • 推荐使用
    wait --load networkidle
    或选择器/引用等待,而非固定时长休眠
  • 对于多步JS操作,使用
    eval --stdin
    (或base64)避免shell转义问题
  • 并发任务场景下,通过
    --session <name>
    实现隔离
  • 长页面下使用输出控制减少上下文冗余
  • 敏感流程可选增强配置:域名白名单和操作策略
可选增强配置示例:
bash
undefined

Wrap page content with boundaries to reduce prompt-injection risk

为页面内容添加边界降低prompt注入风险

export AGENT_BROWSER_CONTENT_BOUNDARIES=1
export AGENT_BROWSER_CONTENT_BOUNDARIES=1

Limit output volume for long pages

限制长页面的输出体积

export AGENT_BROWSER_MAX_OUTPUT=50000
export AGENT_BROWSER_MAX_OUTPUT=50000

Restrict navigation and network to trusted domains

仅允许导航和访问可信域名

export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"

Restrict allowed action types

限制允许的操作类型

export AGENT_BROWSER_ACTION_POLICY=./policy.json

Example `policy.json`:

```json
{"default":"deny","allow":["navigate","snapshot","click","fill","scroll","wait","get"],"deny":["eval","download","upload","network","state"]}
CLI-flag equivalent:
bash
agent-browser --content-boundaries --max-output 50000 --allowed-domains "example.com,*.example.com" --action-policy ./policy.json open https://example.com
export AGENT_BROWSER_ACTION_POLICY=./policy.json

`policy.json`示例:

```json
{"default":"deny","allow":["navigate","snapshot","click","fill","scroll","wait","get"],"deny":["eval","download","upload","network","state"]}
CLI参数等价写法:
bash
agent-browser --content-boundaries --max-output 50000 --allowed-domains "example.com,*.example.com" --action-policy ./policy.json open https://example.com

Troubleshooting

故障排查

  • command not found
    : install and run
    agent-browser install
    .
  • Wrong element clicked: run
    snapshot -i
    again and use fresh refs.
  • Dynamic SPA content missing: wait with
    --load networkidle
    or targeted
    wait
    selector.
  • Session collisions: assign unique
    --session
    names and close each session.
  • Large output pressure: narrow snapshots (
    -i
    ,
    -c
    ,
    -d
    ,
    -s
    ) and extract only needed text.
  • command not found
    :安装工具后执行
    agent-browser install
  • 点击了错误元素:重新执行
    snapshot -i
    使用最新的引用
  • 动态SPA内容缺失:使用
    --load networkidle
    等待加载或指定
    wait
    选择器
  • 会话冲突:分配唯一的
    --session
    名称并在使用后关闭每个会话
  • 输出体积过大:缩小快照范围(
    -i
    -c
    -d
    -s
    )仅提取需要的文本

References

参考资料

Deep-dive docs in this skill:
  • commands
  • snapshot-refs
  • session-management
  • authentication
Related resources:
Ready templates:
  • ./templates/form-automation.sh
  • ./templates/capture-workflow.sh
该技能的深度文档:
  • commands
  • snapshot-refs
  • session-management
  • authentication
相关资源:
可用模板:
  • ./templates/form-automation.sh
  • ./templates/capture-workflow.sh

Metadata

元数据

  • Version: 1.1.0
  • Last updated: 2026-02-26
  • Scope: deterministic browser automation for agent workflows
  • 版本:1.1.0
  • 最后更新:2026-02-26
  • 适用范围:Agent工作流的确定性浏览器自动化