agent-browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

使用代理浏览器实现浏览器自动化

AI Agent浏览器自动化工具使用指南

Core Workflow

核心工作流

Every browser automation follows this pattern:
  1. Navigate:
    agent-browser open <url>
  2. Snapshot:
    agent-browser snapshot -i
    (get element refs like
    @e1
    ,
    @e2
    )
  3. Interact: Use refs to click, fill, select
  4. Re-snapshot: After navigation or DOM changes, get fresh refs
bash
agent-browser open https://example.com/form
agent-browser snapshot -i
所有浏览器自动化任务都遵循以下流程:
  1. 导航
    agent-browser open <url>
  2. 快照
    agent-browser snapshot -i
    (获取元素引用,如
    @e1
    @e2
  3. 交互:使用元素引用执行点击、填写、选择等操作
  4. 重新快照:页面导航或DOM变更后,获取新的元素引用
bash
agent-browser open https://example.com/form
agent-browser snapshot -i

Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

输出:@e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

agent-browser fill @e1 "user@example.com" agent-browser fill @e2 "password123" agent-browser click @e3 agent-browser wait --load networkidle agent-browser snapshot -i # Check result
undefined
agent-browser fill @e1 "user@example.com" agent-browser fill @e2 "password123" agent-browser click @e3 agent-browser wait --load networkidle agent-browser snapshot -i # 检查结果
undefined

Command Chaining

命令链式调用

Commands can be chained with
&&
in a single shell invocation. The browser persists between commands via a background daemon, so chaining is safe and more efficient than separate calls.
bash
undefined
可以在单个Shell调用中使用
&&
将多个命令链式执行。浏览器会通过后台守护进程在命令之间保持会话,因此链式调用比单独执行命令更安全、高效。
bash
undefined

Chain open + wait + snapshot in one call

单次调用中链式执行打开、等待、快照命令

agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i

Chain multiple interactions

链式执行多个交互操作

agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "password123" && agent-browser click @e3
agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "password123" && agent-browser click @e3

Navigate and capture

导航并捕获页面

agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png

**When to chain:** Use `&&` when you don't need to read the output of an intermediate command before proceeding (e.g., open + wait + screenshot). Run commands separately when you need to parse the output first (e.g., snapshot to discover refs, then interact using those refs).
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png

**何时使用链式调用**:当不需要在执行中间命令前读取其输出时(如打开→等待→截屏),使用`&&`;当需要先解析中间命令的输出再继续时(如先快照获取元素引用,再使用引用进行交互),请单独执行命令。

Essential Commands

核心命令

bash
undefined
bash
undefined

Navigation

导航相关

agent-browser open <url> # Navigate (aliases: goto, navigate) agent-browser close # Close browser
agent-browser open <url> # 打开页面(别名:goto、navigate) agent-browser close # 关闭浏览器

Snapshot

快照相关

agent-browser snapshot -i # Interactive elements with refs (recommended) agent-browser snapshot -i -C # Include cursor-interactive elements (divs with onclick, cursor:pointer) agent-browser snapshot -s "#selector" # Scope to CSS selector
agent-browser snapshot -i # 获取带交互元素引用的快照(推荐) agent-browser snapshot -i -C # 包含可通过光标交互的元素(如带onclick的div、cursor:pointer元素) agent-browser snapshot -s "#selector" # �仅针对指定CSS选择器范围内的元素

Interaction (use @refs from snapshot)

交互操作(使用快照获取的@引用)

agent-browser click @e1 # Click element agent-browser click @e1 --new-tab # Click and open in new tab agent-browser fill @e2 "text" # Clear and type text agent-browser type @e2 "text" # Type without clearing agent-browser select @e1 "option" # Select dropdown option agent-browser check @e1 # Check checkbox agent-browser press Enter # Press key agent-browser scroll down 500 # Scroll page
agent-browser click @e1 # 点击元素 agent-browser click @e1 --new-tab # 在新标签页中打开链接 agent-browser fill @e2 "text" # 清空并填写文本 agent-browser type @e2 "text" # 直接输入文本(不清空原有内容) agent-browser select @e1 "option" # 选择下拉选项 agent-browser check @e1 # 勾选复选框 agent-browser press Enter # 按下按键 agent-browser scroll down 500 # 向下滚动页面500像素

Get information

信息获取

agent-browser get text @e1 # Get element text agent-browser get url # Get current URL agent-browser get title # Get page title
agent-browser get text @e1 # 获取元素文本 agent-browser get url # 获取当前URL agent-browser get title # 获取页面标题

Wait

等待操作

agent-browser wait @e1 # Wait for element agent-browser wait --load networkidle # Wait for network idle agent-browser wait --url "**/page" # Wait for URL pattern agent-browser wait 2000 # Wait milliseconds
agent-browser wait @e1 # 等待元素出现 agent-browser wait --load networkidle # 等待网络空闲 agent-browser wait --url "**/page" # 等待URL匹配指定模式 agent-browser wait 2000 # 等待2000毫秒

Capture

捕获操作

agent-browser screenshot # Screenshot to temp dir agent-browser screenshot --full # Full page screenshot agent-browser screenshot --annotate # Annotated screenshot with numbered element labels agent-browser pdf output.pdf # Save as PDF
agent-browser screenshot # 将截屏保存到临时目录 agent-browser screenshot --full # 全屏截屏 agent-browser screenshot --annotate # 带编号元素标签的标注式截屏 agent-browser pdf output.pdf # 将页面保存为PDF

Diff (compare page states)

差异对比(比较页面状态)

agent-browser diff snapshot # Compare current vs last snapshot agent-browser diff snapshot --baseline before.txt # Compare current vs saved file agent-browser diff screenshot --baseline before.png # Visual pixel diff agent-browser diff url <url1> <url2> # Compare two pages agent-browser diff url <url1> <url2> --wait-until networkidle # Custom wait strategy agent-browser diff url <url1> <url2> --selector "#main" # Scope to element
undefined
agent-browser diff snapshot # 对比当前页面与上一次快照 agent-browser diff snapshot --baseline before.txt # 对比当前页面与保存的基准快照文件 agent-browser diff screenshot --baseline before.png # 可视化像素差异对比 agent-browser diff url <url1> <url2> # 对比两个页面 agent-browser diff url <url1> <url2> --wait-until networkidle # 自定义等待策略 agent-browser diff url <url1> <url2> --selector "#main" # �仅对比指定元素范围内的内容
undefined

Common Patterns

常见使用场景

Form Submission

表单提交

bash
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle
bash
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle

Authentication with State Persistence

带状态持久化的身份验证

bash
undefined
bash
undefined

Login once and save state

登录一次并保存状态

agent-browser open https://app.example.com/login agent-browser snapshot -i agent-browser fill @e1 "$USERNAME" agent-browser fill @e2 "$PASSWORD" agent-browser click @e3 agent-browser wait --url "**/dashboard" agent-browser state save auth.json
agent-browser open https://app.example.com/login agent-browser snapshot -i agent-browser fill @e1 "$USERNAME" agent-browser fill @e2 "$PASSWORD" agent-browser click @e3 agent-browser wait --url "**/dashboard" agent-browser state save auth.json

Reuse in future sessions

在后续会话中复用状态

agent-browser state load auth.json agent-browser open https://app.example.com/dashboard
undefined
agent-browser state load auth.json agent-browser open https://app.example.com/dashboard
undefined

Session Persistence

会话持久化

bash
undefined
bash
undefined

Auto-save/restore cookies and localStorage across browser restarts

在浏览器重启时自动保存/恢复Cookie和localStorage

agent-browser --session-name myapp open https://app.example.com/login
agent-browser --session-name myapp open https://app.example.com/login

... login flow ...

... 执行登录流程 ...

agent-browser close # State auto-saved to ~/.agent-browser/sessions/
agent-browser close # 状态会自动保存到~/.agent-browser/sessions/目录

Next time, state is auto-loaded

下次启动时,状态会自动加载

agent-browser --session-name myapp open https://app.example.com/dashboard
agent-browser --session-name myapp open https://app.example.com/dashboard

Encrypt state at rest

静态存储状态时启用加密

export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32) agent-browser --session-name secure open https://app.example.com
export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32) agent-browser --session-name secure open https://app.example.com

Manage saved states

管理已保存的状态

agent-browser state list agent-browser state show myapp-default.json agent-browser state clear myapp agent-browser state clean --older-than 7
undefined
agent-browser state list agent-browser state show myapp-default.json agent-browser state clear myapp agent-browser state clean --older-than 7
undefined

Data Extraction

数据提取

bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5           # Get specific element text
agent-browser get text body > page.txt  # Get all page text
bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5           # 获取指定元素的文本
agent-browser get text body > page.txt  # 获取页面所有文本

JSON output for parsing

输出JSON格式以便解析

agent-browser snapshot -i --json agent-browser get text @e1 --json
undefined
agent-browser snapshot -i --json agent-browser get text @e1 --json
undefined

Parallel Sessions

并行会话

bash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com

agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i

agent-browser session list
bash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com

agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i

agent-browser session list

Connect to Existing Chrome

连接到已运行的Chrome浏览器

bash
undefined
bash
undefined

Auto-discover running Chrome with remote debugging enabled

�自动发现启用了远程调试的Chrome浏览器

agent-browser --auto-connect open https://example.com agent-browser --auto-connect snapshot
agent-browser --auto-connect open https://example.com agent-browser --auto-connect snapshot

Or with explicit CDP port

或通过指定CDP端口连接

agent-browser --cdp 9222 snapshot
undefined
agent-browser --cdp 9222 snapshot
undefined

Visual Browser (Debugging)

可视化浏览器(调试用)

bash
agent-browser --headed open https://example.com
agent-browser highlight @e1          # Highlight element
agent-browser record start demo.webm # Record session
agent-browser profiler start         # Start Chrome DevTools profiling
agent-browser profiler stop trace.json # Stop and save profile (path optional)
bash
agent-browser --headed open https://example.com
agent-browser highlight @e1          # 高亮显示元素
agent-browser record start demo.webm # 录制会话
agent-browser profiler start         # 启动Chrome DevTools性能分析
agent-browser profiler stop trace.json # 停止分析并保存结果(路径可选)

Local Files (PDFs, HTML)

本地文件(PDF、HTML)

bash
undefined
bash
undefined

Open local files with file:// URLs

使用file://URL打开本地文件

agent-browser --allow-file-access open file:///path/to/document.pdf agent-browser --allow-file-access open file:///path/to/page.html agent-browser screenshot output.png
undefined
agent-browser --allow-file-access open file:///path/to/document.pdf agent-browser --allow-file-access open file:///path/to/page.html agent-browser screenshot output.png
undefined

iOS Simulator (Mobile Safari)

iOS模拟器(Mobile Safari)

bash
undefined
bash
undefined

List available iOS simulators

列出可用的iOS模拟器

agent-browser device list
agent-browser device list

Launch Safari on a specific device

在指定设备上启动Safari

agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com

Same workflow as desktop - snapshot, interact, re-snapshot

与桌面端相同的工作流——快照、交互、重新快照

agent-browser -p ios snapshot -i agent-browser -p ios tap @e1 # Tap (alias for click) agent-browser -p ios fill @e2 "text" agent-browser -p ios swipe up # Mobile-specific gesture
agent-browser -p ios snapshot -i agent-browser -p ios tap @e1 # 点击(click的别名) agent-browser -p ios fill @e2 "text" agent-browser -p ios swipe up # 移动端特定手势操作

Take screenshot

截屏

agent-browser -p ios screenshot mobile.png
agent-browser -p ios screenshot mobile.png

Close session (shuts down simulator)

关闭会话(会关闭模拟器)

agent-browser -p ios close

**Requirements:** macOS with Xcode, Appium (`npm install -g appium && appium driver install xcuitest`)

**Real devices:** Works with physical iOS devices if pre-configured. Use `--device "<UDID>"` where UDID is from `xcrun xctrace list devices`.
agent-browser -p ios close

**要求**:安装了Xcode的macOS系统,以及Appium(`npm install -g appium && appium driver install xcuitest`)

**真实设备**:如果已预先配置,可在物理iOS设备上运行。使用`--device "<UDID>"`,其中UDID可通过`xcrun xctrace list devices`获取。

Diffing (Verifying Changes)

差异对比(验证变更)

Use
diff snapshot
after performing an action to verify it had the intended effect. This compares the current accessibility tree against the last snapshot taken in the session.
bash
undefined
执行操作后,使用
diff snapshot
命令验证操作是否达到预期效果。该命令会将当前页面的可访问性树与会话中最后一次快照进行对比。
bash
undefined

Typical workflow: snapshot -> action -> diff

典型工作流:快照→执行操作→对比差异

agent-browser snapshot -i # Take baseline snapshot agent-browser click @e2 # Perform action agent-browser diff snapshot # See what changed (auto-compares to last snapshot)

For visual regression testing or monitoring:

```bash
agent-browser snapshot -i # 拍摄基准快照 agent-browser click @e2 # 执行操作 agent-browser diff snapshot # 查看变更内容(自动与上一次快照对比)

用于视觉回归测试或监控场景:

```bash

Save a baseline screenshot, then compare later

保存基准截屏,之后进行对比

agent-browser screenshot baseline.png
agent-browser screenshot baseline.png

... time passes or changes are made ...

... 经过一段时间或页面发生变更后 ...

agent-browser diff screenshot --baseline baseline.png
agent-browser diff screenshot --baseline baseline.png

Compare staging vs production

对比预发布环境与生产环境页面

agent-browser diff url https://staging.example.com https://prod.example.com --screenshot

`diff snapshot` output uses `+` for additions and `-` for removals, similar to git diff. `diff screenshot` produces a diff image with changed pixels highlighted in red, plus a mismatch percentage.
agent-browser diff url https://staging.example.com https://prod.example.com --screenshot

`diff snapshot`的输出使用`+`表示新增内容,`-`表示移除内容,与git diff格式类似。`diff screenshot`会生成一张差异图片,用红色高亮显示变化的像素,并给出不匹配百分比。

Timeouts and Slow Pages

超时与慢页面处理

The default Playwright timeout is 60 seconds for local browsers. For slow websites or large pages, use explicit waits instead of relying on the default timeout:
bash
undefined
本地浏览器的默认Playwright超时时间为60秒。对于加载缓慢的网站或大型页面,请使用显式等待,而非依赖默认超时:
bash
undefined

Wait for network activity to settle (best for slow pages)

等待网络活动稳定(处理慢页面的最佳方式)

agent-browser wait --load networkidle
agent-browser wait --load networkidle

Wait for a specific element to appear

等待指定元素出现

agent-browser wait "#content" agent-browser wait @e1
agent-browser wait "#content" agent-browser wait @e1

Wait for a specific URL pattern (useful after redirects)

等待URL匹配指定模式(重定向后非常有用)

agent-browser wait --url "**/dashboard"
agent-browser wait --url "**/dashboard"

Wait for a JavaScript condition

等待JavaScript条件满足

agent-browser wait --fn "document.readyState === 'complete'"
agent-browser wait --fn "document.readyState === 'complete'"

Wait a fixed duration (milliseconds) as a last resort

最后一种选择:等待固定时长(毫秒)

agent-browser wait 5000

When dealing with consistently slow websites, use `wait --load networkidle` after `open` to ensure the page is fully loaded before taking a snapshot. If a specific element is slow to render, wait for it directly with `wait <selector>` or `wait @ref`.
agent-browser wait 5000

对于持续加载缓慢的网站,在`open`命令后使用`wait --load networkidle`,确保页面完全加载后再进行快照。如果特定元素加载缓慢,直接使用`wait <selector>`或`wait @ref`等待该元素。

Session Management and Cleanup

会话管理与清理

When running multiple agents or automations concurrently, always use named sessions to avoid conflicts:
bash
undefined
当同时运行多个Agent或自动化任务时,请务必使用命名会话以避免冲突:
bash
undefined

Each agent gets its own isolated session

每个Agent拥有独立的会话

agent-browser --session agent1 open site-a.com agent-browser --session agent2 open site-b.com
agent-browser --session agent1 open site-a.com agent-browser --session agent2 open site-b.com

Check active sessions

查看活跃会话

agent-browser session list

Always close your browser session when done to avoid leaked processes:

```bash
agent-browser close                    # Close default session
agent-browser --session agent1 close   # Close specific session
If a previous session was not closed properly, the daemon may still be running. Use
agent-browser close
to clean it up before starting new work.
agent-browser session list

完成操作后,请始终关闭浏览器会话以避免进程泄漏:

```bash
agent-browser close                    # 关闭默认会话
agent-browser --session agent1 close   # 关闭指定会话
如果之前的会话未正常关闭,守护进程可能仍在运行。在启动新任务前,使用
agent-browser close
命令清理残留进程。

Ref Lifecycle (Important)

元素引用生命周期(重要)

Refs (
@e1
,
@e2
, etc.) are invalidated when the page changes. Always re-snapshot after:
  • Clicking links or buttons that navigate
  • Form submissions
  • Dynamic content loading (dropdowns, modals)
bash
agent-browser click @e5              # Navigates to new page
agent-browser snapshot -i            # MUST re-snapshot
agent-browser click @e1              # Use new refs
当页面发生变更时,元素引用(
@e1
@e2
等)会失效。在以下操作后务必重新快照:
  • 点击链接或按钮导致页面导航
  • 表单提交
  • 动态内容加载(如下拉菜单、模态框)
bash
agent-browser click @e5              # 导航到新页面
agent-browser snapshot -i            # 必须重新快照
agent-browser click @e1              # 使用新的元素引用

Annotated Screenshots (Vision Mode)

标注式截屏(视觉模式)

Use
--annotate
to take a screenshot with numbered labels overlaid on interactive elements. Each label
[N]
maps to ref
@eN
. This also caches refs, so you can interact with elements immediately without a separate snapshot.
bash
agent-browser screenshot --annotate
使用
--annotate
命令拍摄带编号标签的截屏,标签会覆盖在交互元素上。每个标签
[N]
对应元素引用
@eN
。该命令还会缓存元素引用,因此无需单独执行快照即可直接与元素交互。
bash
agent-browser screenshot --annotate

Output includes the image path and a legend:

输出包含图片路径和图例:

[1] @e1 button "Submit"

[1] @e1 button "Submit"

[2] @e2 link "Home"

[2] @e2 link "Home"

[3] @e3 textbox "Email"

[3] @e3 textbox "Email"

agent-browser click @e2 # Click using ref from annotated screenshot

Use annotated screenshots when:
- The page has unlabeled icon buttons or visual-only elements
- You need to verify visual layout or styling
- Canvas or chart elements are present (invisible to text snapshots)
- You need spatial reasoning about element positions
agent-browser click @e2 # 使用标注式截屏中的元素引用进行点击

在以下场景使用标注式截屏:
- 页面包含无标签的图标按钮或仅视觉可见的元素
- 需要验证视觉布局或样式
- 页面包含Canvas或图表元素(文本快照无法识别)
- 需要对元素位置进行空间推理

Semantic Locators (Alternative to Refs)

语义定位器(元素引用的替代方案)

When refs are unavailable or unreliable, use semantic locators:
bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click
当元素引用不可用或不可靠时,使用语义定位器:
bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click

JavaScript Evaluation (eval)

JavaScript执行(eval)

Use
eval
to run JavaScript in the browser context. Shell quoting can corrupt complex expressions -- use
--stdin
or
-b
to avoid issues.
bash
undefined
使用
eval
命令在浏览器上下文中运行JavaScript。Shell引号可能会破坏复杂表达式——请使用
--stdin
-b
参数避免此类问题。
bash
undefined

Simple expressions work with regular quoting

简单表达式可使用常规引号

agent-browser eval 'document.title' agent-browser eval 'document.querySelectorAll("img").length'
agent-browser eval 'document.title' agent-browser eval 'document.querySelectorAll("img").length'

Complex JS: use --stdin with heredoc (RECOMMENDED)

复杂JavaScript:推荐使用--stdin和 heredoc

agent-browser eval --stdin <<'EVALEOF' JSON.stringify( Array.from(document.querySelectorAll("img")) .filter(i => !i.alt) .map(i => ({ src: i.src.split("/").pop(), width: i.width })) ) EVALEOF
agent-browser eval --stdin <<'EVALEOF' JSON.stringify( Array.from(document.querySelectorAll("img")) .filter(i => !i.alt) .map(i => ({ src: i.src.split("/").pop(), width: i.width })) ) EVALEOF

Alternative: base64 encoding (avoids all shell escaping issues)

替代方案:base64编码(可避免所有Shell转义问题)

agent-browser eval -b "$(echo -n 'Array.from(document.querySelectorAll("a")).map(a => a.href)' | base64)"

**Why this matters:** When the shell processes your command, inner double quotes, `!` characters (history expansion), backticks, and `$()` can all corrupt the JavaScript before it reaches agent-browser. The `--stdin` and `-b` flags bypass shell interpretation entirely.

**Rules of thumb:**
- Single-line, no nested quotes -> regular `eval 'expression'` with single quotes is fine
- Nested quotes, arrow functions, template literals, or multiline -> use `eval --stdin <<'EVALEOF'`
- Programmatic/generated scripts -> use `eval -b` with base64
agent-browser eval -b "$(echo -n 'Array.from(document.querySelectorAll("a")).map(a => a.href)' | base64)"

**为何这很重要**:当Shell处理命令时,内部的双引号、`!`字符(历史扩展)、反引号和`$()`都会破坏JavaScript代码,使其无法正常传递给agent-browser。`--stdin`和`-b`参数可完全绕过Shell解析。

**经验法则**:
- 单行、无嵌套引号的表达式→使用常规`eval 'expression'`(单引号包裹)
- 包含嵌套引号、箭头函数、模板字面量或多行的表达式→使用`eval --stdin <<'EVALEOF'`
- 程序化/生成的脚本→使用`eval -b`结合base64编码

Configuration File

配置文件

Create
agent-browser.json
in the project root for persistent settings:
json
{
  "headed": true,
  "proxy": "http://localhost:8080",
  "profile": "./browser-data"
}
Priority (lowest to highest):
~/.agent-browser/config.json
<
./agent-browser.json
< env vars < CLI flags. Use
--config <path>
or
AGENT_BROWSER_CONFIG
env var for a custom config file (exits with error if missing/invalid). All CLI options map to camelCase keys (e.g.,
--executable-path
->
"executablePath"
). Boolean flags accept
true
/
false
values (e.g.,
--headed false
overrides config). Extensions from user and project configs are merged, not replaced.
在项目根目录创建
agent-browser.json
文件以设置持久化配置:
json
{
  "headed": true,
  "proxy": "http://localhost:8080",
  "profile": "./browser-data"
}
配置优先级(从低到高):
~/.agent-browser/config.json
<
./agent-browser.json
< 环境变量 < CLI参数。使用
--config <path>
AGENT_BROWSER_CONFIG
环境变量指定自定义配置文件(如果文件不存在或无效,会报错退出)。所有CLI选项对应驼峰式键名(如
--executable-path
对应
"executablePath"
)。布尔类型参数接受
true
/
false
值(如
--headed false
会覆盖配置文件中的设置)。用户配置和项目配置中的扩展会被合并,而非替换。

Deep-Dive Documentation

深度文档

ReferenceWhen to Use
references/commands.mdFull command reference with all options
references/snapshot-refs.mdRef lifecycle, invalidation rules, troubleshooting
references/session-management.mdParallel sessions, state persistence, concurrent scraping
references/authentication.mdLogin flows, OAuth, 2FA handling, state reuse
references/video-recording.mdRecording workflows for debugging and documentation
references/profiling.mdChrome DevTools profiling for performance analysis
references/proxy-support.mdProxy configuration, geo-testing, rotating proxies
参考文档使用场景
references/commands.md包含所有选项的完整命令参考
references/snapshot-refs.md元素引用生命周期、失效规则、故障排除
references/session-management.md并行会话、状态持久化、并发抓取
references/authentication.md登录流程、OAuth、2FA处理、状态复用
references/video-recording.md录制工作流用于调试和文档
references/profiling.mdChrome DevTools性能分析
references/proxy-support.md代理配置、地域测试、轮换代理

Ready-to-Use Templates

即用型模板

TemplateDescription
templates/form-automation.shForm filling with validation
templates/authenticated-session.shLogin once, reuse state
templates/capture-workflow.shContent extraction with screenshots
bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output
模板描述
templates/form-automation.sh带验证的表单填写
templates/authenticated-session.sh一次登录,复用状态
templates/capture-workflow.sh带截屏的内容提取
bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output