browser-act

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Browser Automation with browser-act CLI

使用browser-act CLI实现浏览器自动化

browser-act
is a CLI for browser automation with stealth and captcha solving capabilities. It supports two browser types (Stealth and Real Chrome) and provides commands for navigation, page interaction, data extraction, tab/session management, and more.
All commands output human-readable text by default. Use
--format json
for structured JSON output, ideal for AI agent integration and scripting.
browser-act
是一款具备隐身能力和验证码解决功能的浏览器自动化CLI。它支持两种浏览器类型(隐身浏览器和真实Chrome),提供导航、页面交互、数据提取、标签页/会话管理等命令。
所有命令默认输出人类可读文本。使用
--format json
可获取结构化JSON输出,非常适合AI Agent集成和脚本编写。

Installation

安装

bash
undefined
bash
undefined

Upgrade if installed, otherwise install fresh

Upgrade if installed, otherwise install fresh

uv tool upgrade browser-act-cli || uv tool install browser-act-cli --python 3.12

Run this at the start of every session to ensure the latest version.

**Global options** available on every command:

| Option | Default | Description |
|--------|---------|-------------|
| `--session <name>` | `default` | Session name (isolates browser state) |
| `--format <text\|json>` | `text` | Output format |
| `--intent <desc>` | none | Caller intent for analytics |
| `--version` | | Show version |
| `-h, --help` | | Show help |
uv tool upgrade browser-act-cli || uv tool install browser-act-cli --python 3.12

每次会话开始时运行此命令以确保使用最新版本。

**全局选项** 所有命令都支持以下选项:

| 选项 | 默认值 | 描述 |
|--------|---------|-------------|
| `--session <name>` | `default` | 会话名称(隔离浏览器状态) |
| `--format <text\|json>` | `text` | 输出格式 |
| `--intent <desc>` | none | 用于分析的调用方意图 |
| `--version` | | 显示版本 |
| `-h, --help` | | 显示帮助 |

Authentication

认证

Some features require a BrowserAct API key (stealth browsers, captcha solving, etc.). Real Chrome and basic page operations work without one.
Option 1: Interactive registration (recommended)
bash
undefined
部分功能需要BrowserAct API密钥(隐身浏览器、验证码解决等)。真实Chrome和基础页面操作无需API密钥。
选项1:交互式注册(推荐)
bash
undefined

Step 1: Get registration URL

Step 1: Get registration URL

browser-act auth login
browser-act auth login

Output: registration URL + instructions

Output: registration URL + instructions

Step 2: Check registration status (single check, not a loop)

Step 2: Check registration status (single check, not a loop)

browser-act auth poll
browser-act auth poll

Returns API key on success, or pending status if not yet completed

Returns API key on success, or pending status if not yet completed


**AI agent flow:** Call `auth login`, present the registration URL to the user, then loop `auth poll` every few seconds until it returns success. When the response indicates less than 10 minutes remaining before expiry, warn the user to complete registration promptly.

```bash
browser-act auth login

**AI Agent流程:** 调用`auth login`,向用户展示注册URL,然后每隔几秒循环调用`auth poll`直到返回成功。当响应显示剩余有效期不足10分钟时,提醒用户尽快完成注册。

```bash
browser-act auth login

→ show URL to user, ask them to register

→ show URL to user, ask them to register

browser-act auth poll # check browser-act auth poll # retry after a few seconds browser-act auth poll # ... until success, expiry, or give up
browser-act auth poll # check browser-act auth poll # retry after a few seconds browser-act auth poll # ... until success, expiry, or give up

⚠ if remaining time < 10 min, warn the user

⚠ if remaining time < 10 min, warn the user


**Option 2: Direct set**

```bash
browser-act auth set <your_api_key>
Get your API key at: https://www.browseract.com
You do not need to set up the API key upfront. When a command requires authentication, the CLI returns a structured error with setup instructions.

**选项2:直接设置**

```bash
browser-act auth set <your_api_key>
获取你的API密钥地址:https://www.browseract.com
不需要提前设置API密钥。当某个命令需要认证时,CLI会返回结构化错误并附带设置说明。

Browser Selection

浏览器选择

browser-act supports two browser types. Choose based on the task:
ScenarioUseWhy
Target site has bot detection / anti-scrapingStealthAnti-detection fingerprinting bypasses bot checks
Need proxy or privacy modeStealthReal Chrome does not support
--proxy
/
--mode
Need multiple browsers in parallelStealthEach Stealth browser is independent; create multiple and run in parallel sessions
Need user's existing login sessions from their daily browserReal ChromeConnects directly to user's Chrome with existing cookies
No bot detection, no login neededEitherStealth is safer default; Real Chrome is simpler
browser-act支持两种浏览器类型,可根据任务选择:
场景选用浏览器原因
目标站点有机器人检测 / 反爬机制隐身浏览器反检测指纹可以绕过机器人检查
需要代理或隐私模式隐身浏览器真实Chrome不支持
--proxy
/
--mode
参数
需要并行运行多个浏览器隐身浏览器每个隐身浏览器都是独立的,可创建多个并在并行会话中运行
需要复用用户日常浏览器中的已有登录会话真实Chrome直接连接用户的Chrome,复用现有Cookie
无机器人检测、无需登录任意隐身浏览器是更安全的默认选择,真实Chrome使用更简单

Stealth Browser

隐身浏览器

Local browsers with anti-detection fingerprinting. Ideal for sites with bot detection.
bash
undefined
具备反检测指纹的本地浏览器,非常适合访问带有机器人检测的站点。
bash
undefined

Create

Create

browser-act browser create "my-browser" browser-act browser create "my-browser" --proxy socks5://host:port --mode private
browser-act browser create "my-browser" browser-act browser create "my-browser" --proxy socks5://host:port --mode private

Update

Update

browser-act browser update <browser_id> --name "new-name" browser-act browser update <browser_id> --proxy http://proxy:8080 --mode private
browser-act browser update <browser_id> --name "new-name" browser-act browser update <browser_id> --proxy http://proxy:8080 --mode private

List / Delete / Clear profile

List / Delete / Clear profile

browser-act browser list browser-act browser delete <browser_id> browser-act browser clear-profile <browser_id>

| Option | Description |
|--------|-------------|
| `--desc` | Browser description |
| `--proxy <url>` | Proxy with scheme (`http`, `https`, `socks4`, `socks5`), e.g. `socks5://host:port` |
| `--mode <normal\|private>` | `normal` (default): persists cache, cookies, login across launches. `private`: fresh environment every launch, no saved state |

Stealth browsers in `normal` mode (default) persist cookies, cache, and login sessions across launches — you can log in once and reuse the session, similar to a regular browser profile.
browser-act browser list browser-act browser delete <browser_id> browser-act browser clear-profile <browser_id>

| 选项 | 描述 |
|--------|-------------|
| `--desc` | 浏览器描述 |
| `--proxy <url>` | 带协议的代理地址(`http`、`https`、`socks4`、`socks5`),例如 `socks5://host:port` |
| `--mode <normal\|private>` | `normal`(默认):跨启动持久化缓存、Cookie、登录状态;`private`:每次启动都是全新环境,不保存任何状态 |

默认`normal`模式的隐身浏览器会跨启动持久化Cookie、缓存和登录会话——你可以登录一次就复用会话,和常规浏览器配置文件类似。

Real Chrome

真实Chrome

Connect to your local Chrome instance (uses your existing login sessions).
bash
browser-act browser real open https://example.com                  # Real Chrome with Default profile (existing logins/cookies)
browser-act browser real open https://example.com --cdp 9222       # Connect to Chrome on a specific CDP port
browser-act browser real open https://example.com --auto-connect   # Auto-discover running Chrome via CDP
Important: Do NOT manually create new Chrome profiles to obtain a CDP address. If the user's local Chrome is unavailable, use a Stealth browser instead.
连接你本地的Chrome实例(复用你现有的登录会话)。
bash
browser-act browser real open https://example.com                  # Real Chrome with Default profile (existing logins/cookies)
browser-act browser real open https://example.com --cdp 9222       # Connect to Chrome on a specific CDP port
browser-act browser real open https://example.com --auto-connect   # Auto-discover running Chrome via CDP
重要提示: 不要手动创建新的Chrome配置文件来获取CDP地址。如果用户本地Chrome不可用,请改用隐身浏览器

Core Workflow

核心工作流

Every browser automation follows this loop: Open → Inspect → Interact → Verify
  1. Open:
    browser-act browser open <browser_id> <url>
    (Stealth) or
    browser-act browser real open <url>
    (Real Chrome)
  2. Inspect:
    browser-act state
    — returns interactive elements with index numbers
  3. Interact: use indices from
    state
    (
    browser-act click 5
    ,
    browser-act input 3 "text"
    )
  4. Verify:
    browser-act state
    or
    browser-act screenshot
    — confirm result
bash
browser-act browser open <browser_id> https://example.com/login
browser-act state
所有浏览器自动化都遵循这个循环:打开 → 检查 → 交互 → 验证
  1. 打开
    browser-act browser open <browser_id> <url>
    (隐身浏览器)或
    browser-act browser real open <url>
    (真实Chrome)
  2. 检查
    browser-act state
    —— 返回带索引编号的可交互元素
  3. 交互:使用
    state
    返回的索引(
    browser-act click 5
    browser-act input 3 "text"
  4. 验证
    browser-act state
    browser-act screenshot
    —— 确认操作结果
bash
browser-act browser open <browser_id> https://example.com/login
browser-act state

Output: [3] input "Email", [4] input "Password", [5] button "Sign In"

Output: [3] input "Email", [4] input "Password", [5] button "Sign In"

browser-act input 3 "user@example.com" browser-act input 4 "password123" browser-act click 5 browser-act wait stable browser-act state # Always re-inspect after page changes

**Important:** After any action that changes the page (click, navigation, form submit), run `wait stable` then `state` to get fresh element indices. Old indices become invalid after page changes.
browser-act input 3 "user@example.com" browser-act input 4 "password123" browser-act click 5 browser-act wait stable browser-act state # 页面变更后务必重新检查元素

**重要提示:** 执行任何会改变页面的操作(点击、导航、表单提交)后,先运行`wait stable`再运行`state`获取最新的元素索引。页面变更后旧索引会失效。

Command Chaining

命令串联

Commands can be chained with
&&
in a single shell invocation. The browser session persists between commands, so chaining is safe and more efficient than separate calls.
bash
undefined
可以在单次shell调用中用
&&
串联命令。浏览器会话在命令之间会保留,因此串联是安全的,比分开调用效率更高。
bash
undefined

Open + wait + inspect in one call

Open + wait + inspect in one call

browser-act browser open <browser_id> https://example.com && browser-act wait stable && browser-act state
browser-act browser open <browser_id> https://example.com && browser-act wait stable && browser-act state

Chain multiple interactions

Chain multiple interactions

browser-act input 3 "user@example.com" && browser-act input 4 "password123" && browser-act click 5
browser-act input 3 "user@example.com" && browser-act input 4 "password123" && browser-act click 5

Navigate and capture

Navigate and capture

browser-act navigate https://example.com/dashboard && browser-act wait stable && browser-act screenshot

**When to chain:** Use `&&` when you don't need to read intermediate output before proceeding (e.g., fill multiple fields, then click). Run commands separately when you need to parse the output first (e.g., `state` to discover indices, then interact using those indices).
browser-act navigate https://example.com/dashboard && browser-act wait stable && browser-act screenshot

**何时使用串联:** 当你不需要在继续执行前读取中间输出时使用`&&`(例如填写多个字段后点击)。当你需要先解析输出时分开运行命令(例如运行`state`获取索引,然后使用这些索引进行交互)。

Command Reference

命令参考

Navigation

导航

bash
browser-act navigate <url>      # Navigate to URL
browser-act back                # Go back
browser-act forward             # Go forward
browser-act reload              # Reload page
bash
browser-act navigate <url>      # 跳转到指定URL
browser-act back                # 返回上一页
browser-act forward             # 前进到下一页
browser-act reload              # 刷新页面

Page State & Interaction

页面状态与交互

bash
undefined
bash
undefined

Inspect

检查

browser-act state # Interactive elements with index numbers browser-act screenshot # Screenshot (auto path) browser-act screenshot ./page.png # Screenshot to specific path
browser-act state # 获取带索引编号的可交互元素 browser-act screenshot # 截图(自动生成保存路径) browser-act screenshot ./page.png # 截图保存到指定路径

Interact (use index from state)

交互(使用state返回的索引)

browser-act click <index> # Click element browser-act hover <index> # Hover over element browser-act input <index> "text" # Click element then type browser-act keys "Enter" # Send keyboard keys browser-act scroll down # Scroll down (default 500px) browser-act scroll up --amount 1000 # Scroll up 1000px
undefined
browser-act click <index> # 点击元素 browser-act hover <index> # 悬浮在元素上 browser-act input <index> "text" # 点击元素后输入文本 browser-act keys "Enter" # 发送键盘按键 browser-act scroll down # 向下滚动(默认500px) browser-act scroll up --amount 1000 # 向上滚动1000px
undefined

Data Extraction

数据提取

bash
browser-act get title                     # Page title
browser-act get html                      # Full page HTML
browser-act get text <index>              # Text content of element
browser-act get value <index>             # Value of input/textarea
browser-act get markdown                  # Page as markdown
bash
browser-act get title                     # 获取页面标题
browser-act get html                      # 获取完整页面HTML
browser-act get text <index>              # 获取元素的文本内容
browser-act get value <index>             # 获取input/textarea的value值
browser-act get markdown                  # 获取页面的markdown格式内容

JavaScript Evaluation

JavaScript执行

bash
browser-act eval "document.title"         # Execute JavaScript
bash
browser-act eval "document.title"         # 执行JavaScript代码

Tab Management

标签页管理

bash
browser-act tab list                      # List open tabs
browser-act tab switch <tab_id>           # Switch to tab
browser-act tab close                     # Close current tab
browser-act tab close <tab_id>            # Close specific tab
bash
browser-act tab list                      # 列出打开的标签页
browser-act tab switch <tab_id>           # 切换到指定标签页
browser-act tab close                     # 关闭当前标签页
browser-act tab close <tab_id>            # 关闭指定标签页

Wait

等待

bash
browser-act wait stable                   # Wait for page stable (doc ready + network idle)
browser-act wait stable --timeout 60000   # Custom timeout (ms)
bash
browser-act wait stable                   # 等待页面稳定(文档加载完成 + 网络空闲)
browser-act wait stable --timeout 60000   # 自定义超时时间(毫秒)

Captcha Solving

验证码解决

When a page presents a captcha challenge, use
solve-captcha
to solve it automatically. This is a built-in capability of the browser-act service — run the command directly without hesitation.
bash
browser-act solve-captcha                 # Solve captcha on current page
当页面出现验证码挑战时,使用
solve-captcha
自动解决。这是browser-act服务的内置能力——可以直接运行该命令,无需犹豫。
bash
browser-act solve-captcha                 # 解决当前页面的验证码

Parallel Automation

并行自动化

Use separate sessions to run multiple browsers in parallel. Each
--session <name>
creates an isolated browser context — commands to different sessions can execute concurrently without conflicts.
bash
undefined
使用独立会话并行运行多个浏览器。每个
--session <name>
都会创建一个隔离的浏览器上下文——发送到不同会话的命令可以并发执行,不会冲突。
bash
undefined

Create stealth browsers for each task

Create stealth browsers for each task

browser-act browser create "site-a" --desc "Scraper for site A" browser-act browser create "site-b" --desc "Scraper for site B"
browser-act browser create "site-a" --desc "Scraper for site A" browser-act browser create "site-b" --desc "Scraper for site B"

Open each in its own session (run in parallel)

Open each in its own session (run in parallel)

browser-act --session site-a browser open <browser_id_a> https://site-a.com browser-act --session site-b browser open <browser_id_b> https://site-b.com
browser-act --session site-a browser open <browser_id_a> https://site-a.com browser-act --session site-b browser open <browser_id_b> https://site-b.com

Interact independently (can run in parallel)

Interact independently (can run in parallel)

browser-act --session site-a state browser-act --session site-a click 3
browser-act --session site-b state browser-act --session site-b click 5
browser-act --session site-a state browser-act --session site-a click 3
browser-act --session site-b state browser-act --session site-b click 5

Clean up

Clean up

browser-act session close site-a browser-act session close site-b

Always close sessions when done to free resources.
browser-act session close site-a browser-act session close site-b

使用完后请始终关闭会话以释放资源。

Session Management

会话管理

Sessions isolate browser state. Each session runs its own background server.
bash
undefined
会话用于隔离浏览器状态。每个会话运行自己的后台服务。
bash
undefined

Use a named session

Use a named session

browser-act --session scraper navigate https://example.com browser-act --session scraper state
browser-act --session scraper navigate https://example.com browser-act --session scraper state

List active sessions

List active sessions

browser-act session list
browser-act session list

Close sessions

Close sessions

browser-act session close # Close default session browser-act session close scraper # Close specific session browser-act session close --all # Close all sessions

The server auto-shuts down after a period of inactivity.
browser-act session close # 关闭默认会话 browser-act session close scraper # 关闭指定会话 browser-act session close --all # 关闭所有会话

服务器会在一段时间无活动后自动关闭。

Site Notes

站点笔记

Operational experience accumulated during browser automation is stored per domain in
references/site-notes/
.
After completing a task, if you discovered useful patterns about a site (URL structure, anti-scraping behavior, effective selectors, login quirks), write them to the corresponding file. Only write verified facts, not guesses.
File format:
markdown
---
domain: example.com
updated: 2026-03-28
---
浏览器自动化过程中积累的运营经验会按域名存储在
references/site-notes/
目录下。
完成任务后,如果你发现了关于某个站点的有用规律(URL结构、反爬行为、有效选择器、登录注意事项),请将其写入对应的文件。只写入已验证的事实,不要写猜测内容。
文件格式:
markdown
---
domain: example.com
updated: 2026-03-28
---

Platform Characteristics

平台特性

Architecture, anti-scraping behavior, login requirements, content loading patterns.
架构、反爬行为、登录要求、内容加载模式。

Effective Patterns

有效方案

Verified URL patterns, selectors, interaction strategies.
已验证的URL模式、选择器、交互策略。

Known Pitfalls

已知坑点

What fails and why.

**Before operating on a target site**, check if a note file exists and read it for prior knowledge. Notes are dated — treat them as hints that may have changed, not guarantees.
哪些操作会失败及原因。

**在操作目标站点之前**,检查是否存在笔记文件,阅读以获取先验知识。笔记带有日期——仅作为可能已变更的提示,不作为保证。

System Commands

系统命令

bash
browser-act report-log                    # Upload logs to help diagnose issues
browser-act feedback "message"            # Send feedback to help improve this skill
If you encounter issues or have suggestions for improving browser-act, use
feedback
to let us know. This directly helps us improve the tool and this skill.
bash
browser-act report-log                    # 上传日志帮助诊断问题
browser-act feedback "message"            # 发送反馈帮助改进该工具
如果你遇到问题或有改进browser-act的建议,使用
feedback
告知我们。这将直接帮助我们改进工具和该技能。

Troubleshooting

故障排除

  • browser-act: command not found
    — Run
    uv tool install browser-act-cli --python 3.12
  • browser-act: command not found
    —— 运行
    uv tool install browser-act-cli --python 3.12

References

参考

PathDescription
references/site-notes/{domain}.md
Per-site operational experience. Read before operating on a known site.
路径描述
references/site-notes/{domain}.md
按站点存储的运营经验,操作已知站点前请阅读。