browser-act
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBrowser Automation with browser-act CLI
使用browser-act CLI实现浏览器自动化
browser-actAll commands output human-readable text by default. Use for structured JSON output, ideal for AI agent integration and scripting.
--format jsonbrowser-act所有命令默认输出人类可读文本。使用可获取结构化JSON输出,非常适合AI Agent集成和脚本编写。
--format jsonInstallation
安装
bash
undefinedbash
undefinedUpgrade if installed, otherwise install fresh
Upgrade if installed, otherwise install fresh
uv tool upgrade browser-act-cli || uv tool install browser-act-cli --python 3.12
Run this at the start of every session to ensure the latest version.
**Global options** available on every command:
| Option | Default | Description |
|--------|---------|-------------|
| `--session <name>` | `default` | Session name (isolates browser state) |
| `--format <text\|json>` | `text` | Output format |
| `--intent <desc>` | none | Caller intent for analytics |
| `--version` | | Show version |
| `-h, --help` | | Show help |uv tool upgrade browser-act-cli || uv tool install browser-act-cli --python 3.12
每次会话开始时运行此命令以确保使用最新版本。
**全局选项** 所有命令都支持以下选项:
| 选项 | 默认值 | 描述 |
|--------|---------|-------------|
| `--session <name>` | `default` | 会话名称(隔离浏览器状态) |
| `--format <text\|json>` | `text` | 输出格式 |
| `--intent <desc>` | none | 用于分析的调用方意图 |
| `--version` | | 显示版本 |
| `-h, --help` | | 显示帮助 |Authentication
认证
Some features require a BrowserAct API key (stealth browsers, captcha solving, etc.). Real Chrome and basic page operations work without one.
Option 1: Interactive registration (recommended)
bash
undefined部分功能需要BrowserAct API密钥(隐身浏览器、验证码解决等)。真实Chrome和基础页面操作无需API密钥。
选项1:交互式注册(推荐)
bash
undefinedStep 1: Get registration URL
Step 1: Get registration URL
browser-act auth login
browser-act auth login
Output: registration URL + instructions
Output: registration URL + instructions
Step 2: Check registration status (single check, not a loop)
Step 2: Check registration status (single check, not a loop)
browser-act auth poll
browser-act auth poll
Returns API key on success, or pending status if not yet completed
Returns API key on success, or pending status if not yet completed
**AI agent flow:** Call `auth login`, present the registration URL to the user, then loop `auth poll` every few seconds until it returns success. When the response indicates less than 10 minutes remaining before expiry, warn the user to complete registration promptly.
```bash
browser-act auth login
**AI Agent流程:** 调用`auth login`,向用户展示注册URL,然后每隔几秒循环调用`auth poll`直到返回成功。当响应显示剩余有效期不足10分钟时,提醒用户尽快完成注册。
```bash
browser-act auth login→ show URL to user, ask them to register
→ show URL to user, ask them to register
browser-act auth poll # check
browser-act auth poll # retry after a few seconds
browser-act auth poll # ... until success, expiry, or give up
browser-act auth poll # check
browser-act auth poll # retry after a few seconds
browser-act auth poll # ... until success, expiry, or give up
⚠ if remaining time < 10 min, warn the user
⚠ if remaining time < 10 min, warn the user
**Option 2: Direct set**
```bash
browser-act auth set <your_api_key>Get your API key at: https://www.browseract.com
You do not need to set up the API key upfront. When a command requires authentication, the CLI returns a structured error with setup instructions.
**选项2:直接设置**
```bash
browser-act auth set <your_api_key>获取你的API密钥地址:https://www.browseract.com
你不需要提前设置API密钥。当某个命令需要认证时,CLI会返回结构化错误并附带设置说明。
Browser Selection
浏览器选择
browser-act supports two browser types. Choose based on the task:
| Scenario | Use | Why |
|---|---|---|
| Target site has bot detection / anti-scraping | Stealth | Anti-detection fingerprinting bypasses bot checks |
| Need proxy or privacy mode | Stealth | Real Chrome does not support |
| Need multiple browsers in parallel | Stealth | Each Stealth browser is independent; create multiple and run in parallel sessions |
| Need user's existing login sessions from their daily browser | Real Chrome | Connects directly to user's Chrome with existing cookies |
| No bot detection, no login needed | Either | Stealth is safer default; Real Chrome is simpler |
browser-act支持两种浏览器类型,可根据任务选择:
| 场景 | 选用浏览器 | 原因 |
|---|---|---|
| 目标站点有机器人检测 / 反爬机制 | 隐身浏览器 | 反检测指纹可以绕过机器人检查 |
| 需要代理或隐私模式 | 隐身浏览器 | 真实Chrome不支持 |
| 需要并行运行多个浏览器 | 隐身浏览器 | 每个隐身浏览器都是独立的,可创建多个并在并行会话中运行 |
| 需要复用用户日常浏览器中的已有登录会话 | 真实Chrome | 直接连接用户的Chrome,复用现有Cookie |
| 无机器人检测、无需登录 | 任意 | 隐身浏览器是更安全的默认选择,真实Chrome使用更简单 |
Stealth Browser
隐身浏览器
Local browsers with anti-detection fingerprinting. Ideal for sites with bot detection.
bash
undefined具备反检测指纹的本地浏览器,非常适合访问带有机器人检测的站点。
bash
undefinedCreate
Create
browser-act browser create "my-browser"
browser-act browser create "my-browser" --proxy socks5://host:port --mode private
browser-act browser create "my-browser"
browser-act browser create "my-browser" --proxy socks5://host:port --mode private
Update
Update
browser-act browser update <browser_id> --name "new-name"
browser-act browser update <browser_id> --proxy http://proxy:8080 --mode private
browser-act browser update <browser_id> --name "new-name"
browser-act browser update <browser_id> --proxy http://proxy:8080 --mode private
List / Delete / Clear profile
List / Delete / Clear profile
browser-act browser list
browser-act browser delete <browser_id>
browser-act browser clear-profile <browser_id>
| Option | Description |
|--------|-------------|
| `--desc` | Browser description |
| `--proxy <url>` | Proxy with scheme (`http`, `https`, `socks4`, `socks5`), e.g. `socks5://host:port` |
| `--mode <normal\|private>` | `normal` (default): persists cache, cookies, login across launches. `private`: fresh environment every launch, no saved state |
Stealth browsers in `normal` mode (default) persist cookies, cache, and login sessions across launches — you can log in once and reuse the session, similar to a regular browser profile.browser-act browser list
browser-act browser delete <browser_id>
browser-act browser clear-profile <browser_id>
| 选项 | 描述 |
|--------|-------------|
| `--desc` | 浏览器描述 |
| `--proxy <url>` | 带协议的代理地址(`http`、`https`、`socks4`、`socks5`),例如 `socks5://host:port` |
| `--mode <normal\|private>` | `normal`(默认):跨启动持久化缓存、Cookie、登录状态;`private`:每次启动都是全新环境,不保存任何状态 |
默认`normal`模式的隐身浏览器会跨启动持久化Cookie、缓存和登录会话——你可以登录一次就复用会话,和常规浏览器配置文件类似。Real Chrome
真实Chrome
Connect to your local Chrome instance (uses your existing login sessions).
bash
browser-act browser real open https://example.com # Real Chrome with Default profile (existing logins/cookies)
browser-act browser real open https://example.com --cdp 9222 # Connect to Chrome on a specific CDP port
browser-act browser real open https://example.com --auto-connect # Auto-discover running Chrome via CDPImportant: Do NOT manually create new Chrome profiles to obtain a CDP address. If the user's local Chrome is unavailable, use a Stealth browser instead.
连接你本地的Chrome实例(复用你现有的登录会话)。
bash
browser-act browser real open https://example.com # Real Chrome with Default profile (existing logins/cookies)
browser-act browser real open https://example.com --cdp 9222 # Connect to Chrome on a specific CDP port
browser-act browser real open https://example.com --auto-connect # Auto-discover running Chrome via CDP重要提示: 不要手动创建新的Chrome配置文件来获取CDP地址。如果用户本地Chrome不可用,请改用隐身浏览器。
Core Workflow
核心工作流
Every browser automation follows this loop: Open → Inspect → Interact → Verify
- Open: (Stealth) or
browser-act browser open <browser_id> <url>(Real Chrome)browser-act browser real open <url> - Inspect: — returns interactive elements with index numbers
browser-act state - Interact: use indices from (
state,browser-act click 5)browser-act input 3 "text" - Verify: or
browser-act state— confirm resultbrowser-act screenshot
bash
browser-act browser open <browser_id> https://example.com/login
browser-act state所有浏览器自动化都遵循这个循环:打开 → 检查 → 交互 → 验证
- 打开:(隐身浏览器)或
browser-act browser open <browser_id> <url>(真实Chrome)browser-act browser real open <url> - 检查:—— 返回带索引编号的可交互元素
browser-act state - 交互:使用返回的索引(
state、browser-act click 5)browser-act input 3 "text" - 验证:或
browser-act state—— 确认操作结果browser-act screenshot
bash
browser-act browser open <browser_id> https://example.com/login
browser-act stateOutput: [3] input "Email", [4] input "Password", [5] button "Sign In"
Output: [3] input "Email", [4] input "Password", [5] button "Sign In"
browser-act input 3 "user@example.com"
browser-act input 4 "password123"
browser-act click 5
browser-act wait stable
browser-act state # Always re-inspect after page changes
**Important:** After any action that changes the page (click, navigation, form submit), run `wait stable` then `state` to get fresh element indices. Old indices become invalid after page changes.browser-act input 3 "user@example.com"
browser-act input 4 "password123"
browser-act click 5
browser-act wait stable
browser-act state # 页面变更后务必重新检查元素
**重要提示:** 执行任何会改变页面的操作(点击、导航、表单提交)后,先运行`wait stable`再运行`state`获取最新的元素索引。页面变更后旧索引会失效。Command Chaining
命令串联
Commands can be chained with in a single shell invocation. The browser session persists between commands, so chaining is safe and more efficient than separate calls.
&&bash
undefined可以在单次shell调用中用串联命令。浏览器会话在命令之间会保留,因此串联是安全的,比分开调用效率更高。
&&bash
undefinedOpen + wait + inspect in one call
Open + wait + inspect in one call
browser-act browser open <browser_id> https://example.com && browser-act wait stable && browser-act state
browser-act browser open <browser_id> https://example.com && browser-act wait stable && browser-act state
Chain multiple interactions
Chain multiple interactions
browser-act input 3 "user@example.com" && browser-act input 4 "password123" && browser-act click 5
browser-act input 3 "user@example.com" && browser-act input 4 "password123" && browser-act click 5
Navigate and capture
Navigate and capture
browser-act navigate https://example.com/dashboard && browser-act wait stable && browser-act screenshot
**When to chain:** Use `&&` when you don't need to read intermediate output before proceeding (e.g., fill multiple fields, then click). Run commands separately when you need to parse the output first (e.g., `state` to discover indices, then interact using those indices).browser-act navigate https://example.com/dashboard && browser-act wait stable && browser-act screenshot
**何时使用串联:** 当你不需要在继续执行前读取中间输出时使用`&&`(例如填写多个字段后点击)。当你需要先解析输出时分开运行命令(例如运行`state`获取索引,然后使用这些索引进行交互)。Command Reference
命令参考
Navigation
导航
bash
browser-act navigate <url> # Navigate to URL
browser-act back # Go back
browser-act forward # Go forward
browser-act reload # Reload pagebash
browser-act navigate <url> # 跳转到指定URL
browser-act back # 返回上一页
browser-act forward # 前进到下一页
browser-act reload # 刷新页面Page State & Interaction
页面状态与交互
bash
undefinedbash
undefinedInspect
检查
browser-act state # Interactive elements with index numbers
browser-act screenshot # Screenshot (auto path)
browser-act screenshot ./page.png # Screenshot to specific path
browser-act state # 获取带索引编号的可交互元素
browser-act screenshot # 截图(自动生成保存路径)
browser-act screenshot ./page.png # 截图保存到指定路径
Interact (use index from state)
交互(使用state返回的索引)
browser-act click <index> # Click element
browser-act hover <index> # Hover over element
browser-act input <index> "text" # Click element then type
browser-act keys "Enter" # Send keyboard keys
browser-act scroll down # Scroll down (default 500px)
browser-act scroll up --amount 1000 # Scroll up 1000px
undefinedbrowser-act click <index> # 点击元素
browser-act hover <index> # 悬浮在元素上
browser-act input <index> "text" # 点击元素后输入文本
browser-act keys "Enter" # 发送键盘按键
browser-act scroll down # 向下滚动(默认500px)
browser-act scroll up --amount 1000 # 向上滚动1000px
undefinedData Extraction
数据提取
bash
browser-act get title # Page title
browser-act get html # Full page HTML
browser-act get text <index> # Text content of element
browser-act get value <index> # Value of input/textarea
browser-act get markdown # Page as markdownbash
browser-act get title # 获取页面标题
browser-act get html # 获取完整页面HTML
browser-act get text <index> # 获取元素的文本内容
browser-act get value <index> # 获取input/textarea的value值
browser-act get markdown # 获取页面的markdown格式内容JavaScript Evaluation
JavaScript执行
bash
browser-act eval "document.title" # Execute JavaScriptbash
browser-act eval "document.title" # 执行JavaScript代码Tab Management
标签页管理
bash
browser-act tab list # List open tabs
browser-act tab switch <tab_id> # Switch to tab
browser-act tab close # Close current tab
browser-act tab close <tab_id> # Close specific tabbash
browser-act tab list # 列出打开的标签页
browser-act tab switch <tab_id> # 切换到指定标签页
browser-act tab close # 关闭当前标签页
browser-act tab close <tab_id> # 关闭指定标签页Wait
等待
bash
browser-act wait stable # Wait for page stable (doc ready + network idle)
browser-act wait stable --timeout 60000 # Custom timeout (ms)bash
browser-act wait stable # 等待页面稳定(文档加载完成 + 网络空闲)
browser-act wait stable --timeout 60000 # 自定义超时时间(毫秒)Captcha Solving
验证码解决
When a page presents a captcha challenge, use to solve it automatically. This is a built-in capability of the browser-act service — run the command directly without hesitation.
solve-captchabash
browser-act solve-captcha # Solve captcha on current page当页面出现验证码挑战时,使用自动解决。这是browser-act服务的内置能力——可以直接运行该命令,无需犹豫。
solve-captchabash
browser-act solve-captcha # 解决当前页面的验证码Parallel Automation
并行自动化
Use separate sessions to run multiple browsers in parallel. Each creates an isolated browser context — commands to different sessions can execute concurrently without conflicts.
--session <name>bash
undefined使用独立会话并行运行多个浏览器。每个都会创建一个隔离的浏览器上下文——发送到不同会话的命令可以并发执行,不会冲突。
--session <name>bash
undefinedCreate stealth browsers for each task
Create stealth browsers for each task
browser-act browser create "site-a" --desc "Scraper for site A"
browser-act browser create "site-b" --desc "Scraper for site B"
browser-act browser create "site-a" --desc "Scraper for site A"
browser-act browser create "site-b" --desc "Scraper for site B"
Open each in its own session (run in parallel)
Open each in its own session (run in parallel)
browser-act --session site-a browser open <browser_id_a> https://site-a.com
browser-act --session site-b browser open <browser_id_b> https://site-b.com
browser-act --session site-a browser open <browser_id_a> https://site-a.com
browser-act --session site-b browser open <browser_id_b> https://site-b.com
Interact independently (can run in parallel)
Interact independently (can run in parallel)
browser-act --session site-a state
browser-act --session site-a click 3
browser-act --session site-b state
browser-act --session site-b click 5
browser-act --session site-a state
browser-act --session site-a click 3
browser-act --session site-b state
browser-act --session site-b click 5
Clean up
Clean up
browser-act session close site-a
browser-act session close site-b
Always close sessions when done to free resources.browser-act session close site-a
browser-act session close site-b
使用完后请始终关闭会话以释放资源。Session Management
会话管理
Sessions isolate browser state. Each session runs its own background server.
bash
undefined会话用于隔离浏览器状态。每个会话运行自己的后台服务。
bash
undefinedUse a named session
Use a named session
browser-act --session scraper navigate https://example.com
browser-act --session scraper state
browser-act --session scraper navigate https://example.com
browser-act --session scraper state
List active sessions
List active sessions
browser-act session list
browser-act session list
Close sessions
Close sessions
browser-act session close # Close default session
browser-act session close scraper # Close specific session
browser-act session close --all # Close all sessions
The server auto-shuts down after a period of inactivity.browser-act session close # 关闭默认会话
browser-act session close scraper # 关闭指定会话
browser-act session close --all # 关闭所有会话
服务器会在一段时间无活动后自动关闭。Site Notes
站点笔记
Operational experience accumulated during browser automation is stored per domain in .
references/site-notes/After completing a task, if you discovered useful patterns about a site (URL structure, anti-scraping behavior, effective selectors, login quirks), write them to the corresponding file. Only write verified facts, not guesses.
File format:
markdown
---
domain: example.com
updated: 2026-03-28
---浏览器自动化过程中积累的运营经验会按域名存储在目录下。
references/site-notes/完成任务后,如果你发现了关于某个站点的有用规律(URL结构、反爬行为、有效选择器、登录注意事项),请将其写入对应的文件。只写入已验证的事实,不要写猜测内容。
文件格式:
markdown
---
domain: example.com
updated: 2026-03-28
---Platform Characteristics
平台特性
Architecture, anti-scraping behavior, login requirements, content loading patterns.
架构、反爬行为、登录要求、内容加载模式。
Effective Patterns
有效方案
Verified URL patterns, selectors, interaction strategies.
已验证的URL模式、选择器、交互策略。
Known Pitfalls
已知坑点
What fails and why.
**Before operating on a target site**, check if a note file exists and read it for prior knowledge. Notes are dated — treat them as hints that may have changed, not guarantees.哪些操作会失败及原因。
**在操作目标站点之前**,检查是否存在笔记文件,阅读以获取先验知识。笔记带有日期——仅作为可能已变更的提示,不作为保证。System Commands
系统命令
bash
browser-act report-log # Upload logs to help diagnose issues
browser-act feedback "message" # Send feedback to help improve this skillIf you encounter issues or have suggestions for improving browser-act, use to let us know. This directly helps us improve the tool and this skill.
feedbackbash
browser-act report-log # 上传日志帮助诊断问题
browser-act feedback "message" # 发送反馈帮助改进该工具如果你遇到问题或有改进browser-act的建议,使用告知我们。这将直接帮助我们改进工具和该技能。
feedbackTroubleshooting
故障排除
- — Run
browser-act: command not founduv tool install browser-act-cli --python 3.12
- —— 运行
browser-act: command not founduv tool install browser-act-cli --python 3.12
References
参考
| Path | Description |
|---|---|
| Per-site operational experience. Read before operating on a known site. |
| 路径 | 描述 |
|---|---|
| 按站点存储的运营经验,操作已知站点前请阅读。 |