browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Browser Automation

浏览器自动化

Automate browser interactions using the browse CLI with Claude.
通过browse CLI与Claude配合实现浏览器交互自动化。

Setup check

设置检查

Before running any browser commands, verify the CLI is available:
bash
which browse || npm install -g browse
运行任何浏览器命令之前,请确认CLI已可用:
bash
which browse || npm install -g browse

Environment Selection (Local vs Remote)

环境选择(本地 vs 远程)

The CLI supports explicit per-command environment flags. If you do nothing, the next session defaults to Browserbase when
BROWSERBASE_API_KEY
is set and to local otherwise.
CLI支持为每个命令显式指定环境标志。如果未做任何设置,当设置了
BROWSERBASE_API_KEY
时,下一个会话默认使用Browserbase,否则默认使用本地环境。

Local mode

本地模式

  • browse open <url> --local
    starts a clean isolated local browser
  • browse open <url> --auto-connect
    attaches to an already-running debuggable Chrome; use
    --local
    when no debuggable Chrome is available
  • browse open <url> --cdp <port|url>
    attaches to a specific CDP target
  • Best for: development, localhost, trusted sites, and reproducible runs
  • browse open <url> --local
    启动一个干净隔离的本地浏览器
  • browse open <url> --auto-connect
    连接到已运行的可调试Chrome;当没有可调试Chrome可用时使用
    --local
  • browse open <url> --cdp <port|url>
    连接到特定的CDP目标
  • 最适合:开发、本地主机、可信站点和可复现的运行场景

Remote mode (Browserbase)

远程模式(Browserbase)

  • browse open <url> --remote
    starts a Browserbase session
  • Without a local flag, Browserbase is also the default when
    BROWSERBASE_API_KEY
    is set
  • Provides: Browserbase Identity, Verified browsers, automatic CAPTCHA solving, residential proxies, session persistence
  • Use remote mode when: the target site has bot detection, CAPTCHAs, IP rate limiting, Cloudflare protection, or requires geo-specific access
  • Get credentials at https://browserbase.com/settings
  • browse open <url> --remote
    启动一个Browserbase会话
  • 当设置了
    BROWSERBASE_API_KEY
    且未指定本地标志时,Browserbase也会成为默认选项
  • 提供:Browserbase Identity、验证浏览器、自动CAPTCHA破解、住宅代理、会话持久化
  • **使用远程模式的场景:**目标站点存在机器人检测、CAPTCHA、IP速率限制、Cloudflare保护,或需要特定地域访问权限
  • https://browserbase.com/settings 获取凭证

When to choose which

如何选择模式

  • Repeatable local testing / clean state:
    browse open <url> --local
  • Reuse your local login/cookies:
    browse open <url> --auto-connect
  • Simple browsing (docs, wikis, public APIs): local mode is fine
  • Protected sites (login walls, CAPTCHAs, anti-scraping): use remote mode
  • If local mode fails with bot detection or access denied: switch to remote mode
  • 可重复的本地测试 / 干净状态
    browse open <url> --local
  • 复用本地登录信息/ Cookie
    browse open <url> --auto-connect
  • 简单浏览(文档、维基、公开API):本地模式即可
  • 受保护站点(登录墙、CAPTCHA、反爬机制):使用远程模式
  • 本地模式因机器人检测或访问被拒绝而失败:切换到远程模式

Commands

命令

Most driver commands work across local, remote, and CDP sessions after the daemon starts.
启动守护进程后,大多数驱动命令可在本地、远程和CDP会话中通用。

Navigation

导航

bash
browse open <url>                        # Go to URL
browse open <url> --local                # Go to URL in a clean local browser
browse open <url> --remote               # Go to URL in a Browserbase session
browse reload                            # Reload current page
browse back                              # Go back in history
browse forward                           # Go forward in history
bash
browse open <url>                        # 打开指定URL
browse open <url> --local                # 在干净的本地浏览器中打开URL
browse open <url> --remote               # 在Browserbase会话中打开URL
browse reload                            # 刷新当前页面
browse back                              # 返回上一页
browse forward                           # 前进到下一页

Page state (prefer snapshot over screenshot)

页面状态(优先使用snapshot而非screenshot)

bash
browse snapshot                          # Get accessibility tree with element refs (fast, structured)
browse screenshot --path <path>          # Take visual screenshot (slow, uses vision tokens)
browse get url                           # Get current URL
browse get title                         # Get page title
browse get text <selector>               # Get text content (use "body" for all text)
browse get html <selector>               # Get HTML content of element
browse get value <selector>              # Get form field value
Use
browse snapshot
as your default for understanding page state — it returns the accessibility tree with element refs you can use to interact. Only use
browse screenshot
when you need visual context (layout, images, debugging).
bash
browse snapshot                          # 获取包含元素引用的可访问性树(快速、结构化)
browse screenshot --path <path>          # 截取可视截图(较慢,使用视觉令牌)
browse get url                           # 获取当前URL
browse get title                         # 获取页面标题
browse get text <selector>               # 获取文本内容(使用"body"获取全部文本)
browse get html <selector>               # 获取元素的HTML内容
browse get value <selector>              # 获取表单字段值
默认使用
browse snapshot
来了解页面状态——它会返回带有元素引用的可访问性树,你可以用这些引用来进行交互。仅当需要视觉上下文(布局、图片、调试)时才使用
browse screenshot

Interaction

交互操作

bash
browse click <ref>                       # Click element by ref from snapshot (e.g., @0-5)
browse type <text>                       # Type text into focused element
browse fill <selector> <value>           # Fill input; add --press-enter if Enter is needed
browse select <selector> <values...>     # Select dropdown option(s)
browse press <key>                       # Press key (Enter, Tab, Escape, Cmd+A, etc.)
browse mouse drag <fromX> <fromY> <toX> <toY>  # Drag from one point to another
browse mouse scroll <x> <y> <deltaX> <deltaY>  # Scroll at coordinates
browse highlight <selector>              # Highlight element on page
browse is visible <selector>             # Check if element is visible
browse is checked <selector>             # Check if element is checked
browse wait <type> [arg]                 # Wait for: load, selector, timeout
bash
browse click <ref>                       # 通过snapshot返回的元素引用点击元素(例如:@0-5)
browse type <text>                       # 在聚焦的元素中输入文本
browse fill <selector> <value>           # 填写输入框;若需要按回车键则添加--press-enter
browse select <selector> <values...>     # 选择下拉选项
browse press <key>                       # 按下按键(Enter、Tab、Escape、Cmd+A等)
browse mouse drag <fromX> <fromY> <toX> <toY>  # 从一个点拖动到另一个点
browse mouse scroll <x> <y> <deltaX> <deltaY>  # 在指定坐标处滚动
browse highlight <selector>              # 高亮页面上的元素
browse is visible <selector>             # 检查元素是否可见
browse is checked <selector>             # 检查元素是否被勾选
browse wait <type> [arg]                 # 等待:页面加载、元素出现、超时

Session management

会话管理

bash
browse stop                              # Stop the browser daemon
browse status                            # Check daemon status and resolved mode
browse tab list                          # List all open tabs
browse tab switch <index-or-target-id>   # Switch to tab by index or target ID
browse tab close [index-or-target-id]    # Close tab
bash
browse stop                              # 停止浏览器守护进程
browse status                            # 检查守护进程状态和已解析的模式
browse tab list                          # 列出所有打开的标签页
browse tab switch <index-or-target-id>   # 通过索引或目标ID切换标签页
browse tab close [index-or-target-id]    # 关闭标签页

Typical workflow

典型工作流

If the environment matters, put
--local
,
--remote
,
--auto-connect
, or
--cdp <port|url>
on the first browser command.
  1. browse open <url> --local
    or
    browse open <url> --remote
    — navigate to the page
  2. browse snapshot
    — read the accessibility tree to understand page structure and get element refs
  3. browse click <ref>
    /
    browse type <text>
    /
    browse fill <selector> <value>
    — interact using refs from snapshot
  4. browse snapshot
    — confirm the action worked
  5. Repeat 3-4 as needed
  6. browse stop
    — close the browser when done
如果环境很重要,请在第一条浏览器命令后加上
--local
--remote
--auto-connect
--cdp <port|url>
  1. browse open <url> --local
    browse open <url> --remote
    —— 导航到目标页面
  2. browse snapshot
    —— 读取可访问性树以了解页面结构并获取元素引用
  3. browse click <ref>
    /
    browse type <text>
    /
    browse fill <selector> <value>
    —— 使用snapshot返回的引用进行交互
  4. browse snapshot
    —— 确认操作已生效
  5. 根据需要重复步骤3-4
  6. browse stop
    —— 完成操作后关闭浏览器

Quick Example

快速示例

bash
browse open https://example.com
browse snapshot                          # see page structure + element refs
browse click @0-5                        # click element with ref 0-5
browse get title
browse stop
bash
browse open https://example.com
browse snapshot                          # 查看页面结构 + 元素引用
browse click @0-5                        # 点击引用为0-5的元素
browse get title
browse stop

Mode Comparison

模式对比

FeatureLocalBrowserbase
SpeedFasterSlightly slower
SetupChrome requiredAPI key required
Reuse existing local cookiesWith
browse open <url> --auto-connect
N/A
Verified browserNoYes (Browserbase Verified browser via Identity)
CAPTCHA solvingNoYes (automatic reCAPTCHA/hCaptcha)
Residential proxiesNoYes (201 countries, geo-targeting)
Session persistenceNoYes (cookies/auth persist via contexts)
Best forDevelopment/simple pagesProtected sites, Browserbase Identity + Verified access, production scraping
特性本地Browserbase
速度更快稍慢
设置要求需要Chrome浏览器需要API密钥
复用本地现有Cookie通过
browse open <url> --auto-connect
实现
不支持
验证浏览器是(通过Identity提供的Browserbase验证浏览器)
CAPTCHA破解是(自动破解reCAPTCHA/hCaptcha)
住宅代理是(覆盖201个国家,支持地域定位)
会话持久化是(Cookie/认证信息通过上下文持久化)
最佳适用场景开发/简单页面受保护站点、Browserbase Identity + 验证访问、生产环境数据采集

Best Practices

最佳实践

  1. Choose the local strategy deliberately: use
    browse open <url> --local
    for clean state,
    browse open <url> --auto-connect
    for existing local credentials, and
    browse open <url> --remote
    for protected sites
  2. Always
    browse open
    first
    before interacting
  3. Use
    browse snapshot
    to check page state — it's fast and gives you element refs
  4. Only screenshot when visual context is needed (layout checks, images, debugging)
  5. Use refs from snapshot to click/interact — e.g.,
    browse click @0-5
  6. browse stop
    when done to clean up the browser session and clear the env override
  1. 谨慎选择本地策略:使用
    browse open <url> --local
    获取干净状态,使用
    browse open <url> --auto-connect
    复用本地现有凭证,使用
    browse open <url> --remote
    访问受保护站点
  2. **始终先执行
    browse open
    **再进行交互操作
  3. **使用
    browse snapshot
    **检查页面状态——它速度快且能提供元素引用
  4. 仅在需要视觉上下文时截图(布局检查、图片查看、调试)
  5. 使用snapshot返回的引用进行点击/交互——例如:
    browse click @0-5
  6. **完成后执行
    browse stop
    **以清理浏览器会话并清除环境覆盖设置

Troubleshooting

故障排除

  • "No active page": Run
    browse stop
    , then check
    browse status
    . If it still says running, kill the zombie daemon with
    pkill -f "browse.*daemon"
    , then retry
    browse open
  • Chrome not found: Install Chrome, use
    browse open <url> --auto-connect
    if you already have a debuggable Chrome running, or switch to
    browse open <url> --remote
  • Action fails: Run
    browse snapshot
    to see available elements and their refs
  • Browserbase fails: Verify API key is set
  • "No active page"(无活动页面):执行
    browse stop
    ,然后检查
    browse status
    。如果仍显示运行中,使用
    pkill -f "browse.*daemon"
    终止僵死的守护进程,然后重试
    browse open
  • 未找到Chrome:安装Chrome,若已有可调试Chrome在运行则使用
    browse open <url> --auto-connect
    ,或切换到
    browse open <url> --remote
  • 操作失败:执行
    browse snapshot
    查看可用元素及其引用
  • Browserbase操作失败:验证API密钥已正确设置

Switching to Remote Mode

切换到远程模式

Switch to remote when you detect: CAPTCHAs (reCAPTCHA, hCaptcha, Turnstile), bot detection pages ("Checking your browser..."), HTTP 403/429, empty pages on sites that should have content, or the user asks for it.
Don't switch for simple sites (docs, wikis, public APIs, localhost).
bash
browse open <url> --local          # clean isolated local browser
browse open <url> --auto-connect   # attach to existing debuggable Chrome
browse open <url> --remote         # Browserbase session
Mode flags are applied when a session starts. After
browse stop
, the next start falls back to env-var-based auto detection. Use
browse status
to inspect the resolved mode and target while the daemon is running.
For detailed examples, see EXAMPLES.md. For API reference, see REFERENCE.md.
当检测到以下情况时切换到远程模式:CAPTCHA(reCAPTCHA、hCaptcha、Turnstile)、机器人检测页面("Checking your browser...")、HTTP 403/429状态码、应包含内容但显示为空的页面,或用户要求切换时。
不要为简单站点(文档、维基、公开API、本地主机)切换模式。
bash
browse open <url> --local          # 干净隔离的本地浏览器
browse open <url> --auto-connect   # 连接到已运行的可调试Chrome
browse open <url> --remote         # Browserbase会话
模式标志会在会话启动时生效。执行
browse stop
后,下一次启动会回到基于环境变量的自动检测模式。当守护进程运行时,使用
browse status
查看已解析的模式和目标。
如需详细示例,请查看EXAMPLES.md。 如需API参考,请查看REFERENCE.md