cmux-browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Browser Automation with cmux

基于cmux的浏览器自动化

Use this skill for browser tasks inside cmux webviews.
在cmux webview中执行浏览器任务时可使用该功能。

Core Workflow

核心工作流

  1. Open or target a browser surface.
  2. Snapshot (
    --interactive
    ) to get fresh element refs.
  3. Act with refs (
    click
    ,
    fill
    ,
    type
    ,
    select
    ,
    press
    ).
  4. Wait for state changes.
  5. Re-snapshot after DOM/navigation changes.
bash
cmux browser open https://example.com --json
  1. 打开或定位浏览器界面。
  2. 生成快照(
    --interactive
    )以获取最新的元素引用。
  3. 通过引用执行操作(
    click
    fill
    type
    select
    press
    )。
  4. 等待状态变更。
  5. DOM或导航变更后重新生成快照。
bash
cmux browser open https://example.com --json

use returned surface ref, for example: surface:7

使用返回的界面引用,例如:surface:7

cmux browser surface:7 snapshot --interactive cmux browser surface:7 fill e1 "hello" cmux browser surface:7 click e2 --snapshot-after --json cmux browser surface:7 wait --load-state complete --timeout-ms 15000 cmux browser surface:7 snapshot --interactive
undefined
cmux browser surface:7 snapshot --interactive cmux browser surface:7 fill e1 "hello" cmux browser surface:7 click e2 --snapshot-after --json cmux browser surface:7 wait --load-state complete --timeout-ms 15000 cmux browser surface:7 snapshot --interactive
undefined

Surface Targeting

界面定位

bash
undefined
bash
undefined

identify current context

识别当前上下文

cmux identify --json
cmux identify --json

open routed to a specific topology target

打开并路由到特定拓扑目标

cmux browser open https://example.com --workspace workspace:2 --window window:1 --json

Notes:
- CLI output defaults to short refs (`surface:N`, `pane:N`, `workspace:N`, `window:N`).
- UUIDs are still accepted on input; only request UUID output when needed (`--id-format uuids|both`).
- Keep using one `surface:N` per task unless you intentionally switch.
cmux browser open https://example.com --workspace workspace:2 --window window:1 --json

注意事项:
- CLI输出默认使用短引用(`surface:N`、`pane:N`、`workspace:N`、`window:N`)。
- 输入时仍支持UUID;仅在需要时请求UUID输出(`--id-format uuids|both`)。
- 除非有意切换,否则单个任务请保持使用同一个`surface:N`。

Wait Support

等待功能支持

cmux supports wait patterns similar to agent-browser:
bash
cmux browser <surface> wait --selector "#ready" --timeout-ms 10000
cmux browser <surface> wait --text "Success" --timeout-ms 10000
cmux browser <surface> wait --url-contains "/dashboard" --timeout-ms 10000
cmux browser <surface> wait --load-state complete --timeout-ms 15000
cmux browser <surface> wait --function "document.readyState === 'complete'" --timeout-ms 10000
cmux支持与agent-browser类似的等待模式:
bash
cmux browser <surface> wait --selector "#ready" --timeout-ms 10000
cmux browser <surface> wait --text "Success" --timeout-ms 10000
cmux browser <surface> wait --url-contains "/dashboard" --timeout-ms 10000
cmux browser <surface> wait --load-state complete --timeout-ms 15000
cmux browser <surface> wait --function "document.readyState === 'complete'" --timeout-ms 10000

Common Flows

常见流程

Form Submit

表单提交

bash
cmux browser open https://example.com/signup --json
cmux browser surface:7 snapshot --interactive
cmux browser surface:7 fill e1 "Jane Doe"
cmux browser surface:7 fill e2 "jane@example.com"
cmux browser surface:7 click e3 --snapshot-after --json
cmux browser surface:7 wait --url-contains "/welcome" --timeout-ms 15000
cmux browser surface:7 snapshot --interactive
bash
cmux browser open https://example.com/signup --json
cmux browser surface:7 snapshot --interactive
cmux browser surface:7 fill e1 "Jane Doe"
cmux browser surface:7 fill e2 "jane@example.com"
cmux browser surface:7 click e3 --snapshot-after --json
cmux browser surface:7 wait --url-contains "/welcome" --timeout-ms 15000
cmux browser surface:7 snapshot --interactive

Clear an Input

清空输入框

bash
cmux browser surface:7 fill e11 "" --snapshot-after --json
cmux browser surface:7 get value e11 --json
bash
cmux browser surface:7 fill e11 "" --snapshot-after --json
cmux browser surface:7 get value e11 --json

Stable Agent Loop (Recommended)

稳定的Agent循环(推荐)

bash
undefined
bash
undefined

snapshot -> action -> wait -> snapshot

快照 -> 操作 -> 等待 -> 快照

cmux browser surface:7 snapshot --interactive cmux browser surface:7 click e5 --snapshot-after --json cmux browser surface:7 wait --load-state complete --timeout-ms 15000 cmux browser surface:7 snapshot --interactive
undefined
cmux browser surface:7 snapshot --interactive cmux browser surface:7 click e5 --snapshot-after --json cmux browser surface:7 wait --load-state complete --timeout-ms 15000 cmux browser surface:7 snapshot --interactive
undefined

Deep-Dive References

深度参考文档

ReferenceWhen to Use
references/commands.mdFull browser command mapping and quick syntax
references/snapshot-refs.mdRef lifecycle and stale-ref troubleshooting
references/authentication.mdLogin/OAuth/2FA patterns and state save/load
references/authentication.md#saving-authentication-stateSave authenticated state right after login
references/session-management.mdMulti-surface isolation and state persistence patterns
references/video-recording.mdCurrent recording status and practical alternatives
references/proxy-support.mdProxy behavior in WKWebView and workarounds
参考文档适用场景
references/commands.md完整的浏览器命令映射和快速语法说明
references/snapshot-refs.md引用生命周期及过期引用故障排查
references/authentication.md登录/OAuth/2FA模式及状态保存/加载
references/authentication.md#saving-authentication-state登录完成后立即保存认证状态
references/session-management.md多界面隔离及状态持久化模式
references/video-recording.md当前录制状态及实用替代方案
references/proxy-support.mdWKWebView中的代理行为及解决方法

Ready-to-Use Templates

即用型模板

TemplateDescription
templates/form-automation.shSnapshot/ref form fill loop
templates/authenticated-session.shLogin once, save/load state
templates/capture-workflow.shNavigate + capture snapshots/screenshots
模板描述
templates/form-automation.sh快照/引用表单填充循环
templates/authenticated-session.sh一次登录,保存/加载状态
templates/capture-workflow.sh导航 + 捕获快照/截图

Limits (WKWebView)

限制(WKWebView)

These commands currently return
not_supported
because they rely on Chrome/CDP-only APIs not exposed by WKWebView:
  • viewport emulation
  • offline emulation
  • trace/screencast recording
  • network route interception/mocking
  • low-level raw input injection
Use supported high-level commands (
click
,
fill
,
press
,
scroll
,
wait
,
snapshot
) instead.
以下命令目前返回
not_supported
,因为它们依赖Chrome/CDP专属API,而WKWebView未暴露这些API:
  • 视口模拟
  • 离线模拟
  • 追踪/录屏
  • 网络路由拦截/模拟
  • 底层原始输入注入
请改用支持的高级命令(
click
fill
press
scroll
wait
snapshot
)。