better-agent-browser
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBetter Agent Browser
更优Agent浏览器
Extends CLI with four capabilities:
agent-browser- Login Watch — Auto-detect login walls, notify user, poll until login completes
- CAPTCHA / Automation Block Detection — Distinguish solvable CAPTCHAs from unsolvable automation blocks
- Site Experience — Per-domain patterns that accumulate over time as agents learn site quirks
- Parallel Tabs — Operate many tabs simultaneously via CDP proxy, sharing one Chrome and one login state
This skill complements agent-browser, not replaces it. is the execution layer. This skill adds detection, experience, and coordination on top.
agent-browser${SKILL_PATH}扩展 CLI的四项能力:
agent-browser- 登录监听(Login Watch) — 自动检测登录拦截墙,通知用户,轮询直到登录完成
- CAPTCHA/自动化拦截检测 — 区分可人工解决的CAPTCHA和无法绕过的自动化拦截
- 站点适配经验(Site Experience) — 按域名存储的适配规则,随Agent遇到和解决站点问题不断积累
- 并行标签(Parallel Tabs) — 通过CDP代理同时操作多个标签页,共享同一个Chrome实例和登录状态
本技能是agent-browser的补充,而非替代。 是执行层,本技能在其之上增加了检测、经验沉淀和协同能力。
agent-browser${SKILL_PATH}Language
语言要求
Match user's language.
匹配用户使用的语言。
Layers — Start Simple, Escalate When Needed
层级架构 — 从简单起步,按需升级
Layer 0a: agent-browser open ← local dev / public pages (no CDP)
Layer 0b: agent-browser connect <port> ← external sites needing login / anti-bot
Layer 1: + login-watch / captcha-watch ← when hitting login walls or anti-bot
+ site-patterns ← read before, write after
Layer 2: + CDP proxy ← parallel multi-tab onlyAlways start at Layer 0. Escalate only when you hit a specific problem.
Layer 0a: agent-browser open ← 本地开发/公开页面(无需CDP)
Layer 0b: agent-browser connect <port> ← 需要登录/反爬的外部站点
Layer 1: + login-watch / captcha-watch ← 遇到登录墙或反爬拦截时启用
+ site-patterns ← 操作前读取,操作后回写
Layer 2: + CDP proxy ← 仅在需要并行多标签时启用始终从Layer 0开始执行,仅在遇到明确问题时才升级层级。
Layer 0: Direct agent-browser
Layer 0: 直接使用agent-browser
Do you need the user's login state or need to bypass anti-bot?
你需要用户的登录状态,或是需要绕过反爬机制吗?
0a: No login needed
0a: 无需登录
Local dev, testing, public pages. Just use — launches its own browser, no setup.
agent-browser openbash
agent-browser open http://localhost:3000
agent-browser snapshot -i
agent-browser click @e3
agent-browser screenshot page.png本地开发、测试、公开页面场景,直接使用即可,会启动独立浏览器,无需额外配置:
agent-browser openbash
agent-browser open http://localhost:3000
agent-browser snapshot -i
agent-browser click @e3
agent-browser screenshot page.png0b: External sites
0b: 外部站点
Connect to the agent's dedicated Chrome profile at . This is a persistent profile — login state, cookies, and site data survive across sessions and agents. Once the user logs into a site here, it stays logged in for all future agent sessions.
~/.chrome-debug-profileUse to connect safely — it checks if another agent is already using Layer 0b and auto-routes to Layer 2 if so:
browser-connect.shbash
RESULT=$(bash ${SKILL_PATH}/scripts/browser-connect.sh 9333)
MODE=$(echo "$RESULT" | jq -r '.mode') | Meaning | Action |
|---|---|---|
| Lock acquired, connected | Use |
| Another agent holds the lock | Use CDP proxy (Layer 2) instead |
| Chrome not running | Guide user to start Chrome |
After connecting:
bash
agent-browser open https://github.com/settings
agent-browser snapshot -iDefault assumption: already logged in. This profile accumulates login state over time. Navigate directly — don't pre-check login. Only escalate to Layer 1's login-watch if you hit a login wall.
Tab management:
bash
agent-browser tab list # [index] title - url
agent-browser tab 0 # Switch by index (0-based)
agent-browser tab new https://... # Open new tab
agent-browser tab close # Close current tab
agent-browser get url # Which tab am I on?Tab discipline: Don't touch existing tabs. Work in tabs you create ( / ). Close your tabs when done.
tab newopenWhen done, always disconnect to release the lock for other agents:
bash
bash ${SKILL_PATH}/scripts/browser-disconnect.sh 9333First-time Chrome setup (once per machine boot, when debug port is not running):
bash
undefined连接到存放在的Agent专属Chrome配置文件,这是持久化配置 — 登录状态、Cookie、站点数据会在会话和Agent之间保留。用户在此配置中登录站点后,后续所有Agent会话都会保持登录状态。
~/.chrome-debug-profile使用安全连接 — 它会检查是否有其他Agent正在使用Layer 0b,如果是会自动路由到Layer 2:
browser-connect.shbash
RESULT=$(bash ${SKILL_PATH}/scripts/browser-connect.sh 9333)
MODE=$(echo "$RESULT" | jq -r '.mode') | 含义 | 操作 |
|---|---|---|
| 已获取锁,连接成功 | 正常使用 |
| 其他Agent持有锁 | 改用CDP代理(Layer 2) |
| Chrome未运行 | 引导用户启动Chrome |
连接成功后:
bash
agent-browser open https://github.com/settings
agent-browser snapshot -i默认假设:已登录。 该配置会逐步积累登录状态,直接导航即可,不要预先检查登录状态。仅在遇到登录墙时才升级到Layer 1的登录监听功能。
标签管理:
bash
agent-browser tab list # [索引] 标题 - url
agent-browser tab 0 # 按索引切换标签(从0开始计数)
agent-browser tab new https://... # 打开新标签页
agent-browser tab close # 关闭当前标签页
agent-browser get url # 查看当前标签页地址标签使用规范: 不要操作已存在的标签页,仅在你创建的标签页( / )中工作,使用完毕后关闭你创建的标签页。
tab newopen操作完成后必须断开连接,释放锁供其他Agent使用:
bash
bash ${SKILL_PATH}/scripts/browser-disconnect.sh 9333Chrome首次启动配置(每台机器重启后执行一次,调试端口未运行时需要):
bash
undefined1. Quit Chrome (Cmd+Q)
1. 退出Chrome(Cmd+Q)
2. Start the agent's dedicated Chrome profile
2. 启动Agent专属Chrome配置
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome
--remote-debugging-port=9333 '--remote-allow-origins=*'
--user-data-dir="$HOME/.chrome-debug-profile" 2>/dev/null &
--remote-debugging-port=9333 '--remote-allow-origins=*'
--user-data-dir="$HOME/.chrome-debug-profile" 2>/dev/null &
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome
--remote-debugging-port=9333 '--remote-allow-origins=*'
--user-data-dir="$HOME/.chrome-debug-profile" 2>/dev/null &
--remote-debugging-port=9333 '--remote-allow-origins=*'
--user-data-dir="$HOME/.chrome-debug-profile" 2>/dev/null &
3. Verify
3. 验证启动成功
Port 9222 is often occupied by Electron — prefer **9333**.
**Get CDP port programmatically:**
```bash
agent-browser get cdp-url # → ws://127.0.0.1:9333/devtools/browser/...Diagnostics: → JSON with agent-browser, Node, CDP, proxy status.
bash ${SKILL_PATH}/scripts/check-deps.sh
端口9222常被Electron占用,优先使用**9333**端口。
**程序化获取CDP端口:**
```bash
agent-browser get cdp-url # → ws://127.0.0.1:9333/devtools/browser/...诊断命令: → 返回包含agent-browser、Node、CDP、代理状态的JSON。
bash ${SKILL_PATH}/scripts/check-deps.shLayer 1: Login Watch + CAPTCHA Watch + Site Patterns
Layer 1: 登录监听 + CAPTCHA监听 + 站点适配规则
Site Patterns — Read Before, Write After
站点适配规则 — 操作前读取,操作后回写
Accumulated experience about specific domains. Grows as agents encounter and solve problems.
Before navigating to an external domain:
bash
PATTERN_FILE="${SKILL_PATH}/references/site-patterns/${DOMAIN}.md"- File exists → Read it. Treat as hints (may be outdated), not guarantees.
- and not connected to real Chrome → STOP,
requires_cdp: truefirst.connect
- File doesn't exist → Proceed. You may create one later.
After successful operations, write back what you learned:
markdown
---
domain: example.com
aliases: [示例, Example]
requires_cdp: true
updated: 2026-04-05
---积累的特定域名适配经验,随Agent遇到和解决问题不断更新。
导航到外部域名前:
bash
PATTERN_FILE="${SKILL_PATH}/references/site-patterns/${DOMAIN}.md"- 文件存在 → 读取内容,作为参考提示(可能过时),不保证完全有效。
- 若且未连接到真实Chrome → 停止操作,先执行
requires_cdp: true连接。connect
- 若
- 文件不存在 → 继续操作,后续可以创建对应规则文件。
操作成功后,回写你总结的适配经验:
markdown
---
domain: example.com
aliases: [示例, Example]
requires_cdp: true
updated: 2026-04-05
---Platform Characteristics
平台特性
Architecture, anti-bot, login needs, content loading.
架构、反爬机制、登录要求、内容加载方式。
Effective Patterns
有效适配方案
Verified URLs, selectors, strategies. Tag each with date.
验证过的URL、选择器、操作策略,每条都标注日期。
Known Traps
已知坑点
What fails and why. Tag each with date.
Rules:
- Only write verified facts, not guesses
- Tag findings with dates — patterns decay
- If a pattern stops working, update the file
**List available patterns:**
```bash
ls ${SKILL_PATH}/references/site-patterns/*.md | grep -v _template失效的操作及原因,每条都标注日期。
规则:
- 仅写入已验证的事实,不要写猜测内容
- 所有发现都标注日期 — 适配规则会随站点更新失效
- 如果某条规则不再生效,更新对应文件
**列出所有可用适配规则:**
```bash
ls ${SKILL_PATH}/references/site-patterns/*.md | grep -v _templateLogin Watch — Auto-Detect and Wait
登录监听 — 自动检测并等待
Don't pre-assume login is needed. Try to access content first. Only if the page shows a login wall:
- Tell the user: "Please log in to [site] in your Chrome."
- Start login-watch in the background:
bash
bash ${SKILL_PATH}/scripts/login-watch.sh --interval 5 --timeout 300MUST run in background with long timeout — user may take minutes to log in.
- When it returns , navigate to the target URL and continue.
logged_in: true
Exit codes:
| Exit | Meaning | Action |
|---|---|---|
| Login detected | Navigate to target and continue |
| Timeout | Tell user login-watch timed out, ask them to confirm when done |
| Error | agent-browser not connected, fix connection |
Output:
{ "logged_in": bool, "url": string, "elapsed": number, "hint": string }Login detection keywords: "Sign in", "Log in", "Create account", "Enter your password", etc. If a site uses unusual login wall text, the detection may miss it — fall back to asking the user.
不要预先假设需要登录,先尝试访问目标内容,仅当页面出现登录墙时:
- 告知用户:"请在你的Chrome中登录[站点名称]。"
- 后台启动登录监听:
bash
bash ${SKILL_PATH}/scripts/login-watch.sh --interval 5 --timeout 300必须后台运行并设置较长超时时间 — 用户可能需要几分钟完成登录。
- 当返回时,导航到目标URL继续操作。
logged_in: true
退出码说明:
| 退出码 | 含义 | 操作 |
|---|---|---|
| 检测到登录完成 | 导航到目标页面继续操作 |
| 超时 | 告知用户登录监听超时,让用户完成登录后确认 |
| 错误 | agent-browser未连接,修复连接问题 |
输出:
{ "logged_in": bool, "url": string, "elapsed": number, "hint": string }登录检测关键词: "Sign in"、"Log in"、"Create account"、"Enter your password"等。如果站点使用不常见的登录墙文本,检测可能失效,降级为直接询问用户。
CAPTCHA Watch — Check After Navigating
CAPTCHA监听 — 导航后检查
Run only when you suspect a block (blank page, challenge page, Cloudflare interstitial):
bash
bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>Use the port Chrome is running on. Get it from if unsure.
agent-browser get cdp-url| Exit | Meaning | Action |
|---|---|---|
| No CAPTCHA | Continue |
| Solvable CAPTCHA (real Chrome) | Screenshot → notify user → poll (background, same pattern as login-watch) |
| Automation block (unsolvable) | STOP. Must use real Chrome. Never retry in session mode. |
| Error | Retry once |
Solvable CAPTCHA polling:
bash
agent-browser screenshot captcha.png
echo "CAPTCHA detected. Please solve it in Chrome."仅当你怀疑出现拦截时运行(空白页、验证页、Cloudflare拦截页):
bash
bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>使用Chrome运行的端口,不确定的话可以通过获取。
agent-browser get cdp-url| 退出码 | 含义 | 操作 |
|---|---|---|
| 无CAPTCHA | 继续操作 |
| 可人工解决的CAPTCHA(真实Chrome环境) | 截图 → 通知用户 → 轮询(后台运行,和登录监听模式一致) |
| 自动化拦截(无法解决) | 停止操作,必须使用真实Chrome,不要在会话模式下重试。 |
| 错误 | 重试一次 |
可解决CAPTCHA轮询逻辑:
bash
agent-browser screenshot captcha.png
echo "检测到CAPTCHA,请在Chrome中完成验证。"Run in background — same pattern as login-watch
后台运行 — 和登录监听模式一致
for i in $(seq 1 24); do
sleep 5
RESULT=$(bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>)
if echo "$RESULT" | jq -e '.captcha == false' >/dev/null 2>&1; then
break
fi
done
---for i in $(seq 1 24); do
sleep 5
RESULT=$(bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>)
if echo "$RESULT" | jq -e '.captcha == false' >/dev/null 2>&1; then
break
fi
done
---Layer 2: CDP Proxy (Parallel Multi-Tab)
Layer 2: CDP代理(并行多标签)
HTTP API to manage multiple tabs in the user's real Chrome. The only way to do parallel multi-tab while sharing login state and clean fingerprint.
Why not ? Sessions spawn Playwright instances — , blocked by anti-bot, no shared login.
agent-browser --sessionnavigator.webdriver=true通过HTTP API管理用户真实Chrome中的多个标签页,是实现并行多标签同时共享登录状态、干净指纹的唯一方案。
为什么不使用? 会话模式会启动Playwright实例,,会被反爬机制拦截,也无法共享登录状态。
agent-browser --sessionnavigator.webdriver=truePreflight
前置检查
bash
bash ${SKILL_PATH}/scripts/check-deps.sh [CDP_PORT]Output: .
{ ready, mode, agent_browser, node, chrome_cdp, proxy, hint } | | Action |
|---|---|---|
| | Proceed |
| | Layer 2 unavailable. Fall back to Layer 0. |
| — | STOP. Fix per |
bash
bash ${SKILL_PATH}/scripts/check-deps.sh [CDP_PORT]输出:。
{ ready, mode, agent_browser, node, chrome_cdp, proxy, hint } | | 操作 |
|---|---|---|
| | 继续操作 |
| | Layer 2不可用,降级到Layer 0 |
| — | 停止操作,按照 |
Start Proxy
启动代理
bash
undefinedbash
undefinedReuse if already running
如果已经运行则复用
curl -sf http://127.0.0.1:3456/health >/dev/null 2>&1 ||
node ${SKILL_PATH}/scripts/cdp-proxy.mjs & PROXY="http://127.0.0.1:3456"
node ${SKILL_PATH}/scripts/cdp-proxy.mjs & PROXY="http://127.0.0.1:3456"
undefinedcurl -sf http://127.0.0.1:3456/health >/dev/null 2>&1 ||
node ${SKILL_PATH}/scripts/cdp-proxy.mjs & PROXY="http://127.0.0.1:3456"
node ${SKILL_PATH}/scripts/cdp-proxy.mjs & PROXY="http://127.0.0.1:3456"
undefinedBatch Operations
批量操作
bash
TABS=$(curl -s -X POST "$PROXY/batch" \
-H 'Content-Type: application/json' \
-d '{"urls":["https://site1.com","https://site2.com"]}')
TARGETS=$(echo "$TABS" | jq '[.[].targetId]')
curl -s -X POST "$PROXY/batch-eval" \
-H 'Content-Type: application/json' \
-d "{\"targets\":$TARGETS,\"expression\":\"document.title\"}"
curl -s -X POST "$PROXY/batch-close" \
-H 'Content-Type: application/json' \
-d "{\"targets\":$TARGETS}"bash
TABS=$(curl -s -X POST "$PROXY/batch" \
-H 'Content-Type: application/json' \
-d '{"urls":["https://site1.com","https://site2.com"]}')
TARGETS=$(echo "$TABS" | jq '[.[].targetId]')
curl -s -X POST "$PROXY/batch-eval" \
-H 'Content-Type: application/json' \
-d "{\"targets\":$TARGETS,\"expression\":\"document.title\"}"
curl -s -X POST "$PROXY/batch-close" \
-H 'Content-Type: application/json' \
-d "{\"targets\":$TARGETS}"Parallel Sub-Agents
并行子Agent
Main Agent
├── Start CDP proxy (once)
├── Sub-Agent A: /new → targetId-A → /eval → /close
├── Sub-Agent B: /new → targetId-B → /eval → /close
└── Sub-Agent C: /new → targetId-C → /eval → /close
All share same Chrome, login state, cookies主Agent
├── 启动CDP代理(仅一次)
├── 子Agent A: /new → targetId-A → /eval → /close
├── 子Agent B: /new → targetId-B → /eval → /close
└── 子Agent C: /new → targetId-C → /eval → /close
所有子Agent共享同一个Chrome、登录状态、CookieCAPTCHA Watch for Proxy Tabs
代理标签页的CAPTCHA监听
bash
bash ${SKILL_PATH}/scripts/captcha-watch.sh --proxy-eval <targetId>bash
bash ${SKILL_PATH}/scripts/captcha-watch.sh --proxy-eval <targetId>Cleanup
清理
Tab discipline applies. Only close tabs you opened. Never close user's tabs. Don't kill the proxy if other agents may be using it.
bash
curl -s -X POST "$PROXY/batch-close" \
-H 'Content-Type: application/json' \
-d "{\"targets\":$TARGETS}"Full API: .
references/api.md适用标签使用规范。 仅关闭你打开的标签页,永远不要关闭用户的标签页。如果有其他Agent可能在使用代理,不要停止代理进程。
bash
curl -s -X POST "$PROXY/batch-close" \
-H 'Content-Type: application/json' \
-d "{\"targets\":$TARGETS}"完整API参考:。
references/api.mdScripts Reference
脚本参考
| Script | Purpose | Layer | Run in background? |
|---|---|---|---|
| Safe connect with lock-based conflict detection | 0b | No |
| Release lock and close browser | 0b | No |
| Environment diagnostics | 0+ | No |
| Poll until login wall disappears | 1 | Yes |
| Detect CAPTCHA/automation block | 1 | Polling loop: yes |
| Same, for proxy tabs | 2 | Polling loop: yes |
| HTTP API for parallel tabs | 2 | Yes (daemon) |
| 脚本 | 用途 | 适用层级 | 是否需要后台运行? |
|---|---|---|---|
| 基于锁的冲突检测的安全连接 | 0b | 否 |
| 释放锁并关闭浏览器 | 0b | 否 |
| 环境诊断 | 0+ | 否 |
| 轮询直到登录墙消失 | 1 | 是 |
| 检测CAPTCHA/自动化拦截 | 1 | 轮询循环:是 |
| 同上,适用于代理标签页 | 2 | 轮询循环:是 |
| 并行标签页的HTTP API | 2 | 是(守护进程) |
Degradation
降级方案
| Missing | Impact |
|---|---|
| Chrome CDP | Layer 0b/1/2 unavailable. 0a still works. |
| CDP proxy | Layer 2 unavailable. 0/1 still work. |
| agent-browser | Fatal. |
| Node.js 22+ | Layer 2 unavailable. 0/1 still work. |
| Site pattern file | Proceed without experience. Risk: unexpected blocks. |
| 缺失依赖 | 影响 |
|---|---|
| Chrome CDP | Layer 0b/1/2不可用,0a仍可正常工作 |
| CDP代理 | Layer 2不可用,0/1仍可正常工作 |
| agent-browser | 致命错误 |
| Node.js 22+ | Layer 2不可用,0/1仍可正常工作 |
| 站点适配规则文件 | 无适配经验继续操作,风险:出现意外拦截 |
Troubleshooting
故障排查
| Symptom | Fix |
|---|---|
| Chrome not running with debug port. See setup in Layer 0. |
| CAPTCHA unsolvable (exit 2) | |
| Connected to Playwright/Electron. Use real Chrome. |
| Proxy port 3456 in use | |
| Slow page. |
| Tabs accumulate | |
| login-watch misses login wall | Site uses unusual text. Fall back to asking user. |
| 现象 | 修复方案 |
|---|---|
| Chrome未启动调试端口,参考Layer 0的启动配置 |
| CAPTCHA无法解决(退出码2) | |
CDP中 | 连接到了Playwright/Electron,使用真实Chrome |
| 代理端口3456被占用 | |
| 页面加载慢,添加 |
| 标签页堆积 | 调用 |
| 登录监听未识别登录墙 | 站点使用不常见文本,降级为直接询问用户 |
References (read on demand)
参考文档(按需读取)
| Document | When |
|---|---|
| Layer 2: proxy endpoints, batch examples |
| Layer 1: before navigating to known domains |
| 文档 | 适用场景 |
|---|---|
| Layer 2:代理端点、批量操作示例 |
| Layer 1:导航到已知域名前参考 |