kernel-agent-browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Agent-Browser with Kernel Cloud Browsers

使用Agent-Browser搭配Kernel云浏览器

This skill documents best practices for using agent-browser's built-in Kernel provider (
-p kernel
) for cloud browser automation.
本Skill记录了使用agent-browser内置的Kernel提供商(
-p kernel
)进行云浏览器自动化的最佳实践。

When to Use This Skill

适用场景

Use this skill when you need to:
  • Automate websites using
    agent-browser -p kernel
    commands
  • Handle bot detection on sites with aggressive anti-bot measures
  • Persist login sessions across automation runs using profiles
  • Work with iframes including cross-origin payment forms
  • Get live view URLs for debugging or manual intervention
  • Find the underlying Kernel session ID for advanced Playwright scripting
  • Create site-specific automation skills for new websites
当你需要以下操作时,可使用本Skill:
  • 使用
    agent-browser -p kernel
    命令自动化网站操作
  • 在有严格反机器人机制的网站上处理机器人检测
  • 通过配置文件持久化登录会话,跨自动化任务复用
  • 处理包含跨域支付表单的iframe
  • 获取实时查看URL用于调试或人工干预
  • 查找底层Kernel会话ID以进行高级Playwright脚本编写
  • 为新网站创建特定站点的自动化Skill

References

参考资料

  • Creating Site-Specific Skills - Guide for building automation skills for specific websites
  • 创建特定站点的Skill - 为特定网站构建自动化Skill的指南

Prerequisites

前置条件

Load the
kernel-cli
skill for Kernel CLI installation and authentication.
加载
kernel-cli
Skill以完成Kernel CLI的安装和认证。

Environment Variables

环境变量

Set these before your first
agent-browser -p kernel
call. The CLI holds state between invocations.
VariableDescriptionDefault
KERNEL_API_KEY
Required. Your Kernel API key for authentication(none)
KERNEL_HEADLESS
Run browser in headless mode (
true
/
false
)
false
KERNEL_STEALTH
Enable stealth mode to avoid bot detection (
true
/
false
)
true
KERNEL_TIMEOUT_SECONDS
Session timeout in seconds
300
KERNEL_PROFILE_NAME
Browser profile name for persistent cookies/logins(none)
在首次调用
agent-browser -p kernel
前设置以下变量。CLI会在多次调用间保留状态。
变量名描述默认值
KERNEL_API_KEY
必填项。用于认证的Kernel API密钥
KERNEL_HEADLESS
以无头模式运行浏览器(
true
/
false
false
KERNEL_STEALTH
启用隐身模式以避免机器人检测(
true
/
false
true
KERNEL_TIMEOUT_SECONDS
会话超时时间(秒)
300
KERNEL_PROFILE_NAME
用于持久化Cookie/登录状态的浏览器配置文件名

Recommended Configuration

推荐配置

bash
export KERNEL_API_KEY="your-api-key"
export KERNEL_TIMEOUT_SECONDS=600     # 10-minute timeout for complex workflows
export KERNEL_STEALTH=true            # Avoid bot detection (default)
export KERNEL_PROFILE_NAME=mysite     # Persist login sessions across runs
bash
export KERNEL_API_KEY="your-api-key"
export KERNEL_TIMEOUT_SECONDS=600     # 为复杂工作流设置10分钟超时
export KERNEL_STEALTH=true            # 启用隐身模式避免检测(默认值)
export KERNEL_PROFILE_NAME=mysite     # 跨任务复用登录会话

Profile Persistence

配置文件持久化

When
KERNEL_PROFILE_NAME
is set:
  • The profile is created if it doesn't exist
  • Cookies, logins, and session data are automatically saved when the browser session ends
  • Future sessions with the same profile name restore the saved state
This is especially useful for sites requiring login—authenticate once, reuse across sessions.
当设置
KERNEL_PROFILE_NAME
后:
  • 若配置文件不存在则自动创建
  • Cookie、登录信息和会话数据会在浏览器会话结束时自动保存
  • 后续使用相同配置文件名的会话会恢复已保存的状态
这对于需要登录的网站尤其有用——只需认证一次,即可跨会话复用状态。

Basic Usage

基础用法

bash
agent-browser -p kernel open <url>        # Navigate to page
agent-browser -p kernel snapshot -i       # Get interactive elements with refs
agent-browser -p kernel click @e1         # Click element by ref
agent-browser -p kernel fill @e2 "text"   # Fill input by ref
agent-browser -p kernel close             # Close browser and save profile
Always use the
-p kernel
flag with each command.
bash
agent-browser -p kernel open <url>        # 导航至指定页面
agent-browser -p kernel snapshot -i       # 获取带引用的交互式元素
agent-browser -p kernel click @e1         # 通过引用点击元素
agent-browser -p kernel fill @e2 "text"   # 通过引用填充输入框
agent-browser -p kernel close             # 关闭浏览器并保存配置文件
每次命令都需带上
-p kernel
参数。

Semantic Selectors (Recommended)

语义选择器(推荐)

Instead of ephemeral
@e
refs that change on every page load, use semantic selectors via the
find
command for more stable, readable automation:
bash
undefined
相较于每次页面加载都会变化的临时
@e
引用,推荐使用
find
命令的语义选择器来实现更稳定、可读性更强的自动化:
bash
undefined

By ARIA role + accessible name (most stable)

通过ARIA角色+可访问名称(最稳定)

agent-browser -p kernel find role button click --name "Log In" agent-browser -p kernel find role textbox fill "user@email.com" --name "Email"
agent-browser -p kernel find role button click --name "登录" agent-browser -p kernel find role textbox fill "user@email.com" --name "邮箱"

By visible text content

通过可见文本内容

agent-browser -p kernel find text "View Menus" click agent-browser -p kernel find text "Submit Order" click
agent-browser -p kernel find text "查看菜单" click agent-browser -p kernel find text "提交订单" click

By form label (great for inputs)

通过表单标签(适用于输入框)

agent-browser -p kernel find label "Username" fill "myuser" agent-browser -p kernel find label "Password" fill "secret123"
agent-browser -p kernel find label "用户名" fill "myuser" agent-browser -p kernel find label "密码" fill "secret123"

By placeholder text

通过占位符文本

agent-browser -p kernel find placeholder "Search..." type "query"
agent-browser -p kernel find placeholder "搜索..." type "query"

By data-testid (if the site uses them)

通过data-testid(若网站支持)

agent-browser -p kernel find testid "submit-btn" click
agent-browser -p kernel find testid "submit-btn" click

By position (when needed)

通过位置(必要时使用)

agent-browser -p kernel find first "li.item" click agent-browser -p kernel find nth 2 ".card" hover
undefined
agent-browser -p kernel find first "li.item" click agent-browser -p kernel find nth 2 ".card" hover
undefined

When to Use Which Selector

选择器适用场景

Selector TypeBest ForStability
find role --name
Buttons, links, navigation⭐⭐⭐ Most stable
find label
Form inputs with labels⭐⭐⭐ Most stable
find text
Clickable text elements⭐⭐ Stable
find testid
Sites with test attributes⭐⭐⭐ Most stable
find placeholder
Search boxes, inputs⭐⭐ Stable
@e
refs
Unknown sites, quick iteration⭐ Ephemeral
Recommendation: Use
find
for production automation. Use
@e
refs for exploration and quick prototyping, then convert to semantic selectors.
选择器类型最佳适用场景稳定性
find role --name
按钮、链接、导航元素⭐⭐⭐ 最稳定
find label
带标签的表单输入框⭐⭐⭐ 最稳定
find text
可点击文本元素⭐⭐ 稳定
find testid
带有测试属性的网站⭐⭐⭐ 最稳定
find placeholder
搜索框、输入框⭐⭐ 稳定
@e
引用
未知网站、快速原型开发⭐ 临时有效
建议:生产环境自动化使用
find
选择器。
@e
引用仅用于探索和快速原型开发,之后需转换为语义选择器。

Finding Session ID and Live View URL

查找会话ID和实时查看URL

agent-browser creates a Kernel browser session under the hood. To get the session ID or live view URL:
bash
undefined
agent-browser在底层会创建一个Kernel浏览器会话。如需获取会话ID或实时查看URL:
bash
undefined

List all Kernel browsers (find yours by profile name or creation time)

列出所有Kernel浏览器会话(可通过配置文件名或创建时间找到目标会话)

kernel browsers list
kernel browsers list

Get live view URL for a specific session

获取特定会话的实时查看URL

kernel browsers view <session-id>

This is useful when:
- You need to execute Playwright scripts directly against the session
- You want to share a live view URL with the user for manual intervention
- You're debugging and want to watch the browser in real-time
kernel browsers view <session-id>

以下场景会用到该功能:
- 需要直接针对会话执行Playwright脚本
- 需分享实时查看URL给用户进行人工干预
- 调试时需实时观察浏览器操作

Handling Bot Detection

处理机器人检测

Stealth Mode

隐身模式

Stealth mode (
KERNEL_STEALTH=true
) is enabled by default and helps avoid detection. However, some sites have aggressive bot detection that still triggers.
隐身模式(
KERNEL_STEALTH=true
)默认启用,可帮助避免检测。但部分网站的反机器人机制仍会触发。

Manual Login Fallback

手动登录回退方案

If login automation fails due to bot detection:
  1. Get the live view URL:
    bash
    kernel browsers list
    # Find your session by profile name
    kernel browsers view <session-id>
  2. Share the live view URL with the user and ask them to complete the login manually
  3. Once logged in, continue automation—the profile will save the authenticated state
若因机器人检测导致登录自动化失败:
  1. 获取实时查看URL:
    bash
    kernel browsers list
    # 通过配置文件名找到目标会话
    kernel browsers view <session-id>
  2. 将实时查看URL分享给用户,请求其手动完成登录
  3. 登录完成后,继续自动化操作——配置文件会保存已认证的状态

JavaScript Fallback for Tricky Elements

针对复杂元素的JavaScript回退方案

Some elements (especially on bot-protected sites) don't respond to standard commands:
bash
undefined
部分元素(尤其是受机器人保护的网站)对标准命令无响应:
bash
undefined

Click by CSS selector

通过CSS选择器点击

agent-browser -p kernel eval "document.querySelector('.submit-btn').click()"
agent-browser -p kernel eval "document.querySelector('.submit-btn').click()"

Fill by selector (with event dispatch)

通过选择器填充(触发事件)

agent-browser -p kernel eval " const el = document.querySelector('#email'); el.value = 'user@example.com'; el.dispatchEvent(new Event('input', {bubbles: true})); el.dispatchEvent(new Event('change', {bubbles: true})); "
agent-browser -p kernel eval " const el = document.querySelector('#email'); el.value = 'user@example.com'; el.dispatchEvent(new Event('input', {bubbles: true})); el.dispatchEvent(new Event('change', {bubbles: true})); "

Click by test ID

通过测试ID点击

agent-browser -p kernel eval "document.querySelector('[data-testid="submit"]').click()"
undefined
agent-browser -p kernel eval "document.querySelector('[data-testid="submit"]').click()"
undefined

Anti-Bot Form Fields

反机器人表单字段

Some payment processors (e.g., Point and Pay) use decoy form fields. Only fill fields matching specific patterns:
bash
agent-browser -p kernel eval "
  const realInputs = Array.from(document.querySelectorAll('input'))
    .filter(el => el.name && el.name.startsWith('xeiinput'));
  // Fill only these inputs
"
部分支付处理器(如Point and Pay)会使用诱饵表单字段。仅填充匹配特定模式的字段:
bash
agent-browser -p kernel eval "
  const realInputs = Array.from(document.querySelectorAll('input'))
    .filter(el => el.name && el.name.startsWith('xeiinput'));
  // 仅填充这些输入框
"

Handling Iframes

处理Iframe

Same-Origin Iframes

同源Iframe

Use the frame command to switch context:
bash
agent-browser -p kernel frame "#iframe-id"   # Switch to iframe
agent-browser -p kernel snapshot -i          # Snapshot within iframe
agent-browser -p kernel click @e1            # Interact within iframe
agent-browser -p kernel frame main           # Return to main frame
使用frame命令切换上下文:
bash
agent-browser -p kernel frame "#iframe-id"   # 切换至iframe
agent-browser -p kernel snapshot -i          # 在iframe内生成快照
agent-browser -p kernel click @e1            # 在iframe内交互
agent-browser -p kernel frame main           # 返回主框架

Cross-Origin Iframes

跨域Iframe

Cross-origin iframes require executing a Playwright script directly against the Kernel session:
  1. Find the session ID:
    bash
    kernel browsers list
  2. Execute a Playwright script:
    bash
    kernel browsers exec <session-id> --code "
      const frame = page.frameLocator('#payment-iframe');
      await frame.locator('#card-number').fill('4111111111111111');
      await frame.locator('#submit').click();
    "
See the kernel-cli skill for more details on executing Playwright code.
跨域Iframe需要直接针对Kernel会话执行Playwright脚本:
  1. 查找会话ID:
    bash
    kernel browsers list
  2. 执行Playwright脚本:
    bash
    kernel browsers exec <session-id> --code "
      const frame = page.frameLocator('#payment-iframe');
      await frame.locator('#card-number').fill('4111111111111111');
      await frame.locator('#submit').click();
    "
更多关于执行Playwright代码的细节,请查看kernel-cli Skill。

Waiting Strategies

等待策略

Smart waits are critical for fast, reliable automation. Using condition-based waits instead of fixed timeouts can reduce execution time by 50%+ while improving reliability.
智能等待对实现快速、可靠的自动化至关重要。 使用基于条件的等待而非固定超时,可将执行时间缩短50%以上,同时提升可靠性。

Smart Waits (Recommended)

智能等待(推荐)

bash
undefined
bash
undefined

Wait for page load states

等待页面加载状态

agent-browser -p kernel wait --load domcontentloaded # DOM ready agent-browser -p kernel wait --load networkidle # Network settled
agent-browser -p kernel wait --load domcontentloaded # DOM加载完成 agent-browser -p kernel wait --load networkidle # 网络请求稳定

Wait for specific URL pattern (great for redirects after login)

等待特定URL模式(适用于登录后的重定向场景)

agent-browser -p kernel wait --url "/dashboard" agent-browser -p kernel wait --url "/order-confirmation"
agent-browser -p kernel wait --url "/dashboard" agent-browser -p kernel wait --url "/order-confirmation"

Wait for text to appear (great for dynamic content)

等待文本出现(适用于动态内容)

agent-browser -p kernel wait --text "Password" # Field appeared agent-browser -p kernel wait --text "Order confirmed" # Success message
agent-browser -p kernel wait --text "密码" # 字段已出现 agent-browser -p kernel wait --text "订单已确认" # 成功提示已出现

Wait for JavaScript condition

等待JavaScript条件满足

agent-browser -p kernel wait --fn "window.appReady === true" agent-browser -p kernel wait --fn "document.querySelector('.spinner') === null"
agent-browser -p kernel wait --fn "window.appReady === true" agent-browser -p kernel wait --fn "document.querySelector('.spinner') === null"

Wait for element by CSS selector

等待CSS选择器对应的元素出现

agent-browser -p kernel wait "#login-form" agent-browser -p kernel wait ".results-loaded"
undefined
agent-browser -p kernel wait "#login-form" agent-browser -p kernel wait ".results-loaded"
undefined

Fixed Waits (Last Resort)

固定等待(万不得已时使用)

bash
undefined
bash
undefined

Only when no condition is available

仅当无合适条件可等待时使用

agent-browser -p kernel wait 2000
undefined
agent-browser -p kernel wait 2000
undefined

Element Refs Best Practices

元素引用最佳实践

Element refs (
@e1
,
@e2
, etc.) are ephemeral and change:
  • After page navigation
  • After significant DOM updates
  • Between browser sessions
Always take a fresh snapshot before interacting:
bash
agent-browser -p kernel snapshot -i
元素引用(
@e1
@e2
等)是临时的,会在以下情况发生变化:
  • 页面导航后
  • DOM发生重大更新后
  • 浏览器会话之间
交互前务必获取最新快照:
bash
agent-browser -p kernel snapshot -i

Now use the refs from this snapshot

然后使用本次快照中的引用

agent-browser -p kernel click @e5
undefined
agent-browser -p kernel click @e5
undefined

Filtering Snapshots

过滤快照

bash
undefined
bash
undefined

Filter for specific elements

过滤特定类型的元素

agent-browser -p kernel snapshot -i | grep -i "button|submit"
agent-browser -p kernel snapshot -i | grep -i "button|submit"

Scope to a specific area

限定在特定区域内

agent-browser -p kernel snapshot -s "#main-content" -i
undefined
agent-browser -p kernel snapshot -s "#main-content" -i
undefined

Login Patterns

登录模式

Single-Page Form (Optimized)

单页表单(优化版)

Username and password on the same page:
bash
agent-browser -p kernel open https://example.com/login
agent-browser -p kernel wait --load domcontentloaded
用户名和密码在同一页面:
bash
agent-browser -p kernel open https://example.com/login
agent-browser -p kernel wait --load domcontentloaded

Use semantic selectors for stability

使用语义选择器确保稳定性

agent-browser -p kernel find label "Email" fill "user@example.com" agent-browser -p kernel find label "Password" fill "secret123" agent-browser -p kernel find role button click --name "Sign In"
agent-browser -p kernel find label "邮箱" fill "user@example.com" agent-browser -p kernel find label "密码" fill "secret123" agent-browser -p kernel find role button click --name "登录"

Wait for actual redirect, not arbitrary timeout

等待实际重定向,而非固定超时

agent-browser -p kernel wait --url "**/dashboard"
undefined
agent-browser -p kernel wait --url "**/dashboard"
undefined

Two-Step Form (Optimized)

两步表单(优化版)

Username first, then password on a second screen:
bash
agent-browser -p kernel open https://example.com/login
agent-browser -p kernel wait --load domcontentloaded
先输入用户名,再在第二个页面输入密码:
bash
agent-browser -p kernel open https://example.com/login
agent-browser -p kernel wait --load domcontentloaded

Step 1: Username

第一步:输入用户名

agent-browser -p kernel find label "Username" fill "myuser" agent-browser -p kernel press Enter
agent-browser -p kernel find label "用户名" fill "myuser" agent-browser -p kernel press Enter

Wait for password field to appear (not a fixed sleep!)

等待密码字段出现(不要使用固定等待!)

agent-browser -p kernel wait --text "Password"
agent-browser -p kernel wait --text "密码"

Step 2: Password

第二步:输入密码

agent-browser -p kernel find label "Password" fill "secret123" agent-browser -p kernel press Enter
agent-browser -p kernel find label "密码" fill "secret123" agent-browser -p kernel press Enter

Wait for successful redirect

等待成功重定向

agent-browser -p kernel wait --url "**/home"
undefined
agent-browser -p kernel wait --url "**/home"
undefined

Comparison: Old vs Optimized

对比:旧方案 vs 优化方案

bash
undefined
bash
undefined

OLD (slow, fragile):

旧方案(缓慢、脆弱):

agent-browser -p kernel wait 2000 agent-browser -p kernel fill @e1 "username" agent-browser -p kernel wait 2000 agent-browser -p kernel fill @e3 "password" agent-browser -p kernel wait 5000
agent-browser -p kernel wait 2000 agent-browser -p kernel fill @e1 "username" agent-browser -p kernel wait 2000 agent-browser -p kernel fill @e3 "password" agent-browser -p kernel wait 5000

OPTIMIZED (fast, stable):

优化方案(快速、稳定):

agent-browser -p kernel wait --load domcontentloaded agent-browser -p kernel find label "Username" fill "username" agent-browser -p kernel wait --text "Password" agent-browser -p kernel find label "Password" fill "password" agent-browser -p kernel wait --url "**/dashboard"
undefined
agent-browser -p kernel wait --load domcontentloaded agent-browser -p kernel find label "用户名" fill "username" agent-browser -p kernel wait --text "密码" agent-browser -p kernel find label "密码" fill "password" agent-browser -p kernel wait --url "**/dashboard"
undefined

Modal Login

模态框登录

Login form appears in a modal overlay:
bash
undefined
登录表单在模态框中显示:
bash
undefined

Click login link to open modal

点击登录链接打开模态框

agent-browser -p kernel find text "Log In" click agent-browser -p kernel wait --text "Password" # Wait for modal
agent-browser -p kernel find text "登录" click agent-browser -p kernel wait --text "密码" # 等待模态框加载完成

Fill modal fields

填充模态框字段

agent-browser -p kernel find label "Email" fill "user@example.com" agent-browser -p kernel find label "Password" fill "password123" agent-browser -p kernel find role button click --name "Sign In" agent-browser -p kernel wait --url "**/dashboard"
undefined
agent-browser -p kernel find label "邮箱" fill "user@example.com" agent-browser -p kernel find label "密码" fill "password123" agent-browser -p kernel find role button click --name "登录" agent-browser -p kernel wait --url "**/dashboard"
undefined

Fallback: JavaScript for Tricky Modals

回退方案:针对复杂模态框的JavaScript

Some modals don't expose accessible labels:
bash
agent-browser -p kernel eval "document.querySelector('.login-link').click()"
agent-browser -p kernel wait 1000

agent-browser -p kernel eval "
  document.getElementById('username').value = 'user@example.com';
  document.getElementById('username').dispatchEvent(new Event('input', {bubbles: true}));
  document.getElementById('password').value = 'password123';
  document.getElementById('password').dispatchEvent(new Event('input', {bubbles: true}));
  document.querySelector('button[type=submit]').click();
"
agent-browser -p kernel wait --url "**/dashboard"
部分模态框未暴露可访问标签:
bash
agent-browser -p kernel eval "document.querySelector('.login-link').click()"
agent-browser -p kernel wait 1000

agent-browser -p kernel eval "
  document.getElementById('username').value = 'user@example.com';
  document.getElementById('username').dispatchEvent(new Event('input', {bubbles: true}));
  document.getElementById('password').value = 'password123';
  document.getElementById('password').dispatchEvent(new Event('input', {bubbles: true}));
  document.querySelector('button[type=submit]').click();
"
agent-browser -p kernel wait --url "**/dashboard"

Handling New Tabs

处理新标签页

Some links open in new tabs:
bash
undefined
部分链接会在新标签页打开:
bash
undefined

Click link that opens new tab

点击会打开新标签页的链接

agent-browser -p kernel click @e38 agent-browser -p kernel tab 1 # Switch to new tab (0-indexed) agent-browser -p kernel wait 2000 agent-browser -p kernel snapshot -i # Interact with new tab
undefined
agent-browser -p kernel click @e38 agent-browser -p kernel tab 1 # 切换至新标签页(索引从0开始) agent-browser -p kernel wait 2000 agent-browser -p kernel snapshot -i # 与新标签页交互
undefined

Screenshots and Debugging

截图与调试

bash
undefined
bash
undefined

Take screenshot

截取当前页面

agent-browser -p kernel screenshot ~/Downloads/page.png
agent-browser -p kernel screenshot ~/Downloads/page.png

Full page screenshot

截取完整页面

agent-browser -p kernel screenshot ~/Downloads/full.png --full
agent-browser -p kernel screenshot ~/Downloads/full.png --full

View console messages

查看控制台消息

agent-browser -p kernel console
agent-browser -p kernel console

View page errors

查看页面错误

agent-browser -p kernel errors
agent-browser -p kernel errors

Get current URL

获取当前URL

agent-browser -p kernel get url
undefined
agent-browser -p kernel get url
undefined

Session Management

会话管理

Cleanup

清理

Always close the browser when done to save the profile:
bash
agent-browser -p kernel close
完成操作后务必关闭浏览器以保存配置文件:
bash
agent-browser -p kernel close

Multiple Sessions

多会话

Run parallel browser sessions with named sessions:
bash
agent-browser -p kernel --session site1 open https://site1.com
agent-browser -p kernel --session site2 open https://site2.com
agent-browser -p kernel session list
可通过命名会话运行并行浏览器会话:
bash
agent-browser -p kernel --session site1 open https://site1.com
agent-browser -p kernel --session site2 open https://site2.com
agent-browser -p kernel session list

Common Gotchas

常见陷阱

  1. Refs change after navigation: Always re-snapshot after clicking links or submitting forms.
  2. Wait after actions: Add waits after clicks/submits that trigger page loads or AJAX.
  3. Profile not saving: Make sure to run
    agent-browser -p kernel close
    to save the profile state.
  4. Timeout too short: Increase
    KERNEL_TIMEOUT_SECONDS
    for workflows with user pauses or slow pages.
  5. Stealth not working: Some sites detect bots despite stealth. Use manual login fallback.
  6. eval for stubborn elements: If
    fill
    or
    click
    don't work, try
    eval
    with direct DOM manipulation.
  7. Cross-origin iframes: Can't interact via agent-browser commands. Use Kernel's Playwright execution.
  1. 导航后引用失效:点击链接或提交表单后,务必重新获取快照。
  2. 操作后添加等待:在触发页面加载或AJAX请求的点击/提交操作后添加等待。
  3. 配置文件未保存:确保运行
    agent-browser -p kernel close
    以保存配置文件状态。
  4. 超时时间过短:对于包含用户暂停或加载缓慢页面的工作流,需增加
    KERNEL_TIMEOUT_SECONDS
  5. 隐身模式无效:部分网站即使开启隐身模式仍会检测到机器人,此时使用手动登录回退方案。
  6. 顽固元素使用eval:若
    fill
    click
    命令无效,尝试使用
    eval
    直接操作DOM。
  7. 跨域Iframe限制:无法通过agent-browser命令交互,需使用Kernel的Playwright执行功能。

Quick Reference

快速参考

bash
undefined
bash
undefined

Start session with profile persistence

启动带配置文件持久化的会话

export KERNEL_PROFILE_NAME=mysite export KERNEL_TIMEOUT_SECONDS=600 agent-browser -p kernel open https://example.com
export KERNEL_PROFILE_NAME=mysite export KERNEL_TIMEOUT_SECONDS=600 agent-browser -p kernel open https://example.com

Basic interaction with semantic selectors (recommended)

使用语义选择器进行基础交互(推荐)

agent-browser -p kernel wait --load domcontentloaded agent-browser -p kernel find label "Email" fill "user@example.com" agent-browser -p kernel find label "Password" fill "secret" agent-browser -p kernel find role button click --name "Submit" agent-browser -p kernel wait --url "**/success"
agent-browser -p kernel wait --load domcontentloaded agent-browser -p kernel find label "邮箱" fill "user@example.com" agent-browser -p kernel find label "密码" fill "secret" agent-browser -p kernel find role button click --name "提交" agent-browser -p kernel wait --url "**/success"

Alternative: snapshot + refs (for exploration)

替代方案:快照+引用(用于探索)

agent-browser -p kernel snapshot -i agent-browser -p kernel fill @eN "text" agent-browser -p kernel click @eM
agent-browser -p kernel snapshot -i agent-browser -p kernel fill @eN "text" agent-browser -p kernel click @eM

Get session info for manual intervention

获取会话信息用于人工干预

kernel browsers list kernel browsers view <session-id>
kernel browsers list kernel browsers view <session-id>

Cleanup

清理

agent-browser -p kernel close
undefined
agent-browser -p kernel close
undefined

Selector Cheat Sheet

选择器速查表

bash
undefined
bash
undefined

Buttons and links

按钮和链接

agent-browser -p kernel find role button click --name "Submit" agent-browser -p kernel find role link click --name "Next" agent-browser -p kernel find text "Click here" click
agent-browser -p kernel find role button click --name "提交" agent-browser -p kernel find role link click --name "下一步" agent-browser -p kernel find text "点击此处" click

Form inputs

表单输入框

agent-browser -p kernel find label "Email" fill "user@example.com" agent-browser -p kernel find placeholder "Search" type "query" agent-browser -p kernel find testid "username-input" fill "myuser"
agent-browser -p kernel find label "邮箱" fill "user@example.com" agent-browser -p kernel find placeholder "搜索" type "query" agent-browser -p kernel find testid "username-input" fill "myuser"

Smart waits

智能等待

agent-browser -p kernel wait --load domcontentloaded agent-browser -p kernel wait --text "Success" agent-browser -p kernel wait --url "**/dashboard" agent-browser -p kernel wait --fn "window.loaded === true"
undefined
agent-browser -p kernel wait --load domcontentloaded agent-browser -p kernel wait --text "成功" agent-browser -p kernel wait --url "**/dashboard" agent-browser -p kernel wait --fn "window.loaded === true"
undefined