browserwing-executor

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

BrowserWing Executor API

BrowserWing Executor API

Overview

概述

BrowserWing Executor provides comprehensive browser automation capabilities through HTTP APIs. You can control browser navigation, interact with page elements, extract data, and analyze page structure.
API Base URL:
http://localhost:8080/api/v1/executor
Authentication: Use
X-BrowserWing-Key: <api-key>
header or
Authorization: Bearer <token>
BrowserWing Executor 通过HTTP API提供全面的浏览器自动化能力。您可以控制浏览器导航、与页面元素交互、提取数据以及分析页面结构。
API基础URL:
http://localhost:8080/api/v1/executor
认证方式: 使用
X-BrowserWing-Key: <api-key>
请求头或
Authorization: Bearer <token>

Core Capabilities

核心功能

  • Page Navigation: Navigate to URLs, go back/forward, reload
  • Element Interaction: Click, type, select, hover on page elements
  • Data Extraction: Extract text, attributes, values from elements
  • Accessibility Analysis: Get accessibility snapshot to understand page structure
  • Advanced Operations: Screenshot, JavaScript execution, keyboard input
  • Batch Processing: Execute multiple operations in sequence
  • 页面导航: 跳转至URL、前进/后退、刷新页面
  • 元素交互: 点击、输入、选择、悬停页面元素
  • 数据提取: 提取元素的文本、属性、值
  • 无障碍分析: 获取无障碍快照以了解页面结构
  • 高级操作: 截图、JavaScript执行、键盘输入
  • 批量处理: 按顺序执行多个操作

API Endpoints

API端点

1. Discover Available Commands

1. 查看可用命令

IMPORTANT: Always call this endpoint first to see all available commands and their parameters.
bash
curl -X GET 'http://localhost:8080/api/v1/executor/help'
Response: Returns complete list of all commands with parameters, examples, and usage guidelines.
Query specific command:
bash
curl -X GET 'http://localhost:8080/api/v1/executor/help?command=extract'
重要提示: 请始终先调用此端点查看所有可用命令及其参数。
bash
curl -X GET 'http://localhost:8080/api/v1/executor/help'
响应: 返回包含参数、示例和使用指南的完整命令列表。
查询特定命令:
bash
curl -X GET 'http://localhost:8080/api/v1/executor/help?command=extract'

2. Get Accessibility Snapshot

2. 获取无障碍快照

CRITICAL: Always call this after navigation to understand page structure and get element RefIDs.
bash
curl -X GET 'http://localhost:8080/api/v1/executor/snapshot'
Response Example:
json
{
  "success": true,
  "snapshot_text": "Clickable Elements:\n  @e1 Login (role: button)\n  @e2 Sign Up (role: link)\n\nInput Elements:\n  @e3 Email (role: textbox) [placeholder: your@email.com]\n  @e4 Password (role: textbox)"
}
Use Cases:
  • Understand what interactive elements are on the page
  • Get element RefIDs (@e1, @e2, etc.) for precise identification
  • See element labels, roles, and attributes
  • The accessibility tree is cleaner than raw DOM and better for LLMs
  • RefIDs are stable references that work reliably across page changes
关键提示: 导航完成后请务必调用此端点,以了解页面结构并获取元素RefID。
bash
curl -X GET 'http://localhost:8080/api/v1/executor/snapshot'
响应示例:
json
{
  "success": true,
  "snapshot_text": "可点击元素:\n  @e1 登录 (角色: 按钮)\n  @e2 注册 (角色: 链接)\n\n输入元素:\n  @e3 邮箱 (角色: 文本框) [占位符: your@email.com]\n  @e4 密码 (角色: 文本框)"
}
使用场景:
  • 了解页面上的交互式元素
  • 获取用于精准定位的元素RefID(@e1、@e2等)
  • 查看元素的标签、角色和属性
  • 无障碍树比原始DOM更简洁,更适合LLM处理
  • RefID是稳定的引用标识,在页面变化时仍能可靠工作

3. Common Operations

3. 常见操作

Navigate to URL

跳转至URL

bash
curl -X POST 'http://localhost:8080/api/v1/executor/navigate' \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://example.com"}'
bash
curl -X POST 'http://localhost:8080/api/v1/executor/navigate' \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://example.com"}'

Click Element

点击元素

bash
curl -X POST 'http://localhost:8080/api/v1/executor/click' \
  -H 'Content-Type: application/json' \
  -d '{"identifier": "@e1"}'
Identifier formats:
  • RefID (Recommended):
    @e1
    ,
    @e2
    (from snapshot)
  • CSS Selector:
    #button-id
    ,
    .class-name
  • XPath:
    //button[@type='submit']
  • Text:
    Login
    (text content)
bash
curl -X POST 'http://localhost:8080/api/v1/executor/click' \
  -H 'Content-Type: application/json' \
  -d '{"identifier": "@e1"}'
标识格式:
  • RefID(推荐):
    @e1
    @e2
    (来自快照)
  • CSS选择器:
    #button-id
    .class-name
  • XPath:
    //button[@type='submit']
  • 文本内容:
    Login
    (元素显示文本)

Type Text

输入文本

bash
curl -X POST 'http://localhost:8080/api/v1/executor/type' \
  -H 'Content-Type: application/json' \
  -d '{"identifier": "@e3", "text": "user@example.com"}'
bash
curl -X POST 'http://localhost:8080/api/v1/executor/type' \
  -H 'Content-Type: application/json' \
  -d '{"identifier": "@e3", "text": "user@example.com"}'

Extract Data

提取数据

bash
curl -X POST 'http://localhost:8080/api/v1/executor/extract' \
  -H 'Content-Type: application/json' \
  -d '{
    "selector": ".product-item",
    "fields": ["text", "href"],
    "multiple": true
  }'
bash
curl -X POST 'http://localhost:8080/api/v1/executor/extract' \
  -H 'Content-Type: application/json' \
  -d '{
    "selector": ".product-item",
    "fields": ["text", "href"],
    "multiple": true
  }'

Wait for Element

等待元素

bash
curl -X POST 'http://localhost:8080/api/v1/executor/wait' \
  -H 'Content-Type: application/json' \
  -d '{"identifier": ".loading", "state": "hidden", "timeout": 10}'
bash
curl -X POST 'http://localhost:8080/api/v1/executor/wait' \
  -H 'Content-Type: application/json' \
  -d '{"identifier": ".loading", "state": "hidden", "timeout": 10}'

Batch Operations

批量操作

bash
curl -X POST 'http://localhost:8080/api/v1/executor/batch' \
  -H 'Content-Type: application/json' \
  -d '{
    "operations": [
      {"type": "navigate", "params": {"url": "https://example.com"}, "stop_on_error": true},
      {"type": "click", "params": {"identifier": "@e1"}, "stop_on_error": true},
      {"type": "type", "params": {"identifier": "@e3", "text": "query"}, "stop_on_error": true}
    ]
  }'
bash
curl -X POST 'http://localhost:8080/api/v1/executor/batch' \
  -H 'Content-Type: application/json' \
  -d '{
    "operations": [
      {"type": "navigate", "params": {"url": "https://example.com"}, "stop_on_error": true},
      {"type": "click", "params": {"identifier": "@e1"}, "stop_on_error": true},
      {"type": "type", "params": {"identifier": "@e3", "text": "query"}, "stop_on_error": true}
    ]
  }'

Instructions

操作步骤

Step-by-step workflow:
  1. Discover commands: Call
    GET /help
    to see all available operations and their parameters (do this first if unsure).
  2. Navigate: Use
    POST /navigate
    to open the target webpage.
  3. Analyze page: Call
    GET /snapshot
    to understand page structure and get element RefIDs.
  4. Interact: Use element RefIDs (like
    @e1
    ,
    @e2
    ) or CSS selectors to:
    • Click elements:
      POST /click
    • Input text:
      POST /type
    • Select options:
      POST /select
    • Wait for elements:
      POST /wait
  5. Extract data: Use
    POST /extract
    to get information from the page.
  6. Present results: Format and show extracted data to the user.
分步工作流:
  1. 查看命令: 调用
    GET /help
    查看所有可用操作及其参数(不确定时请先执行此步骤)。
  2. 导航页面: 使用
    POST /navigate
    打开目标网页。
  3. 分析页面: 调用
    GET /snapshot
    了解页面结构并获取元素RefID。
  4. 交互操作: 使用元素RefID(如
    @e1
    @e2
    )或CSS选择器执行以下操作:
    • 点击元素:
      POST /click
    • 输入文本:
      POST /type
    • 选择选项:
      POST /select
    • 等待元素:
      POST /wait
  5. 提取数据: 使用
    POST /extract
    从页面中获取信息。
  6. 展示结果: 格式化并向用户展示提取的数据。

Complete Example

完整示例

User Request: "Search for 'laptop' on example.com and get the first 5 results"
Your Actions:
  1. Navigate to search page:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/navigate' \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://example.com/search"}'
  1. Get page structure to find search input:
bash
curl -X GET 'http://localhost:8080/api/v1/executor/snapshot'
Response shows:
@e3 Search (role: textbox) [placeholder: Search...]
  1. Type search query:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/type' \
  -H 'Content-Type: application/json' \
  -d '{"identifier": "@e3", "text": "laptop"}'
  1. Press Enter to submit:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/press-key' \
  -H 'Content-Type: application/json' \
  -d '{"key": "Enter"}'
  1. Wait for results to load:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/wait' \
  -H 'Content-Type: application/json' \
  -d '{"identifier": ".search-results", "state": "visible", "timeout": 10}'
  1. Extract search results:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/extract' \
  -H 'Content-Type: application/json' \
  -d '{
    "selector": ".result-item",
    "fields": ["text", "href"],
    "multiple": true
  }'
  1. Present the extracted data:
Found 15 results for 'laptop':
1. Gaming Laptop - $1299 (https://...)
2. Business Laptop - $899 (https://...)
...
用户请求: "在example.com上搜索'laptop'并获取前5条结果"
操作步骤:
  1. 跳转到搜索页面:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/navigate' \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://example.com/search"}'
  1. 获取页面结构以找到搜索输入框:
bash
curl -X GET 'http://localhost:8080/api/v1/executor/snapshot'
响应显示:
@e3 搜索 (角色: 文本框) [占位符: Search...]
  1. 输入搜索关键词:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/type' \
  -H 'Content-Type: application/json' \
  -d '{"identifier": "@e3", "text": "laptop"}'
  1. 按回车键提交搜索:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/press-key' \
  -H 'Content-Type: application/json' \
  -d '{"key": "Enter"}'
  1. 等待结果加载完成:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/wait' \
  -H 'Content-Type: application/json' \
  -d '{"identifier": ".search-results", "state": "visible", "timeout": 10}'
  1. 提取搜索结果:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/extract' \
  -H 'Content-Type: application/json' \
  -d '{
    "selector": ".result-item",
    "fields": ["text", "href"],
    "multiple": true
  }'
  1. 展示提取的数据:
为'laptop'找到15条结果:
1. 游戏笔记本电脑 - 1299美元(https://...)
2. 商务笔记本电脑 - 899美元(https://...)
...

Key Commands Reference

核心命令参考

Navigation

导航类

  • POST /navigate
    - Navigate to URL
  • POST /go-back
    - Go back in history
  • POST /go-forward
    - Go forward in history
  • POST /reload
    - Reload current page
  • POST /navigate
    - 跳转至URL
  • POST /go-back
    - 后退至历史页面
  • POST /go-forward
    - 前进至历史页面
  • POST /reload
    - 刷新当前页面

Element Interaction

元素交互类

  • POST /click
    - Click element (supports: RefID
    @e1
    , CSS selector, XPath, text content)
  • POST /type
    - Type text into input (supports: RefID
    @e3
    , CSS selector, XPath)
  • POST /select
    - Select dropdown option
  • POST /hover
    - Hover over element
  • POST /wait
    - Wait for element state (visible, hidden, enabled)
  • POST /press-key
    - Press keyboard key (Enter, Tab, Ctrl+S, etc.)
  • POST /click
    - 点击元素(支持:RefID
    @e1
    、CSS选择器、XPath、文本内容)
  • POST /type
    - 向输入框输入文本(支持:RefID
    @e3
    、CSS选择器、XPath)
  • POST /select
    - 选择下拉选项
  • POST /hover
    - 悬停在元素上
  • POST /wait
    - 等待元素状态变化(可见、隐藏、可用)
  • POST /press-key
    - 按下键盘按键(Enter、Tab、Ctrl+S等)

Data Extraction

数据提取类

  • POST /extract
    - Extract data from elements (supports multiple elements, custom fields)
  • POST /get-text
    - Get element text content
  • POST /get-value
    - Get input element value
  • GET /page-info
    - Get page URL and title
  • GET /page-text
    - Get all page text
  • GET /page-content
    - Get full HTML
  • POST /extract
    - 从元素中提取数据(支持多元素、自定义字段)
  • POST /get-text
    - 获取元素文本内容
  • POST /get-value
    - 获取输入框的值
  • GET /page-info
    - 获取页面URL和标题
  • GET /page-text
    - 获取页面所有文本
  • GET /page-content
    - 获取完整HTML内容

Page Analysis

页面分析类

  • GET /snapshot
    - Get accessibility snapshot (⭐ ALWAYS call after navigation)
  • GET /clickable-elements
    - Get all clickable elements
  • GET /input-elements
    - Get all input elements
  • GET /snapshot
    - 获取无障碍快照(⭐ 导航后务必调用
  • GET /clickable-elements
    - 获取所有可点击元素
  • GET /input-elements
    - 获取所有输入元素

Advanced

高级操作类

  • POST /screenshot
    - Take page screenshot (base64 encoded)
  • POST /evaluate
    - Execute JavaScript code
  • POST /batch
    - Execute multiple operations in sequence
  • POST /scroll-to-bottom
    - Scroll to page bottom
  • POST /resize
    - Resize browser window
  • POST /tabs
    - Manage browser tabs (list, new, switch, close)
  • POST /fill-form
    - Intelligently fill multiple form fields at once
  • POST /screenshot
    - 截取页面截图(Base64编码)
  • POST /evaluate
    - 执行JavaScript代码
  • POST /batch
    - 按顺序执行多个操作
  • POST /scroll-to-bottom
    - 滚动至页面底部
  • POST /resize
    - 调整浏览器窗口大小
  • POST /tabs
    - 管理浏览器标签页(列出、新建、切换、关闭)
  • POST /fill-form
    - 智能批量填充多个表单字段

Debug & Monitoring

调试与监控类

  • GET /console-messages
    - Get browser console messages (logs, warnings, errors)
  • GET /network-requests
    - Get network requests made by the page
  • POST /handle-dialog
    - Configure JavaScript dialog (alert, confirm, prompt) handling
  • POST /file-upload
    - Upload files to input elements
  • POST /drag
    - Drag and drop elements
  • POST /close-page
    - Close the current page/tab
  • GET /console-messages
    - 获取浏览器控制台消息(日志、警告、错误)
  • GET /network-requests
    - 获取页面发起的网络请求
  • POST /handle-dialog
    - 配置JavaScript对话框(alert、confirm、prompt)的处理方式
  • POST /file-upload
    - 向输入框上传文件
  • POST /drag
    - 拖放元素
  • POST /close-page
    - 关闭当前页面/标签页

Element Identification

元素定位方式

You can identify elements using:
  1. RefID (Recommended):
    @e1
    ,
    @e2
    ,
    @e3
    • Most reliable method - stable across page changes
    • Get RefIDs from
      /snapshot
      endpoint
    • Valid for 5 minutes after snapshot
    • Example:
      "identifier": "@e1"
    • Works with multi-strategy fallback for robustness
  2. CSS Selector:
    #id
    ,
    .class
    ,
    button[type="submit"]
    • Standard CSS selectors
    • Example:
      "identifier": "#login-button"
  3. XPath:
    //button[@id='login']
    ,
    //a[contains(text(), 'Submit')]
    • XPath expressions for complex queries
    • Example:
      "identifier": "//button[@id='login']"
  4. Text Content:
    Login
    ,
    Sign Up
    ,
    Submit
    • Searches buttons and links with matching text
    • Example:
      "identifier": "Login"
  5. ARIA Label: Elements with
    aria-label
    attribute
    • Automatically searched
您可以通过以下方式定位元素:
  1. RefID(推荐):
    @e1
    @e2
    @e3
    • 最可靠的方式 - 页面变化时仍保持稳定
    • /snapshot
      端点获取RefID
    • 快照生成后5分钟内有效
    • 示例:
      "identifier": "@e1"
    • 支持多策略 fallback,确保鲁棒性
  2. CSS选择器:
    #id
    .class
    button[type="submit"]
    • 标准CSS选择器
    • 示例:
      "identifier": "#login-button"
  3. XPath:
    //button[@id='login']
    //a[contains(text(), 'Submit')]
    • 用于复杂查询的XPath表达式
    • 示例:
      "identifier": "//button[@id='login']"
  4. 文本内容:
    Login
    Sign Up
    Submit
    • 搜索匹配文本的按钮和链接
    • 示例:
      "identifier": "Login"
  5. ARIA标签: 带有
    aria-label
    属性的元素
    • 系统会自动搜索此类元素

Guidelines

使用指南

Before starting:
  • Call
    GET /help
    if you're unsure about available commands or their parameters
  • Ensure browser is started (if not, it will auto-start on first operation)
During automation:
  • Always call
    /snapshot
    after navigation
    to get page structure and RefIDs
  • Prefer RefIDs (like
    @e1
    ) over CSS selectors for reliability and stability
  • Re-snapshot after page changes to get updated RefIDs
  • Use
    /wait
    for dynamic content that loads asynchronously
  • Check element states before interaction (visible, enabled)
  • Use
    /batch
    for multiple sequential operations to improve efficiency
Error handling:
  • If operation fails, check element identifier and try different format
  • For timeout errors, increase timeout value
  • If element not found, call
    /snapshot
    again to refresh page structure
  • Explain errors clearly to user with suggested solutions
Data extraction:
  • Use
    fields
    parameter to specify what to extract:
    ["text", "href", "src"]
  • Set
    multiple: true
    to extract from multiple elements
  • Format extracted data in a readable way for user
开始前:
  • 如果不确定可用命令或参数,请调用
    GET /help
  • 确保浏览器已启动(如果未启动,首次操作时会自动启动)
自动化过程中:
  • 导航后务必调用
    /snapshot
    ,以获取页面结构和RefID
  • 优先使用RefID(如
    @e1
    )而非CSS选择器,以确保可靠性和稳定性
  • 页面变化后重新生成快照,获取更新后的RefID
  • **使用
    /wait
    **处理异步加载的动态内容
  • 交互前检查元素状态(可见、可用)
  • **使用
    /batch
    **执行多个连续操作,提升效率
错误处理:
  • 如果操作失败,检查元素标识并尝试其他格式
  • 对于超时错误,增加超时时间
  • 如果未找到元素,重新调用
    /snapshot
    刷新页面结构
  • 向用户清晰说明错误并提供解决建议
数据提取:
  • 使用
    fields
    参数指定要提取的内容:
    ["text", "href", "src"]
  • 设置
    multiple: true
    以从多个元素中提取数据
  • 将提取的数据格式化为用户易读的形式

Complete Workflow Example

完整工作流示例

Scenario: User wants to login to a website
User: "Please log in to example.com with username 'john' and password 'secret123'"
Your Actions:
Step 1: Navigate to login page
bash
POST http://localhost:8080/api/v1/executor/navigate
{"url": "https://example.com/login"}
Step 2: Get page structure
bash
GET http://localhost:8080/api/v1/executor/snapshot
Response:
Clickable Elements:
  @e1 Login (role: button)

Input Elements:
  @e2 Username (role: textbox)
  @e3 Password (role: textbox)
Step 3: Enter username
bash
POST http://localhost:8080/api/v1/executor/type
{"identifier": "@e2", "text": "john"}
Step 4: Enter password
bash
POST http://localhost:8080/api/v1/executor/type
{"identifier": "@e3", "text": "secret123"}
Step 5: Click login button
bash
POST http://localhost:8080/api/v1/executor/click
{"identifier": "@e1"}
Step 6: Wait for login success (optional)
bash
POST http://localhost:8080/api/v1/executor/wait
{"identifier": ".welcome-message", "state": "visible", "timeout": 10}
Step 7: Inform user
"Successfully logged in to example.com!"
场景: 用户需要登录某个网站
用户:"请使用用户名'john'和密码'secret123'登录example.com"
操作步骤:
步骤1: 跳转到登录页面
bash
POST http://localhost:8080/api/v1/executor/navigate
{"url": "https://example.com/login"}
步骤2: 获取页面结构
bash
GET http://localhost:8080/api/v1/executor/snapshot
响应:
可点击元素:
  @e1 登录 (角色: 按钮)

输入元素:
  @e2 用户名 (角色: 文本框)
  @e3 密码 (角色: 文本框)
步骤3: 输入用户名
bash
POST http://localhost:8080/api/v1/executor/type
{"identifier": "@e2", "text": "john"}
步骤4: 输入密码
bash
POST http://localhost:8080/api/v1/executor/type
{"identifier": "@e3", "text": "secret123"}
步骤5: 点击登录按钮
bash
POST http://localhost:8080/api/v1/executor/click
{"identifier": "@e1"}
步骤6: 等待登录成功(可选)
bash
POST http://localhost:8080/api/v1/executor/wait
{"identifier": ".welcome-message", "state": "visible", "timeout": 10}
步骤7: 通知用户
"已成功登录example.com!"

Batch Operation Example

批量操作示例

Scenario: Fill out a form with multiple fields
Instead of making 5 separate API calls, use one batch operation:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/batch' \
  -H 'Content-Type: application/json' \
  -d '{
    "operations": [
      {
        "type": "navigate",
        "params": {"url": "https://example.com/form"},
        "stop_on_error": true
      },
      {
        "type": "type",
        "params": {"identifier": "#name", "text": "John Doe"},
        "stop_on_error": true
      },
      {
        "type": "type",
        "params": {"identifier": "#email", "text": "john@example.com"},
        "stop_on_error": true
      },
      {
        "type": "select",
        "params": {"identifier": "#country", "value": "United States"},
        "stop_on_error": true
      },
      {
        "type": "click",
        "params": {"identifier": "#submit"},
        "stop_on_error": true
      }
    ]
  }'
场景: 填写包含多个字段的表单
无需发起5次单独的API调用,只需一次批量操作:
bash
curl -X POST 'http://localhost:8080/api/v1/executor/batch' \
  -H 'Content-Type: application/json' \
  -d '{
    "operations": [
      {
        "type": "navigate",
        "params": {"url": "https://example.com/form"},
        "stop_on_error": true
      },
      {
        "type": "type",
        "params": {"identifier": "#name", "text": "John Doe"},
        "stop_on_error": true
      },
      {
        "type": "type",
        "params": {"identifier": "#email", "text": "john@example.com"},
        "stop_on_error": true
      },
      {
        "type": "select",
        "params": {"identifier": "#country", "value": "United States"},
        "stop_on_error": true
      },
      {
        "type": "click",
        "params": {"identifier": "#submit"},
        "stop_on_error": true
      }
    ]
  }'

Best Practices

最佳实践

  1. Discovery first: If unsure, call
    /help
    or
    /help?command=<name>
    to learn about commands
  2. Structure first: Always call
    /snapshot
    after navigation to understand the page
  3. Use accessibility indices: They're more reliable than CSS selectors (elements might have dynamic classes)
  4. Wait for dynamic content: Use
    /wait
    before interacting with elements that load asynchronously
  5. Batch when possible: Use
    /batch
    for multiple sequential operations
  6. Handle errors gracefully: Provide clear explanations and suggestions when operations fail
  7. Verify results: After operations, check if desired outcome was achieved
  1. 先探索: 如果不确定,调用
    /help
    /help?command=<name>
    了解命令详情
  2. 先看结构: 导航后务必调用
    /snapshot
    了解页面结构
  3. 使用无障碍索引: 比CSS选择器更可靠(元素可能有动态类名)
  4. 等待动态内容: 与异步加载的元素交互前使用
    /wait
  5. 批量处理: 对于多个连续操作,使用
    /batch
  6. 优雅处理错误: 操作失败时提供清晰的解释和建议
  7. 验证结果: 操作完成后检查是否达到预期效果

Common Scenarios

常见场景

Form Filling

表单填充

  1. Navigate to form page
  2. Get accessibility snapshot to find input elements and their RefIDs
  3. Use
    /type
    for each field:
    @e1
    ,
    @e2
    , etc.
  4. Use
    /select
    for dropdowns
  5. Click submit button using its RefID
  1. 跳转到表单页面
  2. 获取无障碍快照以找到输入元素及其RefID
  3. 对每个字段使用
    /type
    @e1
    @e2
  4. 对下拉框使用
    /select
  5. 使用RefID点击提交按钮

Data Scraping

数据爬取

  1. Navigate to target page
  2. Wait for content to load with
    /wait
  3. Use
    /extract
    with CSS selector and
    multiple: true
  4. Specify fields to extract:
    ["text", "href", "src"]
  1. 跳转到目标页面
  2. 使用
    /wait
    等待内容加载完成
  3. 使用
    /extract
    搭配CSS选择器和
    multiple: true
  4. 指定要提取的字段:
    ["text", "href", "src"]

Search Operations

搜索操作

  1. Navigate to search page
  2. Get accessibility snapshot to locate search input
  3. Type search query into input
  4. Press Enter or click search button
  5. Wait for results
  6. Extract results data
  1. 跳转到搜索页面
  2. 获取无障碍快照以定位搜索输入框
  3. 在输入框中输入搜索关键词
  4. 按Enter键或点击搜索按钮
  5. 等待结果加载
  6. 提取结果数据

Login Automation

登录自动化

  1. Navigate to login page
  2. Get accessibility snapshot to find RefIDs
  3. Type username:
    @e2
  4. Type password:
    @e3
  5. Click login button:
    @e1
  6. Wait for success indicator
  1. 跳转到登录页面
  2. 获取无障碍快照以找到RefID
  3. 输入用户名:
    @e2
  4. 输入密码:
    @e3
  5. 点击登录按钮:
    @e1
  6. 等待成功标识出现

Important Notes

重要说明

  • Browser must be running (it will auto-start on first operation if needed)
  • Operations are executed on the currently active browser tab
  • Accessibility snapshot updates after each navigation and click operation
  • All timeouts are in seconds
  • Use
    wait_visible: true
    (default) for reliable element interaction
  • Replace
    localhost:8080
    with actual API host address
  • Authentication required: use
    X-BrowserWing-Key
    header or JWT token
  • 浏览器必须处于运行状态(如果未运行,首次操作时会自动启动)
  • 操作将在当前活动的浏览器标签页中执行
  • 无障碍快照会在每次导航和点击操作后更新
  • 所有超时时间单位为秒
  • 默认使用
    wait_visible: true
    以确保元素交互的可靠性
  • localhost:8080
    替换为实际的API主机地址
  • 需要认证:使用
    X-BrowserWing-Key
    请求头或JWT令牌

Troubleshooting

故障排除

Element not found:
  • Call
    /snapshot
    to see available elements
  • Try different identifier format (accessibility index, CSS selector, text)
  • Check if page has finished loading
Timeout errors:
  • Increase timeout value in request
  • Check if element actually appears on page
  • Use
    /wait
    with appropriate state before interaction
Extraction returns empty:
  • Verify CSS selector matches target elements
  • Check if content has loaded (use
    /wait
    first)
  • Try different extraction fields or type
未找到元素:
  • 调用
    /snapshot
    查看可用元素
  • 尝试其他标识格式(无障碍索引、CSS选择器、文本)
  • 检查页面是否已加载完成
超时错误:
  • 增加请求中的超时时间
  • 检查元素是否确实出现在页面上
  • 交互前使用
    /wait
    等待元素进入指定状态
提取结果为空:
  • 验证CSS选择器是否匹配目标元素
  • 检查内容是否已加载(先使用
    /wait
  • 尝试不同的提取字段或类型

Quick Reference

快速参考

bash
undefined
bash
undefined

Discover commands

查看可用命令

GET localhost:8080/api/v1/executor/help
GET localhost:8080/api/v1/executor/help

Navigate

跳转页面

POST localhost:8080/api/v1/executor/navigate {"url": "..."}
POST localhost:8080/api/v1/executor/navigate {"url": "..."}

Get page structure

获取页面结构

GET localhost:8080/api/v1/executor/snapshot
GET localhost:8080/api/v1/executor/snapshot

Click element

点击元素

POST localhost:8080/api/v1/executor/click {"identifier": "@e1"}
POST localhost:8080/api/v1/executor/click {"identifier": "@e1"}

Type text

输入文本

POST localhost:8080/api/v1/executor/type {"identifier": "@e3", "text": "..."}
POST localhost:8080/api/v1/executor/type {"identifier": "@e3", "text": "..."}

Extract data

提取数据

POST localhost:8080/api/v1/executor/extract {"selector": "...", "fields": [...], "multiple": true}
undefined
POST localhost:8080/api/v1/executor/extract {"selector": "...", "fields": [...], "multiple": true}
undefined

Response Format

响应格式

All operations return:
json
{
  "success": true,
  "message": "Operation description",
  "timestamp": "2026-01-15T10:30:00Z",
  "data": {
    // Operation-specific data
  }
}
Error response:
json
{
  "error": "error.operationFailed",
  "detail": "Detailed error message"
}
所有操作返回:
json
{
  "success": true,
  "message": "操作描述",
  "timestamp": "2026-01-15T10:30:00Z",
  "data": {
    // 操作特定数据
  }
}
错误响应:
json
{
  "error": "error.operationFailed",
  "detail": "详细错误信息"
}