browse

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Browse - Browser Automation CLI

Browse - 浏览器自动化CLI

Browser automation CLI for AI agents. Create, test, and deploy web automations using the
browse
CLI.
面向AI Agent的浏览器自动化CLI。使用
browse
CLI创建、测试和部署Web自动化任务。

Setup (Run First!)

安装设置(请先执行!)

Before using this skill, install the required CLIs:
bash
npm install -g @browserbasehq/browse-cli @browserbasehq/sdk-functions
Set your credentials:
bash
export BROWSERBASE_API_KEY="your_api_key"
export BROWSERBASE_PROJECT_ID="your_project_id"
Get credentials from: https://browserbase.com/settings
使用本工具前,请安装所需的CLI:
bash
npm install -g @browserbasehq/browse-cli @browserbasehq/sdk-functions
设置您的凭证:
bash
export BROWSERBASE_API_KEY="your_api_key"
export BROWSERBASE_PROJECT_ID="your_project_id"
从以下地址获取凭证:https://browserbase.com/settings

When to Use

适用场景

Use this skill when:
  • User wants to automate a website task
  • User needs to scrape data from a site
  • User wants to create a Browserbase Function
  • Starting from scratch on a new automation
在以下场景使用本工具:
  • 用户想要自动化网站任务
  • 用户需要从网站爬取数据
  • 用户想要创建Browserbase Function
  • 从零开始构建新的自动化任务

Workflow

工作流程

1. Understand the Goal

1. 明确目标

Ask clarifying questions:
  • What website/URL are you automating?
  • What's the end goal (extract data, submit forms, monitor changes)?
  • Does it require authentication?
  • Should this run on a schedule or on-demand?
提出澄清问题:
  • 您要自动化哪个网站/URL?
  • 最终目标是什么(提取数据、提交表单、监控变更)?
  • 是否需要身份验证?
  • 它应该按计划运行还是按需运行?

2. Explore the Site Interactively

2. 交互式探索网站

Start a local browser session to understand the site structure:
bash
browse open https://example.com
Use snapshot to understand the DOM:
bash
browse snapshot
Take screenshots to see the visual layout:
bash
browse screenshot exploration.png
启动本地浏览器会话以了解网站结构:
bash
browse open https://example.com
使用快照功能了解DOM结构:
bash
browse snapshot
截取屏幕截图查看可视化布局:
bash
browse screenshot exploration.png

3. Identify Key Elements

3. 识别关键元素

For each step of the automation, identify:
  • Selectors for interactive elements
  • Wait conditions needed
  • Data to extract
Use the accessibility tree refs to understand element relationships:
[@0-5] button: "Submit"
[@0-6] textbox: "Email"
[@0-7] textbox: "Password"
针对自动化的每个步骤,确定:
  • 交互元素的选择器
  • 所需的等待条件
  • 要提取的数据
使用可访问性树引用了解元素关系:
[@0-5] button: "Submit"
[@0-6] textbox: "Email"
[@0-7] textbox: "Password"

4. Test Interactions Manually

4. 手动测试交互

Before writing code, verify each step works:
bash
browse fill @0-6 "test@example.com"
browse fill @0-7 "password123"
browse click @0-5
browse wait load networkidle
browse snapshot
编写代码前,验证每个步骤是否可行:
bash
browse fill @0-6 "test@example.com"
browse fill @0-7 "password123"
browse click @0-5
browse wait load networkidle
browse snapshot

5. Enable Network Capture (if needed)

5. 启用网络捕获(如有需要)

For API-based automations or debugging:
bash
browse network on
对于基于API的自动化或调试:
bash
browse network on

perform actions

执行操作

browse network path
browse network path

inspect captured requests in the directory

在目录中检查捕获的请求

undefined
undefined

6. Create the Function

6. 创建Function

Once you understand the flow, create a full function project:
bash
pnpm dlx @browserbasehq/sdk-functions init my-automation
cd my-automation
This creates a complete project with:
  • package.json
    with dependencies
  • .env
    for credentials
  • tsconfig.json
  • index.ts
    template
Edit
index.ts
with your automation logic:
typescript
import { defineFn } from "@browserbasehq/sdk-functions";
import { chromium } from "playwright-core";

defineFn("my-automation", async (context) => {
  const { session } = context;
  const browser = await chromium.connectOverCDP(session.connectUrl);
  const page = browser.contexts()[0]!.pages()[0]!;

  // Your automation steps here
  await page.goto("https://example.com");
  await page.fill('input[name="email"]', context.params.email);
  await page.click('button[type="submit"]');
  
  // Extract and return data
  const result = await page.textContent('.result');
  return { success: true, result };
});
了解流程后,创建完整的Function项目:
bash
pnpm dlx @browserbasehq/sdk-functions init my-automation
cd my-automation
这会创建一个包含以下内容的完整项目:
  • 包含依赖的
    package.json
  • 用于存储凭证的
    .env
  • tsconfig.json
  • index.ts
    模板
编辑
index.ts
添加您的自动化逻辑:
typescript
import { defineFn } from "@browserbasehq/sdk-functions";
import { chromium } from "playwright-core";

defineFn("my-automation", async (context) => {
  const { session } = context;
  const browser = await chromium.connectOverCDP(session.connectUrl);
  const page = browser.contexts()[0]!.pages()[0]!;

  // 在此添加您的自动化步骤
  await page.goto("https://example.com");
  await page.fill('input[name="email"]', context.params.email);
  await page.click('button[type="submit"]');
  
  // 提取并返回数据
  const result = await page.textContent('.result');
  return { success: true, result };
});

7. Test Locally

7. 本地测试

Start the local development server:
bash
pnpm bb dev index.ts
Then invoke locally via curl:
bash
curl -X POST http://127.0.0.1:14113/v1/functions/my-automation/invoke \
  -H "Content-Type: application/json" \
  -d '{"params": {"email": "test@example.com"}}'
启动本地开发服务器:
bash
pnpm bb dev index.ts
然后通过curl本地调用:
bash
curl -X POST http://127.0.0.1:14113/v1/functions/my-automation/invoke \
  -H "Content-Type: application/json" \
  -d '{"params": {"email": "test@example.com"}}'

8. Deploy to Browserbase

8. 部署到Browserbase

When ready for production:
bash
pnpm bb publish index.ts
准备好生产环境部署时:
bash
pnpm bb publish index.ts

9. Test Production

9. 生产环境测试

Invoke the deployed function via API:
bash
curl -X POST https://api.browserbase.com/v1/functions/<function-id>/invoke \
  -H "Content-Type: application/json" \
  -H "x-bb-api-key: $BROWSERBASE_API_KEY" \
  -d '{"params": {"email": "test@example.com"}}'
通过API调用已部署的Function:
bash
curl -X POST https://api.browserbase.com/v1/functions/<function-id>/invoke \
  -H "Content-Type: application/json" \
  -H "x-bb-api-key: $BROWSERBASE_API_KEY" \
  -d '{"params": {"email": "test@example.com"}}'

Best Practices

最佳实践

Selectors

选择器

  • Prefer data attributes (
    data-testid
    ) over CSS classes
  • Use text content as fallback (
    text=Submit
    )
  • Avoid fragile selectors like nth-child
  • 优先使用数据属性(
    data-testid
    )而非CSS类
  • 使用文本内容作为备选方案(
    text=Submit
  • 避免使用像nth-child这样脆弱的选择器

Waiting

等待策略

  • Always wait for navigation/network after clicks
  • Use
    waitForSelector
    for dynamic content
  • Set reasonable timeouts
  • 点击后始终等待导航/网络完成
  • 对动态内容使用
    waitForSelector
  • 设置合理的超时时间

Error Handling

错误处理

  • Wrap risky operations in try/catch
  • Return structured error information
  • Log intermediate steps for debugging
  • 将风险操作包裹在try/catch中
  • 返回结构化的错误信息
  • 记录中间步骤以便调试

Data Extraction

数据提取

  • Use
    page.evaluate()
    for complex extraction
  • Validate extracted data before returning
  • Handle missing elements gracefully
  • 对复杂提取使用
    page.evaluate()
  • 返回前验证提取的数据
  • 优雅处理缺失元素的情况

Example: E-commerce Price Monitor

示例:电商价格监控器

typescript
defineFn("price-monitor", async (context) => {
  const { session, params } = context;
  const browser = await chromium.connectOverCDP(session.connectUrl);
  const page = browser.contexts()[0]!.pages()[0]!;

  await page.goto(params.productUrl);
  await page.waitForSelector('.price');

  const price = await page.evaluate(() => {
    const el = document.querySelector('.price');
    return el?.textContent?.replace(/[^0-9.]/g, '');
  });

  return {
    url: params.productUrl,
    price: parseFloat(price || '0'),
    timestamp: new Date().toISOString(),
  };
});
typescript
defineFn("price-monitor", async (context) => {
  const { session, params } = context;
  const browser = await chromium.connectOverCDP(session.connectUrl);
  const page = browser.contexts()[0]!.pages()[0]!;

  await page.goto(params.productUrl);
  await page.waitForSelector('.price');

  const price = await page.evaluate(() => {
    const el = document.querySelector('.price');
    return el?.textContent?.replace(/[^0-9.]/g, '');
  });

  return {
    url: params.productUrl,
    price: parseFloat(price || '0'),
    timestamp: new Date().toISOString(),
  };
});