browse
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBrowse - Browser Automation CLI
Browse - 浏览器自动化CLI
Browser automation CLI for AI agents. Create, test, and deploy web automations using the CLI.
browse面向AI Agent的浏览器自动化CLI。使用 CLI创建、测试和部署Web自动化任务。
browseSetup (Run First!)
安装设置(请先执行!)
Before using this skill, install the required CLIs:
bash
npm install -g @browserbasehq/browse-cli @browserbasehq/sdk-functionsSet your credentials:
bash
export BROWSERBASE_API_KEY="your_api_key"
export BROWSERBASE_PROJECT_ID="your_project_id"Get credentials from: https://browserbase.com/settings
使用本工具前,请安装所需的CLI:
bash
npm install -g @browserbasehq/browse-cli @browserbasehq/sdk-functions设置您的凭证:
bash
export BROWSERBASE_API_KEY="your_api_key"
export BROWSERBASE_PROJECT_ID="your_project_id"从以下地址获取凭证:https://browserbase.com/settings
When to Use
适用场景
Use this skill when:
- User wants to automate a website task
- User needs to scrape data from a site
- User wants to create a Browserbase Function
- Starting from scratch on a new automation
在以下场景使用本工具:
- 用户想要自动化网站任务
- 用户需要从网站爬取数据
- 用户想要创建Browserbase Function
- 从零开始构建新的自动化任务
Workflow
工作流程
1. Understand the Goal
1. 明确目标
Ask clarifying questions:
- What website/URL are you automating?
- What's the end goal (extract data, submit forms, monitor changes)?
- Does it require authentication?
- Should this run on a schedule or on-demand?
提出澄清问题:
- 您要自动化哪个网站/URL?
- 最终目标是什么(提取数据、提交表单、监控变更)?
- 是否需要身份验证?
- 它应该按计划运行还是按需运行?
2. Explore the Site Interactively
2. 交互式探索网站
Start a local browser session to understand the site structure:
bash
browse open https://example.comUse snapshot to understand the DOM:
bash
browse snapshotTake screenshots to see the visual layout:
bash
browse screenshot exploration.png启动本地浏览器会话以了解网站结构:
bash
browse open https://example.com使用快照功能了解DOM结构:
bash
browse snapshot截取屏幕截图查看可视化布局:
bash
browse screenshot exploration.png3. Identify Key Elements
3. 识别关键元素
For each step of the automation, identify:
- Selectors for interactive elements
- Wait conditions needed
- Data to extract
Use the accessibility tree refs to understand element relationships:
[@0-5] button: "Submit"
[@0-6] textbox: "Email"
[@0-7] textbox: "Password"针对自动化的每个步骤,确定:
- 交互元素的选择器
- 所需的等待条件
- 要提取的数据
使用可访问性树引用了解元素关系:
[@0-5] button: "Submit"
[@0-6] textbox: "Email"
[@0-7] textbox: "Password"4. Test Interactions Manually
4. 手动测试交互
Before writing code, verify each step works:
bash
browse fill @0-6 "test@example.com"
browse fill @0-7 "password123"
browse click @0-5
browse wait load networkidle
browse snapshot编写代码前,验证每个步骤是否可行:
bash
browse fill @0-6 "test@example.com"
browse fill @0-7 "password123"
browse click @0-5
browse wait load networkidle
browse snapshot5. Enable Network Capture (if needed)
5. 启用网络捕获(如有需要)
For API-based automations or debugging:
bash
browse network on对于基于API的自动化或调试:
bash
browse network onperform actions
执行操作
browse network path
browse network path
inspect captured requests in the directory
在目录中检查捕获的请求
undefinedundefined6. Create the Function
6. 创建Function
Once you understand the flow, create a full function project:
bash
pnpm dlx @browserbasehq/sdk-functions init my-automation
cd my-automationThis creates a complete project with:
- with dependencies
package.json - for credentials
.env tsconfig.json- template
index.ts
Edit with your automation logic:
index.tstypescript
import { defineFn } from "@browserbasehq/sdk-functions";
import { chromium } from "playwright-core";
defineFn("my-automation", async (context) => {
const { session } = context;
const browser = await chromium.connectOverCDP(session.connectUrl);
const page = browser.contexts()[0]!.pages()[0]!;
// Your automation steps here
await page.goto("https://example.com");
await page.fill('input[name="email"]', context.params.email);
await page.click('button[type="submit"]');
// Extract and return data
const result = await page.textContent('.result');
return { success: true, result };
});了解流程后,创建完整的Function项目:
bash
pnpm dlx @browserbasehq/sdk-functions init my-automation
cd my-automation这会创建一个包含以下内容的完整项目:
- 包含依赖的
package.json - 用于存储凭证的
.env tsconfig.json- 模板
index.ts
编辑添加您的自动化逻辑:
index.tstypescript
import { defineFn } from "@browserbasehq/sdk-functions";
import { chromium } from "playwright-core";
defineFn("my-automation", async (context) => {
const { session } = context;
const browser = await chromium.connectOverCDP(session.connectUrl);
const page = browser.contexts()[0]!.pages()[0]!;
// 在此添加您的自动化步骤
await page.goto("https://example.com");
await page.fill('input[name="email"]', context.params.email);
await page.click('button[type="submit"]');
// 提取并返回数据
const result = await page.textContent('.result');
return { success: true, result };
});7. Test Locally
7. 本地测试
Start the local development server:
bash
pnpm bb dev index.tsThen invoke locally via curl:
bash
curl -X POST http://127.0.0.1:14113/v1/functions/my-automation/invoke \
-H "Content-Type: application/json" \
-d '{"params": {"email": "test@example.com"}}'启动本地开发服务器:
bash
pnpm bb dev index.ts然后通过curl本地调用:
bash
curl -X POST http://127.0.0.1:14113/v1/functions/my-automation/invoke \
-H "Content-Type: application/json" \
-d '{"params": {"email": "test@example.com"}}'8. Deploy to Browserbase
8. 部署到Browserbase
When ready for production:
bash
pnpm bb publish index.ts准备好生产环境部署时:
bash
pnpm bb publish index.ts9. Test Production
9. 生产环境测试
Invoke the deployed function via API:
bash
curl -X POST https://api.browserbase.com/v1/functions/<function-id>/invoke \
-H "Content-Type: application/json" \
-H "x-bb-api-key: $BROWSERBASE_API_KEY" \
-d '{"params": {"email": "test@example.com"}}'通过API调用已部署的Function:
bash
curl -X POST https://api.browserbase.com/v1/functions/<function-id>/invoke \
-H "Content-Type: application/json" \
-H "x-bb-api-key: $BROWSERBASE_API_KEY" \
-d '{"params": {"email": "test@example.com"}}'Best Practices
最佳实践
Selectors
选择器
- Prefer data attributes () over CSS classes
data-testid - Use text content as fallback ()
text=Submit - Avoid fragile selectors like nth-child
- 优先使用数据属性()而非CSS类
data-testid - 使用文本内容作为备选方案()
text=Submit - 避免使用像nth-child这样脆弱的选择器
Waiting
等待策略
- Always wait for navigation/network after clicks
- Use for dynamic content
waitForSelector - Set reasonable timeouts
- 点击后始终等待导航/网络完成
- 对动态内容使用
waitForSelector - 设置合理的超时时间
Error Handling
错误处理
- Wrap risky operations in try/catch
- Return structured error information
- Log intermediate steps for debugging
- 将风险操作包裹在try/catch中
- 返回结构化的错误信息
- 记录中间步骤以便调试
Data Extraction
数据提取
- Use for complex extraction
page.evaluate() - Validate extracted data before returning
- Handle missing elements gracefully
- 对复杂提取使用
page.evaluate() - 返回前验证提取的数据
- 优雅处理缺失元素的情况
Example: E-commerce Price Monitor
示例:电商价格监控器
typescript
defineFn("price-monitor", async (context) => {
const { session, params } = context;
const browser = await chromium.connectOverCDP(session.connectUrl);
const page = browser.contexts()[0]!.pages()[0]!;
await page.goto(params.productUrl);
await page.waitForSelector('.price');
const price = await page.evaluate(() => {
const el = document.querySelector('.price');
return el?.textContent?.replace(/[^0-9.]/g, '');
});
return {
url: params.productUrl,
price: parseFloat(price || '0'),
timestamp: new Date().toISOString(),
};
});typescript
defineFn("price-monitor", async (context) => {
const { session, params } = context;
const browser = await chromium.connectOverCDP(session.connectUrl);
const page = browser.contexts()[0]!.pages()[0]!;
await page.goto(params.productUrl);
await page.waitForSelector('.price');
const price = await page.evaluate(() => {
const el = document.querySelector('.price');
return el?.textContent?.replace(/[^0-9.]/g, '');
});
return {
url: params.productUrl,
price: parseFloat(price || '0'),
timestamp: new Date().toISOString(),
};
});