agent-onboarding

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Bright Data — Agent Onboarding

Bright Data — Agent 接入指南

Bright Data gives agents reliable access to the open web: SERP results that look like a real browser, clean markdown from any URL (with CAPTCHA + JS handled), structured datasets for 40+ platforms (Amazon, LinkedIn, Instagram, TikTok, YouTube, Reddit, Crunchbase, …), and a Browser API for pages that need real interaction.
This skill is the entry point. Read it once, pick a path, then hand off to the narrower skill that owns that path.
Bright Data 为Agent提供可靠的开放网络访问能力:呈现与真实浏览器一致的SERP结果、从任意URL提取干净的Markdown内容(自动处理验证码与JS)、支持40+平台的结构化数据集(Amazon、LinkedIn、Instagram、TikTok、YouTube、Reddit、Crunchbase等),以及针对需真实交互页面的Browser API。
本技能是接入的入口。阅读一次后,选择合适的路径,再转向对应细分技能。

Install

安装

One command installs the CLI and the agent skills, and walks the human through OAuth in the browser:
bash
undefined
一条命令即可安装CLI和Agent技能,并引导用户在浏览器中完成OAuth验证:
bash
undefined

macOS / Linux — fastest install

macOS / Linux — 最快安装方式

Cross-platform (or if you don't want the install script)

跨平台(或不想使用安装脚本)

npm install -g @brightdata/cli
npm install -g @brightdata/cli

One-off, no install

一次性使用,无需安装

npx --yes --package @brightdata/cli brightdata <command>

Requires Node.js >= 20. After install, both `brightdata` and `bdata`
(shorthand) are available.

Then authenticate **once**:

```bash
bdata login
This single command:
  1. Opens the browser for OAuth (or use
    bdata login --device
    on headless / SSH machines)
  2. Saves the API key locally — you never need to paste a token again
  3. Auto-creates the required proxy zones (
    cli_unlocker
    ,
    cli_browser
    )
  4. Sets sensible default configuration
For non-interactive setups you can pass the key directly:
bash
bdata login --api-key <key>
npx --yes --package @brightdata/cli brightdata <command>

需要Node.js >= 20版本。安装完成后,`brightdata`和其简写`bdata`均可使用。

然后完成一次身份验证:

```bash
bdata login
这条命令会完成以下操作:
  1. 打开浏览器进行OAuth验证(在无头/SSH机器上可使用
    bdata login --device
  2. 将API密钥本地保存——您无需再手动粘贴令牌
  3. 自动创建所需的代理区域(
    cli_unlocker
    cli_browser
  4. 设置合理的默认配置
对于非交互式环境,可直接传入密钥:
bash
bdata login --api-key <key>

or

export BRIGHTDATA_API_KEY=<key>

Verify the install before doing real work:

```bash
bdata version
bdata config            # confirms auth + zones
bdata zones             # should list cli_unlocker, cli_browser
bdata budget            # confirms account + balance
If any of these fail, route to Path C (auth) before continuing.
export BRIGHTDATA_API_KEY=<key>

在开始实际操作前验证安装是否成功:

```bash
bdata version
bdata config            # 确认身份验证与代理区域
bdata zones             # 应显示cli_unlocker、cli_browser
bdata budget            # 确认账户与余额
如果以上任何命令失败,请先进入路径C(身份验证)解决问题后再继续。

Install agent skills (optional, recommended)

安装Agent技能(可选,推荐)

The CLI ships an installer that drops Bright Data skills directly into your coding agent's skill directory:
bash
undefined
CLI附带的安装程序可将Bright Data技能直接安装到您的编码Agent技能目录中:
bash
undefined

Interactive picker — choose skills + target agent

交互式选择器——选择技能与目标Agent

bdata skill add
bdata skill add

Install a specific skill

安装特定技能

bdata skill add scrape bdata skill add data-feeds bdata skill add competitive-intel
bdata skill add scrape bdata skill add data-feeds bdata skill add competitive-intel

See everything available

查看所有可用技能

bdata skill list

These are the skills you'll hand off to from the paths below
(`scrape`, `search`, `data-feeds`, `scraper-builder`,
`brightdata-cli`, `bright-data-mcp`, …).
bdata skill list

这些是后续各路径中会用到的技能(`scrape`、`search`、`data-feeds`、`scraper-builder`、`brightdata-cli`、`bright-data-mcp`等)。

Choose your path

选择您的路径

All paths share the same install + auth above. The difference is what you do next.
SituationPath
Need web data during this sessionPath A — live CLI tools
Need to add Bright Data to app codePath B — SDK / REST integration
Want a drop-in tool layer for an LLM agentPath M — MCP server
Need an API key firstPath C — auth only
Don't want to install anythingPath D — REST API directly
If your task spans paths, do them in order: auth → live tools to explore → app integration once the shape is known.

所有路径都共享上述的安装与身份验证步骤,区别在于后续操作。
场景路径
需要在当前会话中获取网页数据路径A — 实时CLI工具
需要将Bright Data集成到应用代码中路径B — SDK/REST集成
为LLM Agent添加即用型工具层路径M — MCP服务器
首先需要API密钥路径C — 仅身份验证
不想安装任何内容路径D — 直接使用REST API
如果您的任务涉及多个路径,请按以下顺序操作:身份验证 → 使用实时工具探索 → 确定需求后进行应用集成。

Path A — Live web tools (CLI)

路径A — 实时网页工具(CLI)

Use this when the agent itself needs web data right now: discovering URLs, fetching clean content, pulling structured records from a known platform, or running a quick competitive scan.
After install + login, hand off to the narrower skills:
  • brightdata-cli
    — overall command surface (
    scrape
    ,
    search
    ,
    pipelines
    ,
    status
    ,
    zones
    ,
    budget
    ,
    config
    )
  • search
    — discovery via
    bdata search
    (Google / Bing / Yandex SERP, structured JSON)
  • scrape
    — clean content from a known URL via
    bdata scrape
    (markdown / HTML / JSON / screenshot)
  • data-feeds
    — structured records from 40+ supported platforms via
    bdata pipelines <type>
    (Amazon, LinkedIn, Instagram, TikTok, YouTube, Reddit, Crunchbase, Google Maps, …)
  • competitive-intel
    — packaged competitor / pricing / review / hiring / SEO analyses on top of the CLI
  • seo-audit
    — sitemap-stratified live SEO audits
Default flow for live web work:
  1. Search first when you need discovery
    bdata search "query" --json
  2. Pipelines next if the target is a supported platform — you get structured JSON with no parsing
    bdata pipelines amazon_product "https://amazon.com/dp/..."
  3. Scrape when you have a URL and no platform pipeline applies
    bdata scrape "https://example.com" -f markdown
  4. Browser API only when the page truly needs clicks, forms, or login (see the
    brightdata-cli
    skill for
    bdata browser
    and the
    bright-data-best-practices
    browser-api reference)
When the task shifts from "fetch data now" to "wire this into an app," switch to Path B.

当Agent需要立即获取网页数据时使用此路径:发现URL、提取干净内容、从已知平台获取结构化记录,或快速进行竞品扫描。
完成安装与登录后,转向以下细分技能:
  • brightdata-cli
    — 完整命令集(
    scrape
    search
    pipelines
    status
    zones
    budget
    config
  • search
    — 通过
    bdata search
    进行内容发现(Google/Bing/Yandex的SERP结果,结构化JSON格式)
  • scrape
    — 通过
    bdata scrape
    从已知URL提取干净内容(Markdown/HTML/JSON/截图)
  • data-feeds
    — 通过
    bdata pipelines <type>
    从40+支持平台获取结构化记录(Amazon、LinkedIn、Instagram、TikTok、YouTube、Reddit、Crunchbase、Google Maps等)
  • competitive-intel
    — 基于CLI的竞品/定价/评论/招聘/SEO分析工具包
  • seo-audit
    — 基于站点地图的实时SEO审计
实时网页操作的默认流程:
  1. 先搜索:当需要发现内容时
    bdata search "query" --json
  2. 再用Pipelines:如果目标是支持的平台——无需解析即可获取结构化JSON
    bdata pipelines amazon_product "https://amazon.com/dp/..."
  3. 最后抓取:当您有URL且无对应平台Pipelines时
    bdata scrape "https://example.com" -f markdown
  4. 仅在必要时使用Browser API:当页面确实需要点击、表单提交或登录时(查看
    brightdata-cli
    技能中的
    bdata browser
    以及
    bright-data-best-practices
    中的Browser API参考文档)
当任务从「立即获取数据」转向「集成到应用中」时,请切换到路径B。

Path B — Integrate Bright Data into an app

路径B — 将Bright Data集成到应用中

Use this when you're building an application, agent, or workflow that calls Bright Data from code and needs
BRIGHTDATA_API_KEY
(and a zone) in
.env
or runtime config.
The required question on this path is:
What should Bright Data do in the product?
Use the answer to pick the API:
Job in productAPISkill
Fetch a single page as markdown / HTML / JSONWeb Unlocker
bright-data-best-practices
web-unlocker.md
Search engine results in structured JSONSERP API
bright-data-best-practices
serp-api.md
Structured records from supported platformsWeb Scraper API
bright-data-best-practices
web-scraper-api.md
JS-heavy / interactive pages with Playwright/PuppeteerBrowser API
bright-data-best-practices
browser-api.md
Build a custom scraper for an arbitrary siteAll four, picked by site shape
scraper-builder
当您构建需要从代码中调用Bright Data的应用、Agent或工作流,且需要在
.env
或运行时配置中设置
BRIGHTDATA_API_KEY
(以及代理区域)时使用此路径。
此路径的核心问题是:
Bright Data在产品中需要实现什么功能?
根据答案选择对应的API:
产品需求API技能
获取单个页面的Markdown/HTML/JSON格式内容Web Unlocker
bright-data-best-practices
web-unlocker.md
获取结构化JSON格式的搜索引擎结果SERP API
bright-data-best-practices
serp-api.md
从支持平台获取结构化记录Web Scraper API
bright-data-best-practices
web-scraper-api.md
处理JS密集型/交互式页面(使用Playwright/Puppeteer)Browser API
bright-data-best-practices
browser-api.md
为任意站点构建自定义抓取工具以上四个API,根据站点类型选择
scraper-builder

Pick a stack

选择技术栈

  • Python → use the official SDK
    bash
    pip install brightdata-sdk
    Hand off to
    python-sdk-best-practices
    for client setup (async/sync), platform scrapers, SERP, datasets, Browser API, and error handling.
  • Node / TypeScript / shell / other → call the REST API directly (Path D below has the endpoints), or use the CLI as a library via
    npx @brightdata/cli
    .
  • LLM tool layer (Claude, ChatGPT, etc.) → use the MCP server (Path M).
  • Python → 使用官方SDK
    bash
    pip install brightdata-sdk
    转向
    python-sdk-best-practices
    技能获取客户端配置(异步/同步)、平台抓取、SERP、数据集、Browser API以及错误处理的相关指导。
  • Node/TypeScript/Shell/其他 → 直接调用REST API(路径D包含端点信息),或通过
    npx @brightdata/cli
    将CLI作为库使用。
  • LLM工具层(Claude、ChatGPT等) → 使用MCP服务器(路径M)。

Set credentials

设置凭据

dotenv
BRIGHTDATA_API_KEY=...
BRIGHTDATA_UNLOCKER_ZONE=cli_unlocker   # created automatically by `bdata login`
BRIGHTDATA_SERP_ZONE=cli_unlocker       # or a dedicated SERP zone
If you don't have a key yet, do Path C first.
dotenv
BRIGHTDATA_API_KEY=...
BRIGHTDATA_UNLOCKER_ZONE=cli_unlocker   # 由`bdata login`自动创建
BRIGHTDATA_SERP_ZONE=cli_unlocker       # 或专用的SERP代理区域
如果您还没有密钥,请先完成路径C。

Smoke test before writing real code

编写实际代码前进行冒烟测试

Always run one real Bright Data request before scaling up integration work — catches auth, zone, and quota issues before they hide inside your app's error paths.
bash
undefined
在大规模集成工作前,务必运行一次真实的Bright Data请求——在问题隐藏到应用错误路径前,提前发现身份验证、代理区域和配额问题。
bash
undefined

Web Unlocker via REST

通过REST调用Web Unlocker

curl -sS https://api.brightdata.com/request
-H "Authorization: Bearer $BRIGHTDATA_API_KEY"
-H "Content-Type: application/json"
-d '{ "url": "https://example.com", "zone": "'"$BRIGHTDATA_UNLOCKER_ZONE"'", "format": "raw", "data_format": "markdown" }' | head -40

If this prints clean markdown, you're wired up. If not, check the
zone name and key.

---
curl -sS https://api.brightdata.com/request
-H "Authorization: Bearer $BRIGHTDATA_API_KEY"
-H "Content-Type: application/json"
-d '{ "url": "https://example.com", "zone": "'"$BRIGHTDATA_UNLOCKER_ZONE"'", "format": "raw", "data_format": "markdown" }' | head -40

如果输出干净的Markdown内容,则说明集成成功。如果失败,请检查代理区域名称和密钥。

---

Path M — MCP server (LLM tool layer)

路径M — MCP服务器(LLM工具层)

Use this when the consumer is an LLM agent that should call Bright Data as tools (e.g., Claude Code, ChatGPT desktop, custom agent loops). The MCP server exposes 60+ tools — search, scrape, structured data per platform, browser automation — over a single URL.
Connect with:
https://mcp.brightdata.com/mcp?token=YOUR_BRIGHTDATA_API_TOKEN
Optional URL parameters:
ParameterEffect
pro=1
Enable all 60+ Pro tools
groups=<name>
Enable a tool group (
social
,
ecommerce
,
business
,
finance
,
research
,
app_stores
,
travel
,
browser
,
advanced_scraping
)
tools=<names>
Enable a specific tool list, comma-separated
Hand off to the
bright-data-mcp
skill for tool selection, tool-group auto-enabling, and workflow patterns. That skill explicitly replaces WebFetch / WebSearch with Bright Data MCP equivalents.

当使用者是需要将Bright Data作为工具调用的LLM Agent时使用此路径(如Claude Code、ChatGPT桌面版、自定义Agent循环)。MCP服务器通过单个URL暴露60+工具——搜索、抓取、各平台结构化数据、浏览器自动化等。
连接地址:
https://mcp.brightdata.com/mcp?token=YOUR_BRIGHTDATA_API_TOKEN
可选URL参数:
参数作用
pro=1
启用全部60+专业工具
groups=<name>
启用指定工具组(
social
ecommerce
business
finance
research
app_stores
travel
browser
advanced_scraping
tools=<names>
启用指定工具列表,以逗号分隔
转向
bright-data-mcp
技能获取工具选择、工具组自动启用以及工作流模式的相关指导。该技能会明确使用Bright Data MCP等效工具替代WebFetch/WebSearch。

Path C — Get an API key (auth only)

路径C — 获取API密钥(仅身份验证)

Use this when the human still needs to sign up, sign in, or generate a key. Skip this path if
bdata config
already shows an authenticated account, or if
BRIGHTDATA_API_KEY
is already set in the environment.
当用户仍需注册、登录或生成密钥时使用此路径。如果
bdata config
已显示已验证的账户,或环境中已设置
BRIGHTDATA_API_KEY
,则可跳过此路径。

Easiest: use the CLI's OAuth flow

最简单方式:使用CLI的OAuth流程

bash
bdata login            # browser-based OAuth
bdata login --device   # headless / SSH (device-code flow)
This handles signup-or-signin, key generation, zone creation, and local config in one step. Prefer this over manual flows.
bash
bdata login            # 基于浏览器的OAuth验证
bdata login --device   # 无头/SSH环境(设备码流程)
此流程可一步完成注册/登录、密钥生成、代理区域创建以及本地配置。优先选择此方式而非手动流程。

Manual: dashboard

手动方式:控制台

If the human prefers the web UI:
  1. Go to https://brightdata.com/cp (sign up if needed)
  2. Create a Web Unlocker zone ("Add" → "Unlocker zone")
  3. Copy the API key from the dashboard
  4. Save it where the rest of the app reads secrets:
bash
echo "BRIGHTDATA_API_KEY=..." >> .env
echo "BRIGHTDATA_UNLOCKER_ZONE=<zone-name>" >> .env
如果用户偏好网页界面:
  1. 访问https://brightdata.com/cp(如需注册请先完成)
  2. 创建一个Web Unlocker代理区域(「添加」→「解锁器区域」)
  3. 从控制台复制API密钥
  4. 将密钥保存到应用读取密钥的位置:
bash
echo "BRIGHTDATA_API_KEY=..." >> .env
echo "BRIGHTDATA_UNLOCKER_ZONE=<zone-name>" >> .env

Verify

验证

bash
bdata budget    # any successful response means the key works
If verification fails, the key is wrong, the zone is wrong, or the account has no active subscription — surface the error to the human rather than guessing.

bash
bdata budget    # 任何成功的响应都表明密钥有效
如果验证失败,则说明密钥错误、代理区域错误或账户无有效订阅——请将错误信息告知用户,而非自行猜测。

Path D — Use Bright Data without installing anything

路径D — 无需安装即可使用Bright Data

Use this when the environment can't run
npm
/
curl | bash
, or when you only need one or two requests and don't want the CLI / SDK. Works for both live agent work and app integration.
You still need an API key and a zone. Two ways to get them:
  • Human pastes it in — if a key already exists, set
    BRIGHTDATA_API_KEY=...
    and
    BRIGHTDATA_UNLOCKER_ZONE=...
    in the environment
  • Browser flow — do Path C; the dashboard issues both
Base URL:
https://api.brightdata.com
Auth header:
Authorization: Bearer $BRIGHTDATA_API_KEY
当环境无法运行
npm
/
curl | bash
,或仅需一两次请求且不想安装CLI/SDK时使用此路径。适用于实时Agent操作和应用集成场景。
您仍需API密钥和代理区域。获取方式有两种:
  • 用户手动粘贴:如果已有密钥,在环境中设置
    BRIGHTDATA_API_KEY=...
    BRIGHTDATA_UNLOCKER_ZONE=...
  • 浏览器流程:完成路径C;控制台会生成密钥和代理区域
基础URL:
https://api.brightdata.com
身份验证头:
Authorization: Bearer $BRIGHTDATA_API_KEY

Core endpoints

核心端点

http
undefined
http
undefined

Web Unlocker — clean content from any URL

Web Unlocker — 从任意URL提取干净内容

POST /request { "url": "https://target.com", "zone": "<unlocker-zone>", "format": "raw", "data_format": "markdown" // or "html", "screenshot", "parsed_light" }

```http
POST /request { "url": "https://target.com", "zone": "<unlocker-zone>", "format": "raw", "data_format": "markdown" // 或 "html", "screenshot", "parsed_light" }

```http

SERP API — structured search results

SERP API — 结构化搜索结果

Use the same /request endpoint with a SERP zone and a search URL,

使用相同的/request端点,搭配SERP代理区域和搜索URL,

adding
brd_json=1
to receive parsed JSON instead of raw HTML.

添加
brd_json=1
参数以接收解析后的JSON而非原始HTML。

POST /request { "url": "https://www.google.com/search?q=web+scraping&brd_json=1", "zone": "<serp-zone>", "format": "raw" }

```http
POST /request { "url": "https://www.google.com/search?q=web+scraping&brd_json=1", "zone": "<serp-zone>", "format": "raw" }

```http

Web Scraper API — structured data for 40+ platforms (async)

Web Scraper API — 40+平台的结构化数据(异步)

POST /datasets/v3/trigger?dataset_id=<id> [ { "url": "https://amazon.com/dp/B09V3KXJPB" } ]
POST /datasets/v3/trigger?dataset_id=<id> [ { "url": "https://amazon.com/dp/B09V3KXJPB" } ]

then poll

然后轮询结果

GET /datasets/v3/snapshot/<snapshot_id>?format=json

For the full parameter surface (special headers like
`x-unblock-expect`, async response IDs, dataset progress states,
Browser API CDP commands), read the `bright-data-best-practices`
skill — its references are the source of truth for REST-level work.
GET /datasets/v3/snapshot/<snapshot_id>?format=json

如需完整参数说明(如`x-unblock-expect`等特殊头、异步响应ID、数据集进度状态、Browser API的CDP命令),请查看`bright-data-best-practices`技能——其参考文档是REST操作的权威来源。

Documentation

文档



After onboarding — where to go next

接入完成后 — 下一步

Once the agent is set up, route the work to the narrowest skill that fits. Quick map:
User says…Skill
"scrape this URL" / "get this page"
scrape
"search Google for…" / "find URLs about…"
search
"get Amazon / LinkedIn / Instagram / TikTok / YouTube / Reddit data"
data-feeds
"build a scraper for <site>"
scraper-builder
"analyze my competitor" / "compare pricing"
competitive-intel
"audit SEO" / "rank check" / "schema check"
seo-audit
"write Bright Data code in Python"
python-sdk-best-practices
"plug Bright Data into my LLM agent"
bright-data-mcp
"use the CLI" / "run from terminal"
brightdata-cli
"debug a Browser API session"
brd-browser-debug
When in doubt, prefer the more specific skill:
data-feeds
over
scrape
for supported platforms,
scraper-builder
over
scrape
for multi-page extraction,
bright-data-mcp
over
brightdata-cli
when the consumer is an LLM agent rather than a human at a terminal.
Agent设置完成后,将工作转向最符合需求的细分技能。快速指引:
用户需求…对应技能
"抓取这个URL" / "获取这个页面"
scrape
"在Google上搜索…" / "查找关于…的URL"
search
"获取Amazon/LinkedIn/Instagram/TikTok/YouTube/Reddit数据"
data-feeds
"为<站点>构建抓取工具"
scraper-builder
"分析我的竞品" / "对比定价"
competitive-intel
"审计SEO" / "排名检查" / "Schema检查"
seo-audit
"用Python编写Bright Data代码"
python-sdk-best-practices
"将Bright Data接入我的LLM Agent"
bright-data-mcp
"使用CLI" / "从终端运行"
brightdata-cli
"调试Browser API会话"
brd-browser-debug
如有疑问,优先选择更具体的技能:对于支持平台,优先使用
data-feeds
而非
scrape
;对于多页面提取,优先使用
scraper-builder
而非
scrape
;当使用者是LLM Agent而非终端用户时,优先使用
bright-data-mcp
而非
brightdata-cli