agent-onboarding
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBright Data — Agent Onboarding
Bright Data — Agent 接入指南
Bright Data gives agents reliable access to the open web: SERP results
that look like a real browser, clean markdown from any URL (with
CAPTCHA + JS handled), structured datasets for 40+ platforms (Amazon,
LinkedIn, Instagram, TikTok, YouTube, Reddit, Crunchbase, …), and a
Browser API for pages that need real interaction.
This skill is the entry point. Read it once, pick a path, then hand
off to the narrower skill that owns that path.
Bright Data 为Agent提供可靠的开放网络访问能力:呈现与真实浏览器一致的SERP结果、从任意URL提取干净的Markdown内容(自动处理验证码与JS)、支持40+平台的结构化数据集(Amazon、LinkedIn、Instagram、TikTok、YouTube、Reddit、Crunchbase等),以及针对需真实交互页面的Browser API。
本技能是接入的入口。阅读一次后,选择合适的路径,再转向对应细分技能。
Install
安装
One command installs the CLI and the agent skills, and walks the
human through OAuth in the browser:
bash
undefined一条命令即可安装CLI和Agent技能,并引导用户在浏览器中完成OAuth验证:
bash
undefinedmacOS / Linux — fastest install
macOS / Linux — 最快安装方式
curl -fsSL https://cli.brightdata.com/install.sh | bash
curl -fsSL https://cli.brightdata.com/install.sh | bash
Cross-platform (or if you don't want the install script)
跨平台(或不想使用安装脚本)
npm install -g @brightdata/cli
npm install -g @brightdata/cli
One-off, no install
一次性使用,无需安装
npx --yes --package @brightdata/cli brightdata <command>
Requires Node.js >= 20. After install, both `brightdata` and `bdata`
(shorthand) are available.
Then authenticate **once**:
```bash
bdata loginThis single command:
- Opens the browser for OAuth (or use on headless / SSH machines)
bdata login --device - Saves the API key locally — you never need to paste a token again
- Auto-creates the required proxy zones (,
cli_unlocker)cli_browser - Sets sensible default configuration
For non-interactive setups you can pass the key directly:
bash
bdata login --api-key <key>npx --yes --package @brightdata/cli brightdata <command>
需要Node.js >= 20版本。安装完成后,`brightdata`和其简写`bdata`均可使用。
然后完成一次身份验证:
```bash
bdata login这条命令会完成以下操作:
- 打开浏览器进行OAuth验证(在无头/SSH机器上可使用)
bdata login --device - 将API密钥本地保存——您无需再手动粘贴令牌
- 自动创建所需的代理区域(、
cli_unlocker)cli_browser - 设置合理的默认配置
对于非交互式环境,可直接传入密钥:
bash
bdata login --api-key <key>or
或
export BRIGHTDATA_API_KEY=<key>
Verify the install before doing real work:
```bash
bdata version
bdata config # confirms auth + zones
bdata zones # should list cli_unlocker, cli_browser
bdata budget # confirms account + balanceIf any of these fail, route to Path C (auth) before continuing.
export BRIGHTDATA_API_KEY=<key>
在开始实际操作前验证安装是否成功:
```bash
bdata version
bdata config # 确认身份验证与代理区域
bdata zones # 应显示cli_unlocker、cli_browser
bdata budget # 确认账户与余额如果以上任何命令失败,请先进入路径C(身份验证)解决问题后再继续。
Install agent skills (optional, recommended)
安装Agent技能(可选,推荐)
The CLI ships an installer that drops Bright Data skills directly into
your coding agent's skill directory:
bash
undefinedCLI附带的安装程序可将Bright Data技能直接安装到您的编码Agent技能目录中:
bash
undefinedInteractive picker — choose skills + target agent
交互式选择器——选择技能与目标Agent
bdata skill add
bdata skill add
Install a specific skill
安装特定技能
bdata skill add scrape
bdata skill add data-feeds
bdata skill add competitive-intel
bdata skill add scrape
bdata skill add data-feeds
bdata skill add competitive-intel
See everything available
查看所有可用技能
bdata skill list
These are the skills you'll hand off to from the paths below
(`scrape`, `search`, `data-feeds`, `scraper-builder`,
`brightdata-cli`, `bright-data-mcp`, …).bdata skill list
这些是后续各路径中会用到的技能(`scrape`、`search`、`data-feeds`、`scraper-builder`、`brightdata-cli`、`bright-data-mcp`等)。Choose your path
选择您的路径
All paths share the same install + auth above. The difference is what
you do next.
| Situation | Path |
|---|---|
| Need web data during this session | Path A — live CLI tools |
| Need to add Bright Data to app code | Path B — SDK / REST integration |
| Want a drop-in tool layer for an LLM agent | Path M — MCP server |
| Need an API key first | Path C — auth only |
| Don't want to install anything | Path D — REST API directly |
If your task spans paths, do them in order: auth → live tools to
explore → app integration once the shape is known.
所有路径都共享上述的安装与身份验证步骤,区别在于后续操作。
| 场景 | 路径 |
|---|---|
| 需要在当前会话中获取网页数据 | 路径A — 实时CLI工具 |
| 需要将Bright Data集成到应用代码中 | 路径B — SDK/REST集成 |
| 为LLM Agent添加即用型工具层 | 路径M — MCP服务器 |
| 首先需要API密钥 | 路径C — 仅身份验证 |
| 不想安装任何内容 | 路径D — 直接使用REST API |
如果您的任务涉及多个路径,请按以下顺序操作:身份验证 → 使用实时工具探索 → 确定需求后进行应用集成。
Path A — Live web tools (CLI)
路径A — 实时网页工具(CLI)
Use this when the agent itself needs web data right now: discovering
URLs, fetching clean content, pulling structured records from a known
platform, or running a quick competitive scan.
After install + login, hand off to the narrower skills:
- — overall command surface (
brightdata-cli,scrape,search,pipelines,status,zones,budget)config - — discovery via
search(Google / Bing / Yandex SERP, structured JSON)bdata search - — clean content from a known URL via
scrape(markdown / HTML / JSON / screenshot)bdata scrape - — structured records from 40+ supported platforms via
data-feeds(Amazon, LinkedIn, Instagram, TikTok, YouTube, Reddit, Crunchbase, Google Maps, …)bdata pipelines <type> - — packaged competitor / pricing / review / hiring / SEO analyses on top of the CLI
competitive-intel - — sitemap-stratified live SEO audits
seo-audit
Default flow for live web work:
- Search first when you need discovery
bdata search "query" --json - Pipelines next if the target is a supported platform — you get
structured JSON with no parsing
bdata pipelines amazon_product "https://amazon.com/dp/..." - Scrape when you have a URL and no platform pipeline applies
bdata scrape "https://example.com" -f markdown - Browser API only when the page truly needs clicks, forms, or
login (see the skill for
brightdata-cliand thebdata browserbrowser-api reference)bright-data-best-practices
When the task shifts from "fetch data now" to "wire this into an
app," switch to Path B.
当Agent需要立即获取网页数据时使用此路径:发现URL、提取干净内容、从已知平台获取结构化记录,或快速进行竞品扫描。
完成安装与登录后,转向以下细分技能:
- — 完整命令集(
brightdata-cli、scrape、search、pipelines、status、zones、budget)config - — 通过
search进行内容发现(Google/Bing/Yandex的SERP结果,结构化JSON格式)bdata search - — 通过
scrape从已知URL提取干净内容(Markdown/HTML/JSON/截图)bdata scrape - — 通过
data-feeds从40+支持平台获取结构化记录(Amazon、LinkedIn、Instagram、TikTok、YouTube、Reddit、Crunchbase、Google Maps等)bdata pipelines <type> - — 基于CLI的竞品/定价/评论/招聘/SEO分析工具包
competitive-intel - — 基于站点地图的实时SEO审计
seo-audit
实时网页操作的默认流程:
- 先搜索:当需要发现内容时
bdata search "query" --json - 再用Pipelines:如果目标是支持的平台——无需解析即可获取结构化JSON
bdata pipelines amazon_product "https://amazon.com/dp/..." - 最后抓取:当您有URL且无对应平台Pipelines时
bdata scrape "https://example.com" -f markdown - 仅在必要时使用Browser API:当页面确实需要点击、表单提交或登录时(查看技能中的
brightdata-cli以及bdata browser中的Browser API参考文档)bright-data-best-practices
当任务从「立即获取数据」转向「集成到应用中」时,请切换到路径B。
Path B — Integrate Bright Data into an app
路径B — 将Bright Data集成到应用中
Use this when you're building an application, agent, or workflow that
calls Bright Data from code and needs (and a
zone) in or runtime config.
BRIGHTDATA_API_KEY.envThe required question on this path is:
What should Bright Data do in the product?
Use the answer to pick the API:
| Job in product | API | Skill |
|---|---|---|
| Fetch a single page as markdown / HTML / JSON | Web Unlocker | |
| Search engine results in structured JSON | SERP API | |
| Structured records from supported platforms | Web Scraper API | |
| JS-heavy / interactive pages with Playwright/Puppeteer | Browser API | |
| Build a custom scraper for an arbitrary site | All four, picked by site shape | |
当您构建需要从代码中调用Bright Data的应用、Agent或工作流,且需要在或运行时配置中设置(以及代理区域)时使用此路径。
.envBRIGHTDATA_API_KEY此路径的核心问题是:
Bright Data在产品中需要实现什么功能?
根据答案选择对应的API:
| 产品需求 | API | 技能 |
|---|---|---|
| 获取单个页面的Markdown/HTML/JSON格式内容 | Web Unlocker | |
| 获取结构化JSON格式的搜索引擎结果 | SERP API | |
| 从支持平台获取结构化记录 | Web Scraper API | |
| 处理JS密集型/交互式页面(使用Playwright/Puppeteer) | Browser API | |
| 为任意站点构建自定义抓取工具 | 以上四个API,根据站点类型选择 | |
Pick a stack
选择技术栈
-
Python → use the official SDKbash
pip install brightdata-sdkHand off tofor client setup (async/sync), platform scrapers, SERP, datasets, Browser API, and error handling.python-sdk-best-practices -
Node / TypeScript / shell / other → call the REST API directly (Path D below has the endpoints), or use the CLI as a library via.
npx @brightdata/cli -
LLM tool layer (Claude, ChatGPT, etc.) → use the MCP server (Path M).
-
Python → 使用官方SDKbash
pip install brightdata-sdk转向技能获取客户端配置(异步/同步)、平台抓取、SERP、数据集、Browser API以及错误处理的相关指导。python-sdk-best-practices -
Node/TypeScript/Shell/其他 → 直接调用REST API(路径D包含端点信息),或通过将CLI作为库使用。
npx @brightdata/cli -
LLM工具层(Claude、ChatGPT等) → 使用MCP服务器(路径M)。
Set credentials
设置凭据
dotenv
BRIGHTDATA_API_KEY=...
BRIGHTDATA_UNLOCKER_ZONE=cli_unlocker # created automatically by `bdata login`
BRIGHTDATA_SERP_ZONE=cli_unlocker # or a dedicated SERP zoneIf you don't have a key yet, do Path C first.
dotenv
BRIGHTDATA_API_KEY=...
BRIGHTDATA_UNLOCKER_ZONE=cli_unlocker # 由`bdata login`自动创建
BRIGHTDATA_SERP_ZONE=cli_unlocker # 或专用的SERP代理区域如果您还没有密钥,请先完成路径C。
Smoke test before writing real code
编写实际代码前进行冒烟测试
Always run one real Bright Data request before scaling up integration
work — catches auth, zone, and quota issues before they hide inside
your app's error paths.
bash
undefined在大规模集成工作前,务必运行一次真实的Bright Data请求——在问题隐藏到应用错误路径前,提前发现身份验证、代理区域和配额问题。
bash
undefinedWeb Unlocker via REST
通过REST调用Web Unlocker
curl -sS https://api.brightdata.com/request
-H "Authorization: Bearer $BRIGHTDATA_API_KEY"
-H "Content-Type: application/json"
-d '{ "url": "https://example.com", "zone": "'"$BRIGHTDATA_UNLOCKER_ZONE"'", "format": "raw", "data_format": "markdown" }' | head -40
-H "Authorization: Bearer $BRIGHTDATA_API_KEY"
-H "Content-Type: application/json"
-d '{ "url": "https://example.com", "zone": "'"$BRIGHTDATA_UNLOCKER_ZONE"'", "format": "raw", "data_format": "markdown" }' | head -40
If this prints clean markdown, you're wired up. If not, check the
zone name and key.
---curl -sS https://api.brightdata.com/request
-H "Authorization: Bearer $BRIGHTDATA_API_KEY"
-H "Content-Type: application/json"
-d '{ "url": "https://example.com", "zone": "'"$BRIGHTDATA_UNLOCKER_ZONE"'", "format": "raw", "data_format": "markdown" }' | head -40
-H "Authorization: Bearer $BRIGHTDATA_API_KEY"
-H "Content-Type: application/json"
-d '{ "url": "https://example.com", "zone": "'"$BRIGHTDATA_UNLOCKER_ZONE"'", "format": "raw", "data_format": "markdown" }' | head -40
如果输出干净的Markdown内容,则说明集成成功。如果失败,请检查代理区域名称和密钥。
---Path M — MCP server (LLM tool layer)
路径M — MCP服务器(LLM工具层)
Use this when the consumer is an LLM agent that should call Bright
Data as tools (e.g., Claude Code, ChatGPT desktop, custom agent
loops). The MCP server exposes 60+ tools — search, scrape, structured
data per platform, browser automation — over a single URL.
Connect with:
https://mcp.brightdata.com/mcp?token=YOUR_BRIGHTDATA_API_TOKENOptional URL parameters:
| Parameter | Effect |
|---|---|
| Enable all 60+ Pro tools |
| Enable a tool group ( |
| Enable a specific tool list, comma-separated |
Hand off to the skill for tool selection, tool-group
auto-enabling, and workflow patterns. That skill explicitly replaces
WebFetch / WebSearch with Bright Data MCP equivalents.
bright-data-mcp当使用者是需要将Bright Data作为工具调用的LLM Agent时使用此路径(如Claude Code、ChatGPT桌面版、自定义Agent循环)。MCP服务器通过单个URL暴露60+工具——搜索、抓取、各平台结构化数据、浏览器自动化等。
连接地址:
https://mcp.brightdata.com/mcp?token=YOUR_BRIGHTDATA_API_TOKEN可选URL参数:
| 参数 | 作用 |
|---|---|
| 启用全部60+专业工具 |
| 启用指定工具组( |
| 启用指定工具列表,以逗号分隔 |
转向技能获取工具选择、工具组自动启用以及工作流模式的相关指导。该技能会明确使用Bright Data MCP等效工具替代WebFetch/WebSearch。
bright-data-mcpPath C — Get an API key (auth only)
路径C — 获取API密钥(仅身份验证)
Use this when the human still needs to sign up, sign in, or generate
a key. Skip this path if already shows an authenticated
account, or if is already set in the environment.
bdata configBRIGHTDATA_API_KEY当用户仍需注册、登录或生成密钥时使用此路径。如果已显示已验证的账户,或环境中已设置,则可跳过此路径。
bdata configBRIGHTDATA_API_KEYEasiest: use the CLI's OAuth flow
最简单方式:使用CLI的OAuth流程
bash
bdata login # browser-based OAuth
bdata login --device # headless / SSH (device-code flow)This handles signup-or-signin, key generation, zone creation, and
local config in one step. Prefer this over manual flows.
bash
bdata login # 基于浏览器的OAuth验证
bdata login --device # 无头/SSH环境(设备码流程)此流程可一步完成注册/登录、密钥生成、代理区域创建以及本地配置。优先选择此方式而非手动流程。
Manual: dashboard
手动方式:控制台
If the human prefers the web UI:
- Go to https://brightdata.com/cp (sign up if needed)
- Create a Web Unlocker zone ("Add" → "Unlocker zone")
- Copy the API key from the dashboard
- Save it where the rest of the app reads secrets:
bash
echo "BRIGHTDATA_API_KEY=..." >> .env
echo "BRIGHTDATA_UNLOCKER_ZONE=<zone-name>" >> .env如果用户偏好网页界面:
- 访问https://brightdata.com/cp(如需注册请先完成)
- 创建一个Web Unlocker代理区域(「添加」→「解锁器区域」)
- 从控制台复制API密钥
- 将密钥保存到应用读取密钥的位置:
bash
echo "BRIGHTDATA_API_KEY=..." >> .env
echo "BRIGHTDATA_UNLOCKER_ZONE=<zone-name>" >> .envVerify
验证
bash
bdata budget # any successful response means the key worksIf verification fails, the key is wrong, the zone is wrong, or the
account has no active subscription — surface the error to the human
rather than guessing.
bash
bdata budget # 任何成功的响应都表明密钥有效如果验证失败,则说明密钥错误、代理区域错误或账户无有效订阅——请将错误信息告知用户,而非自行猜测。
Path D — Use Bright Data without installing anything
路径D — 无需安装即可使用Bright Data
Use this when the environment can't run / , or
when you only need one or two requests and don't want the CLI / SDK.
Works for both live agent work and app integration.
npmcurl | bashYou still need an API key and a zone. Two ways to get them:
- Human pastes it in — if a key already exists, set
and
BRIGHTDATA_API_KEY=...in the environmentBRIGHTDATA_UNLOCKER_ZONE=... - Browser flow — do Path C; the dashboard issues both
Base URL:
Auth header:
https://api.brightdata.comAuthorization: Bearer $BRIGHTDATA_API_KEY当环境无法运行/,或仅需一两次请求且不想安装CLI/SDK时使用此路径。适用于实时Agent操作和应用集成场景。
npmcurl | bash您仍需API密钥和代理区域。获取方式有两种:
- 用户手动粘贴:如果已有密钥,在环境中设置和
BRIGHTDATA_API_KEY=...BRIGHTDATA_UNLOCKER_ZONE=... - 浏览器流程:完成路径C;控制台会生成密钥和代理区域
基础URL:
身份验证头:
https://api.brightdata.comAuthorization: Bearer $BRIGHTDATA_API_KEYCore endpoints
核心端点
http
undefinedhttp
undefinedWeb Unlocker — clean content from any URL
Web Unlocker — 从任意URL提取干净内容
POST /request
{
"url": "https://target.com",
"zone": "<unlocker-zone>",
"format": "raw",
"data_format": "markdown" // or "html", "screenshot", "parsed_light"
}
```httpPOST /request
{
"url": "https://target.com",
"zone": "<unlocker-zone>",
"format": "raw",
"data_format": "markdown" // 或 "html", "screenshot", "parsed_light"
}
```httpSERP API — structured search results
SERP API — 结构化搜索结果
Use the same /request endpoint with a SERP zone and a search URL,
使用相同的/request端点,搭配SERP代理区域和搜索URL,
adding brd_json=1
to receive parsed JSON instead of raw HTML.
brd_json=1添加brd_json=1
参数以接收解析后的JSON而非原始HTML。
brd_json=1POST /request
{
"url": "https://www.google.com/search?q=web+scraping&brd_json=1",
"zone": "<serp-zone>",
"format": "raw"
}
```httpPOST /request
{
"url": "https://www.google.com/search?q=web+scraping&brd_json=1",
"zone": "<serp-zone>",
"format": "raw"
}
```httpWeb Scraper API — structured data for 40+ platforms (async)
Web Scraper API — 40+平台的结构化数据(异步)
POST /datasets/v3/trigger?dataset_id=<id>
[ { "url": "https://amazon.com/dp/B09V3KXJPB" } ]
POST /datasets/v3/trigger?dataset_id=<id>
[ { "url": "https://amazon.com/dp/B09V3KXJPB" } ]
then poll
然后轮询结果
GET /datasets/v3/snapshot/<snapshot_id>?format=json
For the full parameter surface (special headers like
`x-unblock-expect`, async response IDs, dataset progress states,
Browser API CDP commands), read the `bright-data-best-practices`
skill — its references are the source of truth for REST-level work.GET /datasets/v3/snapshot/<snapshot_id>?format=json
如需完整参数说明(如`x-unblock-expect`等特殊头、异步响应ID、数据集进度状态、Browser API的CDP命令),请查看`bright-data-best-practices`技能——其参考文档是REST操作的权威来源。Documentation
文档
- Product docs: https://docs.brightdata.com
- LLM-friendly docs index: https://docs.brightdata.com/llms.txt
- Dashboard (zones, keys, billing): https://brightdata.com/cp
- 产品文档:https://docs.brightdata.com
- LLM友好型文档索引:https://docs.brightdata.com/llms.txt
- 控制台(代理区域、密钥、账单):https://brightdata.com/cp
After onboarding — where to go next
接入完成后 — 下一步
Once the agent is set up, route the work to the narrowest skill that
fits. Quick map:
| User says… | Skill |
|---|---|
| "scrape this URL" / "get this page" | |
| "search Google for…" / "find URLs about…" | |
| "get Amazon / LinkedIn / Instagram / TikTok / YouTube / Reddit data" | |
| "build a scraper for <site>" | |
| "analyze my competitor" / "compare pricing" | |
| "audit SEO" / "rank check" / "schema check" | |
| "write Bright Data code in Python" | |
| "plug Bright Data into my LLM agent" | |
| "use the CLI" / "run from terminal" | |
| "debug a Browser API session" | |
When in doubt, prefer the more specific skill: over
for supported platforms, over for
multi-page extraction, over when
the consumer is an LLM agent rather than a human at a terminal.
data-feedsscrapescraper-builderscrapebright-data-mcpbrightdata-cliAgent设置完成后,将工作转向最符合需求的细分技能。快速指引:
| 用户需求… | 对应技能 |
|---|---|
| "抓取这个URL" / "获取这个页面" | |
| "在Google上搜索…" / "查找关于…的URL" | |
| "获取Amazon/LinkedIn/Instagram/TikTok/YouTube/Reddit数据" | |
| "为<站点>构建抓取工具" | |
| "分析我的竞品" / "对比定价" | |
| "审计SEO" / "排名检查" / "Schema检查" | |
| "用Python编写Bright Data代码" | |
| "将Bright Data接入我的LLM Agent" | |
| "使用CLI" / "从终端运行" | |
| "调试Browser API会话" | |
如有疑问,优先选择更具体的技能:对于支持平台,优先使用而非;对于多页面提取,优先使用而非;当使用者是LLM Agent而非终端用户时,优先使用而非。
data-feedsscrapescraper-builderscrapebright-data-mcpbrightdata-cli