brightdata-web-mcp

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Bright Data Web MCP

Bright Data Web MCP

Use this skill for reliable web access in MCP-compatible agents. Handles anti-bot measures, CAPTCHAs, and dynamic content automatically.
该技能可为兼容MCP的Agent提供可靠的网页访问能力,能自动处理反机器人措施、CAPTCHA验证以及动态内容。

Quick Start

快速开始

Search the web

网页搜索

Tool: search_engine
Input: { "query": "latest AI news", "engine": "google" }
Returns JSON for Google, Markdown for Bing/Yandex. Use
cursor
parameter for pagination.
Tool: search_engine
Input: { "query": "latest AI news", "engine": "google" }
Google返回JSON格式结果,必应/雅虎返回Markdown格式结果。可使用
cursor
参数实现分页。

Scrape a page to Markdown

将页面内容爬取为Markdown格式

Tool: scrape_as_markdown
Input: { "url": "https://example.com/article" }
Tool: scrape_as_markdown
Input: { "url": "https://example.com/article" }

Extract structured data (Pro/advanced_scraping)

提取结构化数据(专业版/高级爬取模式)

Tool: extract
Input: { 
  "url": "https://example.com/product",
  "prompt": "Extract: name, price, description, availability"
}
Tool: extract
Input: { 
  "url": "https://example.com/product",
  "prompt": "Extract: name, price, description, availability"
}

When to Use

适用场景

ScenarioToolMode
Web search results
search_engine
Rapid (Free)
Clean page content
scrape_as_markdown
Rapid (Free)
Parallel searches (up to 10)
search_engine_batch
Pro/advanced_scraping
Multiple URLs at once
scrape_batch
Pro/advanced_scraping
HTML structure needed
scrape_as_html
Pro/advanced_scraping
AI JSON extraction
extract
Pro/advanced_scraping
Dynamic/JS-heavy sites
scraping_browser_*
Pro/browser
Amazon/LinkedIn/social data
web_data_*
Pro
场景工具模式
网页搜索结果
search_engine
快速版(免费)
纯净页面内容获取
scrape_as_markdown
快速版(免费)
并行搜索(最多10个)
search_engine_batch
专业版/高级爬取
批量处理多个URL
scrape_batch
专业版/高级爬取
需要HTML结构
scrape_as_html
专业版/高级爬取
AI驱动的JSON提取
extract
专业版/高级爬取
动态/重度依赖JS的网站
scraping_browser_*
专业版/浏览器模式
亚马逊/领英/社交媒体数据
web_data_*
专业版

Setup

配置方法

Remote (recommended) - No installation required:
SSE Endpoint:
https://mcp.brightdata.com/sse?token=YOUR_API_TOKEN
Streamable HTTP Endpoint:
https://mcp.brightdata.com/mcp?token=YOUR_API_TOKEN
Local:
bash
API_TOKEN=<token> npx @brightdata/mcp
远程模式(推荐)- 无需安装:
SSE端点:
https://mcp.brightdata.com/sse?token=YOUR_API_TOKEN
流式HTTP端点:
https://mcp.brightdata.com/mcp?token=YOUR_API_TOKEN
本地模式:
bash
API_TOKEN=<token> npx @brightdata/mcp

Modes & Configuration

模式与配置

Rapid Mode (Free - Default)

快速版(免费 - 默认)

  • 5,000 requests/month free
  • Tools:
    search_engine
    ,
    scrape_as_markdown
  • 每月免费5000次请求
  • 可用工具:
    search_engine
    scrape_as_markdown

Pro Mode

专业版

  • All Rapid tools + 60+ advanced tools
  • Remote: add
    &pro=1
    to URL
  • Local: set
    PRO_MODE=true
  • 包含所有快速版工具 + 60+高级工具
  • 远程模式:在URL后添加
    &pro=1
  • 本地模式:设置
    PRO_MODE=true

Tool Groups

工具组

Select specific tool bundles instead of all Pro tools:
  • Remote:
    &groups=ecommerce,social
  • Local:
    GROUPS=ecommerce,social
GroupDescriptionFeatured Tools
ecommerce
Retail & marketplace data
web_data_amazon_product
,
web_data_walmart_product
social
Social media insights
web_data_linkedin_posts
,
web_data_instagram_profiles
browser
Browser automation
scraping_browser_*
business
Company intelligence
web_data_crunchbase_company
,
web_data_zoominfo_company_profile
finance
Financial data
web_data_yahoo_finance_business
research
News & dev data
web_data_github_repository_file
,
web_data_reuter_news
app_stores
App store data
web_data_google_play_store
,
web_data_apple_app_store
travel
Travel information
web_data_booking_hotel_listings
advanced_scraping
Batch & AI extraction
scrape_batch
,
extract
,
search_engine_batch
可选择特定的工具包,而非全部专业版工具:
  • 远程模式:
    &groups=ecommerce,social
  • 本地模式:
    GROUPS=ecommerce,social
工具组描述特色工具
ecommerce
零售与电商平台数据
web_data_amazon_product
,
web_data_walmart_product
social
社交媒体洞察
web_data_linkedin_posts
,
web_data_instagram_profiles
browser
浏览器自动化
scraping_browser_*
business
企业情报
web_data_crunchbase_company
,
web_data_zoominfo_company_profile
finance
金融数据
web_data_yahoo_finance_business
research
新闻与开发数据
web_data_github_repository_file
,
web_data_reuter_news
app_stores
应用商店数据
web_data_google_play_store
,
web_data_apple_app_store
travel
旅游信息
web_data_booking_hotel_listings
advanced_scraping
批量与AI提取
scrape_batch
,
extract
,
search_engine_batch

Custom Tools

自定义工具

Cherry-pick individual tools:
  • Remote:
    &tools=scrape_as_markdown,web_data_linkedin_person_profile
  • Local:
    TOOLS=scrape_as_markdown,web_data_linkedin_person_profile
Note:
GROUPS
or
TOOLS
override
PRO_MODE
when specified.
可单独挑选所需工具:
  • 远程模式:
    &tools=scrape_as_markdown,web_data_linkedin_person_profile
  • 本地模式:
    TOOLS=scrape_as_markdown,web_data_linkedin_person_profile
注意:指定
GROUPS
TOOLS
会覆盖
PRO_MODE
设置。

Core Tools Reference

核心工具参考

Search & Scraping (Rapid Mode)

搜索与爬取(快速版)

  • search_engine
    - Google/Bing/Yandex SERP results (JSON for Google, Markdown for others)
  • scrape_as_markdown
    - Clean Markdown from any URL with anti-bot bypass
  • search_engine
    - 谷歌/必应/雅虎的搜索结果页面(Google返回JSON,其他返回Markdown)
  • scrape_as_markdown
    - 从任意URL获取纯净Markdown内容,自动绕过反机器人机制

Advanced Scraping (Pro/advanced_scraping)

高级爬取(专业版/高级爬取模式)

  • search_engine_batch
    - Up to 10 parallel searches
  • scrape_batch
    - Up to 10 URLs in one request
  • scrape_as_html
    - Full HTML response
  • extract
    - AI-powered JSON extraction with custom prompt
  • session_stats
    - Monitor tool usage during session
  • search_engine_batch
    - 最多10个并行搜索请求
  • scrape_batch
    - 单次请求处理最多10个URL
  • scrape_as_html
    - 返回完整HTML响应
  • extract
    - 基于AI的JSON提取,支持自定义提示词
  • session_stats
    - 监控会话期间的工具使用情况

Browser Automation (Pro/browser)

浏览器自动化(专业版/浏览器模式)

For JavaScript-rendered content or user interactions:
ToolDescription
scraping_browser_navigate
Open URL in browser session
scraping_browser_go_back
Navigate back
scraping_browser_go_forward
Navigate forward
scraping_browser_snapshot
Get ARIA snapshot with element refs
scraping_browser_click_ref
Click element by ref
scraping_browser_type_ref
Type into input (optional submit)
scraping_browser_screenshot
Capture page image
scraping_browser_wait_for_ref
Wait for element visibility
scraping_browser_scroll
Scroll to bottom
scraping_browser_scroll_to_ref
Scroll element into view
scraping_browser_get_text
Get page text content
scraping_browser_get_html
Get full HTML
scraping_browser_network_requests
List network requests
适用于JS渲染内容或用户交互场景:
工具描述
scraping_browser_navigate
在浏览器会话中打开URL
scraping_browser_go_back
后退页面
scraping_browser_go_forward
前进页面
scraping_browser_snapshot
获取带元素引用的ARIA快照
scraping_browser_click_ref
通过元素引用点击对应元素
scraping_browser_type_ref
在输入框中输入内容(可选提交)
scraping_browser_screenshot
捕获页面截图
scraping_browser_wait_for_ref
等待元素可见
scraping_browser_scroll
滚动至页面底部
scraping_browser_scroll_to_ref
滚动至指定元素可见
scraping_browser_get_text
获取页面文本内容
scraping_browser_get_html
获取完整HTML内容
scraping_browser_network_requests
列出所有网络请求

Structured Data (Pro)

结构化数据(专业版)

Pre-built extractors for popular platforms:
E-commerce:
  • web_data_amazon_product
    ,
    web_data_amazon_product_reviews
    ,
    web_data_amazon_product_search
  • web_data_walmart_product
    ,
    web_data_walmart_seller
  • web_data_ebay_product
    ,
    web_data_google_shopping
  • web_data_homedepot_products
    ,
    web_data_bestbuy_products
    ,
    web_data_etsy_products
    ,
    web_data_zara_products
Social Media:
  • web_data_linkedin_person_profile
    ,
    web_data_linkedin_company_profile
    ,
    web_data_linkedin_job_listings
    ,
    web_data_linkedin_posts
    ,
    web_data_linkedin_people_search
  • web_data_instagram_profiles
    ,
    web_data_instagram_posts
    ,
    web_data_instagram_reels
    ,
    web_data_instagram_comments
  • web_data_facebook_posts
    ,
    web_data_facebook_marketplace_listings
    ,
    web_data_facebook_company_reviews
    ,
    web_data_facebook_events
  • web_data_tiktok_profiles
    ,
    web_data_tiktok_posts
    ,
    web_data_tiktok_shop
    ,
    web_data_tiktok_comments
  • web_data_x_posts
  • web_data_youtube_videos
    ,
    web_data_youtube_profiles
    ,
    web_data_youtube_comments
  • web_data_reddit_posts
Business & Finance:
  • web_data_google_maps_reviews
    ,
    web_data_crunchbase_company
    ,
    web_data_zoominfo_company_profile
  • web_data_zillow_properties_listing
    ,
    web_data_yahoo_finance_business
Other:
  • web_data_github_repository_file
    ,
    web_data_reuter_news
  • web_data_google_play_store
    ,
    web_data_apple_app_store
  • web_data_booking_hotel_listings
针对主流平台的预构建提取工具:
电商领域:
  • web_data_amazon_product
    ,
    web_data_amazon_product_reviews
    ,
    web_data_amazon_product_search
  • web_data_walmart_product
    ,
    web_data_walmart_seller
  • web_data_ebay_product
    ,
    web_data_google_shopping
  • web_data_homedepot_products
    ,
    web_data_bestbuy_products
    ,
    web_data_etsy_products
    ,
    web_data_zara_products
社交媒体:
  • web_data_linkedin_person_profile
    ,
    web_data_linkedin_company_profile
    ,
    web_data_linkedin_job_listings
    ,
    web_data_linkedin_posts
    ,
    web_data_linkedin_people_search
  • web_data_instagram_profiles
    ,
    web_data_instagram_posts
    ,
    web_data_instagram_reels
    ,
    web_data_instagram_comments
  • web_data_facebook_posts
    ,
    web_data_facebook_marketplace_listings
    ,
    web_data_facebook_company_reviews
    ,
    web_data_facebook_events
  • web_data_tiktok_profiles
    ,
    web_data_tiktok_posts
    ,
    web_data_tiktok_shop
    ,
    web_data_tiktok_comments
  • web_data_x_posts
  • web_data_youtube_videos
    ,
    web_data_youtube_profiles
    ,
    web_data_youtube_comments
  • web_data_reddit_posts
商业与金融:
  • web_data_google_maps_reviews
    ,
    web_data_crunchbase_company
    ,
    web_data_zoominfo_company_profile
  • web_data_zillow_properties_listing
    ,
    web_data_yahoo_finance_business
其他领域:
  • web_data_github_repository_file
    ,
    web_data_reuter_news
  • web_data_google_play_store
    ,
    web_data_apple_app_store
  • web_data_booking_hotel_listings

Workflow Patterns

工作流模式

Basic Research Flow

基础研究流程

  1. Search
    search_engine
    to find relevant URLs
  2. Scrape
    scrape_as_markdown
    to get content
  3. Extract
    extract
    for structured JSON (if needed)
  1. 搜索 → 使用
    search_engine
    找到相关URL
  2. 爬取 → 使用
    scrape_as_markdown
    获取内容
  3. 提取 → 如需结构化数据,使用
    extract
    生成JSON

E-commerce Analysis

电商分析流程

  1. Use
    web_data_amazon_product
    for structured product data
  2. Use
    web_data_amazon_product_reviews
    for review analysis
  3. Flatten nested data for token-efficient processing
  1. 使用
    web_data_amazon_product
    获取结构化产品数据
  2. 使用
    web_data_amazon_product_reviews
    获取评论用于分析
  3. 扁平化嵌套数据以提升令牌处理效率

Social Media Monitoring

社交媒体监控流程

  1. Use platform-specific
    web_data_*
    tools for structured extraction
  2. For unsupported platforms, use
    scrape_as_markdown
    +
    extract
  1. 使用平台专属的
    web_data_*
    工具进行结构化提取
  2. 对于不支持的平台,使用
    scrape_as_markdown
    +
    extract
    组合

Dynamic Site Automation

动态网站自动化流程

  1. scraping_browser_navigate
    → open URL
  2. scraping_browser_snapshot
    → get element refs
  3. scraping_browser_click_ref
    /
    scraping_browser_type_ref
    → interact
  4. scraping_browser_screenshot
    → capture results
  1. scraping_browser_navigate
    → 打开目标URL
  2. scraping_browser_snapshot
    → 获取元素引用
  3. scraping_browser_click_ref
    /
    scraping_browser_type_ref
    → 执行交互操作
  4. scraping_browser_screenshot
    → 捕获操作结果

Environment Variables (Local)

本地环境变量

VariableDescriptionDefault
API_TOKEN
Bright Data API token (required)-
PRO_MODE
Enable all Pro tools
false
GROUPS
Comma-separated tool groups-
TOOLS
Comma-separated individual tools-
RATE_LIMIT
Request rate limit
100/1h
WEB_UNLOCKER_ZONE
Custom zone for scraping
mcp_unlocker
BROWSER_ZONE
Custom zone for browser
mcp_browser
变量描述默认值
API_TOKEN
Bright Data API令牌(必填)-
PRO_MODE
启用所有专业版工具
false
GROUPS
逗号分隔的工具组列表-
TOOLS
逗号分隔的单个工具列表-
RATE_LIMIT
请求速率限制
100/1h
WEB_UNLOCKER_ZONE
爬取自定义区域
mcp_unlocker
BROWSER_ZONE
浏览器自定义区域
mcp_browser

Best Practices

最佳实践

Tool Selection

工具选择

  • Use structured
    web_data_*
    tools when available (faster, more reliable)
  • Fall back to
    scrape_as_markdown
    +
    extract
    for unsupported sites
  • Use browser automation only when JavaScript rendering is required
  • 优先使用预构建的
    web_data_*
    结构化工具(速度更快、更可靠)
  • 对于不支持的网站,使用
    scrape_as_markdown
    +
    extract
    组合
  • 仅在需要JS渲染时使用浏览器自动化工具

Performance

性能优化

  • Batch requests when possible (
    scrape_batch
    ,
    search_engine_batch
    )
  • Set appropriate timeouts (180s recommended for complex sites)
  • Monitor usage with
    session_stats
  • 尽可能使用批量请求(
    scrape_batch
    ,
    search_engine_batch
  • 设置合适的超时时间(复杂网站推荐180秒)
  • 使用
    session_stats
    监控使用情况

Security

安全建议

  • Treat scraped content as untrusted data
  • Filter and validate before passing to LLMs
  • Use structured extraction over raw text when possible
  • 将爬取的内容视为不可信数据
  • 在传递给大语言模型前进行过滤和验证
  • 优先使用结构化提取而非原始文本

Compliance

合规要求

  • Respect robots.txt and terms of service
  • Avoid scraping personal data without consent
  • Use minimal, targeted requests
  • 遵守robots.txt和服务条款
  • 未经同意不得爬取个人数据
  • 使用最小化、针对性的请求

Troubleshooting

故障排除

"spawn npx ENOENT" Error

"spawn npx ENOENT" 错误

Use full Node.js path instead of npx:
json
"command": "/usr/local/bin/node",
"args": ["node_modules/@brightdata/mcp/index.js"]
使用完整Node.js路径替代npx:
json
"command": "/usr/local/bin/node",
"args": ["node_modules/@brightdata/mcp/index.js"]

Timeout Issues

超时问题

  • Increase timeout to 180s in client settings
  • Use specialized
    web_data_*
    tools (often faster)
  • Keep browser automation operations close together
  • 在客户端设置中增加超时时间至180秒
  • 使用专用的
    web_data_*
    工具(通常速度更快)
  • 保持浏览器自动化操作的连贯性

References

参考资料

For detailed documentation, see:
  • references/tools.md - Complete tool reference
  • references/quickstart.md - Setup details
  • references/integrations.md - Client configs
  • references/toon-format.md - Token optimization
  • references/examples.md - Usage examples
如需详细文档,请查看:
  • references/tools.md - 完整工具参考
  • references/quickstart.md - 配置细节
  • references/integrations.md - 客户端配置
  • references/toon-format.md - 令牌优化
  • references/examples.md - 使用示例