apify-ultimate-scraper
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseUniversal Web Scraper
通用网页爬虫
AI-driven data extraction from 55+ Actors across all major platforms. This skill automatically selects the best Actor for your task.
基于AI驱动,可通过55+个Actor从所有主流平台提取数据,该工具会自动为你的任务选择最合适的Actor。
Prerequisites
前置条件
(No need to check it upfront)
- file with
.envAPIFY_TOKEN - Node.js 20.6+ (for native support)
--env-file - CLI tool:
mcpcnpm install -g @apify/mcpc
(无需提前检查)
- 包含的
APIFY_TOKEN文件.env - Node.js 20.6+(用于支持原生参数)
--env-file - CLI工具:
mcpcnpm install -g @apify/mcpc
Workflow
工作流程
Copy this checklist and track progress:
Task Progress:
- [ ] Step 1: Understand user goal and select Actor
- [ ] Step 2: Fetch Actor schema via mcpc
- [ ] Step 3: Ask user preferences (format, filename)
- [ ] Step 4: Run the scraper script
- [ ] Step 5: Summarize results and offer follow-ups复制以下检查清单跟踪进度:
任务进度:
- [ ] 步骤1:理解用户需求并选择Actor
- [ ] 步骤2:通过mcpc获取Actor schema
- [ ] 步骤3:询问用户偏好(格式、文件名)
- [ ] 步骤4:运行爬虫脚本
- [ ] 步骤5:总结结果并提供后续建议Step 1: Understand User Goal and Select Actor
步骤1:理解用户需求并选择Actor
First, understand what the user wants to achieve. Then select the best Actor from the options below.
首先明确用户想要实现的目标,然后从以下选项中选择最合适的Actor。
Instagram Actors (12)
Instagram Actors(12个)
| Actor ID | Best For |
|---|---|
| Profile data, follower counts, bio info |
| Individual post details, engagement metrics |
| Comment extraction, sentiment analysis |
| Hashtag content, trending topics |
| Hashtag performance metrics |
| Reels content and metrics |
| Search users, places, hashtags |
| Posts tagged with specific accounts |
| Follower count tracking |
| Comprehensive Instagram data |
| API-based Instagram access |
| Bulk comment/post export |
| Actor ID | 适用场景 |
|---|---|
| 账号资料、粉丝数、简介信息 |
| 单条帖子详情、互动指标 |
| 评论提取、情感分析 |
| 标签内容、热门话题 |
| 标签表现指标 |
| Reels内容和指标 |
| 搜索用户、地点、标签 |
| 被特定账号标记的帖子 |
| 粉丝数跟踪 |
| 全量Instagram数据 |
| 基于API的Instagram访问 |
| 批量导出评论/帖子 |
Facebook Actors (14)
Facebook Actors(14个)
| Actor ID | Best For |
|---|---|
| Page data, metrics, contact info |
| Emails, phones, addresses from pages |
| Post content and engagement |
| Comment extraction |
| Reaction analysis |
| Page reviews |
| Group content and members |
| Event data |
| Ad creative and targeting |
| Search results |
| Reels content |
| Photo extraction |
| Marketplace listings |
| Follower/following lists |
| Actor ID | 适用场景 |
|---|---|
| 主页数据、指标、联系方式 |
| 主页关联的邮箱、电话、地址 |
| 帖子内容和互动数据 |
| 评论提取 |
| 反应分析 |
| 主页评论 |
| 群组内容和成员 |
| 活动数据 |
| 广告素材和定向设置 |
| 搜索结果 |
| Reels内容 |
| 图片提取 |
| Marketplace商品列表 |
| 粉丝/关注列表 |
TikTok Actors (14)
TikTok Actors(14个)
| Actor ID | Best For |
|---|---|
| Comprehensive TikTok data |
| Free TikTok extraction |
| Profile data |
| Video details and metrics |
| Comment extraction |
| Follower lists |
| Find users by keywords |
| Hashtag content |
| Trending sounds |
| Ad content |
| Discover page content |
| Explore content |
| Trending content |
| Live stream data |
| Actor ID | 适用场景 |
|---|---|
| 全量TikTok数据 |
| 免费TikTok数据提取 |
| 账号资料 |
| 视频详情和指标 |
| 评论提取 |
| 粉丝列表 |
| 按关键词查找用户 |
| 标签内容 |
| 热门音频 |
| 广告内容 |
| Discover页面内容 |
| 探索页内容 |
| 热门内容 |
| 直播数据 |
YouTube Actors (5)
YouTube Actors(5个)
| Actor ID | Best For |
|---|---|
| Video data and metrics |
| Channel information |
| Comment extraction |
| Shorts content |
| Videos by hashtag |
| Actor ID | 适用场景 |
|---|---|
| 视频数据和指标 |
| 频道信息 |
| 评论提取 |
| Shorts内容 |
| 按标签提取视频 |
Google Maps Actors (4)
Google Maps Actors(4个)
| Actor ID | Best For |
|---|---|
| Business listings, ratings, contact info |
| Detailed business data |
| Review extraction |
| Email discovery from listings |
| Actor ID | 适用场景 |
|---|---|
| 商家列表、评分、联系方式 |
| 详细商家数据 |
| 评论提取 |
| 从列表中挖掘邮箱 |
Other Actors (6)
其他Actors(6个)
| Actor ID | Best For |
|---|---|
| Google search results |
| Google Trends data |
| Booking.com hotel data |
| Booking.com reviews |
| TripAdvisor reviews |
| Contact enrichment from URLs |
| Actor ID | 适用场景 |
|---|---|
| Google搜索结果 |
| Google Trends数据 |
| Booking.com酒店数据 |
| Booking.com评论 |
| TripAdvisor评论 |
| 从URL补充联系信息 |
Actor Selection by Use Case
按使用场景选择Actor
| Use Case | Primary Actors |
|---|---|
| Lead Generation | |
| Influencer Discovery | |
| Brand Monitoring | |
| Competitor Analysis | |
| Content Analytics | |
| Trend Research | |
| Review Analysis | |
| Audience Analysis | |
| 使用场景 | 首选Actor |
|---|---|
| 线索生成 | |
| 红人挖掘 | |
| 品牌监控 | |
| 竞品分析 | |
| 内容分析 | |
| 趋势研究 | |
| 评论分析 | |
| 受众分析 | |
Multi-Actor Workflows
多Actor工作流
For complex tasks, chain multiple Actors:
| Workflow | Step 1 | Step 2 |
|---|---|---|
| Lead enrichment | | |
| Influencer vetting | | |
| Competitor deep-dive | | |
| Local business analysis | | |
针对复杂任务,可以串联多个Actor使用:
| 工作流 | 步骤1 | 步骤2 |
|---|---|---|
| 线索信息补充 | | |
| 红人资质审核 | | |
| 竞品深度分析 | | |
| 本地商家分析 | | |
Can't Find a Suitable Actor?
找不到合适的Actor?
If none of the Actors above match the user's request, search the Apify Store directly:
bash
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com --header "Authorization: Bearer $APIFY_TOKEN" tools-call search-actors keywords:="SEARCH_KEYWORDS" limit:=10 offset:=0 category:="" | jq -r '.content[0].text'Replace with 1-3 simple terms (e.g., "LinkedIn profiles", "Amazon products", "Twitter").
SEARCH_KEYWORDS如果以上Actor都不符合用户需求,可以直接在Apify商店搜索:
bash
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com --header "Authorization: Bearer $APIFY_TOKEN" tools-call search-actors keywords:="SEARCH_KEYWORDS" limit:=10 offset:=0 category:="" | jq -r '.content[0].text'将替换为1-3个简单关键词(例如"LinkedIn profiles"、"Amazon products"、"Twitter")。
SEARCH_KEYWORDSStep 2: Fetch Actor Schema
步骤2:获取Actor Schema
Fetch the Actor's input schema and details dynamically using mcpc:
bash
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com --header "Authorization: Bearer $APIFY_TOKEN" tools-call fetch-actor-details actor:="ACTOR_ID" | jq -r ".content"Replace with the selected Actor (e.g., ).
ACTOR_IDcompass/crawler-google-placesThis returns:
- Actor description and README
- Required and optional input parameters
- Output fields (if available)
使用mcpc动态获取Actor的输入schema和详情:
bash
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com --header "Authorization: Bearer $APIFY_TOKEN" tools-call fetch-actor-details actor:="ACTOR_ID" | jq -r ".content"将替换为选中的Actor(例如)。
ACTOR_IDcompass/crawler-google-places该命令会返回:
- Actor描述和README
- 必填和可选输入参数
- 输出字段(如果有)
Step 3: Ask User Preferences
步骤3:询问用户偏好
Before running, ask:
- Output format:
- Quick answer - Display top few results in chat (no file saved)
- CSV - Full export with all fields
- JSON - Full export in JSON format
- Number of results: Based on character of use case
运行前请确认:
- 输出格式:
- 快速答复 - 在聊天中展示前几条结果(不保存文件)
- CSV - 全量导出所有字段
- JSON - JSON格式全量导出
- 结果数量:根据使用场景的特点确定
Step 4: Run the Script
步骤4:运行脚本
Quick answer (display in chat, no file):
bash
node --env-file=.env ${CLAUDE_PLUGIN_ROOT}/reference/scripts/run_actor.js \
--actor "ACTOR_ID" \
--input 'JSON_INPUT'CSV:
bash
node --env-file=.env ${CLAUDE_PLUGIN_ROOT}/reference/scripts/run_actor.js \
--actor "ACTOR_ID" \
--input 'JSON_INPUT' \
--output YYYY-MM-DD_OUTPUT_FILE.csv \
--format csvJSON:
bash
node --env-file=.env ${CLAUDE_PLUGIN_ROOT}/reference/scripts/run_actor.js \
--actor "ACTOR_ID" \
--input 'JSON_INPUT' \
--output YYYY-MM-DD_OUTPUT_FILE.json \
--format json快速答复(聊天展示,不保存文件):
bash
node --env-file=.env ${CLAUDE_PLUGIN_ROOT}/reference/scripts/run_actor.js \
--actor "ACTOR_ID" \
--input 'JSON_INPUT'CSV格式:
bash
node --env-file=.env ${CLAUDE_PLUGIN_ROOT}/reference/scripts/run_actor.js \
--actor "ACTOR_ID" \
--input 'JSON_INPUT' \
--output YYYY-MM-DD_OUTPUT_FILE.csv \
--format csvJSON格式:
bash
node --env-file=.env ${CLAUDE_PLUGIN_ROOT}/reference/scripts/run_actor.js \
--actor "ACTOR_ID" \
--input 'JSON_INPUT' \
--output YYYY-MM-DD_OUTPUT_FILE.json \
--format jsonStep 5: Summarize Results and Offer Follow-ups
步骤5:总结结果并提供后续建议
After completion, report:
- Number of results found
- File location and name
- Key fields available
- Suggested follow-up workflows based on results:
| If User Got | Suggest Next |
|---|---|
| Business listings | Enrich with |
| Influencer profiles | Analyze engagement with comment scrapers |
| Competitor pages | Deep-dive with post/ad scrapers |
| Trend data | Validate with platform-specific hashtag scrapers |
运行完成后,反馈以下信息:
- 找到的结果数量
- 文件路径和名称
- 可用的核心字段
- 基于结果的后续工作流建议:
| 如果用户获取了 | 下一步建议 |
|---|---|
| 商家列表 | 使用 |
| 红人资料 | 使用评论爬虫分析互动情况 |
| 竞品页面 | 使用帖子/广告爬虫做深度分析 |
| 趋势数据 | 使用平台专属的标签爬虫验证结果 |
Error Handling
错误处理
APIFY_TOKEN not found.envAPIFY_TOKEN=your_tokenmcpc not foundnpm install -g @apify/mcpcActor not foundRun FAILEDTimeout--timeoutAPIFY_TOKEN not foundAPIFY_TOKEN=your_token.envmcpc not foundnpm install -g @apify/mcpcActor not foundRun FAILEDTimeout--timeout