customer-discovery

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Customer Discovery

客户发现

Find all customers of a company by scanning multiple public data sources. Produces a deduplicated report with confidence scoring.
通过扫描多个公开数据源,查找某公司的所有客户。生成包含置信度评分的去重报告。

Quick Start

快速开始

Find all customers of Datadog
Who are Notion's customers? Use deep mode.
Find all customers of Datadog
Who are Notion's customers? Use deep mode.

Inputs

输入项

InputRequiredDefaultDescription
Company nameYesThe company to research
Website URLNoAuto-detectedThe company's website URL
DepthNostandard
quick
,
standard
, or
deep
输入项是否必填默认值描述
公司名称要调研的公司
网站URL自动检测公司的网站URL
Depthstandard
quick
standard
deep

Procedure

操作流程

Step 1: Gather Inputs

步骤1:收集输入信息

Ask the user for:
  1. Company name (required)
  2. Company website URL (optional — if not provided, WebSearch for it)
  3. Depth tier — present these options, default to Standard:
    • Quick (~2-3 min): Website logos, case studies, G2 reviews, press search
    • Standard (~5-8 min): Quick + blog posts, Wayback Machine, LinkedIn, Twitter, Reddit, HN, job postings, YouTube
    • Deep (~10-15 min): Standard + SEC filings, podcasts, GitHub, integration directories, BuiltWith, Crunchbase
向用户询问:
  1. 公司名称(必填)
  2. 公司网站URL(可选——若未提供,通过WebSearch查找)
  3. 深度层级——提供以下选项,默认选择Standard:
    • Quick(约2-3分钟):网站Logo墙、案例研究、G2评论、新闻搜索
    • Standard(约5-8分钟):包含Quick的所有数据源 + 博客文章、Wayback Machine、LinkedIn、Twitter/X、Reddit、HN、招聘信息、YouTube
    • Deep(约10-15分钟):包含Standard的所有数据源 + SEC文件、播客、GitHub、集成目录、BuiltWith、Crunchbase

Step 2: Create Output Directory

步骤2:创建输出目录

bash
mkdir -p customer-discovery-[company-slug]
bash
mkdir -p customer-discovery-[company-slug]

Step 3: Run Sources for Selected Tier

步骤3:运行对应层级的数据源扫描

Collect all results into a running list. For each customer found, record:
  • name: Company name
  • confidence: high / medium / low
  • source_type: e.g., "logo_wall", "case_study", "g2_review", "press", "job_posting"
  • evidence_url: URL where the evidence was found
  • notes: Brief description of the evidence
将所有结果收集到一个列表中。对于每个找到的客户,记录:
  • name:客户公司名称
  • confidence:高/中/低
  • source_type:例如"logo_wall"、"case_study"、"g2_review"、"press"、"job_posting"
  • evidence_url:找到证据的URL
  • notes:证据的简要描述

Quick Sources

快速层级数据源

1. Website logo wall
Run the scrape_website_logos.py script:
bash
python3 skills/capabilities/customer-discovery/scripts/scrape_website_logos.py \
  --url "[company-url]" --output json
Parse the JSON output and add each result to the customer list.
2. Case studies page
Use WebFetch on the company's case studies page (try
/case-studies
,
/customers
,
/resources/case-studies
). Extract customer names from page headings and content.
3. G2/Capterra reviews
If the
review-scraper
skill is available, use it to find reviewer companies:
bash
python3 skills/capabilities/review-scraper/scripts/scrape_reviews.py \
  --platform g2 --url "[g2-product-url]" --max-reviews 50 --output json
First, WebSearch for the company's G2 page:
site:g2.com "[company]"
. Extract reviewer company names from review author info.
4. Web search for press
WebSearch these queries and extract customer mentions from results:
  • "[company]" customer OR "case study" OR partnership
  • "[company]" "we use" OR "switched to" OR "chose"
1. 网站Logo墙
运行scrape_website_logos.py脚本:
bash
python3 skills/capabilities/customer-discovery/scripts/scrape_website_logos.py \
  --url "[company-url]" --output json
解析JSON输出,并将每个结果添加到客户列表中。
2. 案例研究页面
使用WebFetch获取公司的案例研究页面(尝试路径
/case-studies
/customers
/resources/case-studies
)。从页面标题和内容中提取客户名称。
3. G2/Capterra评论
review-scraper
技能可用,使用它查找评论者所属公司:
bash
python3 skills/capabilities/review-scraper/scripts/scrape_reviews.py \
  --platform g2 --url "[g2-product-url]" --max-reviews 50 --output json
首先,通过WebSearch查找公司的G2页面:
site:g2.com "[company]"
。从评论作者信息中提取评论者的公司名称。
4. 新闻媒体网络搜索
使用以下查询进行WebSearch,并从结果中提取客户提及信息:
  • "[company]" customer OR "case study" OR partnership
  • "[company]" "we use" OR "switched to" OR "chose"

Standard Sources (in addition to Quick)

标准层级数据源(包含快速层级所有内容)

5. Company blog posts
WebSearch:
site:[company-domain] customer OR "case study" OR partnership OR "customer story"
6. Wayback Machine logos
Run the scrape_wayback_logos.py script:
bash
python3 skills/capabilities/customer-discovery/scripts/scrape_wayback_logos.py \
  --url "[company-url]" --output json
Logos marked
still_present: false
are especially interesting — they indicate former customers.
7. Founder/exec LinkedIn posts
WebSearch:
site:linkedin.com "[company]" customer OR "excited to announce" OR "welcome"
8. Twitter/X mentions
WebSearch:
site:twitter.com "[company]" "we use" OR "just switched to" OR "loving"
9. Reddit/HN mentions
WebSearch these queries:
  • site:reddit.com "we use [company]" OR "[company] customer"
  • site:news.ycombinator.com "[company]" customer OR user
10. Job postings
WebSearch:
"experience with [company]" site:linkedin.com/jobs OR site:greenhouse.io OR site:lever.co
Companies requiring experience with the product are likely customers.
11. YouTube testimonials
WebSearch:
site:youtube.com "[company]" customer OR testimonial OR review
5. 公司博客文章
WebSearch查询:
site:[company-domain] customer OR "case study" OR partnership OR "customer story"
6. Wayback Machine历史Logo
运行scrape_wayback_logos.py脚本:
bash
python3 skills/capabilities/customer-discovery/scripts/scrape_wayback_logos.py \
  --url "[company-url]" --output json
标记为
still_present: false
的Logo尤其值得关注——它们表示该公司曾是客户但现已不再是。
7. 创始人/高管LinkedIn帖子
WebSearch查询:
site:linkedin.com "[company]" customer OR "excited to announce" OR "welcome"
8. Twitter/X提及
WebSearch查询:
site:twitter.com "[company]" "we use" OR "just switched to" OR "loving"
9. Reddit/HN提及
使用以下查询进行WebSearch:
  • site:reddit.com "we use [company]" OR "[company] customer"
  • site:news.ycombinator.com "[company]" customer OR user
10. 招聘信息
WebSearch查询:
"experience with [company]" site:linkedin.com/jobs OR site:greenhouse.io OR site:lever.co
要求员工具备该产品使用经验的公司很可能是其客户。
11. YouTube客户证言
WebSearch查询:
site:youtube.com "[company]" customer OR testimonial OR review

Deep Sources (in addition to Standard)

深度层级数据源(包含标准层级所有内容)

12. SEC filings
WebSearch:
site:sec.gov "[company]"
— Look for mentions in 10-K and 10-Q filings.
13. Podcast transcripts
WebSearch:
"[company]" podcast customer OR transcript OR interview
14. GitHub usage signals
WebSearch:
site:github.com "[company-package-name]"
in dependency files, package.json, requirements.txt, etc.
15. Integration directories
WebFetch marketplace pages where the company lists integrations:
  • Salesforce AppExchange
  • Zapier integrations page
  • Slack App Directory
  • Any marketplace relevant to the company
16. BuiltWith detection
bash
python3 skills/capabilities/customer-discovery/scripts/search_builtwith.py \
  --technology "[company-slug]" --max-results 50 --output json
17. Crunchbase
WebSearch:
site:crunchbase.com "[company]" customers OR partners
12. SEC文件
WebSearch查询:
site:sec.gov "[company]"
——查找10-K和10-Q文件中的提及信息。
13. 播客文稿
WebSearch查询:
"[company]" podcast customer OR transcript OR interview
14. GitHub使用信号
WebSearch查询:
site:github.com "[company-package-name]"
,查看依赖文件、package.json、requirements.txt等中的引用。
15. 集成目录
使用WebFetch获取公司列出集成的市场页面:
  • Salesforce AppExchange
  • Zapier集成页面
  • Slack应用目录
  • 与该公司相关的任何市场
16. BuiltWith检测
bash
python3 skills/capabilities/customer-discovery/scripts/search_builtwith.py \
  --technology "[company-slug]" --max-results 50 --output json
17. Crunchbase
WebSearch查询:
site:crunchbase.com "[company]" customers OR partners

Step 4: Deduplicate Results

步骤4:去重结果

Merge results by company name using fuzzy matching:
  • Normalize: lowercase, strip suffixes (Inc, Corp, LLC, Ltd, Co., GmbH)
  • Treat "Acme Inc" = "Acme" = "ACME Corp" = "acme.com" as the same company
  • When merging, keep the highest confidence level and all evidence URLs
使用模糊匹配按公司名称合并结果:
  • 标准化处理:转为小写,去除后缀(Inc、Corp、LLC、Ltd、Co.、GmbH)
  • 将"Acme Inc"、"Acme"、"ACME Corp"、"acme.com"视为同一家公司
  • 合并时,保留最高置信度等级以及所有证据URL

Step 5: Assign Confidence

步骤5:分配置信度

Apply these rules:
High confidence:
  • Logo on current website (from scrape_website_logos.py with confidence "high")
  • Published case study or customer story
  • Direct quote or testimonial on the company's site
  • Official partnership page listing
Medium confidence:
  • G2/Capterra review (reviewer's company)
  • Press article mentioning customer relationship
  • Job posting requiring experience with the product
  • YouTube testimonial or video review
  • Logo found only in Wayback Machine (was on site, now removed)
Low confidence:
  • Single social media mention (tweet, Reddit post)
  • Indirect reference ("heard good things about X")
  • BuiltWith detection only (technology on site doesn't mean they're a paying customer)
  • HN discussion mention
应用以下规则:
高置信度:
  • 当前网站上的Logo(来自scrape_website_logos.py且置信度为"high")
  • 已发布的案例研究或客户故事
  • 公司官网的直接引用或客户证言
  • 官方合作伙伴页面列出的客户
中置信度:
  • G2/Capterra评论(评论者所属公司)
  • 提及客户关系的新闻文章
  • 要求具备该产品使用经验的招聘信息
  • YouTube客户证言或视频评论
  • 仅在Wayback Machine中找到的Logo(曾在官网展示,现已移除)
低置信度:
  • 单一社交媒体提及(推文、Reddit帖子)
  • 间接引用(如“听说X的口碑不错”)
  • 仅通过BuiltWith检测到(网站使用该技术不代表是付费客户)
  • HN讨论中的提及

Step 6: Generate Report

步骤6:生成报告

Create two output files:
customer-discovery-[company]/report.md
:
markdown
undefined
创建两个输出文件:
customer-discovery-[company]/report.md
markdown
undefined

Customer Discovery: [Company Name]

客户发现:[公司名称]

Date: YYYY-MM-DD Depth: quick | standard | deep Total customers found: N
日期: YYYY-MM-DD 深度: quick | standard | deep 发现客户总数: N

High Confidence (N)

高置信度客户(N个)

CustomerSourceEvidence
ShopifyCase study[link]
.........
客户公司来源证据链接
Shopify案例研究[链接]
.........

Medium Confidence (N)

中置信度客户(N个)

CustomerSourceEvidence
.........
客户公司来源证据链接
.........

Low Confidence (N)

低置信度客户(N个)

CustomerSourceEvidence
.........
客户公司来源证据链接
.........

Sources Scanned

扫描的数据源

  • Website logo wall: [url] — N customers found
  • G2 reviews: N reviews analyzed — N companies identified
  • Wayback Machine: N snapshots checked — N logos found (N removed)
  • Web search: N queries — N mentions
  • ...
  • 网站Logo墙:[URL] — 发现N个客户
  • G2评论:分析N条评论 — 识别N家公司
  • Wayback Machine:检查N个快照 — 发现N个Logo(其中N个已移除)
  • 网络搜索:N个查询 — N次提及
  • ...

Methodology

方法论

This report was generated using the customer-discovery skill, which scans public data sources to identify companies that use [Company Name]. Confidence levels reflect the strength and directness of the evidence found.

**`customer-discovery-[company]/customers.csv`:**

CSV with columns: `company_name,confidence,source_type,evidence_url,notes`

Write the CSV using a code block or Python script.
本报告由客户发现技能生成,该技能通过扫描公开数据源识别使用[公司名称]产品的客户。置信度等级反映了所发现证据的可信度和直接性。

**`customer-discovery-[company]/customers.csv`:**

包含以下列的CSV文件:`company_name,confidence,source_type,evidence_url,notes`

使用代码块或Python脚本生成该CSV文件。

Scripts Reference

脚本参考

ScriptPurposeKey flags
scrape_website_logos.py
Extract logos from current website
--url
,
--output json|summary
scrape_wayback_logos.py
Find historical logos via Wayback Machine
--url
,
--paths
,
--output json|summary
search_builtwith.py
BuiltWith technology detection (deep mode)
--technology
,
--max-results
,
--output json|summary
All scripts require
requests
:
pip3 install requests
External skill scripts (use if available):
  • skills/capabilities/review-scraper/scripts/scrape_reviews.py
    — G2/Capterra/Trustpilot reviews (requires Apify token)
  • skills/capabilities/linkedin-post-research/scripts/search_posts.py
    — LinkedIn post search (requires Crustdata API key)
脚本用途关键参数
scrape_website_logos.py
从当前网站提取Logo
--url
,
--output json|summary
scrape_wayback_logos.py
通过Wayback Machine查找历史Logo
--url
,
--paths
,
--output json|summary
search_builtwith.py
BuiltWith技术检测(深度模式)
--technology
,
--max-results
,
--output json|summary
所有脚本均需安装
requests
库:
pip3 install requests
外部技能脚本(若可用则使用):
  • skills/capabilities/review-scraper/scripts/scrape_reviews.py
    — 爬取G2/Capterra/Trustpilot评论(需要Apify令牌)
  • skills/capabilities/linkedin-post-research/scripts/search_posts.py
    — 搜索LinkedIn帖子(需要Crustdata API密钥)

Cost

成本说明

  • Quick / Standard: Free (uses WebSearch + free APIs like Wayback Machine CDX)
  • Deep: Mostly free. BuiltWith paid API is optional (
    --api-key
    flag); free scraping is used by default.
  • External skills (review-scraper, linkedin-post-research) may require paid API tokens.
  • 快速/标准层级: 免费(使用WebSearch + Wayback Machine CDX等免费API)
  • 深度层级: 基本免费。BuiltWith付费API为可选(
    --api-key
    参数);默认使用免费爬取方式。
  • 外部技能(review-scraper、linkedin-post-research)可能需要付费API令牌。