customer-discovery

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Customer Discovery

客户发现

Find all customers of a company by scanning multiple public data sources. Produces a deduplicated report with confidence scoring.

通过扫描多个公开数据源，查找某公司的所有客户。生成包含置信度评分的去重报告。

Quick Start

快速开始

Find all customers of Datadog

Who are Notion's customers? Use deep mode.

Find all customers of Datadog

Who are Notion's customers? Use deep mode.

Inputs

输入项

Input	Required	Default	Description
Company name	Yes	—	The company to research
Website URL	No	Auto-detected	The company's website URL
Depth	No	standard	`quick` , `standard` , or `deep`

输入项	是否必填	默认值	描述
公司名称	是	—	要调研的公司
网站URL	否	自动检测	公司的网站URL
Depth	否	standard	`quick` 、 `standard` 或 `deep`

Procedure

操作流程

Step 1: Gather Inputs

步骤1：收集输入信息

Ask the user for:

Company name (required)
Company website URL (optional — if not provided, WebSearch for it)
Depth tier — present these options, default to Standard:
- Quick (~2-3 min): Website logos, case studies, G2 reviews, press search
- Standard (~5-8 min): Quick + blog posts, Wayback Machine, LinkedIn, Twitter, Reddit, HN, job postings, YouTube
- Deep (~10-15 min): Standard + SEC filings, podcasts, GitHub, integration directories, BuiltWith, Crunchbase

向用户询问：

公司名称（必填）
公司网站URL（可选——若未提供，通过WebSearch查找）
深度层级——提供以下选项，默认选择Standard：
- Quick（约2-3分钟）：网站Logo墙、案例研究、G2评论、新闻搜索
- Standard（约5-8分钟）：包含Quick的所有数据源 + 博客文章、Wayback Machine、LinkedIn、Twitter/X、Reddit、HN、招聘信息、YouTube
- Deep（约10-15分钟）：包含Standard的所有数据源 + SEC文件、播客、GitHub、集成目录、BuiltWith、Crunchbase

Step 2: Create Output Directory

步骤2：创建输出目录

bash

mkdir -p customer-discovery-[company-slug]

bash

mkdir -p customer-discovery-[company-slug]

Step 3: Run Sources for Selected Tier

步骤3：运行对应层级的数据源扫描

Collect all results into a running list. For each customer found, record:

name: Company name
confidence: high / medium / low
source_type: e.g., "logo_wall", "case_study", "g2_review", "press", "job_posting"
evidence_url: URL where the evidence was found
notes: Brief description of the evidence

将所有结果收集到一个列表中。对于每个找到的客户，记录：

name：客户公司名称
confidence：高/中/低
source_type：例如"logo_wall"、"case_study"、"g2_review"、"press"、"job_posting"
evidence_url：找到证据的URL
notes：证据的简要描述

Quick Sources

快速层级数据源

1. Website logo wall

Run the scrape_website_logos.py script:

bash

python3 skills/capabilities/customer-discovery/scripts/scrape_website_logos.py \
  --url "[company-url]" --output json

Parse the JSON output and add each result to the customer list.

2. Case studies page

Use WebFetch on the company's case studies page (try

/case-studies

/customers

/resources/case-studies

). Extract customer names from page headings and content.

3. G2/Capterra reviews

If the

review-scraper

skill is available, use it to find reviewer companies:

bash

python3 skills/capabilities/review-scraper/scripts/scrape_reviews.py \
  --platform g2 --url "[g2-product-url]" --max-reviews 50 --output json

First, WebSearch for the company's G2 page:

site:g2.com "[company]"

. Extract reviewer company names from review author info.

4. Web search for press

WebSearch these queries and extract customer mentions from results:

"[company]" customer OR "case study" OR partnership

"[company]" "we use" OR "switched to" OR "chose"

1. 网站Logo墙

运行scrape_website_logos.py脚本：

bash

python3 skills/capabilities/customer-discovery/scripts/scrape_website_logos.py \
  --url "[company-url]" --output json

解析JSON输出，并将每个结果添加到客户列表中。

2. 案例研究页面

使用WebFetch获取公司的案例研究页面（尝试路径

/case-studies

、

/customers

、

/resources/case-studies

）。从页面标题和内容中提取客户名称。

3. G2/Capterra评论

若

review-scraper

技能可用，使用它查找评论者所属公司：

bash

python3 skills/capabilities/review-scraper/scripts/scrape_reviews.py \
  --platform g2 --url "[g2-product-url]" --max-reviews 50 --output json

首先，通过WebSearch查找公司的G2页面：

site:g2.com "[company]"

。从评论作者信息中提取评论者的公司名称。

4. 新闻媒体网络搜索

使用以下查询进行WebSearch，并从结果中提取客户提及信息：

"[company]" customer OR "case study" OR partnership

"[company]" "we use" OR "switched to" OR "chose"

Standard Sources (in addition to Quick)

标准层级数据源（包含快速层级所有内容）

5. Company blog posts

WebSearch:

site:[company-domain] customer OR "case study" OR partnership OR "customer story"

6. Wayback Machine logos

Run the scrape_wayback_logos.py script:

bash

python3 skills/capabilities/customer-discovery/scripts/scrape_wayback_logos.py \
  --url "[company-url]" --output json

Logos marked

still_present: false

are especially interesting — they indicate former customers.

7. Founder/exec LinkedIn posts

WebSearch:

site:linkedin.com "[company]" customer OR "excited to announce" OR "welcome"

8. Twitter/X mentions

WebSearch:

site:twitter.com "[company]" "we use" OR "just switched to" OR "loving"

9. Reddit/HN mentions

WebSearch these queries:

site:reddit.com "we use [company]" OR "[company] customer"

site:news.ycombinator.com "[company]" customer OR user

10. Job postings

WebSearch:

"experience with [company]" site:linkedin.com/jobs OR site:greenhouse.io OR site:lever.co

Companies requiring experience with the product are likely customers.

11. YouTube testimonials

WebSearch:

site:youtube.com "[company]" customer OR testimonial OR review

5. 公司博客文章

WebSearch查询：

site:[company-domain] customer OR "case study" OR partnership OR "customer story"

6. Wayback Machine历史Logo

运行scrape_wayback_logos.py脚本：

bash

python3 skills/capabilities/customer-discovery/scripts/scrape_wayback_logos.py \
  --url "[company-url]" --output json

标记为

still_present: false

的Logo尤其值得关注——它们表示该公司曾是客户但现已不再是。

7. 创始人/高管LinkedIn帖子

WebSearch查询：

site:linkedin.com "[company]" customer OR "excited to announce" OR "welcome"

8. Twitter/X提及

WebSearch查询：

site:twitter.com "[company]" "we use" OR "just switched to" OR "loving"

9. Reddit/HN提及

使用以下查询进行WebSearch：

site:reddit.com "we use [company]" OR "[company] customer"

site:news.ycombinator.com "[company]" customer OR user

10. 招聘信息

WebSearch查询：

"experience with [company]" site:linkedin.com/jobs OR site:greenhouse.io OR site:lever.co

要求员工具备该产品使用经验的公司很可能是其客户。

11. YouTube客户证言

WebSearch查询：

site:youtube.com "[company]" customer OR testimonial OR review

Deep Sources (in addition to Standard)

深度层级数据源（包含标准层级所有内容）

12. SEC filings

WebSearch:

site:sec.gov "[company]"

— Look for mentions in 10-K and 10-Q filings.

13. Podcast transcripts

WebSearch:

"[company]" podcast customer OR transcript OR interview

14. GitHub usage signals

WebSearch:

site:github.com "[company-package-name]"

in dependency files, package.json, requirements.txt, etc.

15. Integration directories

WebFetch marketplace pages where the company lists integrations:

Salesforce AppExchange
Zapier integrations page
Slack App Directory
Any marketplace relevant to the company

16. BuiltWith detection

bash

python3 skills/capabilities/customer-discovery/scripts/search_builtwith.py \
  --technology "[company-slug]" --max-results 50 --output json

17. Crunchbase

WebSearch:

site:crunchbase.com "[company]" customers OR partners

12. SEC文件

WebSearch查询：

site:sec.gov "[company]"

——查找10-K和10-Q文件中的提及信息。

13. 播客文稿

WebSearch查询：

"[company]" podcast customer OR transcript OR interview

14. GitHub使用信号

WebSearch查询：

site:github.com "[company-package-name]"

，查看依赖文件、package.json、requirements.txt等中的引用。

15. 集成目录

使用WebFetch获取公司列出集成的市场页面：

Salesforce AppExchange
Zapier集成页面
Slack应用目录
与该公司相关的任何市场

16. BuiltWith检测

bash

python3 skills/capabilities/customer-discovery/scripts/search_builtwith.py \
  --technology "[company-slug]" --max-results 50 --output json

17. Crunchbase

WebSearch查询：

site:crunchbase.com "[company]" customers OR partners

Step 4: Deduplicate Results

步骤4：去重结果

Merge results by company name using fuzzy matching:

Normalize: lowercase, strip suffixes (Inc, Corp, LLC, Ltd, Co., GmbH)
Treat "Acme Inc" = "Acme" = "ACME Corp" = "acme.com" as the same company
When merging, keep the highest confidence level and all evidence URLs

使用模糊匹配按公司名称合并结果：

标准化处理：转为小写，去除后缀（Inc、Corp、LLC、Ltd、Co.、GmbH）
将"Acme Inc"、"Acme"、"ACME Corp"、"acme.com"视为同一家公司
合并时，保留最高置信度等级以及所有证据URL

Step 5: Assign Confidence

步骤5：分配置信度

Apply these rules:

High confidence:

Logo on current website (from scrape_website_logos.py with confidence "high")
Published case study or customer story
Direct quote or testimonial on the company's site
Official partnership page listing

Medium confidence:

G2/Capterra review (reviewer's company)
Press article mentioning customer relationship
Job posting requiring experience with the product
YouTube testimonial or video review
Logo found only in Wayback Machine (was on site, now removed)

Low confidence:

Single social media mention (tweet, Reddit post)
Indirect reference ("heard good things about X")
BuiltWith detection only (technology on site doesn't mean they're a paying customer)
HN discussion mention

应用以下规则：

高置信度：

当前网站上的Logo（来自scrape_website_logos.py且置信度为"high"）
已发布的案例研究或客户故事
公司官网的直接引用或客户证言
官方合作伙伴页面列出的客户

中置信度：

G2/Capterra评论（评论者所属公司）
提及客户关系的新闻文章
要求具备该产品使用经验的招聘信息
YouTube客户证言或视频评论
仅在Wayback Machine中找到的Logo（曾在官网展示，现已移除）

低置信度：

单一社交媒体提及（推文、Reddit帖子）
间接引用（如“听说X的口碑不错”）
仅通过BuiltWith检测到（网站使用该技术不代表是付费客户）
HN讨论中的提及

Step 6: Generate Report

步骤6：生成报告

Create two output files:

customer-discovery-[company]/report.md
:

markdown

undefined

创建两个输出文件：

customer-discovery-[company]/report.md
：

markdown

undefined

Customer Discovery: [Company Name]

客户发现：[公司名称]

Date: YYYY-MM-DD Depth: quick | standard | deep Total customers found: N

日期： YYYY-MM-DD 深度： quick | standard | deep 发现客户总数： N

High Confidence (N)

高置信度客户（N个）

Customer	Source	Evidence
Shopify	Case study	[link]
...	...	...

客户公司	来源	证据链接
Shopify	案例研究	[链接]
...	...	...

Medium Confidence (N)

中置信度客户（N个）

Customer	Source	Evidence
...	...	...

客户公司	来源	证据链接
...	...	...

Low Confidence (N)

低置信度客户（N个）

Customer	Source	Evidence
...	...	...

客户公司	来源	证据链接
...	...	...

Sources Scanned

扫描的数据源

Website logo wall: [url] — N customers found
G2 reviews: N reviews analyzed — N companies identified
Wayback Machine: N snapshots checked — N logos found (N removed)
Web search: N queries — N mentions
...

网站Logo墙：[URL] — 发现N个客户
G2评论：分析N条评论 — 识别N家公司
Wayback Machine：检查N个快照 — 发现N个Logo（其中N个已移除）
网络搜索：N个查询 — N次提及
...

Methodology

方法论

This report was generated using the customer-discovery skill, which scans public data sources to identify companies that use [Company Name]. Confidence levels reflect the strength and directness of the evidence found.


**`customer-discovery-[company]/customers.csv`:**

CSV with columns: `company_name,confidence,source_type,evidence_url,notes`

Write the CSV using a code block or Python script.

本报告由客户发现技能生成，该技能通过扫描公开数据源识别使用[公司名称]产品的客户。置信度等级反映了所发现证据的可信度和直接性。


**`customer-discovery-[company]/customers.csv`：**

包含以下列的CSV文件：`company_name,confidence,source_type,evidence_url,notes`

使用代码块或Python脚本生成该CSV文件。

Scripts Reference

脚本参考

Script	Purpose	Key flags
`scrape_website_logos.py`	Extract logos from current website	`--url` , `--output json\|summary`
`scrape_wayback_logos.py`	Find historical logos via Wayback Machine	`--url` , `--paths` , `--output json\|summary`
`search_builtwith.py`	BuiltWith technology detection (deep mode)	`--technology` , `--max-results` , `--output json\|summary`

All scripts require

requests

pip3 install requests

External skill scripts (use if available):

skills/capabilities/review-scraper/scripts/scrape_reviews.py

— G2/Capterra/Trustpilot reviews (requires Apify token)

skills/capabilities/linkedin-post-research/scripts/search_posts.py

— LinkedIn post search (requires Crustdata API key)

脚本	用途	关键参数
`scrape_website_logos.py`	从当前网站提取Logo	`--url` , `--output json\|summary`
`scrape_wayback_logos.py`	通过Wayback Machine查找历史Logo	`--url` , `--paths` , `--output json\|summary`
`search_builtwith.py`	BuiltWith技术检测（深度模式）	`--technology` , `--max-results` , `--output json\|summary`

所有脚本均需安装

requests

库：

pip3 install requests

外部技能脚本（若可用则使用）：

```
skills/capabilities/review-scraper/scripts/scrape_reviews.py
```
— 爬取G2/Capterra/Trustpilot评论（需要Apify令牌）

skills/capabilities/linkedin-post-research/scripts/search_posts.py

— 搜索LinkedIn帖子（需要Crustdata API密钥）

Cost

成本说明

Quick / Standard: Free (uses WebSearch + free APIs like Wayback Machine CDX)
Deep: Mostly free. BuiltWith paid API is optional (
```
--api-key
```
flag); free scraping is used by default.
External skills (review-scraper, linkedin-post-research) may require paid API tokens.

快速/标准层级： 免费（使用WebSearch + Wayback Machine CDX等免费API）
深度层级： 基本免费。BuiltWith付费API为可选（
```
--api-key
```
参数）；默认使用免费爬取方式。
外部技能（review-scraper、linkedin-post-research）可能需要付费API令牌。