hasdata-cli

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

hasdata

Use the

hasdata

CLI for real-time web data. One subcommand per API — flags, enums, defaults are derived from the live schema at

api.hasdata.com/apis

使用

hasdata

CLI获取实时网页数据。每个API对应一个子命令——参数（flags）、枚举值（enums）、默认值均来自

api.hasdata.com/apis

上的实时模式。

Prerequisites

前置条件

command -v hasdata

— if missing, install with

curl -sSL https://raw.githubusercontent.com/HasData/hasdata-cli/main/install.sh | sh

One-time setup: the user runs
```
hasdata configure
```
, pastes their API key, and it's saved to
```
~/.hasdata/config.yaml
```
(mode 0600). Every future call picks it up automatically.
If a call fails with
```
no API key configured
```
, the user hasn't run
```
hasdata configure
```
yet — tell them to. Never invent a key.

执行

command -v hasdata

检查是否已安装；若未安装，运行

curl -sSL https://raw.githubusercontent.com/HasData/hasdata-cli/main/install.sh | sh

进行安装。

一次性配置：用户运行
```
hasdata configure
```
，粘贴API密钥，密钥将保存至
```
~/.hasdata/config.yaml
```
（权限模式0600）。后续所有调用将自动读取该密钥。
若调用时出现
```
no API key configured
```
错误，说明用户尚未运行
```
hasdata configure
```
——请告知用户执行该命令。切勿编造密钥。

Quick start

快速开始

bash

hasdata <api> --flag value [--flag value ...] --raw | jq .

Always pass

--raw

when piping to

jq

(skips pretty-print and TTY detection). Use

--pretty

only for human-readable terminal output.

bash

hasdata <api> --flag value [--flag value ...] --raw | jq .

当通过管道传递给

jq

时，务必添加

--raw

参数（跳过格式化输出和TTY检测）。仅在需要人类可读的终端输出时使用

--pretty

参数。

Picking the right subcommand

选择合适的子命令

User intent	Subcommand
Web search ("what does Google say about…")	`google-serp` (full features) or `google-serp-light` (cheap, single page)
Latest news	`google-news`
AI Mode SERP	`google-ai-mode`
Shopping / product prices	`google-shopping` (broad), `amazon-search` / `amazon-product` (Amazon), `shopify-products` (Shopify)
Immersive product page	`google-immersive-product`
Maps / places / reviews	`google-maps` , `google-maps-place` , `google-maps-reviews` , `google-maps-photos`
Yelp / YellowPages local data	`yelp-search` , `yelp-place` , `yellowpages-search` , `yellowpages-place`
Real-estate listings	`zillow-listing` , `redfin-listing` , `airbnb-listing`
Real-estate single property deep dive	`zillow-property` , `redfin-property` , `airbnb-property`
Jobs	`indeed-listing` , `indeed-job` , `glassdoor-listing` , `glassdoor-job`
Bing search	`bing-serp`
Trends	`google-trends`
Images	`google-images`
Flights	`google-flights`
Short videos	`google-short-videos`
Events	`google-events`
Instagram profile	`instagram-profile`
Amazon seller	`amazon-seller` , `amazon-seller-products`
Scrape a specific URL	`web-scraping` — supports JS rendering, proxies, markdown output, AI extraction, screenshots

For exact flags of a subcommand, run

hasdata <api> --help

or read the matching file in

references/

用户意图	子命令
网页搜索（“Google上关于…的内容是什么？”）	`google-serp` （完整功能）或 `google-serp-light` （低成本，单页结果）
最新新闻	`google-news`
AI模式搜索结果页	`google-ai-mode`
购物/产品价格	`google-shopping` （通用）、 `amazon-search` / `amazon-product` （亚马逊专属）、 `shopify-products` （Shopify专属）
沉浸式产品页面	`google-immersive-product`
地图/地点/评论	`google-maps` , `google-maps-place` , `google-maps-reviews` , `google-maps-photos`
Yelp / YellowPages本地数据	`yelp-search` , `yelp-place` , `yellowpages-search` , `yellowpages-place`
房产列表	`zillow-listing` , `redfin-listing` , `airbnb-listing`
房产单属性深度查询	`zillow-property` , `redfin-property` , `airbnb-property`
招聘信息	`indeed-listing` , `indeed-job` , `glassdoor-listing` , `glassdoor-job`
Bing搜索	`bing-serp`
搜索趋势	`google-trends`
图片搜索	`google-images`
航班信息	`google-flights`
短视频	`google-short-videos`
活动信息	`google-events`
Instagram主页	`instagram-profile`
亚马逊卖家	`amazon-seller` , `amazon-seller-products`
抓取指定URL	`web-scraping` — 支持JS渲染、代理、Markdown输出、AI提取、截图

如需查看子命令的具体参数，运行

hasdata <api> --help

或查阅

references/

目录下的对应文件。

Non-obvious triggers (when to reach for hasdata even if the user doesn't say "scrape")

非显性触发场景（即使用户未提及“抓取”也应使用hasdata的情况）

The user often won't ask for a SERP API or a scraper directly. Map these intents to the skill:

"Is this still true?" / "What's the latest on X?" / "Has Y happened yet?" — LLM training data is stale. Run
```
google-serp
```
or
```
google-news
```
to ground the answer.
"Summarize this article" / "TL;DR this URL" — Use
```
web-scraping --output-format markdown
```
and feed the markdown into the summary prompt. Beats copy-paste because it strips ads, nav, scripts.
"Verify this link" / "Is this site real?" —
```
web-scraping --url X --no-block-resources
```
returns status + screenshot. Or
```
google-serp --q "site:example.com"
```
.
"What does X say about itself?" — Pull the company's own homepage with
```
web-scraping --output-format markdown
```
, then summarize.

"Find me alternatives to X" —

google-serp --q "X alternatives"

google-shopping --q "X competitors"

"What's the going rate for X?" —
```
google-shopping
```
(broad) or
```
amazon-search
```
(Amazon-specific) with
```
jq
```
to extract the price distribution.
"Phone number / address for X" —
```
google-maps-place
```
or
```
yelp-place
```
. Don't guess from training data.
"Are people happy with X service?" / "Is X reputable?" —
```
google-maps-reviews --place-id ... --sort lowest
```
for negative samples;
```
glassdoor-job
```
for employer rep.
"What's the salary range for Y role?" —
```
indeed-listing
```
filtered by role + location, then
```
jq
```
over
```
.jobs[].salary
```
.
"Find me homes/apartments matching X criteria" —
```
zillow-listing
```
/
```
redfin-listing
```
/
```
airbnb-listing
```
with the corresponding filters.

"Recent sold comps near X" —

zillow-listing --type sold --keyword "X" --days-on-zillow 12m

"Track this product's price" — Loop
```
amazon-product --asin X
```
on a schedule; persist
```
.price
```
to a file.
"What's trending around X?" —
```
google-trends --q "X"
```
for relative interest;
```
google-news --q "X"
```
for headlines.
"Find businesses near me that do X" —
```
google-maps --q "X" --ll "@LAT,LNG,12z"
```
then fan out
```
google-maps-place
```
for contacts.
"How does this look in country Y?" —
```
--gl Y
```
on SERP commands,
```
--proxy-country Y
```
on
```
web-scraping
```
. Useful for geo-targeted SEO checks, geo-blocked content.
"Pull structured data from this page" —
```
web-scraping --ai-extract-rules-json '{"price": {"type": "number"}, ...}'
```
. Works on arbitrary pages without writing CSS selectors.
"List of items → per-item details" — Pattern: search command produces IDs/URLs, pipe through
```
xargs
```
into the matching
```
*-property
```
/
```
*-product
```
/
```
*-place
```
deep-dive command.
"Find this person's role / employer / LinkedIn / followers" —
```
google-serp --q '"Person Name" linkedin'
```
first. The organic-result title is typically
```
Name — Role at Company | LinkedIn
```
and the snippet carries location, headline, connection count. SERP often answers the whole question without ever opening the profile page.
"What is company X doing? Where's their HQ? Who works there?" —
```
google-serp --q "$COMPANY"
```
returns a
```
.knowledge_graph
```
block with founder, HQ, founded year, parent, employee range — pre-extracted.
```
google-news --q "$COMPANY"
```
for recent activity. Specific facts via targeted SERP:
```
--q '"$COMPANY" headquarters'
```
,
```
--q '"$COMPANY" funding'
```
,
```
--q 'site:linkedin.com/company "$COMPANY"'
```
.
"Find emails for company X" / "personal email for person Y" — start with SERP:
```
--q '"@example.com"'
```
or
```
--q '"jane@example.com"'
```
often surfaces actual emails indexed by Google. Pattern-guess + SERP-verify for individuals. Disclose unverified guesses to the user.
"Enrich this CSV of leads" — per row:
```
google-serp
```
for LinkedIn, role, employer; another SERP to verify email or pattern. Stay in SERP unless a specific field is missing.
Reverse-lookup (email / phone / domain → identity) —
```
google-serp
```
with the literal value in quotes (
```
--q '"jane@x.com"'
```
,
```
--q '"+1 555 123 4567"'
```
,
```
--q '"acme corp" site:example.com'
```
) almost always surfaces the matching person or business.

SERP-first principle: for any data-enrichment intent (people, companies, emails, products, places), reach for

google-serp

google-news

google-shopping

google-maps

first. They return Google's already-extracted structured fields (

.knowledge_graph

.organic_results[].snippet

.local_results[]

, etc.) and bypass anti-bot. Only escalate to

web-scraping

when SERP doesn't surface the specific field you need — it's the last resort, not the default. See

references/enrichment.md

If a user request matches one of the above and you don't invoke hasdata, you're probably hallucinating a stale answer.

用户通常不会直接要求使用搜索结果页API或抓取工具。请将以下用户意图映射至该工具：

“这信息还准确吗？” / “X的最新情况是什么？” / “Y发生了吗？” — LLM训练数据存在滞后性。运行
```
google-serp
```
或
```
google-news
```
获取最新信息以支撑回答。
“总结这篇文章” / “这个URL的内容概要” — 使用
```
web-scraping --output-format markdown
```
获取内容，再将Markdown传入总结提示词。该方式优于复制粘贴，因为它会自动过滤广告、导航栏和脚本。
“验证这个链接” / “这个网站是真实的吗？” —
```
web-scraping --url X --no-block-resources
```
会返回状态码和截图。或使用
```
google-serp --q "site:example.com"
```
。
“X对自己的描述是什么？” — 使用
```
web-scraping --output-format markdown
```
抓取公司主页，再进行总结。

“找X的替代方案” —

google-serp --q "X alternatives"

或

google-shopping --q "X competitors"

。

“X的当前市场价是多少？” — 使用
```
google-shopping
```
（通用）或
```
amazon-search
```
（亚马逊专属），结合
```
jq
```
提取价格分布数据。
“X的电话/地址是什么？” — 使用
```
google-maps-place
```
或
```
yelp-place
```
。切勿依赖训练数据猜测。
“人们对X服务满意吗？” / “X靠谱吗？” —
```
google-maps-reviews --place-id ... --sort lowest
```
获取负面评价样本；
```
glassdoor-job
```
查询雇主口碑。
“Y岗位的薪资范围是多少？” —
```
indeed-listing
```
按岗位和地点筛选，再通过
```
jq
```
提取
```
.jobs[].salary
```
字段。
“找符合X条件的住宅/公寓” —
```
zillow-listing
```
/
```
redfin-listing
```
/
```
airbnb-listing
```
搭配对应筛选参数。

“X附近近期已售房源” —

zillow-listing --type sold --keyword "X" --days-on-zillow 12m

。

“追踪这款产品的价格” — 定期循环运行
```
amazon-product --asin X
```
，将
```
.price
```
字段存储至文件。
“X相关的热门内容是什么？” —
```
google-trends --q "X"
```
获取相对热度；
```
google-news --q "X"
```
获取头条新闻。
“找我附近提供X服务的商家” —
```
google-maps --q "X" --ll "@LAT,LNG,12z"
```
，再调用
```
google-maps-place
```
获取联系方式。
“在Y国访问这个页面是什么样子？” — 搜索命令添加
```
--gl Y
```
参数，
```
web-scraping
```
添加
```
--proxy-country Y
```
参数。适用于地域定向SEO检查、访问地域限制内容。
“从这个页面提取结构化数据” —
```
web-scraping --ai-extract-rules-json '{"price": {"type": "number"}, ...}'
```
。无需编写CSS选择器即可处理任意页面。
“物品列表→单个物品详情” — 流程：搜索命令生成ID/URL，通过管道传递给
```
xargs
```
，再调用对应的
```
*-property
```
/
```
*-product
```
/
```
*-place
```
深度查询命令。
“查找某人的职位/雇主/LinkedIn账号/粉丝数” — 先运行
```
google-serp --q '"Person Name" linkedin'
```
。自然搜索结果标题通常为
```
Name — Role at Company | LinkedIn
```
，摘要包含地点、职位简介、人脉数量。通常无需打开主页即可通过搜索结果页获取全部信息。
“X公司在做什么？总部在哪里？员工构成如何？” —
```
google-serp --q "$COMPANY"
```
会返回
```
.knowledge_graph
```
模块，包含创始人、总部、成立年份、母公司、员工规模等预提取信息。
```
google-news --q "$COMPANY"
```
获取近期动态。通过定向搜索获取具体信息：
```
--q '"$COMPANY" headquarters'
```
,
```
--q '"$COMPANY" funding'
```
,
```
--q 'site:linkedin.com/company "$COMPANY"'
```
。
“找X公司的邮箱” / “找Y个人的邮箱” — 先通过搜索结果页尝试：
```
--q '"@example.com"'
```
或
```
--q '"jane@example.com"'
```
通常会找到Google收录的真实邮箱。针对个人可先猜测格式再通过搜索结果页验证。需向用户说明未经验证的猜测。
“丰富潜在客户CSV数据” — 逐行处理：
```
google-serp
```
获取LinkedIn账号、职位、雇主；再通过搜索验证邮箱或格式。除非缺少特定字段，否则优先使用搜索结果页。
反向查询（邮箱/电话/域名→身份） —
```
google-serp
```
使用带引号的字面值（
```
--q '"jane@x.com"'
```
,
```
--q '"+1 555 123 4567"'
```
,
```
--q '"acme corp" site:example.com'
```
）几乎总能找到匹配的个人或商家。

优先使用搜索结果页原则：对于任何数据丰富需求（人物、公司、邮箱、产品、地点），优先使用

google-serp

google-news

google-shopping

google-maps

。它们会返回Google已提取的结构化字段（

.knowledge_graph

.organic_results[].snippet

.local_results[]

等），且无需应对反爬机制。仅当搜索结果页无法提供所需特定字段时，才使用

web-scraping

——这是最后手段，而非默认选择。详见

references/enrichment.md

。

若用户请求符合上述场景但未调用hasdata，你很可能给出了基于过期数据的错误回答。

Universal flag patterns

通用参数模式

Kebab-case flag names. The CLI maps them back to the original camelCase before sending to the API.

Booleans defaulting to
true
have a paired negation:

--no-block-ads

--no-screenshot

--no-js-rendering

--no-extract-emails

--no-block-resources

. Setting both

--block-ads

and

--no-block-ads

errors.

Anything ending in
-json
accepts:

inline JSON:
```
--extract-rules-json '{"title":"h1"}'
```
file:
```
--extract-rules-json @rules.json
```

stdin:

cat rules.json | hasdata web-scraping ... --extract-rules-json -

Repeatable key=value flags split on the first
```
=
```
(so values containing
```
=
```
survive):
```
--headers User-Agent=foo --headers Cookie=session=abc
```
. Pair with
```
--headers-json
```
for a JSON base; kv items override per key.
List flags accept either repeats or comma-joined:
```
--lr lang_en --lr lang_fr
```
or
```
--lr lang_en,lang_fr
```
. Serialized as
```
key[]=value
```
for GET endpoints.
Enum flags validate client-side. If you guess wrong, the error lists the allowed values — read the message and retry.

参数名称采用短横线命名法（Kebab-case）。CLI会在发送至API前将其转换为原始的驼峰命名法（camelCase）。

默认值为

true

的布尔型参数配有对应的否定参数：

--no-block-ads

--no-screenshot

--no-js-rendering

--no-extract-emails

--no-block-resources

。同时设置

--block-ads

和

--no-block-ads

会报错。

名称以

-json

结尾的参数接受：

内联JSON：
```
--extract-rules-json '{"title":"h1"}'
```
文件：
```
--extract-rules-json @rules.json
```

标准输入：

cat rules.json | hasdata web-scraping ... --extract-rules-json -

可重复的键值对参数以第一个
```
=
```
分割（因此值中包含
```
=
```
的内容会被保留）：
```
--headers User-Agent=foo --headers Cookie=session=abc
```
。可搭配
```
--headers-json
```
指定基础JSON，键值对会按键覆盖对应内容。
列表型参数接受重复输入或逗号分隔：
```
--lr lang_en --lr lang_fr
```
或
```
--lr lang_en,lang_fr
```
。对于GET端点会序列化为
```
key[]=value
```
。
枚举型参数会在客户端验证。若猜测错误，错误信息会列出允许的值——请阅读信息后重试。

Global flags (apply to every subcommand)

全局参数（适用于所有子命令）

Flag	Effect
`--raw`	Write response bytes as-is (use this when piping to `jq` )
`--pretty`	Pretty-print JSON (default when stdout is a TTY)
`-o, --output FILE`	Write response to file instead of stdout (works for binary like screenshots)
`--verbose`	Log outgoing URL and `X-RateLimit-*` headers to stderr
`--api-key KEY`	Override env var (rarely needed)
`--timeout DURATION`	Per-request timeout (default 2m)
`--retries N`	Max retries on 429/5xx (default 2)

参数	作用
`--raw`	按原样输出响应字节（管道传递给 `jq` 时使用）
`--pretty`	格式化输出JSON（当标准输出为TTY时默认启用）
`-o, --output FILE`	将响应写入文件而非标准输出（适用于截图等二进制内容）
`--verbose`	将请求URL和 `X-RateLimit-*` 头信息记录至标准错误输出
`--api-key KEY`	覆盖环境变量中的密钥（极少需要）
`--timeout DURATION`	单请求超时时间（默认2分钟）
`--retries N`	429/5xx错误的最大重试次数（默认2次）

Output contract

输出约定

Responses are JSON. Pipe through

jq

for extraction:

bash

hasdata google-serp --q "espresso machine" --num 10 --raw \
  | jq -c '.organic_results[] | {title, link, snippet}'

For real-estate / e-commerce results, the array shape is API-specific — read a single response with

--pretty

first to learn the schema, then write the

jq

filter.

响应为JSON格式。可通过管道传递给

jq

进行提取：

bash

hasdata google-serp --q "espresso machine" --num 10 --raw \
  | jq -c '.organic_results[] | {title, link, snippet}'

对于房产/电商结果，数组结构因API而异——先使用

--pretty

查看单个响应以了解模式，再编写

jq

过滤规则。

Exit codes (script-safe)

退出码（适用于脚本）

Code	Meaning
0	success
1	user / CLI-input error (missing required flag, bad enum value, missing API key)
2	network error
3	API returned 4xx (auth, quota, validation)
4	API returned 5xx

代码	含义
0	成功
1	用户/CLI输入错误（缺少必填参数、枚举值无效、未配置API密钥）
2	网络错误
3	API返回4xx错误（认证、配额、验证失败）
4	API返回5xx错误

References

参考文档

```
references/enrichment.md
```
— person and company enrichment (LinkedIn lookup, emails, HQ/funding/news, CSV-row enrichment, reverse-lookup) — the highest-leverage cross-API workflows
```
references/search.md
```
— Google SERP / Bing / News / Trends flag catalog
```
references/web-scraping.md
```
—
```
web-scraping
```
flags, JS scenarios, AI extraction
```
references/real-estate.md
```
— Zillow / Redfin / Airbnb filters and bracketed params
```
references/ecommerce.md
```
— Amazon / Shopify
```
references/local-business.md
```
— Maps / Yelp / YellowPages
```
references/jobs.md
```
— Indeed / Glassdoor
```
references/all-commands.md
```
— full subcommand index with credit costs

```
references/enrichment.md
```
— 人物与公司数据丰富（LinkedIn查询、邮箱、总部/融资/新闻、CSV行数据丰富、反向查询）——跨API最高效的工作流
```
references/search.md
```
— Google搜索结果页 / Bing / 新闻 / 趋势参数目录
```
references/web-scraping.md
```
—
```
web-scraping
```
参数、JS场景、AI提取
```
references/real-estate.md
```
— Zillow / Redfin / Airbnb筛选器和括号参数
```
references/ecommerce.md
```
— Amazon / Shopify
```
references/local-business.md
```
— 地图 / Yelp / YellowPages
```
references/jobs.md
```
— Indeed / Glassdoor
```
references/all-commands.md
```
— 完整子命令索引及费用说明