hasdata-cli

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

hasdata

hasdata

Use the
hasdata
CLI for real-time web data. One subcommand per API — flags, enums, defaults are derived from the live schema at
api.hasdata.com/apis
.
使用
hasdata
CLI获取实时网页数据。每个API对应一个子命令——参数(flags)、枚举值(enums)、默认值均来自
api.hasdata.com/apis
上的实时模式。

Prerequisites

前置条件

  • command -v hasdata
    — if missing, install with
    curl -sSL https://raw.githubusercontent.com/HasData/hasdata-cli/main/install.sh | sh
    .
  • One-time setup: the user runs
    hasdata configure
    , pastes their API key, and it's saved to
    ~/.hasdata/config.yaml
    (mode 0600). Every future call picks it up automatically.
  • If a call fails with
    no API key configured
    , the user hasn't run
    hasdata configure
    yet — tell them to. Never invent a key.
  • 执行
    command -v hasdata
    检查是否已安装;若未安装,运行
    curl -sSL https://raw.githubusercontent.com/HasData/hasdata-cli/main/install.sh | sh
    进行安装。
  • 一次性配置:用户运行
    hasdata configure
    ,粘贴API密钥,密钥将保存至
    ~/.hasdata/config.yaml
    (权限模式0600)。后续所有调用将自动读取该密钥。
  • 若调用时出现
    no API key configured
    错误,说明用户尚未运行
    hasdata configure
    ——请告知用户执行该命令。切勿编造密钥。

Quick start

快速开始

bash
hasdata <api> --flag value [--flag value ...] --raw | jq .
Always pass
--raw
when piping to
jq
(skips pretty-print and TTY detection). Use
--pretty
only for human-readable terminal output.
bash
hasdata <api> --flag value [--flag value ...] --raw | jq .
当通过管道传递给
jq
时,务必添加
--raw
参数(跳过格式化输出和TTY检测)。仅在需要人类可读的终端输出时使用
--pretty
参数。

Picking the right subcommand

选择合适的子命令

User intentSubcommand
Web search ("what does Google say about…")
google-serp
(full features) or
google-serp-light
(cheap, single page)
Latest news
google-news
AI Mode SERP
google-ai-mode
Shopping / product prices
google-shopping
(broad),
amazon-search
/
amazon-product
(Amazon),
shopify-products
(Shopify)
Immersive product page
google-immersive-product
Maps / places / reviews
google-maps
,
google-maps-place
,
google-maps-reviews
,
google-maps-photos
Yelp / YellowPages local data
yelp-search
,
yelp-place
,
yellowpages-search
,
yellowpages-place
Real-estate listings
zillow-listing
,
redfin-listing
,
airbnb-listing
Real-estate single property deep dive
zillow-property
,
redfin-property
,
airbnb-property
Jobs
indeed-listing
,
indeed-job
,
glassdoor-listing
,
glassdoor-job
Bing search
bing-serp
Trends
google-trends
Images
google-images
Flights
google-flights
Short videos
google-short-videos
Events
google-events
Instagram profile
instagram-profile
Amazon seller
amazon-seller
,
amazon-seller-products
Scrape a specific URL
web-scraping
— supports JS rendering, proxies, markdown output, AI extraction, screenshots
For exact flags of a subcommand, run
hasdata <api> --help
or read the matching file in
references/
.
用户意图子命令
网页搜索(“Google上关于…的内容是什么?”)
google-serp
(完整功能)或
google-serp-light
(低成本,单页结果)
最新新闻
google-news
AI模式搜索结果页
google-ai-mode
购物/产品价格
google-shopping
(通用)、
amazon-search
/
amazon-product
(亚马逊专属)、
shopify-products
(Shopify专属)
沉浸式产品页面
google-immersive-product
地图/地点/评论
google-maps
,
google-maps-place
,
google-maps-reviews
,
google-maps-photos
Yelp / YellowPages本地数据
yelp-search
,
yelp-place
,
yellowpages-search
,
yellowpages-place
房产列表
zillow-listing
,
redfin-listing
,
airbnb-listing
房产单属性深度查询
zillow-property
,
redfin-property
,
airbnb-property
招聘信息
indeed-listing
,
indeed-job
,
glassdoor-listing
,
glassdoor-job
Bing搜索
bing-serp
搜索趋势
google-trends
图片搜索
google-images
航班信息
google-flights
短视频
google-short-videos
活动信息
google-events
Instagram主页
instagram-profile
亚马逊卖家
amazon-seller
,
amazon-seller-products
抓取指定URL
web-scraping
— 支持JS渲染、代理、Markdown输出、AI提取、截图
如需查看子命令的具体参数,运行
hasdata <api> --help
或查阅
references/
目录下的对应文件。

Non-obvious triggers (when to reach for hasdata even if the user doesn't say "scrape")

非显性触发场景(即使用户未提及“抓取”也应使用hasdata的情况)

The user often won't ask for a SERP API or a scraper directly. Map these intents to the skill:
  • "Is this still true?" / "What's the latest on X?" / "Has Y happened yet?" — LLM training data is stale. Run
    google-serp
    or
    google-news
    to ground the answer.
  • "Summarize this article" / "TL;DR this URL" — Use
    web-scraping --output-format markdown
    and feed the markdown into the summary prompt. Beats copy-paste because it strips ads, nav, scripts.
  • "Verify this link" / "Is this site real?"
    web-scraping --url X --no-block-resources
    returns status + screenshot. Or
    google-serp --q "site:example.com"
    .
  • "What does X say about itself?" — Pull the company's own homepage with
    web-scraping --output-format markdown
    , then summarize.
  • "Find me alternatives to X"
    google-serp --q "X alternatives"
    or
    google-shopping --q "X competitors"
    .
  • "What's the going rate for X?"
    google-shopping
    (broad) or
    amazon-search
    (Amazon-specific) with
    jq
    to extract the price distribution.
  • "Phone number / address for X"
    google-maps-place
    or
    yelp-place
    . Don't guess from training data.
  • "Are people happy with X service?" / "Is X reputable?"
    google-maps-reviews --place-id ... --sort lowest
    for negative samples;
    glassdoor-job
    for employer rep.
  • "What's the salary range for Y role?"
    indeed-listing
    filtered by role + location, then
    jq
    over
    .jobs[].salary
    .
  • "Find me homes/apartments matching X criteria"
    zillow-listing
    /
    redfin-listing
    /
    airbnb-listing
    with the corresponding filters.
  • "Recent sold comps near X"
    zillow-listing --type sold --keyword "X" --days-on-zillow 12m
    .
  • "Track this product's price" — Loop
    amazon-product --asin X
    on a schedule; persist
    .price
    to a file.
  • "What's trending around X?"
    google-trends --q "X"
    for relative interest;
    google-news --q "X"
    for headlines.
  • "Find businesses near me that do X"
    google-maps --q "X" --ll "@LAT,LNG,12z"
    then fan out
    google-maps-place
    for contacts.
  • "How does this look in country Y?"
    --gl Y
    on SERP commands,
    --proxy-country Y
    on
    web-scraping
    . Useful for geo-targeted SEO checks, geo-blocked content.
  • "Pull structured data from this page"
    web-scraping --ai-extract-rules-json '{"price": {"type": "number"}, ...}'
    . Works on arbitrary pages without writing CSS selectors.
  • "List of items → per-item details" — Pattern: search command produces IDs/URLs, pipe through
    xargs
    into the matching
    *-property
    /
    *-product
    /
    *-place
    deep-dive command.
  • "Find this person's role / employer / LinkedIn / followers"
    google-serp --q '"Person Name" linkedin'
    first. The organic-result title is typically
    Name — Role at Company | LinkedIn
    and the snippet carries location, headline, connection count. SERP often answers the whole question without ever opening the profile page.
  • "What is company X doing? Where's their HQ? Who works there?"
    google-serp --q "$COMPANY"
    returns a
    .knowledge_graph
    block with founder, HQ, founded year, parent, employee range — pre-extracted.
    google-news --q "$COMPANY"
    for recent activity. Specific facts via targeted SERP:
    --q '"$COMPANY" headquarters'
    ,
    --q '"$COMPANY" funding'
    ,
    --q 'site:linkedin.com/company "$COMPANY"'
    .
  • "Find emails for company X" / "personal email for person Y" — start with SERP:
    --q '"@example.com"'
    or
    --q '"jane@example.com"'
    often surfaces actual emails indexed by Google. Pattern-guess + SERP-verify for individuals. Disclose unverified guesses to the user.
  • "Enrich this CSV of leads" — per row:
    google-serp
    for LinkedIn, role, employer; another SERP to verify email or pattern. Stay in SERP unless a specific field is missing.
  • Reverse-lookup (email / phone / domain → identity)
    google-serp
    with the literal value in quotes (
    --q '"jane@x.com"'
    ,
    --q '"+1 555 123 4567"'
    ,
    --q '"acme corp" site:example.com'
    ) almost always surfaces the matching person or business.
SERP-first principle: for any data-enrichment intent (people, companies, emails, products, places), reach for
google-serp
/
google-news
/
google-shopping
/
google-maps
first. They return Google's already-extracted structured fields (
.knowledge_graph
,
.organic_results[].snippet
,
.local_results[]
, etc.) and bypass anti-bot. Only escalate to
web-scraping
when SERP doesn't surface the specific field you need — it's the last resort, not the default. See
references/enrichment.md
.
If a user request matches one of the above and you don't invoke hasdata, you're probably hallucinating a stale answer.
用户通常不会直接要求使用搜索结果页API或抓取工具。请将以下用户意图映射至该工具:
  • “这信息还准确吗?” / “X的最新情况是什么?” / “Y发生了吗?” — LLM训练数据存在滞后性。运行
    google-serp
    google-news
    获取最新信息以支撑回答。
  • “总结这篇文章” / “这个URL的内容概要” — 使用
    web-scraping --output-format markdown
    获取内容,再将Markdown传入总结提示词。该方式优于复制粘贴,因为它会自动过滤广告、导航栏和脚本。
  • “验证这个链接” / “这个网站是真实的吗?”
    web-scraping --url X --no-block-resources
    会返回状态码和截图。或使用
    google-serp --q "site:example.com"
  • “X对自己的描述是什么?” — 使用
    web-scraping --output-format markdown
    抓取公司主页,再进行总结。
  • “找X的替代方案”
    google-serp --q "X alternatives"
    google-shopping --q "X competitors"
  • “X的当前市场价是多少?” — 使用
    google-shopping
    (通用)或
    amazon-search
    (亚马逊专属),结合
    jq
    提取价格分布数据。
  • “X的电话/地址是什么?” — 使用
    google-maps-place
    yelp-place
    。切勿依赖训练数据猜测。
  • “人们对X服务满意吗?” / “X靠谱吗?”
    google-maps-reviews --place-id ... --sort lowest
    获取负面评价样本;
    glassdoor-job
    查询雇主口碑。
  • “Y岗位的薪资范围是多少?”
    indeed-listing
    按岗位和地点筛选,再通过
    jq
    提取
    .jobs[].salary
    字段。
  • “找符合X条件的住宅/公寓”
    zillow-listing
    /
    redfin-listing
    /
    airbnb-listing
    搭配对应筛选参数。
  • “X附近近期已售房源”
    zillow-listing --type sold --keyword "X" --days-on-zillow 12m
  • “追踪这款产品的价格” — 定期循环运行
    amazon-product --asin X
    ,将
    .price
    字段存储至文件。
  • “X相关的热门内容是什么?”
    google-trends --q "X"
    获取相对热度;
    google-news --q "X"
    获取头条新闻。
  • “找我附近提供X服务的商家”
    google-maps --q "X" --ll "@LAT,LNG,12z"
    ,再调用
    google-maps-place
    获取联系方式。
  • “在Y国访问这个页面是什么样子?” — 搜索命令添加
    --gl Y
    参数,
    web-scraping
    添加
    --proxy-country Y
    参数。适用于地域定向SEO检查、访问地域限制内容。
  • “从这个页面提取结构化数据”
    web-scraping --ai-extract-rules-json '{"price": {"type": "number"}, ...}'
    。无需编写CSS选择器即可处理任意页面。
  • “物品列表→单个物品详情” — 流程:搜索命令生成ID/URL,通过管道传递给
    xargs
    ,再调用对应的
    *-property
    /
    *-product
    /
    *-place
    深度查询命令。
  • “查找某人的职位/雇主/LinkedIn账号/粉丝数” — 先运行
    google-serp --q '"Person Name" linkedin'
    。自然搜索结果标题通常为
    Name — Role at Company | LinkedIn
    ,摘要包含地点、职位简介、人脉数量。通常无需打开主页即可通过搜索结果页获取全部信息。
  • “X公司在做什么?总部在哪里?员工构成如何?”
    google-serp --q "$COMPANY"
    会返回
    .knowledge_graph
    模块,包含创始人、总部、成立年份、母公司、员工规模等预提取信息。
    google-news --q "$COMPANY"
    获取近期动态。通过定向搜索获取具体信息:
    --q '"$COMPANY" headquarters'
    ,
    --q '"$COMPANY" funding'
    ,
    --q 'site:linkedin.com/company "$COMPANY"'
  • “找X公司的邮箱” / “找Y个人的邮箱” — 先通过搜索结果页尝试:
    --q '"@example.com"'
    --q '"jane@example.com"'
    通常会找到Google收录的真实邮箱。针对个人可先猜测格式再通过搜索结果页验证。需向用户说明未经验证的猜测。
  • “丰富潜在客户CSV数据” — 逐行处理:
    google-serp
    获取LinkedIn账号、职位、雇主;再通过搜索验证邮箱或格式。除非缺少特定字段,否则优先使用搜索结果页。
  • 反向查询(邮箱/电话/域名→身份)
    google-serp
    使用带引号的字面值(
    --q '"jane@x.com"'
    ,
    --q '"+1 555 123 4567"'
    ,
    --q '"acme corp" site:example.com'
    )几乎总能找到匹配的个人或商家。
优先使用搜索结果页原则:对于任何数据丰富需求(人物、公司、邮箱、产品、地点),优先使用
google-serp
/
google-news
/
google-shopping
/
google-maps
。它们会返回Google已提取的结构化字段(
.knowledge_graph
,
.organic_results[].snippet
,
.local_results[]
等),且无需应对反爬机制。仅当搜索结果页无法提供所需特定字段时,才使用
web-scraping
——这是最后手段,而非默认选择。详见
references/enrichment.md
若用户请求符合上述场景但未调用hasdata,你很可能给出了基于过期数据的错误回答。

Universal flag patterns

通用参数模式

  • Kebab-case flag names. The CLI maps them back to the original camelCase before sending to the API.
  • Booleans defaulting to
    true
    have a paired negation:
    --no-block-ads
    ,
    --no-screenshot
    ,
    --no-js-rendering
    ,
    --no-extract-emails
    ,
    --no-block-resources
    . Setting both
    --block-ads
    and
    --no-block-ads
    errors.
  • Anything ending in
    -json
    accepts:
    • inline JSON:
      --extract-rules-json '{"title":"h1"}'
    • file:
      --extract-rules-json @rules.json
    • stdin:
      cat rules.json | hasdata web-scraping ... --extract-rules-json -
  • Repeatable key=value flags split on the first
    =
    (so values containing
    =
    survive):
    --headers User-Agent=foo --headers Cookie=session=abc
    . Pair with
    --headers-json
    for a JSON base; kv items override per key.
  • List flags accept either repeats or comma-joined:
    --lr lang_en --lr lang_fr
    or
    --lr lang_en,lang_fr
    . Serialized as
    key[]=value
    for GET endpoints.
  • Enum flags validate client-side. If you guess wrong, the error lists the allowed values — read the message and retry.
  • 参数名称采用短横线命名法(Kebab-case)。CLI会在发送至API前将其转换为原始的驼峰命名法(camelCase)。
  • 默认值为
    true
    布尔型参数配有对应的否定参数:
    --no-block-ads
    ,
    --no-screenshot
    ,
    --no-js-rendering
    ,
    --no-extract-emails
    ,
    --no-block-resources
    。同时设置
    --block-ads
    --no-block-ads
    会报错。
  • 名称以
    -json
    结尾的参数接受:
    • 内联JSON:
      --extract-rules-json '{"title":"h1"}'
    • 文件:
      --extract-rules-json @rules.json
    • 标准输入:
      cat rules.json | hasdata web-scraping ... --extract-rules-json -
  • 可重复的键值对参数以第一个
    =
    分割(因此值中包含
    =
    的内容会被保留):
    --headers User-Agent=foo --headers Cookie=session=abc
    。可搭配
    --headers-json
    指定基础JSON,键值对会按键覆盖对应内容。
  • 列表型参数接受重复输入或逗号分隔:
    --lr lang_en --lr lang_fr
    --lr lang_en,lang_fr
    。对于GET端点会序列化为
    key[]=value
  • 枚举型参数会在客户端验证。若猜测错误,错误信息会列出允许的值——请阅读信息后重试。

Global flags (apply to every subcommand)

全局参数(适用于所有子命令)

FlagEffect
--raw
Write response bytes as-is (use this when piping to
jq
)
--pretty
Pretty-print JSON (default when stdout is a TTY)
-o, --output FILE
Write response to file instead of stdout (works for binary like screenshots)
--verbose
Log outgoing URL and
X-RateLimit-*
headers to stderr
--api-key KEY
Override env var (rarely needed)
--timeout DURATION
Per-request timeout (default 2m)
--retries N
Max retries on 429/5xx (default 2)
参数作用
--raw
按原样输出响应字节(管道传递给
jq
时使用)
--pretty
格式化输出JSON(当标准输出为TTY时默认启用)
-o, --output FILE
将响应写入文件而非标准输出(适用于截图等二进制内容)
--verbose
将请求URL和
X-RateLimit-*
头信息记录至标准错误输出
--api-key KEY
覆盖环境变量中的密钥(极少需要)
--timeout DURATION
单请求超时时间(默认2分钟)
--retries N
429/5xx错误的最大重试次数(默认2次)

Output contract

输出约定

Responses are JSON. Pipe through
jq
for extraction:
bash
hasdata google-serp --q "espresso machine" --num 10 --raw \
  | jq -c '.organic_results[] | {title, link, snippet}'
For real-estate / e-commerce results, the array shape is API-specific — read a single response with
--pretty
first to learn the schema, then write the
jq
filter.
响应为JSON格式。可通过管道传递给
jq
进行提取:
bash
hasdata google-serp --q "espresso machine" --num 10 --raw \
  | jq -c '.organic_results[] | {title, link, snippet}'
对于房产/电商结果,数组结构因API而异——先使用
--pretty
查看单个响应以了解模式,再编写
jq
过滤规则。

Exit codes (script-safe)

退出码(适用于脚本)

CodeMeaning
0success
1user / CLI-input error (missing required flag, bad enum value, missing API key)
2network error
3API returned 4xx (auth, quota, validation)
4API returned 5xx
代码含义
0成功
1用户/CLI输入错误(缺少必填参数、枚举值无效、未配置API密钥)
2网络错误
3API返回4xx错误(认证、配额、验证失败)
4API返回5xx错误

References

参考文档

  • references/enrichment.md
    person and company enrichment (LinkedIn lookup, emails, HQ/funding/news, CSV-row enrichment, reverse-lookup) — the highest-leverage cross-API workflows
  • references/search.md
    — Google SERP / Bing / News / Trends flag catalog
  • references/web-scraping.md
    web-scraping
    flags, JS scenarios, AI extraction
  • references/real-estate.md
    — Zillow / Redfin / Airbnb filters and bracketed params
  • references/ecommerce.md
    — Amazon / Shopify
  • references/local-business.md
    — Maps / Yelp / YellowPages
  • references/jobs.md
    — Indeed / Glassdoor
  • references/all-commands.md
    — full subcommand index with credit costs
  • references/enrichment.md
    人物与公司数据丰富(LinkedIn查询、邮箱、总部/融资/新闻、CSV行数据丰富、反向查询)——跨API最高效的工作流
  • references/search.md
    — Google搜索结果页 / Bing / 新闻 / 趋势参数目录
  • references/web-scraping.md
    web-scraping
    参数、JS场景、AI提取
  • references/real-estate.md
    — Zillow / Redfin / Airbnb筛选器和括号参数
  • references/ecommerce.md
    — Amazon / Shopify
  • references/local-business.md
    — 地图 / Yelp / YellowPages
  • references/jobs.md
    — Indeed / Glassdoor
  • references/all-commands.md
    — 完整子命令索引及费用说明