exa-websets-search

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Exa Websets Search

Exa Webset 搜索

Comprehensive webset management including creation, search, imports, items, and enrichments.
Use
--help
to see available commands and verify usage before running:
bash
exa-ai <command> --help
全面的Webset管理,包括创建、搜索、导入、条目管理和增强功能。
运行前使用
--help
查看可用命令并验证用法:
bash
exa-ai <command> --help

Working with Complex Shell Commands

处理复杂Shell命令

When using the Bash tool with complex shell syntax, follow these best practices for reliability:
  1. Run commands directly: Capture JSON output directly rather than nesting command substitutions
  2. Parse in subsequent steps: Use
    jq
    to parse output in a follow-up command if needed
  3. Avoid nested substitutions: Complex nested
    $(...)
    can be fragile; break into sequential steps
Example:
bash
undefined
当使用Bash工具执行复杂Shell语法时,请遵循以下最佳实践以确保可靠性:
  1. 直接运行命令:直接捕获JSON输出,而非嵌套命令替换
  2. 后续步骤解析:如有需要,在后续命令中使用
    jq
    解析输出
  3. 避免嵌套替换:复杂的嵌套
    $(...)
    结构易出问题,拆分为连续步骤执行
示例:
bash
undefined

Less reliable: nested command substitution

可靠性较低:嵌套命令替换

webset_id=$(exa-ai webset-create --search '{"query":"tech startups","count":1}' | jq -r '.webset_id')
webset_id=$(exa-ai webset-create --search '{"query":"tech startups","count":1}' | jq -r '.webset_id')

More reliable: run directly, then parse

更可靠:直接运行,再解析

exa-ai webset-create --search '{"query":"tech startups","count":1}'
exa-ai webset-create --search '{"query":"tech startups","count":1}'

Then in a follow-up command if needed:

如有需要,在后续命令中执行:

webset_id=$(cat output.json | jq -r '.webset_id')
undefined
webset_id=$(cat output.json | jq -r '.webset_id')
undefined

Critical Requirements

关键要求

Universal rules across all operations:
  1. Start with minimal counts (1-5 results): Initial searches are test spikes to validate quality. ALWAYS default to count:1 unless user explicitly requests more.
  2. Three-step workflow - Validate, Expand, Enrich: (1) Create with count:1 to test search quality, (2) Expand search count if results are good, (3) Add enrichments only after validated, expanded results.
  3. No enrichments during validation: Never add enrichments when testing with count:1. Validate search quality first, expand count second, add enrichments last.
  4. Avoid --wait flag: Do NOT use
    --wait
    flag in commands. It's designed for human interactive use, not automated workflows.
  5. Maintain query AND criteria consistency: When scaling up or appending searches, use the EXACT same query AND criteria that you validated. Omitting criteria causes Exa to regenerate them on-the-fly, producing inconsistent results.
所有操作通用规则:
  1. 从最小数量(1-5条结果)开始:初始搜索为测试性操作,用于验证结果质量。除非用户明确要求更多,否则默认使用count:1
  2. 三步工作流 - 验证→扩展→增强:(1) 使用count:1创建Webset以测试搜索质量,(2) 若结果符合预期则扩大搜索数量,(3) 仅在验证并扩展结果后再添加增强功能
  3. 验证阶段不添加增强功能:使用count:1测试时,绝不要添加增强功能。先验证搜索质量,再扩展数量,最后添加增强功能
  4. 避免使用--wait参数:不要在命令中使用
    --wait
    参数。该参数专为人类交互式使用设计,不适用于自动化工作流
  5. 保持查询和条件一致性:当扩大搜索范围或追加搜索时,使用与验证阶段完全相同的查询和条件。省略条件会导致Exa动态重新生成条件,从而产生不一致的结果

Credit Costs

信用成本

Pricing: $50/month = 8,000 credits ($0.00625 per credit)
Cost per operation:
  • Each webset item: 10 credits ($0.0625)
  • Standard enrichment: 2 credits ($0.0125)
  • Email enrichment: 5 credits ($0.03125)
Why start with count:1: Testing with 1 result costs 10 credits ($0.0625). A failed search with count:100 wastes 1,000 credits ($6.25) - 100x more expensive.
Why enrich last: Enriching bad results wastes credits. Always validate first, expand second, enrich last.
定价:50美元/月 = 8000个信用点(每个信用点0.00625美元)
每项操作成本
  • 每个Webset条目:10个信用点(0.0625美元)
  • 标准增强功能:2个信用点(0.0125美元)
  • 邮箱增强功能:5个信用点(0.03125美元)
为何从count:1开始:测试1条结果仅需10个信用点(0.0625美元)。若使用count:100进行失败的搜索,会浪费1000个信用点(6.25美元),成本是前者的100倍。
为何最后添加增强功能:为质量不佳的结果添加增强功能会浪费信用点。务必先验证,再扩展,最后增强。

Quick Command Reference

快速命令参考

exa-ai --help
exa-ai --help

Output Formats

输出格式

All exa-ai webset commands support output formats:
  • JSON (default): Pipe to
    jq
    to extract specific fields (e.g.,
    | jq -r '.webset_id'
    )
  • toon: Compact, readable format for direct viewing
  • pretty: Human-friendly formatted output
  • text: Plain text output
所有exa-ai Webset命令支持以下输出格式:
  • JSON(默认):通过管道传递给
    jq
    以提取特定字段(例如:
    | jq -r '.webset_id'
  • toon:紧凑、易读的格式,适合直接查看
  • pretty:人性化的格式化输出
  • text:纯文本输出

Webset Management

Webset管理

Core operations for managing webset collections.
管理Webset集合的核心操作。

Entity Types

实体类型

  • company
    : Companies and organizations
  • person
    : Individual people
  • article
    : News articles and blog posts
  • research_paper
    : Academic papers
  • custom
    : Custom entity types (define with --entity-description)
  • company
    :公司和组织
  • person
    :个人
  • article
    :新闻文章和博客帖子
  • research_paper
    :学术论文
  • custom
    :自定义实体类型(使用--entity-description定义)

Create Webset from Search

通过搜索创建Webset

bash
webset_id=$(exa-ai webset-create \
  --search '{"query":"AI startups in San Francisco","count":1}' | jq -r '.webset_id')
bash
webset_id=$(exa-ai webset-create \
  --search '{"query":"AI startups in San Francisco","count":1}' | jq -r '.webset_id')

Create with Detailed Search Criteria

使用详细搜索条件创建Webset

bash
exa-ai webset-create \
  --search '{
    "query": "Technology companies focused on developer tools",
    "count": 1,
    "entity": {
      "type": "company"
    },
    "criteria": [
      {
        "description": "Companies with 50-500 employees indicating growth stage"
      },
      {
        "description": "Primary product is developer tools, APIs, or infrastructure"
      }
    ]
  }'
bash
exa-ai webset-create \
  --search '{
    "query": "Technology companies focused on developer tools",
    "count": 1,
    "entity": {
      "type": "company"
    },
    "criteria": [
      {
        "description": "Companies with 50-500 employees indicating growth stage"
      },
      {
        "description": "Primary product is developer tools, APIs, or infrastructure"
      }
    ]
  }'

Create with Custom Entity

使用自定义实体创建Webset

bash
exa-ai webset-create \
  --search '{
    "query": "Nonprofits focused on economic justice",
    "count": 1,
    "entity": {
      "type": "custom",
      "description": "nonprofit"
    },
    "criteria": [
      {
        "description": "Primary focus on economic justice"
      },
      {
        "description": "Annual operating budget between $1M and $10M"
      }
    ]
  }'
bash
exa-ai webset-create \
  --search '{
    "query": "Nonprofits focused on economic justice",
    "count": 1,
    "entity": {
      "type": "custom",
      "description": "nonprofit"
    },
    "criteria": [
      {
        "description": "Primary focus on economic justice"
      },
      {
        "description": "Annual operating budget between $1M and $10M"
      }
    ]
  }'

Create from CSV Import

通过CSV导入创建Webset

bash
import_id=$(exa-ai import-create companies.csv \
  --count 100 \
  --title "Companies" \
  --format csv \
  --entity-type company | jq -r '.import_id')

exa-ai webset-create --import $import_id
bash
import_id=$(exa-ai import-create companies.csv \
  --count 100 \
  --title "Companies" \
  --format csv \
  --entity-type company | jq -r '.import_id')

exa-ai webset-create --import $import_id

Three-Step Workflow: Validate → Expand → Enrich

三步工作流:验证→扩展→增强

Step 1: VALIDATE - Create with count:1 (NO enrichments)

步骤1:验证 - 使用count:1创建(不添加增强功能)

bash
webset_id=$(exa-ai webset-create \
  --search '{"query":"tech startups","count":1}' | jq -r '.webset_id')

exa-ai webset-item-list $webset_id
⚠️ REQUIRED: Manually verify the result is relevant before continuing. If not, adjust the query and start over.

bash
webset_id=$(exa-ai webset-create \
  --search '{"query":"tech startups","count":1}' | jq -r '.webset_id')

exa-ai webset-item-list $webset_id
⚠️ 必须:在继续前手动验证结果是否相关。若不相关,请调整查询后重新开始。

Step 2: EXPAND - Gradually increase count with verification at each stage

步骤2:扩展 - 逐步增加数量,每一步都验证

bash
undefined
bash
undefined

Expand to 2 results (use same query and criteria from validation)

扩展到2条结果(使用与验证阶段相同的查询和条件)

exa-ai webset-search-create $webset_id
--query "tech startups"
--behavior override
--count 2
exa-ai webset-item-list $webset_id

**⚠️ REQUIRED: Check quality at this scale. Repeat with larger counts (5, 10, 25, 50, 100) until you reach your target.**

**Loop this step:** Keep expanding gradually (2 → 5 → 10 → 25 → 50 → 100) with verification between each expansion.

---
exa-ai webset-search-create $webset_id
--query "tech startups"
--behavior override
--count 2
exa-ai webset-item-list $webset_id

**⚠️ 必须:在此规模下检查结果质量。重复此步骤并使用更大的数量(5、10、25、50、100),直到达到目标数量。**

**循环此步骤**:逐步扩大数量(2→5→10→25→50→100),每次扩展后都进行验证。

---

Step 3: ENRICH - Add enrichments only after confirming quality

步骤3:增强 - 仅在确认结果质量后添加增强功能

bash
exa-ai enrichment-create $webset_id \
  --description "Company website" --format url --title "Website"

exa-ai enrichment-create $webset_id \
  --description "Employee count" --format text --title "Team Size"
bash
exa-ai enrichment-create $webset_id \
  --description "Company website" --format url --title "Website"

exa-ai enrichment-create $webset_id \
  --description "Employee count" --format text --title "Team Size"

Interpreting Criterion Success Rates

解读条件成功率

CRITICAL: Criteria are evaluated conditionally - when one criterion fails, others may not run. A low success rate doesn't indicate that criterion is restrictive; it means OTHER criteria are filtering results first. Only interpret a low success rate as "restrictive" when OTHER criteria have high success rates (>80%).
关键:条件是按顺序评估的——当一个条件不满足时,其他条件可能不会执行。低成功率并不意味着该条件限制过严;而是意味着其他条件已经过滤掉了结果。只有当其他条件的成功率较高(>80%)时,才能将低成功率解读为“限制过严”。

Manage Websets

管理Webset

bash
exa-ai webset-list
exa-ai webset-get ws_abc123
exa-ai webset-update ws_abc123 --metadata '{"status":"active","owner":"team"}'
exa-ai webset-delete ws_abc123

bash
exa-ai webset-list
exa-ai webset-get ws_abc123
exa-ai webset-update ws_abc123 --metadata '{"status":"active","owner":"team"}'
exa-ai webset-delete ws_abc123

Search Operations

搜索操作

Run searches within a webset to add new items.
在Webset内执行搜索以添加新条目。

Search Behavior

搜索行为

Control how new search results are combined with existing items:
  • append (default): Add new items to existing collection
    • Requires previous search results to exist
    • Error if webset has no previous search: "No previous search found"
    • Default behavior when
      --behavior
      is omitted
  • override: Replace entire collection with search results
    • REQUIRED for first search on a webset
    • Use when starting fresh or completely replacing results
CRITICAL - First search requirement: The first
webset-search-create
on a webset MUST explicitly use
--behavior override
. Since the default is append, omitting
--behavior
will fail with "No previous search found" error. Subsequent searches can omit the flag (defaults to append).
控制新搜索结果与现有条目的组合方式:
  • append(默认):将新条目添加到现有集合中
    • 要求存在之前的搜索结果
    • 若Webset无之前的搜索结果,会报错:"No previous search found"
    • 省略
      --behavior
      参数时的默认行为
  • override:用搜索结果替换整个集合
    • Webset的首次搜索必须使用此参数
    • 适用于从头开始或完全替换结果的场景
关键 - 首次搜索要求:Webset上的第一次
webset-search-create
必须显式使用
--behavior override
。由于默认行为是append,省略该参数会导致报错:"No previous search found"。后续搜索可省略该参数(默认使用append)。

Query and Criteria Consistency

查询和条件一致性

CRITICAL: When appending or scaling up searches, maintain IDENTICAL query and criteria from your validated search.
关键:当追加或扩大搜索范围时,必须使用与验证阶段完全相同的查询和条件。

Why This Matters

原因

Using different criteria causes Exa to generate new search parameters on-the-fly, which:
  • Violates consistency and produces mismatched results
  • Reduces result quality compared to validated criteria
  • Makes it impossible to reproduce or debug issues
使用不同的条件会导致Exa动态生成新的搜索参数,这会:
  • 破坏一致性,产生不匹配的结果
  • 与验证过的条件相比,降低结果质量
  • 导致无法重现或调试问题

Complete Example

完整示例

bash
undefined
bash
undefined

Step 1: Test search with criteria (MUST use override for first search)

步骤1:使用条件进行测试搜索(首次搜索必须使用override)

exa-ai webset-search-create ws_abc123
--query "Progressive nonprofits in California"
--behavior override
--count 1
--criteria '[ {"description": "Annual budget between $1M and $10M"}, {"description": "Primary focus on economic justice, affordability, living wages, or worker power"}, {"description": "Established communications, narrative strategy, or messaging function"} ]'
exa-ai webset-search-create ws_abc123
--query "Progressive nonprofits in California"
--behavior override
--count 1
--criteria '[ {"description": "Annual budget between $1M and $10M"}, {"description": "Primary focus on economic justice, affordability, living wages, or worker power"}, {"description": "Established communications, narrative strategy, or messaging function"} ]'

Verify quality, then append MORE results with IDENTICAL query and criteria

验证结果质量,然后使用完全相同的查询和条件追加更多结果

exa-ai webset-search-create ws_abc123
--query "Progressive nonprofits in California"
--behavior append
--count 5
--criteria '[ {"description": "Annual budget between $1M and $10M"}, {"description": "Primary focus on economic justice, affordability, living wages, or worker power"}, {"description": "Established communications, narrative strategy, or messaging function"} ]'
undefined
exa-ai webset-search-create ws_abc123
--query "Progressive nonprofits in California"
--behavior append
--count 5
--criteria '[ {"description": "Annual budget between $1M and $10M"}, {"description": "Primary focus on economic justice, affordability, living wages, or worker power"}, {"description": "Established communications, narrative strategy, or messaging function"} ]'
undefined

Best Practice: Save Criteria to File

最佳实践:将条件保存到文件

bash
undefined
bash
undefined

Create criteria file once

一次性创建条件文件

cat > criteria.json <<'EOF' [ {"description": "Annual budget between $1M and $10M"}, {"description": "Primary focus on economic justice, affordability, living wages, or worker power"}, {"description": "Established communications, narrative strategy, or messaging function"} ] EOF
cat > criteria.json <<'EOF' [ {"description": "Annual budget between $1M and $10M"}, {"description": "Primary focus on economic justice, affordability, living wages, or worker power"}, {"description": "Established communications, narrative strategy, or messaging function"} ] EOF

Use consistently across all searches (first search needs override)

在所有搜索中一致使用(首次搜索需要override)

exa-ai webset-search-create ws_abc123
--query "Progressive nonprofits in California"
--behavior override
--count 1
--criteria @criteria.json
exa-ai webset-search-create ws_abc123
--query "Progressive nonprofits in California"
--behavior append
--count 5
--criteria @criteria.json
undefined
exa-ai webset-search-create ws_abc123
--query "Progressive nonprofits in California"
--behavior override
--count 1
--criteria @criteria.json
exa-ai webset-search-create ws_abc123
--query "Progressive nonprofits in California"
--behavior append
--count 5
--criteria @criteria.json
undefined

Basic Search Operations

基础搜索操作

bash
undefined
bash
undefined

First search on webset (must use override)

Webset上的首次搜索(必须使用override)

exa-ai webset-search-create ws_abc123
--query "AI startups in San Francisco"
--behavior override
--count 1
exa-ai webset-search-create ws_abc123
--query "AI startups in San Francisco"
--behavior override
--count 1

Append to collection

追加到集合

exa-ai webset-search-create ws_abc123
--query "SaaS companies Series B"
--behavior append
--count 1
exa-ai webset-search-create ws_abc123
--query "SaaS companies Series B"
--behavior append
--count 1

Override collection

替换集合

exa-ai webset-search-create ws_abc123
--query "top tech companies"
--behavior override
--count 1
undefined
exa-ai webset-search-create ws_abc123
--query "top tech companies"
--behavior override
--count 1
undefined

Monitor Search Progress

监控搜索进度

bash
webset_id="ws_abc123"
search_id=$(exa-ai webset-search-create $webset_id \
  --query "fintech startups" \
  --behavior override \
  --count 1 | jq -r '.search_id')

exa-ai webset-search-get $webset_id $search_id
exa-ai webset-search-cancel $webset_id $search_id

bash
webset_id="ws_abc123"
search_id=$(exa-ai webset-search-create $webset_id \
  --query "fintech startups" \
  --behavior override \
  --count 1 | jq -r '.search_id')

exa-ai webset-search-get $webset_id $search_id
exa-ai webset-search-cancel $webset_id $search_id

CSV Imports

CSV导入

Upload CSV files to create websets from existing datasets.
上传CSV文件,从现有数据集创建Webset。

CSV Format Requirements

CSV格式要求

  1. First row contains column headers
  2. Each row represents one entity
  3. Include at minimum a name or identifier column
  1. 第一行包含列标题
  2. 每一行代表一个实体
  3. 至少包含一个名称或标识符列

Basic Import Workflow

基础导入工作流

bash
undefined
bash
undefined

Create import

创建导入任务

import_id=$(exa-ai import-create companies.csv
--count 100
--title "Tech Companies"
--format csv
--entity-type company | jq -r '.import_id')
import_id=$(exa-ai import-create companies.csv
--count 100
--title "Tech Companies"
--format csv
--entity-type company | jq -r '.import_id')

Create webset from import

从导入任务创建Webset

webset_id=$(exa-ai webset-create --import $import_id | jq -r '.webset_id')
undefined
webset_id=$(exa-ai webset-create --import $import_id | jq -r '.webset_id')
undefined

Custom Entity Type

自定义实体类型

bash
exa-ai import-create products.csv \
  --count 5 \
  --title "Product List" \
  --format csv \
  --entity-type custom \
  --entity-description "Consumer electronics products"
bash
exa-ai import-create products.csv \
  --count 5 \
  --title "Product List" \
  --format csv \
  --entity-type custom \
  --entity-description "Consumer electronics products"

Manage Imports

管理导入任务

bash
exa-ai import-list
exa-ai import-get imp_abc123

bash
exa-ai import-list
exa-ai import-get imp_abc123

Import vs Search Scope

导入与搜索范围

--import
loads data for enrichment.
search.scope
filters searches to specific sources.
⚠️ NEVER use same ID in both - returns 400:
bash
undefined
--import
用于加载数据以进行增强。
search.scope
用于过滤搜索的特定来源。
⚠️ 切勿在两者中使用相同ID - 会返回400错误:
bash
undefined

❌ INVALID

❌ 无效

exa-ai webset-create --import import_abc
--search '{"scope":[{"source":"import","id":"import_abc"}]}'
exa-ai webset-create --import import_abc
--search '{"scope":[{"source":"import","id":"import_abc"}]}'

✅ Scoped search only

✅ 仅使用范围搜索

exa-ai webset-create
--search '{"query":"CEOs","scope":[{"source":"import","id":"import_abc"}]}'
exa-ai webset-create
--search '{"query":"CEOs","scope":[{"source":"import","id":"import_abc"}]}'

✅ Relationship traversal

✅ 关系遍历

exa-ai webset-search-create ws_abc --query "investors" --behavior override
--scope '[{"source":"webset","id":"webset_abc","relationship":{"definition":"investors of","limit":5}}]'

---
exa-ai webset-search-create ws_abc --query "investors" --behavior override
--scope '[{"source":"webset","id":"webset_abc","relationship":{"definition":"investors of","limit":5}}]'

---

Item Management

条目管理

Manage individual items in websets.
管理Webset中的单个条目。

Basic Operations

基础操作

bash
undefined
bash
undefined

List items

列出条目

exa-ai webset-item-list ws_abc123 exa-ai webset-item-list ws_abc123 --output-format pretty
exa-ai webset-item-list ws_abc123 exa-ai webset-item-list ws_abc123 --output-format pretty

Get item details

获取条目详情

exa-ai webset-item-get item_xyz789
exa-ai webset-item-get item_xyz789

Delete item

删除条目

exa-ai webset-item-delete item_xyz789
undefined
exa-ai webset-item-delete item_xyz789
undefined

Extract Item Data

提取条目数据

bash
undefined
bash
undefined

Get all item IDs

获取所有条目ID

exa-ai webset-item-list ws_abc123 --output-format json | jq -r '.[].id'
exa-ai webset-item-list ws_abc123 --output-format json | jq -r '.[].id'

Count items

统计条目数量

exa-ai webset-item-list ws_abc123 --output-format json | jq 'length'

---
exa-ai webset-item-list ws_abc123 --output-format json | jq 'length'

---

Enrichments

增强功能

Add structured data fields to all items in a webset using AI extraction.
使用AI提取功能为Webset中的所有条目添加结构化数据字段。

Enrichment Formats

增强功能格式

  • text: Free-form text extraction (employee count, description, technology stack)
  • url: Extract URLs only (website, LinkedIn, GitHub)
  • options: Categorical data with predefined options (industry, funding stage, size range)
  • text:自由格式文本提取(员工数量、描述、技术栈)
  • url:仅提取URL(官网、LinkedIn、GitHub)
  • options:带有预定义选项的分类数据(行业、融资阶段、规模范围)

Key Concepts

核心概念

  • description: The primary AI prompt that drives extraction. This tells the enrichment WHAT to extract. (Can be updated)
  • instructions: Optional additional guidance on HOW to extract or format. (Creation-only, cannot be updated)
  • Use
    exa-ai enrichment-create --help
    and
    exa-ai enrichment-update --help
    to see all available parameters
  • description:驱动提取的主要AI提示词。用于告知增强功能要提取的内容。(可更新)
  • instructions:可选的额外指导,用于说明提取或格式化的方式。(仅在创建时设置,无法更新)
  • 使用
    exa-ai enrichment-create --help
    exa-ai enrichment-update --help
    查看所有可用参数

Create Enrichments

创建增强功能

bash
undefined
bash
undefined

Text enrichment

文本增强

exa-ai enrichment-create ws_abc123
--description "Number of employees as of latest data"
--format text
--title "Team Size"
exa-ai enrichment-create ws_abc123
--description "Number of employees as of latest data"
--format text
--title "Team Size"

URL enrichment

URL增强

exa-ai enrichment-create ws_abc123
--description "Primary company website URL"
--format url
--title "Website"
exa-ai enrichment-create ws_abc123
--description "Primary company website URL"
--format url
--title "Website"

Options enrichment

选项增强

exa-ai enrichment-create ws_abc123
--description "Current funding stage"
--format options
--options '[ {"label":"Pre-seed"}, {"label":"Seed"}, {"label":"Series A"}, {"label":"Series B"}, {"label":"Series C+"}, {"label":"Public"} ]'
--title "Funding Stage"
undefined
exa-ai enrichment-create ws_abc123
--description "Current funding stage"
--format options
--options '[ {"label":"Pre-seed"}, {"label":"Seed"}, {"label":"Series A"}, {"label":"Series B"}, {"label":"Series C+"}, {"label":"Public"} ]'
--title "Funding Stage"
undefined

Use Options from File

从文件加载选项

bash
cat > industries.json <<'EOF'
[
  {"label": "SaaS"},
  {"label": "Developer Tools"},
  {"label": "AI/ML"},
  {"label": "Fintech"},
  {"label": "Healthcare"},
  {"label": "Other"}
]
EOF

exa-ai enrichment-create ws_abc123 \
  --description "Primary industry or sector" \
  --format options \
  --options @industries.json \
  --title "Industry"
bash
cat > industries.json <<'EOF'
[
  {"label": "SaaS"},
  {"label": "Developer Tools"},
  {"label": "AI/ML"},
  {"label": "Fintech"},
  {"label": "Healthcare"},
  {"label": "Other"}
]
EOF

exa-ai enrichment-create ws_abc123 \
  --description "Primary industry or sector" \
  --format options \
  --options @industries.json \
  --title "Industry"

Add Instructions for Precision

添加指令以提高精度

bash
exa-ai enrichment-create ws_abc123 \
  --description "Technology stack" \
  --format text \
  --instructions "Focus only on backend technologies and databases. Ignore frontend frameworks." \
  --title "Backend Tech"
bash
exa-ai enrichment-create ws_abc123 \
  --description "Technology stack" \
  --format text \
  --instructions "Focus only on backend technologies and databases. Ignore frontend frameworks." \
  --title "Backend Tech"

Manage Enrichments

管理增强功能

bash
undefined
bash
undefined

List enrichments

列出增强功能

exa-ai enrichment-list ws_abc123 exa-ai enrichment-list ws_abc123 --output-format pretty
exa-ai enrichment-list ws_abc123 exa-ai enrichment-list ws_abc123 --output-format pretty

Get details

获取详情

exa-ai enrichment-get ws_abc123 enr_xyz789
exa-ai enrichment-get ws_abc123 enr_xyz789

Update extraction prompt (description)

更新提取提示词(description)

exa-ai enrichment-update ws_abc123 enr_xyz789
--description "Exact employee count from most recent source"
exa-ai enrichment-update ws_abc123 enr_xyz789
--description "Exact employee count from most recent source"

Update format and options

更新格式和选项

exa-ai enrichment-update ws_abc123 enr_xyz789
--format options
--options '[{"label":"Small"},{"label":"Medium"},{"label":"Large"}]'
exa-ai enrichment-update ws_abc123 enr_xyz789
--format options
--options '[{"label":"Small"},{"label":"Medium"},{"label":"Large"}]'

Update metadata

更新元数据

exa-ai enrichment-update ws_abc123 enr_xyz789
--metadata '{"source":"manual","updated":"2024-01-15"}'
exa-ai enrichment-update ws_abc123 enr_xyz789
--metadata '{"source":"manual","updated":"2024-01-15"}'

Note: Cannot update --instructions or --title (creation-only parameters)

注意:无法更新--instructions或--title(仅在创建时设置的参数)

To change instructions, delete and recreate the enrichment

若要修改指令,请删除并重新创建增强功能

Delete

删除增强功能

exa-ai enrichment-delete ws_abc123 enr_xyz789
exa-ai enrichment-delete ws_abc123 enr_xyz789

Cancel running enrichment

取消正在运行的增强功能

exa-ai enrichment-cancel ws_abc123 enr_xyz789
undefined
exa-ai enrichment-cancel ws_abc123 enr_xyz789
undefined

Common Enrichment Patterns

常见增强功能模式

Company websets: Website (url), Team Size (text), Funding Stage (options), Industry (options)
Person websets: LinkedIn (url), Job Title (text), Company (text), Location (text)
Research papers: Publication Year (text), Authors (text), Venue (text), Research Area (options)

公司Webset:Website(url)、Team Size(text)、Funding Stage(options)、Industry(options)
个人Webset:LinkedIn(url)、Job Title(text)、Company(text)、Location(text)
研究论文Webset:Publication Year(text)、Authors(text)、Venue(text)、Research Area(options)

Best Practices

最佳实践

  1. Start small, validate, then scale: Always use count:1 for initial searches
  2. Follow three-step workflow: Validate → Expand → Enrich
  3. Never enrich during validation: Only enrich after validated, expanded results
  4. Avoid --wait flag: Do NOT use
    --wait
    in commands. It's designed for human interactive use, not automated workflows.
  5. Maintain query AND criteria consistency: When appending or scaling up, use IDENTICAL query and criteria from validated search. Save criteria to file for consistency.
  6. CRITICAL - First search must use override: The library defaults to
    --behavior append
    . First search on a webset MUST explicitly use
    --behavior override
    or it will fail with "No previous search found" error.
  7. Use correct parameter names:
    • Use
      --behavior append
      or
      --behavior override
      (NOT
      --mode
      )
    • Commands like
      webset-search-get
      require both webset_id and search_id
  8. Choose specific entity types: Use company, person, etc. for better results
  9. Save IDs: Use
    jq
    to extract and save IDs for subsequent commands

  1. 从小规模开始,验证后再扩展:初始搜索始终使用count:1
  2. 遵循三步工作流:验证→扩展→增强
  3. 验证阶段绝不添加增强功能:仅在验证并扩展结果后再添加增强功能
  4. 避免使用--wait参数:不要在命令中使用
    --wait
    。该参数专为人类交互式使用设计,不适用于自动化工作流。
  5. 保持查询和条件一致性:追加或扩展搜索时,使用与验证阶段完全相同的查询和条件。将条件保存到文件以确保一致性。
  6. 关键 - 首次搜索必须使用override:该工具默认使用
    --behavior append
    。Webset的首次搜索必须显式使用
    --behavior override
    ,否则会报错:"No previous search found"。
  7. 使用正确的参数名称
    • 使用
      --behavior append
      --behavior override
      (不要使用
      --mode
    • webset-search-get
      这样的命令需要同时传入webset_id和search_id
  8. 选择特定的实体类型:使用company、person等类型以获得更好的结果
  9. 保存ID:使用
    jq
    提取并保存ID,以便在后续命令中使用

Detailed Reference

详细参考

For complete command references, syntax, and all options, consult REFERENCE.md and component-specific reference files.
如需完整的命令参考、语法和所有选项,请查阅REFERENCE.md和各组件的参考文件。

Shared Requirements

通用要求

<shared-requirements>
<shared-requirements>

Schema Design

Schema设计

MUST: Use object wrapper for schemas

必须:为Schema使用对象包装器

Applies to: answer, search, find-similar, get-contents
When using schema parameters (
--output-schema
or
--summary-schema
), always wrap properties in an object:
json
{"type":"object","properties":{"field_name":{"type":"string"}}}
DO NOT use bare properties without the object wrapper:
json
{"properties":{"field_name":{"type":"string"}}}  // ❌ Missing "type":"object"
Why: The Exa API requires a valid JSON Schema with an object type at the root level. Omitting this causes validation errors.
Examples:
bash
undefined
适用场景:answer、search、find-similar、get-contents
使用Schema参数(
--output-schema
--summary-schema
)时,始终将属性包装在对象中:
json
{"type":"object","properties":{"field_name":{"type":"string"}}}
不要使用不带对象包装器的裸属性:
json
{"properties":{"field_name":{"type":"string"}}}  // ❌ 缺少"type":"object"
原因:Exa API要求根级别为对象类型的有效JSON Schema。省略该部分会导致验证错误。
示例
bash
undefined

✅ CORRECT - object wrapper included

✅ 正确 - 包含对象包装器

exa-ai search "AI news"
--summary-schema '{"type":"object","properties":{"headline":{"type":"string"}}}'
exa-ai search "AI news"
--summary-schema '{"type":"object","properties":{"headline":{"type":"string"}}}'

❌ WRONG - missing object wrapper

❌ 错误 - 缺少对象包装器

exa-ai search "AI news"
--summary-schema '{"properties":{"headline":{"type":"string"}}}'

---
exa-ai search "AI news"
--summary-schema '{"properties":{"headline":{"type":"string"}}}'

---

Output Format Selection

输出格式选择

MUST NOT: Mix toon format with jq

不要:将toon格式与jq混合使用

Applies to: answer, context, search, find-similar, get-contents
toon
format produces YAML-like output, not JSON. DO NOT pipe toon output to jq for parsing:
bash
undefined
适用场景:answer、context、search、find-similar、get-contents
toon
格式生成类YAML的输出,而非JSON。不要将toon输出通过管道传递给jq进行解析:
bash
undefined

❌ WRONG - toon is not JSON

❌ 错误 - toon不是JSON

exa-ai search "query" --output-format toon | jq -r '.results'
exa-ai search "query" --output-format toon | jq -r '.results'

✅ CORRECT - use JSON (default) with jq

✅ 正确 - 使用JSON(默认)与jq配合

exa-ai search "query" | jq -r '.results[].title'
exa-ai search "query" | jq -r '.results[].title'

✅ CORRECT - use toon for direct reading only

✅ 正确 - 仅将toon用于直接查看

exa-ai search "query" --output-format toon

**Why**: jq expects valid JSON input. toon format is designed for human readability and produces YAML-like output that jq cannot parse.
exa-ai search "query" --output-format toon

**原因**:jq需要有效的JSON输入。toon格式专为人类可读性设计,生成的类YAML输出无法被jq解析。

SHOULD: Choose one output approach

建议:选择一种输出方式

Applies to: answer, context, search, find-similar, get-contents
Pick one strategy and stick with it throughout your workflow:
  1. Approach 1: toon only - Compact YAML-like output for direct reading
    • Use when: Reading output directly, no further processing needed
    • Token savings: ~40% reduction vs JSON
    • Example:
      exa-ai search "query" --output-format toon
  2. Approach 2: JSON + jq - Extract specific fields programmatically
    • Use when: Need to extract specific fields or pipe to other commands
    • Token savings: ~80-90% reduction (extracts only needed fields)
    • Example:
      exa-ai search "query" | jq -r '.results[].title'
  3. Approach 3: Schemas + jq - Structured data extraction with validation
    • Use when: Need consistent structured output across multiple queries
    • Token savings: ~85% reduction + consistent schema
    • Example:
      exa-ai search "query" --summary-schema '{...}' | jq -r '.results[].summary | fromjson'
Why: Mixing approaches increases complexity and token usage. Choosing one approach optimizes for your use case.

适用场景:answer、context、search、find-similar、get-contents
选择一种策略并在整个工作流中坚持使用:
  1. 方式1:仅使用toon - 紧凑的类YAML输出,适合直接查看
    • 适用场景:直接查看输出,无需进一步处理
    • 令牌节省:比JSON减少约40%
    • 示例:
      exa-ai search "query" --output-format toon
  2. 方式2:JSON + jq - 以编程方式提取特定字段
    • 适用场景:需要提取特定字段或传递给其他命令
    • 令牌节省:减少约80-90%(仅提取所需字段)
    • 示例:
      exa-ai search "query" | jq -r '.results[].title'
  3. 方式3:Schema + jq - 带验证的结构化数据提取
    • 适用场景:需要在多个查询中保持一致的结构化输出
    • 令牌节省:减少约85% + 一致的Schema
    • 示例:
      exa-ai search "query" --summary-schema '{...}' | jq -r '.results[].summary | fromjson'
原因:混合使用多种方式会增加复杂度和令牌消耗。选择一种方式可以针对你的使用场景进行优化。

Shell Command Best Practices

Shell命令最佳实践

MUST: Run commands directly, parse separately

必须:直接运行命令,分开解析

Applies to: monitor, search (websets), research, and all skills using complex commands
When using the Bash tool with complex shell syntax, run commands directly and parse output in separate steps:
bash
undefined
适用场景:monitor、search(websets)、research以及所有使用复杂命令的功能
当使用Bash工具执行复杂Shell语法时,直接运行命令并在单独的步骤中解析输出:
bash
undefined

❌ WRONG - nested command substitution

❌ 错误 - 嵌套命令替换

webset_id=$(exa-ai webset-create --search '{"query":"..."}' | jq -r '.webset_id')
webset_id=$(exa-ai webset-create --search '{"query":"..."}' | jq -r '.webset_id')

✅ CORRECT - run directly, then parse

✅ 正确 - 直接运行,然后解析

exa-ai webset-create --search '{"query":"..."}'
exa-ai webset-create --search '{"query":"..."}'

Then in a follow-up command:

然后在后续命令中执行:

webset_id=$(cat output.json | jq -r '.webset_id')

**Why**: Complex nested `$(...)` command substitutions can fail unpredictably in shell environments. Running commands directly and parsing separately improves reliability and makes debugging easier.
webset_id=$(cat output.json | jq -r '.webset_id')

**原因**:复杂的嵌套`$(...)`命令替换在Shell环境中可能会意外失败。直接运行命令并分开解析可以提高可靠性,使调试更简单。

MUST NOT: Use nested command substitutions

不要:使用嵌套命令替换

Applies to: All skills when using complex multi-step operations
Avoid nesting multiple levels of command substitution:
bash
undefined
适用场景:所有使用复杂多步操作的功能
避免使用多层嵌套的命令替换:
bash
undefined

❌ WRONG - deeply nested

❌ 错误 - 深度嵌套

result=$(exa-ai search "$(cat query.txt | tr '\n' ' ')" --num-results $(cat config.json | jq -r '.count'))
result=$(exa-ai search "$(cat query.txt | tr '\n' ' ')" --num-results $(cat config.json | jq -r '.count'))

✅ CORRECT - sequential steps

✅ 正确 - 连续步骤

query=$(cat query.txt | tr '\n' ' ') count=$(cat config.json | jq -r '.count') exa-ai search "$query" --num-results $count

**Why**: Nested command substitutions are fragile and hard to debug when they fail. Sequential steps make each operation explicit and easier to troubleshoot.
query=$(cat query.txt | tr '\n' ' ') count=$(cat config.json | jq -r '.count') exa-ai search "$query" --num-results $count

**原因**:嵌套命令替换很脆弱,失败时难以调试。连续步骤使每个操作都清晰明确,更易于排查问题。

SHOULD: Break complex commands into sequential steps

建议:将复杂命令拆分为连续步骤

Applies to: All skills when working with multi-step workflows
For readability and reliability, break complex operations into clear sequential steps:
bash
undefined
适用场景:所有处理多步工作流的功能
为了可读性和可靠性,将复杂操作拆分为清晰的连续步骤:
bash
undefined

❌ Less maintainable - everything in one line

❌ 可维护性低 - 所有内容在一行中

exa-ai webset-create --search '{"query":"startups","count":1}' | jq -r '.webset_id' | xargs -I {} exa-ai webset-search-create {} --query "AI" --behavior override
exa-ai webset-create --search '{"query":"startups","count":1}' | jq -r '.webset_id' | xargs -I {} exa-ai webset-search-create {} --query "AI" --behavior override

✅ More maintainable - clear steps

✅ 可维护性高 - 步骤清晰

exa-ai webset-create --search '{"query":"startups","count":1}' webset_id=$(jq -r '.webset_id' < output.json) exa-ai webset-search-create $webset_id --query "AI" --behavior override

**Why**: Sequential steps are easier to understand, debug, and modify. Each step can be verified independently.

</shared-requirements>
exa-ai webset-create --search '{"query":"startups","count":1}' webset_id=$(jq -r '.webset_id' < output.json) exa-ai webset-search-create $webset_id --query "AI" --behavior override

**原因**:连续步骤更易于理解、调试和修改。每个步骤都可以独立验证。

</shared-requirements>