exa-search

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Exa AI-Powered Web Search

Exa 人工智能驱动的网页搜索

Search query: $ARGUMENTS

搜索查询：$ARGUMENTS

Role & Positioning

角色与定位

Exa is the broad web search source with built-in content extraction:

Skill	Best for
`/arxiv`	Direct preprint search and PDF download
`/semantic-scholar`	Published venue papers (IEEE, ACM, Springer), citation counts
`/deepxiv`	Layered reading: search, brief, section map, section reads
`/exa-search`	Broad web search: blogs, docs, news, companies, research papers — with content extraction

Use Exa when you need results beyond academic databases, or when you want content (highlights, full text, summaries) extracted alongside search results.

Exa是具备内置内容提取功能的广泛网页搜索工具：

技能	适用场景
`/arxiv`	直接预印本搜索及PDF下载
`/semantic-scholar`	已发表期刊论文（IEEE、ACM、Springer）、引用次数查询
`/deepxiv`	分层阅读：搜索、摘要、章节地图、章节内容读取
`/exa-search`	广泛网页搜索：博客、文档、新闻、企业信息、研究论文 — 支持内容提取

当你需要学术数据库之外的结果，或希望在搜索结果旁提取内容（重点片段、全文、摘要）时，请使用Exa。

Constants

常量定义

FETCH_SCRIPT —
```
tools/exa_search.py
```
relative to the current project.
MAX_RESULTS = 10 — Default number of results to return.

Overrides (append to arguments):
/exa-search "RAG pipelines" — max: 5
— top 5 results
/exa-search "diffusion models" — category: research paper
— research papers only
/exa-search "startup funding" — category: news, start date: 2025-01-01
— recent news
/exa-search "transformer" — content: text, max chars: 8000
— full text mode
/exa-search "transformer" — content: summary
— LLM-generated summaries
/exa-search "transformer" — domains: arxiv.org,huggingface.co
— domain filter
/exa-search "https://arxiv.org/abs/2301.07041" — similar
— find similar pages

FETCH_SCRIPT — 相对于当前项目的路径：
```
tools/exa_search.py
```
MAX_RESULTS = 10 — 默认返回的结果数量

参数覆盖（追加到参数后）：
/exa-search "RAG pipelines" — max: 5
— 返回前5条结果
/exa-search "diffusion models" — category: research paper
— 仅返回研究论文
/exa-search "startup funding" — category: news, start date: 2025-01-01
— 最新新闻
/exa-search "transformer" — content: text, max chars: 8000
— 全文模式
/exa-search "transformer" — content: summary
— 大语言模型生成的摘要
/exa-search "transformer" — domains: arxiv.org,huggingface.co
— 域名过滤
/exa-search "https://arxiv.org/abs/2301.07041" — similar
— 查找相似页面

Setup

安装配置

Exa requires the

exa-py

SDK and an API key:

bash

pip install exa-py

Set your API key:

bash

export EXA_API_KEY=your-key-here

Get a key from exa.ai.

Exa需要

exa-py

SDK及API密钥：

bash

pip install exa-py

设置你的API密钥：

bash

export EXA_API_KEY=your-key-here

可从exa.ai获取密钥。

Workflow

工作流程

Step 1: Parse Arguments

步骤1：解析参数

Parse

$ARGUMENTS

for:

query: The search query (required) or a URL (for
```
find-similar
```
mode)
similar: If present, use
```
find-similar
```
mode instead of search
max: Override MAX_RESULTS

category:

research paper

news

company

personal site

financial report

people

content:
```
highlights
```
(default),
```
text
```
,
```
summary
```
,
```
none
```
max chars: Max characters for content extraction
type: Search type —
```
auto
```
(default),
```
neural
```
,
```
fast
```
,
```
instant
```
domains: Comma-separated include domains
exclude domains: Comma-separated exclude domains
include text: Phrase that must appear in results
exclude text: Phrase to exclude from results
start date: ISO 8601 date — only results after this
end date: ISO 8601 date — only results before this
location: Two-letter ISO country code

从

$ARGUMENTS

中解析以下内容：

query：搜索查询（必填）或URL（用于“查找相似页面”模式）
similar：若存在，则使用“查找相似页面”模式而非普通搜索
max：覆盖MAX_RESULTS的默认值

category：

research paper

、

news

、

company

、

personal site

、

financial report

、

people

content：
```
highlights
```
（默认）、
```
text
```
、
```
summary
```
、
```
none
```
max chars：内容提取的最大字符数
type：搜索类型 —
```
auto
```
（默认）、
```
neural
```
、
```
fast
```
、
```
instant
```
domains：逗号分隔的包含域名列表
exclude domains：逗号分隔的排除域名列表
include text：结果中必须包含的短语
exclude text：结果中需排除的短语
start date：ISO 8601格式日期 — 仅返回该日期之后的结果
end date：ISO 8601格式日期 — 仅返回该日期之前的结果
location：两位ISO国家代码

Step 2: Locate Script

步骤2：定位脚本

bash

SCRIPT=$(find tools/ -name "exa_search.py" 2>/dev/null | head -1)

If not found, tell the user:

exa_search.py not found. Make sure tools/exa_search.py exists and exa-py is installed:
pip install exa-py

bash

SCRIPT=$(find tools/ -name "exa_search.py" 2>/dev/null | head -1)

若未找到，告知用户：

exa_search.py 未找到。请确保 tools/exa_search.py 存在且已安装 exa-py：
pip install exa-py

Step 3: Execute Search

步骤3：执行搜索

Standard search:

bash

python3 "$SCRIPT" search "QUERY" --max 10 --content highlights

With filters:

bash

python3 "$SCRIPT" search "QUERY" --max 10 \
  --category "research paper" \
  --start-date 2025-01-01 \
  --content text --max-chars 8000

Find similar pages:

bash

python3 "$SCRIPT" find-similar "URL" --max 5 --content highlights

Get content for known URLs:

bash

python3 "$SCRIPT" get-contents "URL1" "URL2" --content text

标准搜索：

bash

python3 "$SCRIPT" search "QUERY" --max 10 --content highlights

带过滤条件：

bash

python3 "$SCRIPT" search "QUERY" --max 10 \
  --category "research paper" \
  --start-date 2025-01-01 \
  --content text --max-chars 8000

查找相似页面：

bash

python3 "$SCRIPT" find-similar "URL" --max 5 --content highlights

获取已知URL的内容：

bash

python3 "$SCRIPT" get-contents "URL1" "URL2" --content text

Step 4: Present Results

步骤4：展示结果

Format results as a structured table:

| # | Title | Authors | Venue/Publisher | URL | Date | Key Content |
|---|-------|---------|-----------------|-----|------|-------------|

For each result:

Show title and URL
Show published date if available
Show highlights, text excerpt, or summary depending on content mode
Flag particularly relevant results
For
category: "research paper"
hits only — also record authors (from Exa's
```
author
```
/
```
authors
```
fields, or fallback: parse from the result snippet) and venue/publisher (from
```
publisher
```
,
```
source
```
, or the domain hosting the paper). These are needed by Step 6's wiki hook; if either is unavailable for a given hit, skip wiki ingest for that one hit and log a note.

将结果整理为结构化表格：

| # | 标题 | 作者 | 期刊/出版商 | URL | 日期 | 核心内容 |
|---|-------|---------|-----------------|-----|------|-------------|

对于每条结果：

展示标题和URL
若有发布日期则展示
根据内容模式展示重点片段、文本摘录或摘要
标记特别相关的结果
仅针对
category: "research paper"
的结果 — 同时记录作者信息（来自Exa的
```
author
```
/
```
authors
```
字段，或从结果片段中解析）及期刊/出版商信息（来自
```
publisher
```
、
```
source
```
或论文所在域名）。这些信息是步骤6中维基钩子所需的；若某条结果缺少其中任意一项，则跳过该结果的维基导入并记录备注。

Step 5: Offer Follow-up

步骤5：提供后续操作建议

After presenting results, suggest:

Deepen: "I can fetch full text for any of these results"
Find similar: "I can find pages similar to any result"
Narrow: "I can re-search with domain/date/text filters"

展示结果后，建议：

深入获取内容："我可以获取这些结果中任意一项的全文"
查找相似内容："我可以查找与任意结果相似的页面"
缩小范围："我可以通过域名/日期/文本过滤重新搜索"

Step 6: Update Research Wiki (if active, research-paper results only)

步骤6：更新研究维基（仅在激活状态且搜索结果包含研究论文时执行）

Required when
research-wiki/
exists AND the search returned results of
category: "research paper"
; skip silently otherwise. General web results (blog posts, docs, news) are not ingested — the wiki is for papers only.

For each research paper hit, try to recover an arXiv ID from the URL (

arxiv.org/abs/<id>

); if present, use

--arxiv-id

. Otherwise fall back to manual metadata:

if [ -d research-wiki/ ] and query category was "research paper":
    for each research-paper hit in results:
        if URL matches arxiv.org/abs/<id>:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --arxiv-id "<id>"
        else:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --title "<title>" --authors "<authors joined by , >" \
                --year <year> --venue "<venue or publisher>"

The helper handles slug / dedup / page / index / log — do not handwrite
papers/<slug>.md
. See

shared-references/integration-contract.md

当
research-wiki/
目录存在且搜索返回
category: "research paper"
的结果时必须执行；否则静默跳过。普通网页结果（博客文章、文档、新闻）不导入维基 — 维基仅用于存储论文。

对于每条研究论文结果，尝试从URL中提取arXiv ID（格式为

arxiv.org/abs/<id>

）；若存在，使用

--arxiv-id

参数。否则使用手动元数据：

if [ -d research-wiki/ ] and query category was "research paper":
    for each research-paper hit in results:
        if URL matches arxiv.org/abs/<id>:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --arxiv-id "<id>"
        else:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --title "<title>" --authors "<authors joined by , >" \
                --year <year> --venue "<venue or publisher>"

辅助工具会处理别名/去重/页面/索引/日志 — 请勿手动编写
papers/<slug>.md
。详情请见

shared-references/integration-contract.md

。

Key Rules

核心规则

Always check that
```
EXA_API_KEY
```
is set before searching
Default to
```
highlights
```
content mode for a good balance of speed and context
Use
```
category: "research paper"
```
when the user is clearly looking for academic content
Use
```
text
```
content mode when the user needs full page content
Combine with
```
/arxiv
```
or
```
/semantic-scholar
```
for comprehensive literature coverage

搜索前始终检查是否已设置
```
EXA_API_KEY
```
默认使用
```
highlights
```
内容模式，平衡速度与上下文信息
当用户明确寻找学术内容时，使用
```
category: "research paper"
```
当用户需要完整页面内容时，使用
```
text
```
内容模式
结合
```
/arxiv
```
或
```
/semantic-scholar
```
实现全面的文献覆盖