exa-search

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Exa AI-Powered Web Search

Exa 人工智能驱动的网页搜索

Search query: $ARGUMENTS
搜索查询:$ARGUMENTS

Role & Positioning

角色与定位

Exa is the broad web search source with built-in content extraction:
SkillBest for
/arxiv
Direct preprint search and PDF download
/semantic-scholar
Published venue papers (IEEE, ACM, Springer), citation counts
/deepxiv
Layered reading: search, brief, section map, section reads
/exa-search
Broad web search: blogs, docs, news, companies, research papers — with content extraction
Use Exa when you need results beyond academic databases, or when you want content (highlights, full text, summaries) extracted alongside search results.
Exa是具备内置内容提取功能的广泛网页搜索工具:
技能适用场景
/arxiv
直接预印本搜索及PDF下载
/semantic-scholar
已发表期刊论文(IEEE、ACM、Springer)、引用次数查询
/deepxiv
分层阅读:搜索、摘要、章节地图、章节内容读取
/exa-search
广泛网页搜索:博客、文档、新闻、企业信息、研究论文 — 支持内容提取
当你需要学术数据库之外的结果,或希望在搜索结果旁提取内容(重点片段、全文、摘要)时,请使用Exa。

Constants

常量定义

  • FETCH_SCRIPT
    tools/exa_search.py
    relative to the current project.
  • MAX_RESULTS = 10 — Default number of results to return.
Overrides (append to arguments):
  • /exa-search "RAG pipelines" — max: 5
    — top 5 results
  • /exa-search "diffusion models" — category: research paper
    — research papers only
  • /exa-search "startup funding" — category: news, start date: 2025-01-01
    — recent news
  • /exa-search "transformer" — content: text, max chars: 8000
    — full text mode
  • /exa-search "transformer" — content: summary
    — LLM-generated summaries
  • /exa-search "transformer" — domains: arxiv.org,huggingface.co
    — domain filter
  • /exa-search "https://arxiv.org/abs/2301.07041" — similar
    — find similar pages
  • FETCH_SCRIPT — 相对于当前项目的路径:
    tools/exa_search.py
  • MAX_RESULTS = 10 — 默认返回的结果数量
参数覆盖(追加到参数后):
  • /exa-search "RAG pipelines" — max: 5
    — 返回前5条结果
  • /exa-search "diffusion models" — category: research paper
    — 仅返回研究论文
  • /exa-search "startup funding" — category: news, start date: 2025-01-01
    — 最新新闻
  • /exa-search "transformer" — content: text, max chars: 8000
    — 全文模式
  • /exa-search "transformer" — content: summary
    — 大语言模型生成的摘要
  • /exa-search "transformer" — domains: arxiv.org,huggingface.co
    — 域名过滤
  • /exa-search "https://arxiv.org/abs/2301.07041" — similar
    — 查找相似页面

Setup

安装配置

Exa requires the
exa-py
SDK and an API key:
bash
pip install exa-py
Set your API key:
bash
export EXA_API_KEY=your-key-here
Get a key from exa.ai.
Exa需要
exa-py
SDK及API密钥:
bash
pip install exa-py
设置你的API密钥:
bash
export EXA_API_KEY=your-key-here
可从exa.ai获取密钥。

Workflow

工作流程

Step 1: Parse Arguments

步骤1:解析参数

Parse
$ARGUMENTS
for:
  • query: The search query (required) or a URL (for
    find-similar
    mode)
  • similar: If present, use
    find-similar
    mode instead of search
  • max: Override MAX_RESULTS
  • category:
    research paper
    ,
    news
    ,
    company
    ,
    personal site
    ,
    financial report
    ,
    people
  • content:
    highlights
    (default),
    text
    ,
    summary
    ,
    none
  • max chars: Max characters for content extraction
  • type: Search type —
    auto
    (default),
    neural
    ,
    fast
    ,
    instant
  • domains: Comma-separated include domains
  • exclude domains: Comma-separated exclude domains
  • include text: Phrase that must appear in results
  • exclude text: Phrase to exclude from results
  • start date: ISO 8601 date — only results after this
  • end date: ISO 8601 date — only results before this
  • location: Two-letter ISO country code
$ARGUMENTS
中解析以下内容:
  • query:搜索查询(必填)或URL(用于“查找相似页面”模式)
  • similar:若存在,则使用“查找相似页面”模式而非普通搜索
  • max:覆盖MAX_RESULTS的默认值
  • category
    research paper
    news
    company
    personal site
    financial report
    people
  • content
    highlights
    (默认)、
    text
    summary
    none
  • max chars:内容提取的最大字符数
  • type:搜索类型 —
    auto
    (默认)、
    neural
    fast
    instant
  • domains:逗号分隔的包含域名列表
  • exclude domains:逗号分隔的排除域名列表
  • include text:结果中必须包含的短语
  • exclude text:结果中需排除的短语
  • start date:ISO 8601格式日期 — 仅返回该日期之后的结果
  • end date:ISO 8601格式日期 — 仅返回该日期之前的结果
  • location:两位ISO国家代码

Step 2: Locate Script

步骤2:定位脚本

bash
SCRIPT=$(find tools/ -name "exa_search.py" 2>/dev/null | head -1)
If not found, tell the user:
exa_search.py not found. Make sure tools/exa_search.py exists and exa-py is installed:
pip install exa-py
bash
SCRIPT=$(find tools/ -name "exa_search.py" 2>/dev/null | head -1)
若未找到,告知用户:
exa_search.py 未找到。请确保 tools/exa_search.py 存在且已安装 exa-py:
pip install exa-py

Step 3: Execute Search

步骤3:执行搜索

Standard search:
bash
python3 "$SCRIPT" search "QUERY" --max 10 --content highlights
With filters:
bash
python3 "$SCRIPT" search "QUERY" --max 10 \
  --category "research paper" \
  --start-date 2025-01-01 \
  --content text --max-chars 8000
Find similar pages:
bash
python3 "$SCRIPT" find-similar "URL" --max 5 --content highlights
Get content for known URLs:
bash
python3 "$SCRIPT" get-contents "URL1" "URL2" --content text
标准搜索:
bash
python3 "$SCRIPT" search "QUERY" --max 10 --content highlights
带过滤条件:
bash
python3 "$SCRIPT" search "QUERY" --max 10 \
  --category "research paper" \
  --start-date 2025-01-01 \
  --content text --max-chars 8000
查找相似页面:
bash
python3 "$SCRIPT" find-similar "URL" --max 5 --content highlights
获取已知URL的内容:
bash
python3 "$SCRIPT" get-contents "URL1" "URL2" --content text

Step 4: Present Results

步骤4:展示结果

Format results as a structured table:
| # | Title | Authors | Venue/Publisher | URL | Date | Key Content |
|---|-------|---------|-----------------|-----|------|-------------|
For each result:
  • Show title and URL
  • Show published date if available
  • Show highlights, text excerpt, or summary depending on content mode
  • Flag particularly relevant results
  • For
    category: "research paper"
    hits only
    — also record authors (from Exa's
    author
    /
    authors
    fields, or fallback: parse from the result snippet) and venue/publisher (from
    publisher
    ,
    source
    , or the domain hosting the paper). These are needed by Step 6's wiki hook; if either is unavailable for a given hit, skip wiki ingest for that one hit and log a note.
将结果整理为结构化表格:
| # | 标题 | 作者 | 期刊/出版商 | URL | 日期 | 核心内容 |
|---|-------|---------|-----------------|-----|------|-------------|
对于每条结果:
  • 展示标题和URL
  • 若有发布日期则展示
  • 根据内容模式展示重点片段、文本摘录或摘要
  • 标记特别相关的结果
  • 仅针对
    category: "research paper"
    的结果
    — 同时记录作者信息(来自Exa的
    author
    /
    authors
    字段,或从结果片段中解析)及期刊/出版商信息(来自
    publisher
    source
    或论文所在域名)。这些信息是步骤6中维基钩子所需的;若某条结果缺少其中任意一项,则跳过该结果的维基导入并记录备注。

Step 5: Offer Follow-up

步骤5:提供后续操作建议

After presenting results, suggest:
  • Deepen: "I can fetch full text for any of these results"
  • Find similar: "I can find pages similar to any result"
  • Narrow: "I can re-search with domain/date/text filters"
展示结果后,建议:
  • 深入获取内容:"我可以获取这些结果中任意一项的全文"
  • 查找相似内容:"我可以查找与任意结果相似的页面"
  • 缩小范围:"我可以通过域名/日期/文本过滤重新搜索"

Step 6: Update Research Wiki (if active, research-paper results only)

步骤6:更新研究维基(仅在激活状态且搜索结果包含研究论文时执行)

Required when
research-wiki/
exists AND the search returned results of
category: "research paper"
; skip silently otherwise. General web results (blog posts, docs, news) are not ingested — the wiki is for papers only.
For each research paper hit, try to recover an arXiv ID from the URL (
arxiv.org/abs/<id>
); if present, use
--arxiv-id
. Otherwise fall back to manual metadata:
if [ -d research-wiki/ ] and query category was "research paper":
    for each research-paper hit in results:
        if URL matches arxiv.org/abs/<id>:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --arxiv-id "<id>"
        else:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --title "<title>" --authors "<authors joined by , >" \
                --year <year> --venue "<venue or publisher>"
The helper handles slug / dedup / page / index / log — do not handwrite
papers/<slug>.md
. See
shared-references/integration-contract.md
.
research-wiki/
目录存在且搜索返回
category: "research paper"
的结果时必须执行
;否则静默跳过。普通网页结果(博客文章、文档、新闻)导入维基 — 维基仅用于存储论文。
对于每条研究论文结果,尝试从URL中提取arXiv ID(格式为
arxiv.org/abs/<id>
);若存在,使用
--arxiv-id
参数。否则使用手动元数据:
if [ -d research-wiki/ ] and query category was "research paper":
    for each research-paper hit in results:
        if URL matches arxiv.org/abs/<id>:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --arxiv-id "<id>"
        else:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --title "<title>" --authors "<authors joined by , >" \
                --year <year> --venue "<venue or publisher>"
辅助工具会处理别名/去重/页面/索引/日志 — 请勿手动编写
papers/<slug>.md
。详情请见
shared-references/integration-contract.md

Key Rules

核心规则

  • Always check that
    EXA_API_KEY
    is set before searching
  • Default to
    highlights
    content mode for a good balance of speed and context
  • Use
    category: "research paper"
    when the user is clearly looking for academic content
  • Use
    text
    content mode when the user needs full page content
  • Combine with
    /arxiv
    or
    /semantic-scholar
    for comprehensive literature coverage
  • 搜索前始终检查是否已设置
    EXA_API_KEY
  • 默认使用
    highlights
    内容模式,平衡速度与上下文信息
  • 当用户明确寻找学术内容时,使用
    category: "research paper"
  • 当用户需要完整页面内容时,使用
    text
    内容模式
  • 结合
    /arxiv
    /semantic-scholar
    实现全面的文献覆盖