exa-search
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseExa AI-Powered Web Search
Exa 人工智能驱动的网页搜索
Search query: $ARGUMENTS
搜索查询:$ARGUMENTS
Role & Positioning
角色与定位
Exa is the broad web search source with built-in content extraction:
| Skill | Best for |
|---|---|
| Direct preprint search and PDF download |
| Published venue papers (IEEE, ACM, Springer), citation counts |
| Layered reading: search, brief, section map, section reads |
| Broad web search: blogs, docs, news, companies, research papers — with content extraction |
Use Exa when you need results beyond academic databases, or when you want content (highlights, full text, summaries) extracted alongside search results.
Exa是具备内置内容提取功能的广泛网页搜索工具:
| 技能 | 适用场景 |
|---|---|
| 直接预印本搜索及PDF下载 |
| 已发表期刊论文(IEEE、ACM、Springer)、引用次数查询 |
| 分层阅读:搜索、摘要、章节地图、章节内容读取 |
| 广泛网页搜索:博客、文档、新闻、企业信息、研究论文 — 支持内容提取 |
当你需要学术数据库之外的结果,或希望在搜索结果旁提取内容(重点片段、全文、摘要)时,请使用Exa。
Constants
常量定义
- FETCH_SCRIPT — relative to the current project.
tools/exa_search.py - MAX_RESULTS = 10 — Default number of results to return.
Overrides (append to arguments):
— top 5 results/exa-search "RAG pipelines" — max: 5 — research papers only/exa-search "diffusion models" — category: research paper — recent news/exa-search "startup funding" — category: news, start date: 2025-01-01 — full text mode/exa-search "transformer" — content: text, max chars: 8000 — LLM-generated summaries/exa-search "transformer" — content: summary — domain filter/exa-search "transformer" — domains: arxiv.org,huggingface.co — find similar pages/exa-search "https://arxiv.org/abs/2301.07041" — similar
- FETCH_SCRIPT — 相对于当前项目的路径:
tools/exa_search.py - MAX_RESULTS = 10 — 默认返回的结果数量
参数覆盖(追加到参数后):
— 返回前5条结果/exa-search "RAG pipelines" — max: 5 — 仅返回研究论文/exa-search "diffusion models" — category: research paper — 最新新闻/exa-search "startup funding" — category: news, start date: 2025-01-01 — 全文模式/exa-search "transformer" — content: text, max chars: 8000 — 大语言模型生成的摘要/exa-search "transformer" — content: summary — 域名过滤/exa-search "transformer" — domains: arxiv.org,huggingface.co — 查找相似页面/exa-search "https://arxiv.org/abs/2301.07041" — similar
Setup
安装配置
Exa requires the SDK and an API key:
exa-pybash
pip install exa-pySet your API key:
bash
export EXA_API_KEY=your-key-hereGet a key from exa.ai.
Exa需要 SDK及API密钥:
exa-pybash
pip install exa-py设置你的API密钥:
bash
export EXA_API_KEY=your-key-here可从exa.ai获取密钥。
Workflow
工作流程
Step 1: Parse Arguments
步骤1:解析参数
Parse for:
$ARGUMENTS- query: The search query (required) or a URL (for mode)
find-similar - similar: If present, use mode instead of search
find-similar - max: Override MAX_RESULTS
- category: ,
research paper,news,company,personal site,financial reportpeople - content: (default),
highlights,text,summarynone - max chars: Max characters for content extraction
- type: Search type — (default),
auto,neural,fastinstant - domains: Comma-separated include domains
- exclude domains: Comma-separated exclude domains
- include text: Phrase that must appear in results
- exclude text: Phrase to exclude from results
- start date: ISO 8601 date — only results after this
- end date: ISO 8601 date — only results before this
- location: Two-letter ISO country code
从中解析以下内容:
$ARGUMENTS- query:搜索查询(必填)或URL(用于“查找相似页面”模式)
- similar:若存在,则使用“查找相似页面”模式而非普通搜索
- max:覆盖MAX_RESULTS的默认值
- category:、
research paper、news、company、personal site、financial reportpeople - content:(默认)、
highlights、text、summarynone - max chars:内容提取的最大字符数
- type:搜索类型 — (默认)、
auto、neural、fastinstant - domains:逗号分隔的包含域名列表
- exclude domains:逗号分隔的排除域名列表
- include text:结果中必须包含的短语
- exclude text:结果中需排除的短语
- start date:ISO 8601格式日期 — 仅返回该日期之后的结果
- end date:ISO 8601格式日期 — 仅返回该日期之前的结果
- location:两位ISO国家代码
Step 2: Locate Script
步骤2:定位脚本
bash
SCRIPT=$(find tools/ -name "exa_search.py" 2>/dev/null | head -1)If not found, tell the user:
exa_search.py not found. Make sure tools/exa_search.py exists and exa-py is installed:
pip install exa-pybash
SCRIPT=$(find tools/ -name "exa_search.py" 2>/dev/null | head -1)若未找到,告知用户:
exa_search.py 未找到。请确保 tools/exa_search.py 存在且已安装 exa-py:
pip install exa-pyStep 3: Execute Search
步骤3:执行搜索
Standard search:
bash
python3 "$SCRIPT" search "QUERY" --max 10 --content highlightsWith filters:
bash
python3 "$SCRIPT" search "QUERY" --max 10 \
--category "research paper" \
--start-date 2025-01-01 \
--content text --max-chars 8000Find similar pages:
bash
python3 "$SCRIPT" find-similar "URL" --max 5 --content highlightsGet content for known URLs:
bash
python3 "$SCRIPT" get-contents "URL1" "URL2" --content text标准搜索:
bash
python3 "$SCRIPT" search "QUERY" --max 10 --content highlights带过滤条件:
bash
python3 "$SCRIPT" search "QUERY" --max 10 \
--category "research paper" \
--start-date 2025-01-01 \
--content text --max-chars 8000查找相似页面:
bash
python3 "$SCRIPT" find-similar "URL" --max 5 --content highlights获取已知URL的内容:
bash
python3 "$SCRIPT" get-contents "URL1" "URL2" --content textStep 4: Present Results
步骤4:展示结果
Format results as a structured table:
| # | Title | Authors | Venue/Publisher | URL | Date | Key Content |
|---|-------|---------|-----------------|-----|------|-------------|For each result:
- Show title and URL
- Show published date if available
- Show highlights, text excerpt, or summary depending on content mode
- Flag particularly relevant results
- For hits only — also record authors (from Exa's
category: "research paper"/authorfields, or fallback: parse from the result snippet) and venue/publisher (fromauthors,publisher, or the domain hosting the paper). These are needed by Step 6's wiki hook; if either is unavailable for a given hit, skip wiki ingest for that one hit and log a note.source
将结果整理为结构化表格:
| # | 标题 | 作者 | 期刊/出版商 | URL | 日期 | 核心内容 |
|---|-------|---------|-----------------|-----|------|-------------|对于每条结果:
- 展示标题和URL
- 若有发布日期则展示
- 根据内容模式展示重点片段、文本摘录或摘要
- 标记特别相关的结果
- 仅针对的结果 — 同时记录作者信息(来自Exa的
category: "research paper"/author字段,或从结果片段中解析)及期刊/出版商信息(来自authors、publisher或论文所在域名)。这些信息是步骤6中维基钩子所需的;若某条结果缺少其中任意一项,则跳过该结果的维基导入并记录备注。source
Step 5: Offer Follow-up
步骤5:提供后续操作建议
After presenting results, suggest:
- Deepen: "I can fetch full text for any of these results"
- Find similar: "I can find pages similar to any result"
- Narrow: "I can re-search with domain/date/text filters"
展示结果后,建议:
- 深入获取内容:"我可以获取这些结果中任意一项的全文"
- 查找相似内容:"我可以查找与任意结果相似的页面"
- 缩小范围:"我可以通过域名/日期/文本过滤重新搜索"
Step 6: Update Research Wiki (if active, research-paper results only)
步骤6:更新研究维基(仅在激活状态且搜索结果包含研究论文时执行)
Required when exists AND the search returned
results of ; skip silently otherwise.
General web results (blog posts, docs, news) are not ingested —
the wiki is for papers only.
research-wiki/category: "research paper"For each research paper hit, try to recover an arXiv ID from the URL
(); if present, use . Otherwise fall
back to manual metadata:
arxiv.org/abs/<id>--arxiv-idif [ -d research-wiki/ ] and query category was "research paper":
for each research-paper hit in results:
if URL matches arxiv.org/abs/<id>:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--arxiv-id "<id>"
else:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--title "<title>" --authors "<authors joined by , >" \
--year <year> --venue "<venue or publisher>"The helper handles slug / dedup / page / index / log — do not
handwrite . See
.
papers/<slug>.mdshared-references/integration-contract.md当目录存在且搜索返回的结果时必须执行;否则静默跳过。普通网页结果(博客文章、文档、新闻)不导入维基 — 维基仅用于存储论文。
research-wiki/category: "research paper"对于每条研究论文结果,尝试从URL中提取arXiv ID(格式为);若存在,使用参数。否则使用手动元数据:
arxiv.org/abs/<id>--arxiv-idif [ -d research-wiki/ ] and query category was "research paper":
for each research-paper hit in results:
if URL matches arxiv.org/abs/<id>:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--arxiv-id "<id>"
else:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--title "<title>" --authors "<authors joined by , >" \
--year <year> --venue "<venue or publisher>"辅助工具会处理别名/去重/页面/索引/日志 — 请勿手动编写。详情请见。
papers/<slug>.mdshared-references/integration-contract.mdKey Rules
核心规则
- Always check that is set before searching
EXA_API_KEY - Default to content mode for a good balance of speed and context
highlights - Use when the user is clearly looking for academic content
category: "research paper" - Use content mode when the user needs full page content
text - Combine with or
/arxivfor comprehensive literature coverage/semantic-scholar
- 搜索前始终检查是否已设置
EXA_API_KEY - 默认使用内容模式,平衡速度与上下文信息
highlights - 当用户明确寻找学术内容时,使用
category: "research paper" - 当用户需要完整页面内容时,使用内容模式
text - 结合或
/arxiv实现全面的文献覆盖/semantic-scholar