search-strategy
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSearch Strategy
搜索策略
If you see unfamiliar placeholders or need to check which tools are connected, see CONNECTORS.md.
The core intelligence behind enterprise search. Transforms a single natural language question into parallel, source-specific searches and produces ranked, deduplicated results.
如果你看到不熟悉的占位符,或需要查看已连接的工具,请参阅CONNECTORS.md。
企业级搜索背后的核心智能机制。将单个自然语言问题转化为并行的、针对特定数据源的搜索,并生成排序后的去重结果。
The Goal
目标
Turn this:
"What did we decide about the API migration timeline?"Into targeted searches across every connected source:
~~chat: "API migration timeline decision" (semantic) + "API migration" in:#engineering after:2025-01-01
~~knowledge base: semantic search "API migration timeline decision"
~~project tracker: text search "API migration" in relevant workspaceThen synthesize the results into a single coherent answer.
将以下内容:
"我们关于API迁移时间线的决定是什么?"转化为针对所有已连接数据源的定向搜索:
~~chat: "API migration timeline decision" (语义搜索) + "API migration" in:#engineering after:2025-01-01
~~knowledge base: 语义搜索 "API migration timeline decision"
~~project tracker: 文本搜索 "API migration" in relevant workspace然后将结果合成为一个连贯的答案。
Query Decomposition
查询分解
Step 1: Identify Query Type
步骤1:识别查询类型
Classify the user's question to determine search strategy:
| Query Type | Example | Strategy |
|---|---|---|
| Decision | "What did we decide about X?" | Prioritize conversations (~~chat, email), look for conclusion signals |
| Status | "What's the status of Project Y?" | Prioritize recent activity, task trackers, status updates |
| Document | "Where's the spec for Z?" | Prioritize Drive, wiki, shared docs |
| Person | "Who's working on X?" | Search task assignments, message authors, doc collaborators |
| Factual | "What's our policy on X?" | Prioritize wiki, official docs, then confirmatory conversations |
| Temporal | "When did X happen?" | Search with broad date range, look for timestamps |
| Exploratory | "What do we know about X?" | Broad search across all sources, synthesize |
对用户的问题进行分类,以确定搜索策略:
| 查询类型 | 示例 | 策略 |
|---|---|---|
| 决策类 | "我们关于X的决定是什么?" | 优先搜索对话(~~chat、邮件),寻找结论相关信号 |
| 状态类 | "项目Y的状态是什么?" | 优先搜索近期活动、任务追踪器、状态更新 |
| 文档类 | "Z的规格文档在哪里?" | 优先搜索云端硬盘、维基、共享文档 |
| 人物类 | "谁在负责X?" | 搜索任务分配、消息作者、文档协作者 |
| 事实类 | "我们关于X的政策是什么?" | 优先搜索维基、官方文档,再查找确认性对话 |
| 时间类 | "X是什么时候发生的?" | 搜索宽泛的日期范围,查找时间戳 |
| 探索类 | "我们对X了解多少?" | 跨所有数据源进行广泛搜索,综合结果 |
Step 2: Extract Search Components
步骤2:提取搜索组件
From the query, extract:
- Keywords: Core terms that must appear in results
- Entities: People, projects, teams, tools (use memory system if available)
- Intent signals: Decision words, status words, temporal markers
- Constraints: Time ranges, source hints, author filters
- Negations: Things to exclude
从查询中提取:
- 关键词:结果中必须包含的核心术语
- 实体:人物、项目、团队、工具(如有可用的记忆系统则使用)
- 意图信号:决策相关词汇、状态相关词汇、时间标记
- 约束条件:时间范围、数据源提示、作者筛选
- 否定项:需要排除的内容
Step 3: Generate Sub-Queries Per Source
步骤3:为每个数据源生成子查询
For each available source, create one or more targeted queries:
Prefer semantic search for:
- Conceptual questions ("What do we think about...")
- Questions where exact keywords are unknown
- Exploratory queries
Prefer keyword search for:
- Known terms, project names, acronyms
- Exact phrases the user quoted
- Filter-heavy queries (from:, in:, after:)
Generate multiple query variants when the topic might be referred to differently:
User: "Kubernetes setup"
Queries: "Kubernetes", "k8s", "cluster", "container orchestration"为每个可用的数据源创建一个或多个定向查询:
优先使用语义搜索的场景:
- 概念性问题("我们对...的看法是什么?")
- 确切关键词未知的问题
- 探索类查询
优先使用关键词搜索的场景:
- 已知术语、项目名称、首字母缩写
- 用户引用的确切短语
- 筛选条件较多的查询(from:、in:、after:)
当主题可能有不同表述时,生成多个查询变体:
用户: "Kubernetes setup"
查询: "Kubernetes", "k8s", "cluster", "container orchestration"Source-Specific Query Translation
数据源专属查询转换
~~chat
~~chat
Semantic search (natural language questions):
query: "What is the status of project aurora?"Keyword search:
query: "project aurora status update"
query: "aurora in:#engineering after:2025-01-15"
query: "from:<@UserID> aurora"Filter mapping:
| Enterprise filter | ~~chat syntax |
|---|---|
| |
| |
| |
| |
| |
| |
语义搜索(自然语言问题):
query: "What is the status of project aurora?"关键词搜索:
query: "project aurora status update"
query: "aurora in:#engineering after:2025-01-15"
query: "from:<@UserID> aurora"筛选条件映射:
| 企业级筛选符 | ~~chat 语法 |
|---|---|
| |
| |
| |
| |
| |
| |
~~knowledge base (Wiki)
~~knowledge base (Wiki)
Semantic search — Use for conceptual queries:
descriptive_query: "API migration timeline and decision rationale"Keyword search — Use for exact terms:
query: "API migration"
query: "\"API migration timeline\"" (exact phrase)语义搜索 — 适用于概念性查询:
descriptive_query: "API migration timeline and decision rationale"关键词搜索 — 适用于确切术语:
query: "API migration"
query: "\"API migration timeline\"" (确切短语)~~project tracker
~~project tracker
Task search:
text: "API migration"
workspace: [workspace_id]
completed: false (for status queries)
assignee_any: "me" (for "my tasks" queries)Filter mapping:
| Enterprise filter | ~~project tracker parameter |
|---|---|
| |
| |
| |
任务搜索:
text: "API migration"
workspace: [workspace_id]
completed: false (针对状态类查询)
assignee_any: "me" (针对"我的任务"类查询)筛选条件映射:
| 企业级筛选符 | ~~project tracker 参数 |
|---|---|
| |
| |
| |
Result Ranking
结果排序
Relevance Scoring
相关性评分
Score each result on these factors (weighted by query type):
| Factor | Weight (Decision) | Weight (Status) | Weight (Document) | Weight (Factual) |
|---|---|---|---|---|
| Keyword match | 0.3 | 0.2 | 0.4 | 0.3 |
| Freshness | 0.3 | 0.4 | 0.2 | 0.1 |
| Authority | 0.2 | 0.1 | 0.3 | 0.4 |
| Completeness | 0.2 | 0.3 | 0.1 | 0.2 |
根据以下因素为每个结果评分(权重因查询类型而异):
| 因素 | 权重(决策类) | 权重(状态类) | 权重(文档类) | 权重(事实类) |
|---|---|---|---|---|
| 关键词匹配 | 0.3 | 0.2 | 0.4 | 0.3 |
| 新鲜度 | 0.3 | 0.4 | 0.2 | 0.1 |
| 权威性 | 0.2 | 0.1 | 0.3 | 0.4 |
| 完整性 | 0.2 | 0.3 | 0.1 | 0.2 |
Authority Hierarchy
权威性层级
Depends on query type:
For factual/policy questions:
Wiki/Official docs > Shared documents > Email announcements > Chat messagesFor "what happened" / decision questions:
Meeting notes > Thread conclusions > Email confirmations > Chat messagesFor status questions:
Task tracker > Recent chat > Status docs > Email updates取决于查询类型:
针对事实/政策类问题:
维基/官方文档 > 共享文档 > 邮件公告 > 聊天消息针对"发生了什么" / 决策类问题:
会议纪要 > 对话结论 > 邮件确认 > 聊天消息针对状态类问题:
任务追踪器 > 近期聊天 > 状态文档 > 邮件更新Handling Ambiguity
歧义处理
When a query is ambiguous, prefer asking one focused clarifying question over guessing:
Ambiguous: "search for the migration"
→ "I found references to a few migrations. Are you looking for:
1. The database migration (Project Phoenix)
2. The cloud migration (AWS → GCP)
3. The email migration (Exchange → O365)"Only ask for clarification when:
- There are genuinely distinct interpretations that would produce very different results
- The ambiguity would significantly affect which sources to search
Do NOT ask for clarification when:
- The query is clear enough to produce useful results
- Minor ambiguity can be resolved by returning results from multiple interpretations
当查询存在歧义时,优先提出一个明确的澄清问题,而非猜测:
歧义查询: "search for the migration"
→ "我找到了多个迁移相关的参考内容。你要查找的是:
1. 数据库迁移(Project Phoenix)
2. 云迁移(AWS → GCP)
3. 邮件迁移(Exchange → O365)"仅在以下情况请求澄清:
- 存在真正不同的解释,会导致结果差异极大
- 歧义会严重影响要搜索的数据源
以下情况请勿请求澄清:
- 查询足够清晰,可生成有用结果
- 轻微歧义可通过返回多种解释的结果来解决
Fallback Strategies
回退策略
When a source is unavailable or returns no results:
- Source unavailable: Skip it, search remaining sources, note the gap
- No results from a source: Try broader query terms, remove date filters, try alternate keywords
- All sources return nothing: Suggest query modifications to the user
- Rate limited: Note the limitation, return results from other sources, suggest retrying later
当某个数据源不可用或无结果返回时:
- 数据源不可用:跳过该数据源,搜索剩余数据源,并标注此缺失
- 某数据源无结果:尝试更宽泛的查询术语,移除日期筛选,使用替代关键词
- 所有数据源均无结果:向用户建议修改查询
- 触发速率限制:标注此限制,返回其他数据源的结果,建议稍后重试
Query Broadening
查询拓宽
If initial queries return too few results:
Original: "PostgreSQL migration Q2 timeline decision"
Broader: "PostgreSQL migration"
Broader: "database migration"
Broadest: "migration"Remove constraints in this order:
- Date filters (search all time)
- Source/location filters
- Less important keywords
- Keep only core entity/topic terms
如果初始查询返回结果过少:
原查询: "PostgreSQL migration Q2 timeline decision"
拓宽后: "PostgreSQL migration"
进一步拓宽: "database migration"
最宽泛: "migration"按以下顺序移除约束条件:
- 日期筛选(搜索所有时间范围)
- 数据源/位置筛选
- 次要关键词
- 仅保留核心实体/主题术语
Parallel Execution
并行执行
Always execute searches across sources in parallel, never sequentially. The total search time should be roughly equal to the slowest single source, not the sum of all sources.
[User query]
↓ decompose
[~~chat query] [~~email query] [~~cloud storage query] [Wiki query] [~~project tracker query]
↓ ↓ ↓ ↓ ↓
(parallel execution)
↓
[Merge + Rank + Deduplicate]
↓
[Synthesized answer]始终跨数据源并行执行搜索,绝不要串行执行。总搜索时间应大致等于最慢的单个数据源的搜索时间,而非所有数据源的时间总和。
[用户查询]
↓ 分解
[~~chat 查询] [~~email 查询] [~~cloud storage 查询] [Wiki 查询] [~~project tracker 查询]
↓ ↓ ↓ ↓ ↓
(并行执行)
↓
[合并 + 排序 + 去重]
↓
[综合答案]