search-strategy

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Search Strategy

搜索策略

If you see unfamiliar placeholders or need to check which tools are connected, see CONNECTORS.md.

The core intelligence behind enterprise search. Transforms a single natural language question into parallel, source-specific searches and produces ranked, deduplicated results.

如果你看到不熟悉的占位符，或需要查看已连接的工具，请参阅CONNECTORS.md。

企业级搜索背后的核心智能机制。将单个自然语言问题转化为并行的、针对特定数据源的搜索，并生成排序后的去重结果。

The Goal

目标

Turn this:

"What did we decide about the API migration timeline?"

Into targeted searches across every connected source:

~~chat:  "API migration timeline decision" (semantic) + "API migration" in:#engineering after:2025-01-01
~~knowledge base: semantic search "API migration timeline decision"
~~project tracker:  text search "API migration" in relevant workspace

Then synthesize the results into a single coherent answer.

将以下内容：

"我们关于API迁移时间线的决定是什么？"

转化为针对所有已连接数据源的定向搜索：

~~chat:  "API migration timeline decision" (语义搜索) + "API migration" in:#engineering after:2025-01-01
~~knowledge base: 语义搜索 "API migration timeline decision"
~~project tracker:  文本搜索 "API migration" in relevant workspace

然后将结果合成为一个连贯的答案。

Query Decomposition

查询分解

Step 1: Identify Query Type

步骤1：识别查询类型

Classify the user's question to determine search strategy:

Query Type	Example	Strategy
Decision	"What did we decide about X?"	Prioritize conversations (~~chat, email), look for conclusion signals
Status	"What's the status of Project Y?"	Prioritize recent activity, task trackers, status updates
Document	"Where's the spec for Z?"	Prioritize Drive, wiki, shared docs
Person	"Who's working on X?"	Search task assignments, message authors, doc collaborators
Factual	"What's our policy on X?"	Prioritize wiki, official docs, then confirmatory conversations
Temporal	"When did X happen?"	Search with broad date range, look for timestamps
Exploratory	"What do we know about X?"	Broad search across all sources, synthesize

对用户的问题进行分类，以确定搜索策略：

查询类型	示例	策略
决策类	"我们关于X的决定是什么？"	优先搜索对话（~~chat、邮件），寻找结论相关信号
状态类	"项目Y的状态是什么？"	优先搜索近期活动、任务追踪器、状态更新
文档类	"Z的规格文档在哪里？"	优先搜索云端硬盘、维基、共享文档
人物类	"谁在负责X？"	搜索任务分配、消息作者、文档协作者
事实类	"我们关于X的政策是什么？"	优先搜索维基、官方文档，再查找确认性对话
时间类	"X是什么时候发生的？"	搜索宽泛的日期范围，查找时间戳
探索类	"我们对X了解多少？"	跨所有数据源进行广泛搜索，综合结果

Step 2: Extract Search Components

步骤2：提取搜索组件

From the query, extract:

Keywords: Core terms that must appear in results
Entities: People, projects, teams, tools (use memory system if available)
Intent signals: Decision words, status words, temporal markers
Constraints: Time ranges, source hints, author filters
Negations: Things to exclude

从查询中提取：

关键词：结果中必须包含的核心术语
实体：人物、项目、团队、工具（如有可用的记忆系统则使用）
意图信号：决策相关词汇、状态相关词汇、时间标记
约束条件：时间范围、数据源提示、作者筛选
否定项：需要排除的内容

Step 3: Generate Sub-Queries Per Source

步骤3：为每个数据源生成子查询

For each available source, create one or more targeted queries:

Prefer semantic search for:

Conceptual questions ("What do we think about...")
Questions where exact keywords are unknown
Exploratory queries

Prefer keyword search for:

Known terms, project names, acronyms
Exact phrases the user quoted
Filter-heavy queries (from:, in:, after:)

Generate multiple query variants when the topic might be referred to differently:

User: "Kubernetes setup"
Queries: "Kubernetes", "k8s", "cluster", "container orchestration"

为每个可用的数据源创建一个或多个定向查询：

优先使用语义搜索的场景：

概念性问题（"我们对...的看法是什么？"）
确切关键词未知的问题
探索类查询

优先使用关键词搜索的场景：

已知术语、项目名称、首字母缩写
用户引用的确切短语
筛选条件较多的查询（from:、in:、after:）

当主题可能有不同表述时，生成多个查询变体：

用户: "Kubernetes setup"
查询: "Kubernetes", "k8s", "cluster", "container orchestration"

Source-Specific Query Translation

数据源专属查询转换

~~chat

Semantic search (natural language questions):

query: "What is the status of project aurora?"

Keyword search:

query: "project aurora status update"
query: "aurora in:#engineering after:2025-01-15"
query: "from:<@UserID> aurora"

Filter mapping:

Enterprise filter	~~chat syntax
`from:sarah`	`from:sarah` or `from:<@USERID>`
`in:engineering`	`in:engineering`
`after:2025-01-01`	`after:2025-01-01`
`before:2025-02-01`	`before:2025-02-01`
`type:thread`	`is:thread`
`type:file`	`has:file`

语义搜索（自然语言问题）：

query: "What is the status of project aurora?"

关键词搜索：

query: "project aurora status update"
query: "aurora in:#engineering after:2025-01-15"
query: "from:<@UserID> aurora"

筛选条件映射：

企业级筛选符	~~chat 语法
`from:sarah`	`from:sarah` 或 `from:<@USERID>`
`in:engineering`	`in:engineering`
`after:2025-01-01`	`after:2025-01-01`
`before:2025-02-01`	`before:2025-02-01`
`type:thread`	`is:thread`
`type:file`	`has:file`

~~knowledge base (Wiki)

Semantic search — Use for conceptual queries:

descriptive_query: "API migration timeline and decision rationale"

Keyword search — Use for exact terms:

query: "API migration"
query: "\"API migration timeline\""  (exact phrase)

语义搜索 — 适用于概念性查询：

descriptive_query: "API migration timeline and decision rationale"

关键词搜索 — 适用于确切术语：

query: "API migration"
query: "\"API migration timeline\""  (确切短语)

~~project tracker

Task search:

text: "API migration"
workspace: [workspace_id]
completed: false  (for status queries)
assignee_any: "me"  (for "my tasks" queries)

Filter mapping:

Enterprise filter	~~project tracker parameter
`from:sarah`	`assignee_any` or `created_by_any`
`after:2025-01-01`	`modified_on_after: "2025-01-01"`
`type:milestone`	`resource_subtype: "milestone"`

任务搜索：

text: "API migration"
workspace: [workspace_id]
completed: false  (针对状态类查询)
assignee_any: "me"  (针对"我的任务"类查询)

筛选条件映射：

企业级筛选符	~~project tracker 参数
`from:sarah`	`assignee_any` 或 `created_by_any`
`after:2025-01-01`	`modified_on_after: "2025-01-01"`
`type:milestone`	`resource_subtype: "milestone"`

Result Ranking

结果排序

Relevance Scoring

Factor	Weight (Decision)	Weight (Status)	Weight (Document)	Weight (Factual)
Keyword match	0.3	0.2	0.4	0.3
Freshness	0.3	0.4	0.2	0.1
Authority	0.2	0.1	0.3	0.4
Completeness	0.2	0.3	0.1	0.2

因素	权重（决策类）	权重（状态类）	权重（文档类）	权重（事实类）
关键词匹配	0.3	0.2	0.4	0.3
新鲜度	0.3	0.4	0.2	0.1
权威性	0.2	0.1	0.3	0.4
完整性	0.2	0.3	0.1	0.2

Authority Hierarchy

权威性层级

Depends on query type:

For factual/policy questions:

Wiki/Official docs > Shared documents > Email announcements > Chat messages

For "what happened" / decision questions:

Meeting notes > Thread conclusions > Email confirmations > Chat messages

For status questions:

Task tracker > Recent chat > Status docs > Email updates

取决于查询类型：

针对事实/政策类问题：

维基/官方文档 > 共享文档 > 邮件公告 > 聊天消息

针对"发生了什么" / 决策类问题：

会议纪要 > 对话结论 > 邮件确认 > 聊天消息

针对状态类问题：

任务追踪器 > 近期聊天 > 状态文档 > 邮件更新

Handling Ambiguity

歧义处理

When a query is ambiguous, prefer asking one focused clarifying question over guessing:

Ambiguous: "search for the migration"
→ "I found references to a few migrations. Are you looking for:
   1. The database migration (Project Phoenix)
   2. The cloud migration (AWS → GCP)
   3. The email migration (Exchange → O365)"

Only ask for clarification when:

There are genuinely distinct interpretations that would produce very different results
The ambiguity would significantly affect which sources to search

Do NOT ask for clarification when:

The query is clear enough to produce useful results
Minor ambiguity can be resolved by returning results from multiple interpretations

当查询存在歧义时，优先提出一个明确的澄清问题，而非猜测：

歧义查询: "search for the migration"
→ "我找到了多个迁移相关的参考内容。你要查找的是：
   1. 数据库迁移（Project Phoenix）
   2. 云迁移（AWS → GCP）
   3. 邮件迁移（Exchange → O365）"

仅在以下情况请求澄清：

存在真正不同的解释，会导致结果差异极大
歧义会严重影响要搜索的数据源

以下情况请勿请求澄清：

查询足够清晰，可生成有用结果
轻微歧义可通过返回多种解释的结果来解决

Fallback Strategies

回退策略

When a source is unavailable or returns no results:

Source unavailable: Skip it, search remaining sources, note the gap
No results from a source: Try broader query terms, remove date filters, try alternate keywords
All sources return nothing: Suggest query modifications to the user
Rate limited: Note the limitation, return results from other sources, suggest retrying later

当某个数据源不可用或无结果返回时：

数据源不可用：跳过该数据源，搜索剩余数据源，并标注此缺失
某数据源无结果：尝试更宽泛的查询术语，移除日期筛选，使用替代关键词
所有数据源均无结果：向用户建议修改查询
触发速率限制：标注此限制，返回其他数据源的结果，建议稍后重试

Query Broadening

查询拓宽

If initial queries return too few results:

Original: "PostgreSQL migration Q2 timeline decision"
Broader:  "PostgreSQL migration"
Broader:  "database migration"
Broadest: "migration"

Remove constraints in this order:

Date filters (search all time)
Source/location filters
Less important keywords
Keep only core entity/topic terms

如果初始查询返回结果过少：

原查询: "PostgreSQL migration Q2 timeline decision"
拓宽后:  "PostgreSQL migration"
进一步拓宽:  "database migration"
最宽泛: "migration"

按以下顺序移除约束条件：

日期筛选（搜索所有时间范围）
数据源/位置筛选
次要关键词
仅保留核心实体/主题术语

Parallel Execution

并行执行

Always execute searches across sources in parallel, never sequentially. The total search time should be roughly equal to the slowest single source, not the sum of all sources.

[User query]
     ↓ decompose
[~~chat query] [~~email query] [~~cloud storage query] [Wiki query] [~~project tracker query]
     ↓            ↓            ↓              ↓            ↓
  (parallel execution)
     ↓
[Merge + Rank + Deduplicate]
     ↓
[Synthesized answer]

始终跨数据源并行执行搜索，绝不要串行执行。总搜索时间应大致等于最慢的单个数据源的搜索时间，而非所有数据源的时间总和。

[用户查询]
     ↓ 分解
[~~chat 查询] [~~email 查询] [~~cloud storage 查询] [Wiki 查询] [~~project tracker 查询]
     ↓            ↓            ↓              ↓            ↓
  (并行执行)
     ↓
[合并 + 排序 + 去重]
     ↓
[综合答案]

search-strategy

Original

Translation

Search Strategy

搜索策略

The Goal

目标

Query Decomposition

查询分解

Step 1: Identify Query Type

步骤1：识别查询类型

Step 2: Extract Search Components

步骤2：提取搜索组件

Step 3: Generate Sub-Queries Per Source

步骤3：为每个数据源生成子查询

Source-Specific Query Translation

数据源专属查询转换

~~chat

~~chat

~~knowledge base (Wiki)

~~knowledge base (Wiki)

~~project tracker

~~project tracker

Result Ranking

结果排序

Relevance Scoring

相关性评分

Authority Hierarchy

权威性层级

Handling Ambiguity

歧义处理

Fallback Strategies

回退策略

Query Broadening

查询拓宽

Parallel Execution

并行执行