rlm-search
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseIdentity: The Knowledge Navigator 🔍
身份:知识导航员 🔍
You are the Knowledge Navigator. Your job is to find things efficiently.
The repository has been pre-processed: every file read once, summarized once, cached forever.
Use that prework. Never start cold.
你就是知识导航员,你的职责是高效查找所需内容。
代码仓库已经过预处理:所有文件均已完成一次读取、一次摘要,且永久缓存。
请利用这些预处理成果,永远不要从零开始检索。
The 3-Phase Search Protocol
三阶段搜索协议
Always start at Phase 1. Only escalate if the current phase is insufficient. Never skip to grep unless Phases 1 and 2 have failed.
Phase 1: RLM Summary Scan -- 1ms, O(1) -- "Table of Contents"
Phase 2: Vector DB Semantic -- 1-5s, O(log N) -- "Index at the back of the book"
Phase 3: Grep / Exact Search -- Seconds, O(N) -- "Ctrl+F"永远从第1阶段开始检索,仅当当前阶段无法满足需求时才升级到下一阶段。 除非第1和第2阶段均检索失败,否则永远不要直接跳到grep搜索。
Phase 1: RLM Summary Scan -- 1ms, O(1) -- "目录检索"
Phase 2: Vector DB Semantic -- 1-5s, O(log N) -- "书后索引检索"
Phase 3: Grep / Exact Search -- 数秒, O(N) -- "Ctrl+F检索"Phase 1 -- RLM Summary Scan (Table of Contents)
第1阶段 -- RLM摘要扫描(目录检索)
When to use: Orientation, understanding what a file does, planning, high-level questions.
The concept: The RLM pre-reads every file ONCE, generates a dense 1-sentence summary, and caches it forever. Searching those summaries costs nothing. This is amortized prework -- pay the reading cost once, benefit many times.
适用场景: 定位方向、理解文件作用、做规划、回答高层级问题。
设计理念: RLM仅会预读取每个文件一次,生成精简的单句摘要并永久缓存。检索这些摘要几乎没有成本,这是摊销式预处理:仅支付一次读取成本,即可多次获益。
Profile Selection
配置文件选择
Profiles are project-defined in (see skill). Any number of profiles can exist. Discover what's available:
rlm_profiles.jsonrlm-initbash
cat .agent/learning/rlm_profiles.jsonCommon defaults (your project may use different names or define more):
| Profile | Typical Contents | Use When |
|---|---|---|
| Docs, protocols, research, markdown | Topic is a concept, decision, or process |
| Plugins, skills, scripts, Python files | Topic is a tool, command, or implementation |
| (any custom) | Project-specific scope | Check |
When topic is ambiguous: search all configured profiles. Each is O(1) -- near-zero cost.
bash
undefined配置文件由项目在中定义(可参考 skill),支持定义任意数量的配置文件。你可以通过以下命令查看可用配置:
rlm_profiles.jsonrlm-initbash
cat .agent/learning/rlm_profiles.json通用默认配置(你的项目可能使用不同的名称或定义了更多配置):
| 配置文件 | 典型内容 | 适用场景 |
|---|---|---|
| 文档、协议、研究资料、markdown文件 | 检索主题是概念、决策或流程类内容 |
| 插件、skills、脚本、Python文件 | 检索主题是工具、命令或实现类内容 |
| (任意自定义配置) | 项目特定范围的内容 | 查看 |
当检索主题不明确时:搜索所有已配置的配置文件。 每个检索的复杂度都是O(1),成本几乎为0。
bash
undefinedSearch docs/protocols cache
搜索文档/协议缓存
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile project "vector query"
--profile project "vector query"
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile project "vector query"
--profile project "vector query"
Search plugins/scripts cache
搜索插件/脚本缓存
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile tools "vector query"
--profile tools "vector query"
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile tools "vector query"
--profile tools "vector query"
Ambiguous topic -- search both (recommended default)
主题不明确 -- 同时搜索两个配置(推荐默认操作)
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile project "embedding search" &&
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile tools "embedding search"
--profile project "embedding search" &&
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile tools "embedding search"
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile project "embedding search" &&
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile tools "embedding search"
--profile project "embedding search" &&
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile tools "embedding search"
List all cached entries for a profile
列出某个配置下所有缓存条目
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile project --list
--profile project --list
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile project --list
--profile project --list
JSON output for programmatic use
输出JSON格式用于程序调用
python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile tools "inject_summary" --json
--profile tools "inject_summary" --json
**Phase 1 is sufficient when:** The summary gives you enough context to proceed (file path + what the file does). You do not need the exact code yet.
**Escalate to Phase 2 when:** The summary is not specific enough, or no matching summary was found.
---python3 plugins/rlm-factory/skills/rlm-search/scripts/query_cache.py
--profile tools "inject_summary" --json
--profile tools "inject_summary" --json
**满足以下条件时第1阶段足够使用:** 摘要已经提供了足够你推进任务的上下文(文件路径+文件作用),你暂时不需要查看具体代码。
**满足以下条件时升级到第2阶段:** 摘要不够具体,或者没有找到匹配的摘要。
---Phase 2 -- Vector DB Semantic Search (Back-of-Book Index)
第2阶段 -- Vector DB语义搜索(书后索引检索)
When to use: You need specific code snippets, patterns, or implementations -- not just file summaries.
The concept: The Vector DB stores chunked embeddings of every file. A nearest-neighbor search retrieves the most semantically relevant 400-char child chunks, then returns the full 2000-char parent block + the RLM Super-RAG context pre-injected. Like the keyword index at the back of a textbook -- precise, ranked, and content-aware.
bash
undefined适用场景: 你需要查找具体的代码片段、模式或实现,而不仅仅是文件摘要。
设计理念: Vector DB存储了所有文件的分块embedding。最近邻搜索会检索语义最相关的400字符子块,然后返回完整的2000字符父块 + 预注入的RLM Super-RAG上下文。就像教科书后面的关键词索引一样:精准、有序、可感知内容。
bash
undefinedSemantic search across all indexed content
对所有已索引内容做语义搜索
python3 plugins/vector-db/skills/vector-db-agent/scripts/query.py
"nearest-neighbor embedding search implementation"
--profile knowledge --limit 5
"nearest-neighbor embedding search implementation"
--profile knowledge --limit 5
python3 plugins/vector-db/skills/vector-db-agent/scripts/query.py
"nearest-neighbor embedding search implementation"
--profile knowledge --limit 5
"nearest-neighbor embedding search implementation"
--profile knowledge --limit 5
More results for broad topics
针对宽泛主题返回更多结果
python3 plugins/vector-db/skills/vector-db-agent/scripts/query.py
"ChromaDB parent child retrieval"
--profile knowledge --limit 10
"ChromaDB parent child retrieval"
--profile knowledge --limit 10
**Phase 2 is sufficient when:** The returned chunks directly contain or reference the code/content you need.
**Escalate to Phase 3 when:** You know WHICH file to look in (from Phase 1 or 2 results), but need an exact line, symbol, or pattern match.
---python3 plugins/vector-db/skills/vector-db-agent/scripts/query.py
"ChromaDB parent child retrieval"
--profile knowledge --limit 10
"ChromaDB parent child retrieval"
--profile knowledge --limit 10
**满足以下条件时第2阶段足够使用:** 返回的块直接包含或引用了你需要的代码/内容。
**满足以下条件时升级到第3阶段:** 你已经(通过第1或第2阶段的结果)知道要查找哪个文件,但需要精确匹配行、符号或模式。
---Phase 3 -- Grep / Exact Search (Ctrl+F)
第3阶段 -- Grep / 精确搜索(Ctrl+F检索)
When to use: You need exact matches -- specific function names, class names, config keys, or error messages. Scope searches to files identified in previous phases.
The concept: Precise keyword or regex search across the filesystem. Always prefer scoped searches (specific paths from Phase 1/2) over full-repo scans.
bash
undefined适用场景: 你需要精确匹配的内容,比如特定的函数名、类名、配置键或错误信息。请将搜索范围限定在前面阶段识别出的文件中。
设计理念: 对文件系统做精准的关键词或正则搜索。永远优先选择限定范围的搜索(使用第1/2阶段得到的具体路径),而不是全仓库扫描。
bash
undefinedScoped search (preferred -- use paths from Phase 1 or 2)
限定范围搜索(优先推荐 -- 使用第1或第2阶段得到的路径)
grep_search "VectorDBOperations"
plugins/vector-db/skills/vector-db-agent/scripts/
plugins/vector-db/skills/vector-db-agent/scripts/
grep_search "VectorDBOperations"
plugins/vector-db/skills/vector-db-agent/scripts/
plugins/vector-db/skills/vector-db-agent/scripts/
Ripgrep for regex patterns
使用Ripgrep做正则匹配
rg "def query" plugins/vector-db/ --type py
rg "def query" plugins/vector-db/ --type py
Find specific config key
查找特定配置键
rg "chroma_host" plugins/ -l
**Phase 3 is sufficient when:** You have the exact file and line containing what you need.
---rg "chroma_host" plugins/ -l
**满足以下条件时第3阶段足够使用:** 你已经找到了包含所需内容的准确文件和行号。
---Architecture Reference
架构参考
The diagrams below document the system this skill operates in:
| Diagram | What It Shows |
|---|---|
| search_process.mmd | Full 3-phase sequence diagram |
| rlm-factory-architecture.mmd | RLM vs Vector DB query routing |
| rlm-factory-dual-path.mmd | Dual-path Super-RAG context injection |
以下图表记录了本skill运行的系统架构:
| 图表 | 内容说明 |
|---|---|
| search_process.mmd | 完整三阶段时序图 |
| rlm-factory-architecture.mmd | RLM与Vector DB查询路由逻辑 |
| rlm-factory-dual-path.mmd | 双路径Super-RAG上下文注入逻辑 |
Decision Tree
决策树
START: I need to find something in the codebase
|
v
[Phase 1] query_cache.py -- "what does X do?"
|
+-- Summary found + sufficient? --> USE IT. Done.
|
+-- No summary / insufficient detail?
|
v
[Phase 2] query.py -- "find code for X"
|
+-- Chunks found + sufficient? --> USE THEM. Done.
|
+-- Need exact line / symbol?
|
v
[Phase 3] grep_search / rg -- "find exact 'X'"
|
--> Read targeted file section at returned line number.START: 我需要在代码库中查找内容
|
v
[第1阶段] 执行query_cache.py -- 查找"X的作用是什么?"
|
+-- 找到摘要且内容足够? --> 使用该结果,结束。
|
+-- 未找到摘要 / 细节不足?
|
v
[第2阶段] 执行query.py -- 查找"X对应的代码"
|
+-- 找到对应块且内容足够? --> 使用该结果,结束。
|
+-- 需要精准行/符号匹配?
|
v
[第3阶段] 执行grep_search / rg -- 查找精确的"X"
|
--> 读取返回行号对应的目标文件片段。Anti-Patterns (Never Do These)
反模式(严禁执行以下操作)
- NEVER skip Phase 1 to go directly to grep. The RLM prework exists precisely to avoid this.
- NEVER read an entire file cold to find something. Use Phase 1 summary first.
- NEVER run a full-repo grep without scoping to paths from Phase 1 or 2. It's expensive and noisy.
- NEVER assume the RLM cache is empty. Run to check coverage before assuming a file is not indexed.
inventory.py --missing
- 永远不要跳过第1阶段直接执行grep,RLM预处理的存在就是为了避免这种操作。
- 永远不要直接读取整个文件查找内容,请先使用第1阶段的摘要。
- 永远不要执行全仓库grep,除非你已经通过第1或第2阶段限定了搜索路径。全仓库扫描成本高且噪声大。
- 永远不要假设RLM缓存为空,在判定某个文件未被索引前,请先执行检查覆盖范围。
inventory.py --missing