qmd-search

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

qmd Search

qmd 搜索

Search a local markdown knowledge base semantically with
qmd
. Five modes — BM25 keywords, vector similarity, hybrid (expansion + rerank), literal native-script grep, and a fused
find
— all running on-device. The key advantage over Obsidian's built-in search: it matches meaning, finds notes that share no words with the query, and works across languages (e.g. a Russian query retrieves English notes).
使用
qmd
对本地Markdown知识库进行语义搜索。支持五种模式——BM25关键词搜索、向量相似度搜索、混合模式(扩展+重排序)、字面原生脚本grep搜索,以及融合式
find
搜索——所有操作均在设备端运行。相较于Obsidian内置搜索的核心优势:它匹配语义,能找到与查询词无重叠词汇的笔记,并且支持跨语言(例如俄语查询可检索英文笔记)。

When to use which mode

各模式适用场景

  • hybrid (
    query
    )
    — default. A real question or fuzzy intent ("how do I stop overengineering"). Best quality; first run downloads reranker/expansion models (~one-time slow).
  • vector (
    vsearch
    )
    — fast concept lookup ("notes about embodied computing").
  • BM25 (
    search
    )
    — an exact keyword, name, or filename. Instant, no model.
  • grep (
    -m grep
    )
    — literal fixed-string ripgrep over the .md files. The audit path for proper nouns, transliterations, exact phrases, Russian stems/inflections, and absence checks. Bypasses the index; matches only the exact script/spelling you type.
  • 混合模式(
    query
    ——默认模式。适用于真实问题或模糊意图(如“如何避免过度设计”)。搜索质量最佳;首次运行时会下载重排序/扩展模型(仅一次,速度较慢)。
  • 向量模式(
    vsearch
    ——快速概念查找(如“关于具身计算的笔记”)。
  • BM25模式(
    search
    ——精确关键词、名称或文件名搜索。即时响应,无需模型。
  • grep模式(
    -m grep
    ——对.md文件进行字面固定字符串的ripgrep搜索。适用于专有名词、音译词、精确短语、俄语词干/屈折变化的核查,以及确认内容是否存在的场景。绕过索引;仅匹配你输入的精确脚本/拼写。

Bilingual / proper-name rule (do not skip)

双语/专有名词规则(请勿跳过)

This vault is bilingual (English/Russian). The embedding model is decent for concepts but weak for proper nouns / specific entities, and BM25 only matches the script you type. So:
Never conclude "it's not in the vault" after one English semantic query. For names, people, pets, places, foreign terms, or bilingual topics:
  1. Search semantically first (
    query
    /
    vsearch
    ).
  2. Generate likely native-script spellings/stems and try them, e.g.
    Ziggy → Зигги/Зиги
    ,
    dog/pet → собак, пёс, щенок, питомц, животн
    . Use stems (
    собак
    catches
    собака/собаку/собаки
    ), not just the nominative.
  3. Run a literal pass before concluding absence:
    qmd-search.sh -m grep -n 20 "Зигги"
    .
  4. Use literal hits to disambiguate close names (e.g.
    Зигги
    the pet vs.
    Зигмунд
    Freud).
  5. If everything fails, say "I didn't find it with these queries: …" and list the terms tried — not "it's not in the vault." Raise
    -n
    to ~20 for absence checks.
本知识库为双语(英文/俄语)。嵌入模型对概念的处理效果尚可,但对专有名词/特定实体的处理较弱,且BM25仅匹配你输入的脚本语言。因此:
切勿在一次英文语义查询后就得出“知识库中无相关内容”的结论。对于名称、人物、宠物、地点、外来术语或双语主题:
  1. 先进行语义搜索(
    query
    /
    vsearch
    )。
  2. 生成可能的原生脚本拼写/词干并尝试搜索,例如:
    Ziggy → Зигги/Зиги
    dog/pet → собак, пёс, щенок, питомц, животн
    。使用词干(如
    собак
    可匹配
    собака/собаку/собаки
    ),而非仅主格形式。
  3. 在得出无相关内容的结论前,运行一次字面搜索
    qmd-search.sh -m grep -n 20 "Зигги"
  4. 利用字面搜索结果区分相近名称(例如宠物
    Зигги
    与弗洛伊德
    Зигмунд
    )。
  5. 如果所有尝试都失败,请说明“我用以下查询词未找到相关内容:……”并列出尝试过的术语——而非直接说“知识库中无相关内容”。确认内容不存在时,将
    -n
    参数提高至约20。

Primary usage — the wrapper

主要用法——包装脚本

Use the bundled wrapper; it suppresses qmd's stderr spinner, formats results as
score  path
(parsing qmd's JSON, so commas in filenames are safe), and makes a best-effort refusal to run during an active
qmd embed
(which would return empty results — override with
--force
):
bash
~/.claude/skills/qmd-search/scripts/qmd-search.sh [-m query|search|vsearch|grep|find] [-n N] [-c COLLECTION] [--snippet] [--min-score X] [--json] [--full] <query...>
Examples:
bash
qmd-search.sh "what helps with anxiety"                 # hybrid (default)
qmd-search.sh -m vsearch -n 8 "behavioral health from photos"
qmd-search.sh -m search sensorium                       # BM25 keyword
qmd-search.sh -m grep -n 20 "Зигги"                     # literal native-spelling / absence check
qmd-search.sh -m find "Зигги собака"                    # fused: semantic + literal in one call
qmd-search.sh --snippet "agent orchestration"           # rows + matching snippets
qmd-search.sh --min-score 0.5 "quarterly planning"      # drop low-relevance hits
qmd-search.sh --json "agent orchestration"              # structured output for further processing
After getting hits, read the top files directly (they are normal vault paths) or fetch slices with
qmd get "<path>:<line>" -l <N>
.
使用附带的包装脚本;它会抑制qmd的stderr加载动画,将结果格式化为
score  path
(解析qmd的JSON输出,因此文件名中的逗号是安全的),并会尽量避免在
qmd embed
运行期间执行搜索(否则会返回空结果——可使用
--force
参数强制运行):
bash
~/.claude/skills/qmd-search/scripts/qmd-search.sh [-m query|search|vsearch|grep|find] [-n N] [-c COLLECTION] [--snippet] [--min-score X] [--json] [--full] <query...>
示例:
bash
qmd-search.sh "what helps with anxiety"                 # 混合模式(默认)
qmd-search.sh -m vsearch -n 8 "behavioral health from photos"
qmd-search.sh -m search sensorium                       # BM25关键词搜索
qmd-search.sh -m grep -n 20 "Зигги"                     # 字面原生拼写搜索/内容存在性核查
qmd-search.sh -m find "Зигги собака"                    # 融合模式:一次调用同时进行语义+字面搜索
qmd-search.sh --snippet "agent orchestration"           # 返回结果行+匹配片段
qmd-search.sh --min-score 0.5 "quarterly planning"      # 过滤低相关性结果
qmd-search.sh --json "agent orchestration"              # 结构化输出,用于后续处理
获取搜索结果后,可直接打开排名靠前的文件(它们是标准的知识库路径),或使用
qmd get "<path>:<line>" -l <N>
获取指定片段。

Setup / indexing (only if
qmd status
shows the vault is not indexed)

设置/索引(仅当
qmd status
显示知识库未索引时需要)

bash
qmd collection add ~/Brains/brain --name brain        # index the vault
qmd context add qmd://brain "short description of the vault"
qmd embed                                              # build vectors; re-run until status shows 0 pending
qmd cleanup                                            # compact the index
Refresh after large edits:
qmd update && qmd embed
. Check health any time with
qmd status
.
bash
qmd collection add ~/Brains/brain --name brain        # 索引知识库
qmd context add qmd://brain "short description of the vault"
qmd embed                                              # 构建向量;重复运行直到状态显示0个待处理任务
qmd cleanup                                            # 压缩索引
大量编辑后刷新索引:
qmd update && qmd embed
。可随时使用
qmd status
检查健康状态。

Operational rules (do not skip)

操作规则(请勿跳过)

  • One embed at a time, and never search while embedding — both cause empty/garbage results. The wrapper guards searches; for manual
    qmd
    calls, check
    qmd status
    first.
  • If embedding never reaches 0 pending, check disk space (
    df -h
    ) — a full disk fails writes silently. See
    references/cli-reference.md
    → "Operational gotchas".
  • Vector scores are modest (~0.4–0.6); judge by ranking, not the absolute number.
  • 一次仅运行一个embed任务,且嵌入过程中切勿搜索——两种操作同时进行会导致空结果或无效结果。包装脚本会阻止嵌入期间的搜索;如果手动调用
    qmd
    ,请先检查
    qmd status
  • 如果嵌入任务始终无法完成(待处理数不为0),请检查磁盘空间
    df -h
    )——磁盘满会导致写入静默失败。详见
    references/cli-reference.md
    → "操作注意事项"。
  • 向量分数通常不高(约0.4–0.6);请根据排名判断相关性,而非绝对分数值。

MCP (native tools) vs. the CLI wrapper

MCP(原生工具)与CLI包装脚本对比

qmd ships an MCP server (
qmd mcp
, stdio) exposing tools
query
,
get
,
multi_get
,
status
. If it's registered in the host (e.g.
.mcp.json
), prefer the native
query
tool
for hybrid search — it returns structured results with no spinner/JSON-parsing/exit-code quirks. Register with:
json
{ "mcpServers": { "qmd": { "command": "qmd", "args": ["mcp"] } } }
Use the wrapper (
scripts/qmd-search.sh
) when you need what MCP doesn't cover: BM25-only (
search
), vector-only (
vsearch
), the literal/native-script
grep
pass, the fused
find
mode,
--snippet
, or
--min-score
. The bilingual/proper-name rule above applies to both paths.
qmd提供MCP服务器(
qmd mcp
,标准输入输出),暴露
query
get
multi_get
status
工具。如果已在主机中注册(例如通过
.mcp.json
),优先使用原生
query
工具
进行混合搜索——它返回结构化结果,无加载动画/JSON解析/退出码问题。注册方式如下:
json
{ "mcpServers": { "qmd": { "command": "qmd", "args": ["mcp"] } } }
当需要MCP未覆盖的功能时,使用包装脚本
scripts/qmd-search.sh
):仅BM25模式(
search
)、仅向量模式(
vsearch
)、字面/原生脚本**
grep
搜索、融合式
find
**模式、
--snippet
--min-score
参数。上述双语/专有名词规则对两种方式均适用。

Quality / evals

质量评估

evals/fixture.example.json
+
scripts/run-evals.sh
run
qmd bench
to score search quality (precision/recall/MRR per backend). Baseline and interpretation:
evals/BASELINE.md
. Re-run after changing the wrapper, the index, or the embedding model; a drop vs. baseline is a regression.
evals/fixture.example.json
+
scripts/run-evals.sh
会运行
qmd bench
来评估搜索质量(各后端的精确率/召回率/MRR)。基准线及解读见
evals/BASELINE.md
。修改包装脚本、索引或嵌入模型后需重新运行;若分数低于基准线则表示出现性能退化。

Reference

参考资料

Full command surface, query grammar (
lex:
/
vec:
/
hyde:
), output formats, models, and recovery steps are in
references/cli-reference.md
.
完整命令集、查询语法(
lex:
/
vec:
/
hyde:
)、输出格式、模型及恢复步骤详见
references/cli-reference.md