duckdb-docs
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseYou are helping the user find relevant DuckDB or DuckLake documentation.
Query:
$@Follow these steps in order.
你正在帮助用户查找相关的DuckDB或DuckLake文档。
查询:
$@请按顺序遵循以下步骤。
Step 1 — Check DuckDB is installed
步骤1 — 检查DuckDB是否已安装
bash
command -v duckdbIf not found, delegate to and then continue.
/duckdb-skills:install-duckdbbash
command -v duckdb如果未找到,调用 后继续执行。
/duckdb-skills:install-duckdbStep 2 — Ensure required extensions are installed
步骤2 — 确保已安装所需扩展
bash
duckdb :memory: -c "INSTALL httpfs; INSTALL fts;"If this fails, report the error and stop.
bash
duckdb :memory: -c "INSTALL httpfs; INSTALL fts;"如果执行失败,上报错误并终止。
Step 3 — Choose the data source and extract search terms
步骤3 — 选择数据源并提取搜索词
The query is:
$@查询内容为:
$@Data source selection
数据源选择
There are two search indexes available:
| Index | Remote URL | Local cache filename | Versions | Use when |
|---|---|---|---|---|
| DuckDB docs + blog | | | | Default — any DuckDB question |
| DuckLake docs | | | | Query mentions DuckLake, catalogs, or DuckLake-specific features |
Both indexes share the same schema:
| Column | Type | Description |
|---|---|---|
| | e.g. |
| | Page title from front matter |
| | Section heading (null for page intros) |
| | e.g. |
| | URL path with anchor |
| | See table above |
| | Full markdown of the chunk |
By default, search DuckDB docs and filter to . Use different versions when:
version = 'lts'- The user explicitly asks about /nightly features →
currentversion = 'current' - The user asks about a blog post or wants background/motivation →
version = 'blog' - The user asks about DuckLake → search the DuckLake index with
version = 'stable' - When unsure, omit the version filter to search across all versions.
有两个可用的搜索索引:
| 索引 | 远程URL | 本地缓存文件名 | 版本 | 适用场景 |
|---|---|---|---|---|
| DuckDB文档 + 博客 | | | | 默认选项 — 所有DuckDB相关问题 |
| DuckLake文档 | | | | 查询提及DuckLake、catalog或DuckLake专属功能时 |
两个索引的schema完全一致:
| 列名 | 类型 | 描述 |
|---|---|---|
| | 例如 |
| | 前言中定义的页面标题 |
| | 章节标题(页面简介部分为null) |
| | 例如 |
| | 带锚点的URL路径 |
| | 参考上方表格 |
| | 片段的完整markdown内容 |
默认情况下,搜索 DuckDB文档 并过滤 。满足以下条件时使用其他版本:
version = 'lts'- 用户明确询问/每日构建版本的功能 →
currentversion = 'current' - 用户询问博客文章或需要背景/设计动机 →
version = 'blog' - 用户询问DuckLake相关内容 → 搜索DuckLake索引,使用
version = 'stable' - 不确定时,去掉版本过滤条件,搜索所有版本。
Search terms
搜索词
If the input is a natural language question (e.g. "how do I find the most frequent value"), extract the key technical terms (nouns, function names, SQL keywords) to form a compact BM25 query string. Drop stop words like "how", "do", "I", "the".
If the input is already a function name or technical term (e.g. , ), use it as-is.
arg_maxGROUP BY ALLUse the extracted terms as in the next step.
SEARCH_QUERY如果输入是自然语言问题(例如"how do I find the most frequent value"),提取关键技术术语(名词、函数名、SQL关键字)组成精简的BM25查询字符串。去掉停用词,例如"how"、"do"、"I"、"the"等。
如果输入本身就是函数名或技术术语(例如、),直接使用即可。
arg_maxGROUP BY ALL将提取的术语作为下一步的。
SEARCH_QUERYStep 4 — Ensure local cache is fresh
步骤4 — 确保本地缓存是最新的
The cache lives at (where is or per Step 3).
$HOME/.duckdb/docs/CACHE_FILENAMECACHE_FILENAMEduckdb-docs.duckdbducklake-docs.duckdbFirst, ensure the directory exists:
bash
mkdir -p "$HOME/.duckdb/docs"Then check whether the cache file exists and is fresh (≤2 days old):
bash
CACHE_FILE="$HOME/.duckdb/docs/CACHE_FILENAME"
if [ -f "$CACHE_FILE" ]; then
MTIME=$(stat -f %m "$CACHE_FILE" 2>/dev/null || stat -c %Y "$CACHE_FILE")
CACHE_AGE_DAYS=$(( ( $(date +%s) - MTIME ) / 86400 ))
else
CACHE_AGE_DAYS=999
fi
echo "Cache age: $CACHE_AGE_DAYS days"If ≤ 2 → skip to Step 5.
CACHE_AGE_DAYSOtherwise (stale or missing) → fetch the index:
bash
duckdb -c "
LOAD httpfs;
LOAD fts;
ATTACH 'REMOTE_URL' AS remote (READ_ONLY);
ATTACH '$HOME/.duckdb/docs/CACHE_FILENAME.tmp' AS tmp;
COPY FROM DATABASE remote TO tmp;
" && mv "$HOME/.duckdb/docs/CACHE_FILENAME.tmp" "$HOME/.duckdb/docs/CACHE_FILENAME"Replace and per Step 3. If the fetch fails (network error), report the error and stop.
REMOTE_URLCACHE_FILENAME缓存位于(其中根据步骤3的选择为或)。
$HOME/.duckdb/docs/CACHE_FILENAMECACHE_FILENAMEduckdb-docs.duckdbducklake-docs.duckdb首先确保目录存在:
bash
mkdir -p "$HOME/.duckdb/docs"然后检查缓存文件是否存在且未过期(≤2天):
bash
CACHE_FILE="$HOME/.duckdb/docs/CACHE_FILENAME"
if [ -f "$CACHE_FILE" ]; then
MTIME=$(stat -f %m "$CACHE_FILE" 2>/dev/null || stat -c %Y "$CACHE_FILE")
CACHE_AGE_DAYS=$(( ( $(date +%s) - MTIME ) / 86400 ))
else
CACHE_AGE_DAYS=999
fi
echo "Cache age: $CACHE_AGE_DAYS days"如果 ≤ 2 → 直接跳转到步骤5。
CACHE_AGE_DAYS否则(缓存过期或不存在)→ 拉取索引:
bash
duckdb -c "
LOAD httpfs;
LOAD fts;
ATTACH 'REMOTE_URL' AS remote (READ_ONLY);
ATTACH '$HOME/.duckdb/docs/CACHE_FILENAME.tmp' AS tmp;
COPY FROM DATABASE remote TO tmp;
" && mv "$HOME/.duckdb/docs/CACHE_FILENAME.tmp" "$HOME/.duckdb/docs/CACHE_FILENAME"根据步骤3替换和。如果拉取失败(网络错误),上报错误并终止。
REMOTE_URLCACHE_FILENAMEStep 5 — Search the docs
步骤5 — 搜索文档
bash
duckdb "$HOME/.duckdb/docs/CACHE_FILENAME" -readonly -json -c "
LOAD fts;
SELECT
chunk_id, page_title, section, breadcrumb, url, version, text,
fts_main_docs_chunks.match_bm25(chunk_id, 'SEARCH_QUERY') AS score
FROM docs_chunks
WHERE score IS NOT NULL
AND version = 'VERSION'
ORDER BY score DESC
LIMIT 8;
"Replace , , and per Step 3. Remove the line if searching across all versions.
CACHE_FILENAMESEARCH_QUERYVERSIONAND version = 'VERSION'If the user's question could benefit from both DuckDB docs and blog results, run two queries (one with , one with ) or omit the version filter entirely.
version = 'stable'version = 'blog'bash
duckdb "$HOME/.duckdb/docs/CACHE_FILENAME" -readonly -json -c "
LOAD fts;
SELECT
chunk_id, page_title, section, breadcrumb, url, version, text,
fts_main_docs_chunks.match_bm25(chunk_id, 'SEARCH_QUERY') AS score
FROM docs_chunks
WHERE score IS NOT NULL
AND version = 'VERSION'
ORDER BY score DESC
LIMIT 8;
"根据步骤3替换、和。如果需要搜索所有版本,删除这一行。
CACHE_FILENAMESEARCH_QUERYVERSIONAND version = 'VERSION'如果用户的问题可以同时从DuckDB文档和博客结果中获益,可以运行两次查询(一次使用,一次使用),或者直接去掉版本过滤条件。
version = 'stable'version = 'blog'Step 6 — Handle errors
步骤6 — 处理错误
- Extension not installed (or
httpfsnot found): runftsand retry.duckdb :memory: -c "INSTALL httpfs; INSTALL fts;" - ATTACH fails / network unreachable: inform the user that the docs index is unavailable and suggest checking their internet connection. The DuckDB index is hosted at and the DuckLake index at
https://duckdb.org/data/docs-search.duckdb.https://ducklake.select/data/docs-search.duckdb - No results (all scores NULL or empty result set): try broadening the query — drop the least specific term, or try a single-word version of the query — then retry Step 5. If still no results, tell the user no matching documentation was found and suggest visiting https://duckdb.org/docs or https://ducklake.select/docs directly.
- 扩展未安装(找不到或
httpfs):执行fts后重试。duckdb :memory: -c "INSTALL httpfs; INSTALL fts;" - ATTACH失败/网络不可达:告知用户文档索引不可用,建议检查网络连接。DuckDB索引托管在,DuckLake索引托管在
https://duckdb.org/data/docs-search.duckdb。https://ducklake.select/data/docs-search.duckdb - 无结果(所有score为NULL或结果集为空):尝试放宽查询条件 — 去掉最不具体的术语,或者尝试单关键词查询 — 然后重试步骤5。如果仍然没有结果,告知用户未找到匹配的文档,建议直接访问https://duckdb.org/docs 或 https://ducklake.select/docs。
Step 7 — Present results
步骤7 — 展示结果
For each result chunk returned (ordered by score descending), format as:
undefined对于返回的每个结果片段(按score降序排列),格式如下:
undefined{section} — {page_title}
{section} — {page_title}
{url}
{text}
After presenting all chunks, synthesize a concise answer to the user's original question (`$@`) based on the retrieved documentation. If the chunks directly answer the question, lead with the answer before showing the sources.{url}
{text}
展示完所有片段后,基于检索到的文档,对用户的原始问题(`$@`)生成简洁的回答。如果片段直接回答了问题,在展示来源之前先给出答案。