You are helping the user find relevant DuckDB or DuckLake documentation.
Follow these steps in order.
Step 1 — Check DuckDB is installed
If not found, delegate to
/duckdb-skills:install-duckdb
and then continue.
Step 2 — Ensure required extensions are installed
bash
duckdb :memory: -c "INSTALL httpfs; INSTALL fts;"
If this fails, report the error and stop.
Step 3 — Choose the data source and extract search terms
Data source selection
There are two search indexes available:
| Index | Remote URL | Local cache filename | Versions | Use when |
|---|
| DuckDB docs + blog | https://duckdb.org/data/docs-search.duckdb
| | , , | Default — any DuckDB question |
| DuckLake docs | https://ducklake.select/data/docs-search.duckdb
| | , | Query mentions DuckLake, catalogs, or DuckLake-specific features |
Both indexes share the same schema:
| Column | Type | Description |
|---|
| (PK) | e.g. stable/sql/functions/numeric#absx
|
| | Page title from front matter |
| | Section heading (null for page intros) |
| | e.g. SQL > Functions > Numeric
|
| | URL path with anchor |
| | See table above |
| | Full markdown of the chunk |
By default, search
DuckDB docs and filter to
. Use different versions when:
- The user explicitly asks about /nightly features →
- The user asks about a blog post or wants background/motivation →
- The user asks about DuckLake → search the DuckLake index with
- When unsure, omit the version filter to search across all versions.
Search terms
If the input is a natural language question (e.g. "how do I find the most frequent value"), extract the key technical terms (nouns, function names, SQL keywords) to form a compact BM25 query string. Drop stop words like "how", "do", "I", "the".
If the input is already a
function name or technical term (e.g.
,
), use it as-is.
Use the extracted terms as
in the next step.
Step 4 — Ensure local cache is fresh
The cache lives at
$HOME/.duckdb/docs/CACHE_FILENAME
(where
is
or
per Step 3).
First, ensure the directory exists:
bash
mkdir -p "$HOME/.duckdb/docs"
Then check whether the cache file exists and is fresh (≤2 days old):
bash
CACHE_FILE="$HOME/.duckdb/docs/CACHE_FILENAME"
if [ -f "$CACHE_FILE" ]; then
MTIME=$(stat -f %m "$CACHE_FILE" 2>/dev/null || stat -c %Y "$CACHE_FILE")
CACHE_AGE_DAYS=$(( ( $(date +%s) - MTIME ) / 86400 ))
else
CACHE_AGE_DAYS=999
fi
echo "Cache age: $CACHE_AGE_DAYS days"
Otherwise (stale or missing) → fetch the index:
bash
duckdb -c "
LOAD httpfs;
LOAD fts;
ATTACH 'REMOTE_URL' AS remote (READ_ONLY);
ATTACH '$HOME/.duckdb/docs/CACHE_FILENAME.tmp' AS tmp;
COPY FROM DATABASE remote TO tmp;
" && mv "$HOME/.duckdb/docs/CACHE_FILENAME.tmp" "$HOME/.duckdb/docs/CACHE_FILENAME"
Replace
and
per Step 3. If the fetch fails (network error), report the error and stop.
Step 5 — Search the docs
bash
duckdb "$HOME/.duckdb/docs/CACHE_FILENAME" -readonly -json -c "
LOAD fts;
SELECT
chunk_id, page_title, section, breadcrumb, url, version, text,
fts_main_docs_chunks.match_bm25(chunk_id, 'SEARCH_QUERY') AS score
FROM docs_chunks
WHERE score IS NOT NULL
AND version = 'VERSION'
ORDER BY score DESC
LIMIT 8;
"
Replace
,
, and
per Step 3. Remove the
line if searching across all versions.
If the user's question could benefit from both DuckDB docs and blog results, run two queries (one with
, one with
) or omit the version filter entirely.
Step 6 — Handle errors
- Extension not installed ( or not found): run
duckdb :memory: -c "INSTALL httpfs; INSTALL fts;"
and retry.
- ATTACH fails / network unreachable: inform the user that the docs index is unavailable and suggest checking their internet connection. The DuckDB index is hosted at
https://duckdb.org/data/docs-search.duckdb
and the DuckLake index at https://ducklake.select/data/docs-search.duckdb
.
- No results (all scores NULL or empty result set): try broadening the query — drop the least specific term, or try a single-word version of the query — then retry Step 5. If still no results, tell the user no matching documentation was found and suggest visiting https://duckdb.org/docs or https://ducklake.select/docs directly.
Step 7 — Present results
For each result chunk returned (ordered by score descending), format as:
### {section} — {page_title}
{url}
{text}
---
After presenting all chunks, synthesize a concise answer to the user's original question (
) based on the retrieved documentation. If the chunks directly answer the question, lead with the answer before showing the sources.