postgres-semantic-search
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePostgreSQL Semantic Search
PostgreSQL 语义搜索
Quick Start
快速开始
1. Setup
1. 配置
sql
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT NOT NULL,
embedding vector(1536) -- text-embedding-3-small
-- Or: embedding halfvec(3072) -- text-embedding-3-large (50% memory)
);sql
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT NOT NULL,
embedding vector(1536) -- text-embedding-3-small
-- Or: embedding halfvec(3072) -- text-embedding-3-large (50% memory)
);2. Basic Semantic Search
2. 基础语义搜索
sql
SELECT id, content, 1 - (embedding <=> query_vec) AS similarity
FROM documents
ORDER BY embedding <=> query_vec
LIMIT 10;sql
SELECT id, content, 1 - (embedding <=> query_vec) AS similarity
FROM documents
ORDER BY embedding <=> query_vec
LIMIT 10;3. Add Index (> 10k documents)
3. 添加索引(数据量>10k文档时)
sql
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);sql
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);Docker Quick Start
Docker快速启动
bash
undefinedbash
undefinedpgvector with PostgreSQL 17
pgvector with PostgreSQL 17
docker run -d --name pgvector-db
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg17
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg17
docker run -d --name pgvector-db
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg17
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg17
Or PostgreSQL 18 (latest)
Or PostgreSQL 18 (latest)
docker run -d --name pgvector-db
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg18
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg18
docker run -d --name pgvector-db
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg18
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg18
ParadeDB (includes pgvector + pg_search + BM25)
ParadeDB (includes pgvector + pg_search + BM25)
docker run -d --name paradedb
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
paradedb/paradedb:latest
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
paradedb/paradedb:latest
Connect: `psql postgresql://postgres:postgres@localhost:5432/postgres`docker run -d --name paradedb
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
paradedb/paradedb:latest
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
paradedb/paradedb:latest
连接命令:`psql postgresql://postgres:postgres@localhost:5432/postgres`Cheat Sheet
速查表
Distance Operators
距离操作符
sql
embedding <=> query -- Cosine distance (1 - similarity)
embedding <-> query -- L2/Euclidean distance
embedding <#> query -- Negative inner productsql
embedding <=> query -- 余弦距离(1 - 相似度)
embedding <-> query -- L2/欧氏距离
embedding <#> query -- 负内积Common Queries
常用查询
sql
-- Top 10 similar (cosine)
SELECT * FROM docs ORDER BY embedding <=> $1 LIMIT 10;
-- With similarity score
SELECT *, 1 - (embedding <=> $1) AS similarity FROM docs ORDER BY 2 DESC LIMIT 10;
-- With threshold
SELECT * FROM docs WHERE embedding <=> $1 < 0.3 ORDER BY 1 LIMIT 10;
-- Preload index (run on startup)
SELECT 1 FROM docs ORDER BY embedding <=> $1 LIMIT 1;sql
-- 相似度Top 10(余弦)
SELECT * FROM docs ORDER BY embedding <=> $1 LIMIT 10;
-- 带相似度分数
SELECT *, 1 - (embedding <=> $1) AS similarity FROM docs ORDER BY 2 DESC LIMIT 10;
-- 带阈值过滤
SELECT * FROM docs WHERE embedding <=> $1 < 0.3 ORDER BY 1 LIMIT 10;
-- 预加载索引(启动时执行)
SELECT 1 FROM docs ORDER BY embedding <=> $1 LIMIT 1;Index Quick Reference
索引速查
sql
-- HNSW (recommended)
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);
-- With tuning
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops)
WITH (m = 24, ef_construction = 200);
-- Query-time recall
SET hnsw.ef_search = 100;
-- Iterative scan for filtered queries (pgvector 0.8+)
SET hnsw.iterative_scan = relaxed_order;
SET ivfflat.iterative_scan = on;sql
-- HNSW(推荐)
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);
-- 带调优参数
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops)
WITH (m = 24, ef_construction = 200);
-- 查询时召回率设置
SET hnsw.ef_search = 100;
-- 过滤查询的迭代扫描(pgvector 0.8+)
SET hnsw.iterative_scan = relaxed_order;
SET ivfflat.iterative_scan = on;Decision Trees
决策树
Choose Search Method
选择搜索方式
Query type?
├─ Conceptual/meaning-based → Pure vector search
├─ Exact terms/names → Pure keyword search (FTS)
├─ Fuzzy/typo-tolerant → pg_trgm trigram similarity
├─ Autocomplete/prefix → pg_trgm + prefix index
├─ Substring (LIKE/ILIKE) → pg_trgm GIN index
└─ Mixed/unknown → Hybrid search
├─ Simple setup → FTS + RRF (no extra extensions)
├─ Better ranking → BM25 + RRF (pg_search extension)
└─ Full-featured → ParadeDB (Elasticsearch alternative)查询类型?
├─ 概念/基于含义的查询 → 纯向量搜索
├─ 精确术语/名称查询 → 纯关键词搜索(FTS)
├─ 模糊/容错输入查询 → pg_trgm三元组相似度搜索
├─ 自动补全/前缀查询 → pg_trgm + 前缀索引
├─ 子串查询(LIKE/ILIKE) → pg_trgm GIN索引
└─ 混合/未知类型查询 → 混合搜索
├─ 简单配置 → FTS + RRF(无需额外扩展)
├─ 更优排序 → BM25 + RRF(需pg_search扩展)
└─ 全功能方案 → ParadeDB(Elasticsearch替代方案)Choose Index Type
选择索引类型
Document count?
├─ < 10,000 → No index needed
├─ 10k - 1M → HNSW (best recall)
└─ > 1M → IVFFlat (less memory) or HNSW文档数量?
├─ < 10,000 → 无需索引
├─ 10k - 1M → HNSW(召回率最佳)
└─ > 1M → IVFFlat(内存占用更低)或HNSWChoose Vector Type
选择向量类型
Embedding model?
├─ text-embedding-3-small (1536) → vector(1536)
├─ text-embedding-3-large (3072) → halfvec(3072) (50% memory savings)
└─ Other models → vector(dimensions)嵌入模型?
├─ text-embedding-3-small(1536维度) → vector(1536)
├─ text-embedding-3-large(3072维度) → halfvec(3072)(节省50%内存)
└─ 其他模型 → vector(对应维度)Operators
操作符
| Operator | Distance | Use Case |
|---|---|---|
| Cosine | Text embeddings (default) |
| L2/Euclidean | Image embeddings |
| Inner product | Normalized vectors |
| 操作符 | 距离类型 | 使用场景 |
|---|---|---|
| 余弦距离 | 文本嵌入(默认) |
| L2/欧氏距离 | 图像嵌入 |
| 内积 | 归一化向量 |
SQL Functions
SQL函数
Semantic Search
语义搜索
- - Basic search
match_documents(query_vec, threshold, limit) - - With JSONB filter
match_documents_filtered(query_vec, metadata_filter, threshold, limit) - - Search document chunks
match_chunks(query_vec, threshold, limit)
- - 基础搜索
match_documents(query_vec, threshold, limit) - - 支持JSONB过滤
match_documents_filtered(query_vec, metadata_filter, threshold, limit) - - 文档片段搜索
match_chunks(query_vec, threshold, limit)
Fuzzy Search (pg_trgm)
模糊搜索(pg_trgm)
- - Trigram similarity search
fuzzy_search_trigram(query_text, threshold, limit) - - Prefix + fuzzy autocomplete
autocomplete_search(prefix, limit) - - Fuzzy + vector RRF
hybrid_search_fuzzy_semantic(query_text, query_vec, limit, rrf_k) - - FTS with title/content weighting
weighted_fts_search(query_text, language, limit)
- - 三元组相似度搜索
fuzzy_search_trigram(query_text, threshold, limit) - - 前缀+模糊自动补全
autocomplete_search(prefix, limit) - - 模糊+向量RRF混合搜索
hybrid_search_fuzzy_semantic(query_text, query_vec, limit, rrf_k) - - 带标题/内容权重的FTS搜索
weighted_fts_search(query_text, language, limit)
Hybrid Search (FTS)
混合搜索(FTS)
- - FTS + RRF
hybrid_search_fts(query_vec, query_text, limit, rrf_k, language) - - Linear combination
hybrid_search_weighted(query_vec, query_text, limit, sem_weight, kw_weight) - - Graceful degradation
hybrid_search_fallback(query_vec, query_text, limit)
- - FTS + RRF混合搜索
hybrid_search_fts(query_vec, query_text, limit, rrf_k, language) - - 线性组合权重
hybrid_search_weighted(query_vec, query_text, limit, sem_weight, kw_weight) - - 优雅降级方案
hybrid_search_fallback(query_vec, query_text, limit)
Hybrid Search (BM25)
混合搜索(BM25)
- - BM25 + RRF
hybrid_search_bm25(query_vec, query_text, limit, rrf_k) - - With snippet highlighting
hybrid_search_bm25_highlighted(...) - - For RAG with chunks
hybrid_search_chunks_bm25(...)
- - BM25 + RRF混合搜索
hybrid_search_bm25(query_vec, query_text, limit, rrf_k) - - 带片段高亮
hybrid_search_bm25_highlighted(...) - - 适用于RAG的文档片段搜索
hybrid_search_chunks_bm25(...)
Re-ranking (Optional)
重排序(可选)
Two-stage retrieval improves precision: fast recall → precise rerank.
两阶段检索提升精度:快速召回 → 精准重排序。
When to Use
适用场景
- Results need higher precision
- Using < 50 candidates after initial search
- Have budget for API calls (Cohere) or compute (local models)
- 结果需要更高精度
- 初始搜索后候选结果少于50条
- 有API调用预算(Cohere)或计算资源(本地模型)
Options
方案对比
| Method | Latency | Quality | Cost |
|---|---|---|---|
| Cohere Rerank v4.0-fast | ~150ms | Excellent | $0.001/query |
| Cohere Rerank v4.0-pro | ~300ms | Best | $0.002/query |
| Zerank 2 | ~100ms | Best | API cost |
| Voyage Rerank 2.5 | ~100ms | Excellent | API cost |
| Cross-encoder (local) | ~500ms | Very Good | Compute |
| 方法 | 延迟 | 质量 | 成本 |
|---|---|---|---|
| Cohere Rerank v4.0-fast | ~150ms | 优秀 | $0.001/次查询 |
| Cohere Rerank v4.0-pro | ~300ms | 最佳 | $0.002/次查询 |
| Zerank 2 | ~100ms | 最佳 | API调用成本 |
| Voyage Rerank 2.5 | ~100ms | 优秀 | API调用成本 |
| Cross-encoder(本地) | ~500ms | 非常好 | 计算资源成本 |
TypeScript Example (Cohere)
TypeScript示例(Cohere)
typescript
import { CohereClient } from 'cohere-ai';
const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });
async function rerankResults(query: string, documents: string[]) {
const response = await cohere.rerank({
model: 'rerank-v4.0-fast', // or 'rerank-v4.0-pro' for best quality
query,
documents,
topN: 10,
});
return response.results;
}- reranking.md - Detailed guide
typescript
import { CohereClient } from 'cohere-ai';
const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });
async function rerankResults(query: string, documents: string[]) {
const response = await cohere.rerank({
model: 'rerank-v4.0-fast', // 或使用'rerank-v4.0-pro'获取最佳质量
query,
documents,
topN: 10,
});
return response.results;
}- reranking.md - 详细指南
References
参考资料
- fuzzy-search.md - pg_trgm, fuzzy matching, LIKE/ILIKE, autocomplete, advanced FTS
- paradedb.md - ParadeDB full-text search (Elasticsearch alternative)
- vector-types.md - vector vs halfvec, dimensions, storage
- indexing.md - HNSW, IVFFlat, GIN parameters
- hybrid-search.md - FTS, BM25, RRF algorithms
- performance.md - Cold-start, memory, HNSW vs IVFFlat
- fuzzy-search.md - pg_trgm、模糊匹配、LIKE/ILIKE、自动补全、高级FTS
- paradedb.md - ParadeDB全文搜索(Elasticsearch替代方案)
- vector-types.md - vector与halfvec对比、维度、存储
- indexing.md - HNSW、IVFFlat、GIN参数
- hybrid-search.md - FTS、BM25、RRF算法
- performance.md - 冷启动、内存、HNSW vs IVFFlat
Scripts
脚本
- setup.sql - Extension and table setup
- semantic_search.sql - Semantic search functions
- hybrid_search_fts.sql - FTS hybrid functions
- hybrid_search_bm25.sql - BM25 hybrid functions
- fuzzy_search.sql - pg_trgm fuzzy search, autocomplete, weighted FTS
- indexes.sql - Index creation scripts
- setup.sql - 扩展与表配置
- semantic_search.sql - 语义搜索函数
- hybrid_search_fts.sql - FTS混合搜索函数
- hybrid_search_bm25.sql - BM25混合搜索函数
- fuzzy_search.sql - pg_trgm模糊搜索、自动补全、带权重FTS
- indexes.sql - 索引创建脚本
Common Patterns
常见模式
TypeScript Integration (Supabase)
TypeScript集成(Supabase)
typescript
// Semantic search
const { data } = await supabase.rpc('match_documents', {
query_embedding: embedding,
match_threshold: 0.7,
match_count: 10
});
// Hybrid search
const { data } = await supabase.rpc('hybrid_search_fts', {
query_embedding: embedding,
query_text: userQuery,
match_count: 10,
rrf_k: 60,
fts_language: 'simple'
});typescript
// 语义搜索
const { data } = await supabase.rpc('match_documents', {
query_embedding: embedding,
match_threshold: 0.7,
match_count: 10
});
// 混合搜索
const { data } = await supabase.rpc('hybrid_search_fts', {
query_embedding: embedding,
query_text: userQuery,
match_count: 10,
rrf_k: 60,
fts_language: 'simple'
});Drizzle ORM
Drizzle ORM
typescript
import { sql } from 'drizzle-orm';
const results = await db.execute(sql`
SELECT * FROM match_documents(
${embedding}::vector(1536),
0.7,
10
)
`);typescript
import { sql } from 'drizzle-orm';
const results = await db.execute(sql`
SELECT * FROM match_documents(
${embedding}::vector(1536),
0.7,
10
)
`);Troubleshooting
故障排除
| Symptom | Cause | Solution |
|---|---|---|
| Index not used | < 10k rows or planner choice | Normal for small tables, check with EXPLAIN |
| Slow first query (30-60s) | HNSW cold-start | |
| Poor recall | Low ef_search | |
| FTS returns nothing | Wrong language config | Use |
| Memory error on index build | maintenance_work_mem too low | Increase to 2GB+ |
| Cosine similarity > 1 | Vectors not normalized | Normalize before insert or use L2 |
| Slow inserts | Index overhead | Batch inserts, consider IVFFlat |
| Fuzzy search slow | Missing trigram index | |
| ILIKE '%x%' slow | No pg_trgm GIN index | Enable pg_trgm + create GIN trigram index |
| pg_trgm not installed | |
| 症状 | 原因 | 解决方案 |
|---|---|---|
| 未使用索引 | 数据量<10k行或查询优化器选择 | 小表正常,使用EXPLAIN检查 |
| 首次查询缓慢(30-60s) | HNSW冷启动 | 执行 |
| 召回率低 | ef_search值过低 | 设置 |
| FTS无结果返回 | 语言配置错误 | 混合/未知语言使用 |
| 索引构建时内存错误 | maintenance_work_mem值过低 | 增加至2GB以上 |
| 余弦相似度>1 | 向量未归一化 | 插入前归一化或使用L2距离 |
| 插入速度慢 | 索引开销 | 批量插入,考虑使用IVFFlat索引 |
| 模糊搜索缓慢 | 缺少三元组索引 | 创建 |
| ILIKE '%x%'查询缓慢 | 未创建pg_trgm GIN索引 | 启用pg_trgm并创建GIN三元组索引 |
| 未安装pg_trgm | 执行 |
Version Info (January 2026)
版本信息(2026年1月)
- PostgreSQL 18.1: Latest maintenance release with security fixes (Nov 2025)
- PostgreSQL 17.7: Stable LTS option
- pgvector 0.8.1: Iterative scans, PostgreSQL 18 support, halfvec up to 4000 dims
- pg_search 0.21.2: MVCC visibility, parallel aggregation, varchar[] indexing
- Cohere Rerank v4.0: 32K context, 100+ languages, self-learning (Dec 2025)
- PostgreSQL 18.1: 最新维护版本,含安全修复(2025年11月)
- PostgreSQL 17.7: 稳定LTS版本
- pgvector 0.8.1: 支持迭代扫描、PostgreSQL 18、halfvec维度最高4000
- pg_search 0.21.2: 支持MVCC可见性、并行聚合、varchar[]索引
- Cohere Rerank v4.0: 支持32K上下文、100+语言、自学习(2025年12月)
External Documentation
外部文档
- pgvector GitHub - Official extension, latest features
- OpenAI Embeddings Guide - Embedding models and best practices
- Supabase Vector Guide - Supabase-specific integration
- ParadeDB pg_search - BM25 extension documentation
- PostgreSQL FTS - Built-in full-text search
- pgvector GitHub - 官方扩展,最新功能
- OpenAI Embeddings Guide - 嵌入模型与最佳实践
- Supabase Vector Guide - Supabase专属集成指南
- ParadeDB pg_search - BM25扩展文档
- PostgreSQL FTS - 内置全文搜索文档