postgres-semantic-search

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

PostgreSQL Semantic Search

PostgreSQL 语义搜索

Quick Start

快速开始

1. Setup

1. 配置

sql
CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    embedding vector(1536)  -- text-embedding-3-small
    -- Or: embedding halfvec(3072)  -- text-embedding-3-large (50% memory)
);
sql
CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    embedding vector(1536)  -- text-embedding-3-small
    -- Or: embedding halfvec(3072)  -- text-embedding-3-large (50% memory)
);

2. Basic Semantic Search

2. 基础语义搜索

sql
SELECT id, content, 1 - (embedding <=> query_vec) AS similarity
FROM documents
ORDER BY embedding <=> query_vec
LIMIT 10;
sql
SELECT id, content, 1 - (embedding <=> query_vec) AS similarity
FROM documents
ORDER BY embedding <=> query_vec
LIMIT 10;

3. Add Index (> 10k documents)

3. 添加索引(数据量>10k文档时)

sql
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
sql
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

Docker Quick Start

Docker快速启动

bash
undefined
bash
undefined

pgvector with PostgreSQL 17

pgvector with PostgreSQL 17

docker run -d --name pgvector-db
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg17
docker run -d --name pgvector-db
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg17

Or PostgreSQL 18 (latest)

Or PostgreSQL 18 (latest)

docker run -d --name pgvector-db
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg18
docker run -d --name pgvector-db
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
pgvector/pgvector:pg18

ParadeDB (includes pgvector + pg_search + BM25)

ParadeDB (includes pgvector + pg_search + BM25)

docker run -d --name paradedb
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
paradedb/paradedb:latest

Connect: `psql postgresql://postgres:postgres@localhost:5432/postgres`
docker run -d --name paradedb
-e POSTGRES_PASSWORD=postgres
-p 5432:5432
paradedb/paradedb:latest

连接命令:`psql postgresql://postgres:postgres@localhost:5432/postgres`

Cheat Sheet

速查表

Distance Operators

距离操作符

sql
embedding <=> query  -- Cosine distance (1 - similarity)
embedding <-> query  -- L2/Euclidean distance
embedding <#> query  -- Negative inner product
sql
embedding <=> query  -- 余弦距离(1 - 相似度)
embedding <-> query  -- L2/欧氏距离
embedding <#> query  -- 负内积

Common Queries

常用查询

sql
-- Top 10 similar (cosine)
SELECT * FROM docs ORDER BY embedding <=> $1 LIMIT 10;

-- With similarity score
SELECT *, 1 - (embedding <=> $1) AS similarity FROM docs ORDER BY 2 DESC LIMIT 10;

-- With threshold
SELECT * FROM docs WHERE embedding <=> $1 < 0.3 ORDER BY 1 LIMIT 10;

-- Preload index (run on startup)
SELECT 1 FROM docs ORDER BY embedding <=> $1 LIMIT 1;
sql
-- 相似度Top 10(余弦)
SELECT * FROM docs ORDER BY embedding <=> $1 LIMIT 10;

-- 带相似度分数
SELECT *, 1 - (embedding <=> $1) AS similarity FROM docs ORDER BY 2 DESC LIMIT 10;

-- 带阈值过滤
SELECT * FROM docs WHERE embedding <=> $1 < 0.3 ORDER BY 1 LIMIT 10;

-- 预加载索引(启动时执行)
SELECT 1 FROM docs ORDER BY embedding <=> $1 LIMIT 1;

Index Quick Reference

索引速查

sql
-- HNSW (recommended)
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);

-- With tuning
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops)
WITH (m = 24, ef_construction = 200);

-- Query-time recall
SET hnsw.ef_search = 100;

-- Iterative scan for filtered queries (pgvector 0.8+)
SET hnsw.iterative_scan = relaxed_order;
SET ivfflat.iterative_scan = on;
sql
-- HNSW(推荐)
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);

-- 带调优参数
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops)
WITH (m = 24, ef_construction = 200);

-- 查询时召回率设置
SET hnsw.ef_search = 100;

-- 过滤查询的迭代扫描(pgvector 0.8+)
SET hnsw.iterative_scan = relaxed_order;
SET ivfflat.iterative_scan = on;

Decision Trees

决策树

Choose Search Method

选择搜索方式

Query type?
├─ Conceptual/meaning-based → Pure vector search
├─ Exact terms/names → Pure keyword search (FTS)
├─ Fuzzy/typo-tolerant → pg_trgm trigram similarity
├─ Autocomplete/prefix → pg_trgm + prefix index
├─ Substring (LIKE/ILIKE) → pg_trgm GIN index
└─ Mixed/unknown → Hybrid search
    ├─ Simple setup → FTS + RRF (no extra extensions)
    ├─ Better ranking → BM25 + RRF (pg_search extension)
    └─ Full-featured → ParadeDB (Elasticsearch alternative)
查询类型?
├─ 概念/基于含义的查询 → 纯向量搜索
├─ 精确术语/名称查询 → 纯关键词搜索(FTS)
├─ 模糊/容错输入查询 → pg_trgm三元组相似度搜索
├─ 自动补全/前缀查询 → pg_trgm + 前缀索引
├─ 子串查询(LIKE/ILIKE) → pg_trgm GIN索引
└─ 混合/未知类型查询 → 混合搜索
    ├─ 简单配置 → FTS + RRF(无需额外扩展)
    ├─ 更优排序 → BM25 + RRF(需pg_search扩展)
    └─ 全功能方案 → ParadeDB(Elasticsearch替代方案)

Choose Index Type

选择索引类型

Document count?
├─ < 10,000 → No index needed
├─ 10k - 1M → HNSW (best recall)
└─ > 1M → IVFFlat (less memory) or HNSW
文档数量?
├─ < 10,000 → 无需索引
├─ 10k - 1M → HNSW(召回率最佳)
└─ > 1M → IVFFlat(内存占用更低)或HNSW

Choose Vector Type

选择向量类型

Embedding model?
├─ text-embedding-3-small (1536) → vector(1536)
├─ text-embedding-3-large (3072) → halfvec(3072) (50% memory savings)
└─ Other models → vector(dimensions)
嵌入模型?
├─ text-embedding-3-small(1536维度) → vector(1536)
├─ text-embedding-3-large(3072维度) → halfvec(3072)(节省50%内存)
└─ 其他模型 → vector(对应维度)

Operators

操作符

OperatorDistanceUse Case
<=>
CosineText embeddings (default)
<->
L2/EuclideanImage embeddings
<#>
Inner productNormalized vectors
操作符距离类型使用场景
<=>
余弦距离文本嵌入(默认)
<->
L2/欧氏距离图像嵌入
<#>
内积归一化向量

SQL Functions

SQL函数

Semantic Search

语义搜索

  • match_documents(query_vec, threshold, limit)
    - Basic search
  • match_documents_filtered(query_vec, metadata_filter, threshold, limit)
    - With JSONB filter
  • match_chunks(query_vec, threshold, limit)
    - Search document chunks
  • match_documents(query_vec, threshold, limit)
    - 基础搜索
  • match_documents_filtered(query_vec, metadata_filter, threshold, limit)
    - 支持JSONB过滤
  • match_chunks(query_vec, threshold, limit)
    - 文档片段搜索

Fuzzy Search (pg_trgm)

模糊搜索(pg_trgm)

  • fuzzy_search_trigram(query_text, threshold, limit)
    - Trigram similarity search
  • autocomplete_search(prefix, limit)
    - Prefix + fuzzy autocomplete
  • hybrid_search_fuzzy_semantic(query_text, query_vec, limit, rrf_k)
    - Fuzzy + vector RRF
  • weighted_fts_search(query_text, language, limit)
    - FTS with title/content weighting
  • fuzzy_search_trigram(query_text, threshold, limit)
    - 三元组相似度搜索
  • autocomplete_search(prefix, limit)
    - 前缀+模糊自动补全
  • hybrid_search_fuzzy_semantic(query_text, query_vec, limit, rrf_k)
    - 模糊+向量RRF混合搜索
  • weighted_fts_search(query_text, language, limit)
    - 带标题/内容权重的FTS搜索

Hybrid Search (FTS)

混合搜索(FTS)

  • hybrid_search_fts(query_vec, query_text, limit, rrf_k, language)
    - FTS + RRF
  • hybrid_search_weighted(query_vec, query_text, limit, sem_weight, kw_weight)
    - Linear combination
  • hybrid_search_fallback(query_vec, query_text, limit)
    - Graceful degradation
  • hybrid_search_fts(query_vec, query_text, limit, rrf_k, language)
    - FTS + RRF混合搜索
  • hybrid_search_weighted(query_vec, query_text, limit, sem_weight, kw_weight)
    - 线性组合权重
  • hybrid_search_fallback(query_vec, query_text, limit)
    - 优雅降级方案

Hybrid Search (BM25)

混合搜索(BM25)

  • hybrid_search_bm25(query_vec, query_text, limit, rrf_k)
    - BM25 + RRF
  • hybrid_search_bm25_highlighted(...)
    - With snippet highlighting
  • hybrid_search_chunks_bm25(...)
    - For RAG with chunks
  • hybrid_search_bm25(query_vec, query_text, limit, rrf_k)
    - BM25 + RRF混合搜索
  • hybrid_search_bm25_highlighted(...)
    - 带片段高亮
  • hybrid_search_chunks_bm25(...)
    - 适用于RAG的文档片段搜索

Re-ranking (Optional)

重排序(可选)

Two-stage retrieval improves precision: fast recall → precise rerank.
两阶段检索提升精度:快速召回 → 精准重排序。

When to Use

适用场景

  • Results need higher precision
  • Using < 50 candidates after initial search
  • Have budget for API calls (Cohere) or compute (local models)
  • 结果需要更高精度
  • 初始搜索后候选结果少于50条
  • 有API调用预算(Cohere)或计算资源(本地模型)

Options

方案对比

MethodLatencyQualityCost
Cohere Rerank v4.0-fast~150msExcellent$0.001/query
Cohere Rerank v4.0-pro~300msBest$0.002/query
Zerank 2~100msBestAPI cost
Voyage Rerank 2.5~100msExcellentAPI cost
Cross-encoder (local)~500msVery GoodCompute
方法延迟质量成本
Cohere Rerank v4.0-fast~150ms优秀$0.001/次查询
Cohere Rerank v4.0-pro~300ms最佳$0.002/次查询
Zerank 2~100ms最佳API调用成本
Voyage Rerank 2.5~100ms优秀API调用成本
Cross-encoder(本地)~500ms非常好计算资源成本

TypeScript Example (Cohere)

TypeScript示例(Cohere)

typescript
import { CohereClient } from 'cohere-ai';

const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });

async function rerankResults(query: string, documents: string[]) {
  const response = await cohere.rerank({
    model: 'rerank-v4.0-fast',  // or 'rerank-v4.0-pro' for best quality
    query,
    documents,
    topN: 10,
  });
  return response.results;
}
  • reranking.md - Detailed guide
typescript
import { CohereClient } from 'cohere-ai';

const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });

async function rerankResults(query: string, documents: string[]) {
  const response = await cohere.rerank({
    model: 'rerank-v4.0-fast',  // 或使用'rerank-v4.0-pro'获取最佳质量
    query,
    documents,
    topN: 10,
  });
  return response.results;
}
  • reranking.md - 详细指南

References

参考资料

  • fuzzy-search.md - pg_trgm, fuzzy matching, LIKE/ILIKE, autocomplete, advanced FTS
  • paradedb.md - ParadeDB full-text search (Elasticsearch alternative)
  • vector-types.md - vector vs halfvec, dimensions, storage
  • indexing.md - HNSW, IVFFlat, GIN parameters
  • hybrid-search.md - FTS, BM25, RRF algorithms
  • performance.md - Cold-start, memory, HNSW vs IVFFlat
  • fuzzy-search.md - pg_trgm、模糊匹配、LIKE/ILIKE、自动补全、高级FTS
  • paradedb.md - ParadeDB全文搜索(Elasticsearch替代方案)
  • vector-types.md - vector与halfvec对比、维度、存储
  • indexing.md - HNSW、IVFFlat、GIN参数
  • hybrid-search.md - FTS、BM25、RRF算法
  • performance.md - 冷启动、内存、HNSW vs IVFFlat

Scripts

脚本

  • setup.sql - Extension and table setup
  • semantic_search.sql - Semantic search functions
  • hybrid_search_fts.sql - FTS hybrid functions
  • hybrid_search_bm25.sql - BM25 hybrid functions
  • fuzzy_search.sql - pg_trgm fuzzy search, autocomplete, weighted FTS
  • indexes.sql - Index creation scripts
  • setup.sql - 扩展与表配置
  • semantic_search.sql - 语义搜索函数
  • hybrid_search_fts.sql - FTS混合搜索函数
  • hybrid_search_bm25.sql - BM25混合搜索函数
  • fuzzy_search.sql - pg_trgm模糊搜索、自动补全、带权重FTS
  • indexes.sql - 索引创建脚本

Common Patterns

常见模式

TypeScript Integration (Supabase)

TypeScript集成(Supabase)

typescript
// Semantic search
const { data } = await supabase.rpc('match_documents', {
  query_embedding: embedding,
  match_threshold: 0.7,
  match_count: 10
});

// Hybrid search
const { data } = await supabase.rpc('hybrid_search_fts', {
  query_embedding: embedding,
  query_text: userQuery,
  match_count: 10,
  rrf_k: 60,
  fts_language: 'simple'
});
typescript
// 语义搜索
const { data } = await supabase.rpc('match_documents', {
  query_embedding: embedding,
  match_threshold: 0.7,
  match_count: 10
});

// 混合搜索
const { data } = await supabase.rpc('hybrid_search_fts', {
  query_embedding: embedding,
  query_text: userQuery,
  match_count: 10,
  rrf_k: 60,
  fts_language: 'simple'
});

Drizzle ORM

Drizzle ORM

typescript
import { sql } from 'drizzle-orm';

const results = await db.execute(sql`
  SELECT * FROM match_documents(
    ${embedding}::vector(1536),
    0.7,
    10
  )
`);
typescript
import { sql } from 'drizzle-orm';

const results = await db.execute(sql`
  SELECT * FROM match_documents(
    ${embedding}::vector(1536),
    0.7,
    10
  )
`);

Troubleshooting

故障排除

SymptomCauseSolution
Index not used< 10k rows or planner choiceNormal for small tables, check with EXPLAIN
Slow first query (30-60s)HNSW cold-start
SELECT pg_prewarm('idx_name')
or preload query
Poor recallLow ef_search
SET hnsw.ef_search = 100
or higher
FTS returns nothingWrong language configUse
'simple'
for mixed/unknown languages
Memory error on index buildmaintenance_work_mem too lowIncrease to 2GB+
Cosine similarity > 1Vectors not normalizedNormalize before insert or use L2
Slow insertsIndex overheadBatch inserts, consider IVFFlat
Fuzzy search slowMissing trigram index
CREATE INDEX USING gin (col gin_trgm_ops)
ILIKE '%x%' slowNo pg_trgm GIN indexEnable pg_trgm + create GIN trigram index
%
operator error
pg_trgm not installed
CREATE EXTENSION IF NOT EXISTS pg_trgm
症状原因解决方案
未使用索引数据量<10k行或查询优化器选择小表正常,使用EXPLAIN检查
首次查询缓慢(30-60s)HNSW冷启动执行
SELECT pg_prewarm('idx_name')
或预加载查询
召回率低ef_search值过低设置
SET hnsw.ef_search = 100
或更高
FTS无结果返回语言配置错误混合/未知语言使用
'simple'
配置
索引构建时内存错误maintenance_work_mem值过低增加至2GB以上
余弦相似度>1向量未归一化插入前归一化或使用L2距离
插入速度慢索引开销批量插入,考虑使用IVFFlat索引
模糊搜索缓慢缺少三元组索引创建
CREATE INDEX USING gin (col gin_trgm_ops)
ILIKE '%x%'查询缓慢未创建pg_trgm GIN索引启用pg_trgm并创建GIN三元组索引
%
操作符错误
未安装pg_trgm执行
CREATE EXTENSION IF NOT EXISTS pg_trgm

Version Info (January 2026)

版本信息(2026年1月)

  • PostgreSQL 18.1: Latest maintenance release with security fixes (Nov 2025)
  • PostgreSQL 17.7: Stable LTS option
  • pgvector 0.8.1: Iterative scans, PostgreSQL 18 support, halfvec up to 4000 dims
  • pg_search 0.21.2: MVCC visibility, parallel aggregation, varchar[] indexing
  • Cohere Rerank v4.0: 32K context, 100+ languages, self-learning (Dec 2025)
  • PostgreSQL 18.1: 最新维护版本,含安全修复(2025年11月)
  • PostgreSQL 17.7: 稳定LTS版本
  • pgvector 0.8.1: 支持迭代扫描、PostgreSQL 18、halfvec维度最高4000
  • pg_search 0.21.2: 支持MVCC可见性、并行聚合、varchar[]索引
  • Cohere Rerank v4.0: 支持32K上下文、100+语言、自学习(2025年12月)

External Documentation

外部文档