pgvector-search

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

PGVector Hybrid Search

PGVector混合搜索

Production-grade semantic + keyword search using PostgreSQL
基于PostgreSQL的生产级语义+关键词搜索方案

Overview

概述

Architecture:
Query
  |
[Generate embedding] --> Vector Search (PGVector) --> Top 30 results
  |
[Generate ts_query]  --> Keyword Search (BM25)    --> Top 30 results
  |
[Reciprocal Rank Fusion (RRF)] --> Merge & re-rank --> Top 10 final results
When to use this skill:
  • Building semantic search (RAG, knowledge bases, recommendations)
  • Implementing hybrid retrieval (vector + keyword)
  • Optimizing PGVector performance
  • Working with large document collections (1M+ chunks)

架构:
查询
  |
[生成嵌入向量] --> 向量搜索(PGVector) --> 前30条结果
  |
[生成ts_query]  --> 关键词搜索(BM25)    --> 前30条结果
  |
[Reciprocal Rank Fusion(RRF)] --> 合并并重排序 --> 最终前10条结果
适用场景:
  • 构建语义搜索(RAG、知识库、推荐系统)
  • 实现混合检索(向量+关键词)
  • 优化PGVector性能
  • 处理大规模文档集合(100万+文本块)

Quick Reference

快速参考

Search Type Comparison

搜索类型对比

AspectSemantic (Vector)Keyword (BM25)
QueryEmbedding similarityExact word matches
StrengthsSynonyms, conceptsExact phrases, rare terms
WeaknessesExact matches, technical termsNo semantic understanding
IndexHNSW (pgvector)GIN (tsvector)
维度语义(向量)搜索关键词(BM25)搜索
查询方式嵌入向量相似度匹配精确词汇匹配
优势支持同义词、概念匹配精确短语、罕见术语匹配
劣势精确匹配、技术术语表现不佳无语义理解能力
索引类型HNSW(pgvector)GIN(tsvector)

Index Comparison

索引对比

MetricIVFFlatHNSW
Query speed50ms3ms (17x faster)
Index time2 min20 min
Best for< 100k vectors100k+ vectors
Recall@100.85-0.950.95-0.99
Recommendation: Use HNSW for production (scales to millions).
指标IVFFlatHNSW
查询速度50ms3ms(快17倍)
索引构建时间2分钟20分钟
最佳适用场景向量数量<10万向量数量>10万
Recall@100.85-0.950.95-0.99
推荐: 生产环境使用HNSW(可扩展至百万级向量)。

RRF Formula

RRF公式

python
rrf_score = 1/(k + vector_rank) + 1/(k + keyword_rank)  # k=60 (standard)

python
rrf_score = 1/(k + vector_rank) + 1/(k + keyword_rank)  # k=60(标准值)

Database Schema

数据库 Schema

sql
CREATE TABLE chunks (
    id UUID PRIMARY KEY,
    document_id UUID REFERENCES documents(id),
    content TEXT NOT NULL,
    embedding vector(1024),  -- PGVector
    content_tsvector tsvector GENERATED ALWAYS AS (
        to_tsvector('english', content)
    ) STORED,
    section_title TEXT,
    content_type TEXT,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Indexes
CREATE INDEX idx_chunks_embedding ON chunks
    USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);

CREATE INDEX idx_chunks_content_tsvector ON chunks
    USING gin (content_tsvector);

sql
CREATE TABLE chunks (
    id UUID PRIMARY KEY,
    document_id UUID REFERENCES documents(id),
    content TEXT NOT NULL,
    embedding vector(1024),  -- PGVector
    content_tsvector tsvector GENERATED ALWAYS AS (
        to_tsvector('english', content)
    ) STORED,
    section_title TEXT,
    content_type TEXT,
    created_at TIMESTAMP DEFAULT NOW()
);

-- 索引
CREATE INDEX idx_chunks_embedding ON chunks
    USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);

CREATE INDEX idx_chunks_content_tsvector ON chunks
    USING gin (content_tsvector);

Hybrid Search Query (SQLAlchemy)

混合搜索查询(SQLAlchemy)

python
async def hybrid_search(
    query: str,
    query_embedding: list[float],
    top_k: int = 10
) -> list[Chunk]:
    FETCH_MULTIPLIER = 3  # Fetch 30 for better RRF coverage
    K = 60  # RRF smoothing constant

    # Vector search subquery
    vector_subq = (
        select(Chunk.id,
            func.row_number().over(
                order_by=Chunk.embedding.cosine_distance(query_embedding)
            ).label("vector_rank"))
        .limit(top_k * FETCH_MULTIPLIER)
        .subquery()
    )

    # Keyword search subquery
    ts_query = func.plainto_tsquery("english", query)
    keyword_subq = (
        select(Chunk.id,
            func.row_number().over(
                order_by=func.ts_rank_cd(Chunk.content_tsvector, ts_query).desc()
            ).label("keyword_rank"))
        .where(Chunk.content_tsvector.op("@@")(ts_query))
        .limit(top_k * FETCH_MULTIPLIER)
        .subquery()
    )

    # RRF fusion with FULL OUTER JOIN
    rrf_subq = (
        select(
            func.coalesce(vector_subq.c.id, keyword_subq.c.id).label("chunk_id"),
            (func.coalesce(1.0 / (K + vector_subq.c.vector_rank), 0.0) +
             func.coalesce(1.0 / (K + keyword_subq.c.keyword_rank), 0.0)
            ).label("rrf_score"))
        .select_from(vector_subq.outerjoin(keyword_subq, ..., full=True))
        .order_by("rrf_score DESC")
        .limit(top_k)
        .subquery()
    )

    return await session.execute(
        select(Chunk).join(rrf_subq, Chunk.id == rrf_subq.c.chunk_id)
    )

python
async def hybrid_search(
    query: str,
    query_embedding: list[float],
    top_k: int = 10
) -> list[Chunk]:
    FETCH_MULTIPLIER = 3  # 取30条结果以提升RRF覆盖度
    K = 60  # RRF平滑常数

    # 向量搜索子查询
    vector_subq = (
        select(Chunk.id,
            func.row_number().over(
                order_by=Chunk.embedding.cosine_distance(query_embedding)
            ).label("vector_rank"))
        .limit(top_k * FETCH_MULTIPLIER)
        .subquery()
    )

    # 关键词搜索子查询
    ts_query = func.plainto_tsquery("english", query)
    keyword_subq = (
        select(Chunk.id,
            func.row_number().over(
                order_by=func.ts_rank_cd(Chunk.content_tsvector, ts_query).desc()
            ).label("keyword_rank"))
        .where(Chunk.content_tsvector.op("@@")(ts_query))
        .limit(top_k * FETCH_MULTIPLIER)
        .subquery()
    )

    # 基于FULL OUTER JOIN的RRF融合
    rrf_subq = (
        select(
            func.coalesce(vector_subq.c.id, keyword_subq.c.id).label("chunk_id"),
            (func.coalesce(1.0 / (K + vector_subq.c.vector_rank), 0.0) +
             func.coalesce(1.0 / (K + keyword_subq.c.keyword_rank), 0.0)
            ).label("rrf_score"))
        .select_from(vector_subq.outerjoin(keyword_subq, ..., full=True))
        .order_by("rrf_score DESC")
        .limit(top_k)
        .subquery()
    )

    return await session.execute(
        select(Chunk).join(rrf_subq, Chunk.id == rrf_subq.c.chunk_id)
    )

Common Patterns

常见模式

Filtered Search

过滤搜索

python
results = await hybrid_search(
    query="binary search",
    query_embedding=embedding,
    content_type_filter=["code_block"]
)
python
results = await hybrid_search(
    query="binary search",
    query_embedding=embedding,
    content_type_filter=["code_block"]
)

Similarity Threshold

相似度阈值过滤

python
results = await hybrid_search(query, embedding, top_k=50)
filtered = [r for r in results if (1 - r.vector_distance) >= 0.75][:10]
python
results = await hybrid_search(query, embedding, top_k=50)
filtered = [r for r in results if (1 - r.vector_distance) >= 0.75][:10]

Multi-Query Retrieval

多查询检索

python
queries = ["machine learning", "ML algorithms", "neural networks"]
all_results = [await hybrid_search(q, embed(q)) for q in queries]
final = deduplicate_and_rerank(all_results)

python
queries = ["machine learning", "ML algorithms", "neural networks"]
all_results = [await hybrid_search(q, embed(q)) for q in queries]
final = deduplicate_and_rerank(all_results)

Performance Tips

性能优化技巧

  1. Pre-compute tsvector - 5-10x faster than
    to_tsvector()
    at query time
  2. Use HNSW index - 17x faster queries than IVFFlat
  3. 3x fetch multiplier - Better RRF coverage (30 results per search for top 10)
  4. Iterative scan for filtered queries - Set
    hnsw.iterative_scan = 'relaxed_order'
  5. Metadata boosting - +6% MRR with title/path matching

  1. 预计算tsvector - 比查询时调用
    to_tsvector()
    快5-10倍
  2. 使用HNSW索引 - 查询速度比IVFFlat快17倍
  3. 3倍结果获取系数 - 提升RRF覆盖度(每个搜索取30条结果以得到最终10条)
  4. 过滤查询使用迭代扫描 - 设置
    hnsw.iterative_scan = 'relaxed_order'
  5. 元数据加权 - 匹配标题/路径可提升6%的MRR(平均倒数排名)

Redis 8 FT.HYBRID Alternative

Redis 8 FT.HYBRID替代方案

Redis 8.4 introduced
FT.HYBRID
with native hybrid search support. Consider this alternative when latency is critical.
Redis 8.4引入了
FT.HYBRID
,支持原生混合搜索。当对延迟要求极高时,可考虑此替代方案。

FT.HYBRID Command Syntax

FT.HYBRID命令语法

redis
FT.HYBRID index query
  VECTOR $embedding AS vec_score
  TEXT "search terms" AS text_score
  COMBINE RRF
  LIMIT 0 10
  RETURN 3 content title score
redis
FT.HYBRID index query
  VECTOR $embedding AS vec_score
  TEXT "search terms" AS text_score
  COMBINE RRF
  LIMIT 0 10
  RETURN 3 content title score

Comparison: pgvector vs Redis 8 FT.HYBRID

对比:pgvector vs Redis 8 FT.HYBRID

AspectpgvectorRedis 8 FT.HYBRID
Setup complexityMedium (extensions, indexes)Low (single FT.CREATE)
RRF implementationManual SQL with FULL OUTER JOINNative
COMBINE RRF
Keyword scoring
ts_rank_cd
(TF-IDF variant)
BM25STD
(configurable)
Typical latency5-20ms2-5ms
PersistenceACID transactionsAOF/RDB (configurable)
Max datasetBillions (with partitioning)Memory-bound (~100M vectors)
Joins/relationsFull SQL supportLimited (hash/JSON docs)
Operational maturityBattle-testedNew (8.4, 2024)
维度pgvectorRedis 8 FT.HYBRID
部署复杂度中等(需扩展插件、创建索引)低(仅需FT.CREATE命令)
RRF实现方式手动通过SQL FULL OUTER JOIN实现原生支持
COMBINE RRF
关键词评分算法
ts_rank_cd
(TF-IDF变体)
BM25STD
(可配置)
典型延迟5-20ms2-5ms
持久性ACID事务支持AOF/RDB(可配置)
最大数据集规模数十亿(支持分区)受内存限制(约1亿向量)
关联查询支持完整SQL支持有限(哈希/JSON文档)
生产成熟度久经考验较新(2024年8.4版本推出)

When to Choose pgvector

何时选择pgvector

  • Already using PostgreSQL - No new infrastructure, single source of truth
  • Need ACID transactions - Atomic updates across search index and relational data
  • Complex joins with relational data - Foreign keys, aggregations, reporting
  • Large datasets - Billions of vectors with table partitioning
  • Compliance requirements - Established PostgreSQL security/audit tooling
  • 已在使用PostgreSQL - 无需新增基础设施,保持单一数据源
  • 需要ACID事务 - 搜索索引与关系型数据可实现原子更新
  • 复杂关联查询 - 支持外键、聚合、报表等操作
  • 超大规模数据集 - 支持表分区,可处理数十亿向量
  • 合规要求 - PostgreSQL拥有成熟的安全/审计工具链

When to Choose Redis 8

何时选择Redis 8

  • Need sub-5ms latency - Real-time autocomplete, typeahead, live recommendations
  • Caching layer with search - Already using Redis for cache, add search capability
  • Simpler deployment - Single binary, no extensions or complex configuration
  • Ephemeral search indexes - Rebuild from source of truth is acceptable
  • Memory budget allows - Dataset fits comfortably in RAM

  • 亚5ms延迟需求 - 实时自动补全、输入提示、实时推荐
  • 已有Redis缓存层 - 在现有缓存基础上添加搜索能力
  • 简化部署 - 单二进制文件,无需扩展插件或复杂配置
  • 临时搜索索引 - 可从源数据重建索引的场景
  • 内存预算充足 - 数据集可完全放入内存

References

参考资料

Detailed Implementation Guides

详细实现指南

ReferenceDescriptionUse When
index-strategies.mdHNSW vs IVFFlat, tuning, iterative scansChoosing/optimizing indexes
hybrid-search-rrf.mdRRF algorithm, SQL implementation, debuggingImplementing hybrid search
metadata-filtering.mdPre/post filtering, score boostingImproving relevance
参考文档描述适用场景
index-strategies.mdHNSW与IVFFlat对比、调优、迭代扫描选择/优化索引时
hybrid-search-rrf.mdRRF算法、SQL实现、调试实现混合搜索时
metadata-filtering.md预过滤/后过滤、评分加权提升搜索相关性时

External Resources

外部资源

Related Skills

相关技能

  • ai-native-development
    - Embeddings and vector concepts
  • database-schema-designer
    - Schema design for vector search

Version: 1.2.0 | Status: Production-ready | Updated: pgvector 0.8.1

  • ai-native-development
    - 嵌入向量与向量相关概念
  • database-schema-designer
    - 向量搜索的Schema设计

版本: 1.2.0 | 状态: 可用于生产环境 | 更新适配: pgvector 0.8.1

Capability Details

能力详情

hybrid-search-rrf

hybrid-search-rrf

Keywords: hybrid search, rrf, reciprocal rank fusion, vector bm25, semantic keyword search Solves:
  • How do I combine vector and keyword search?
  • Implement hybrid retrieval with RRF
  • Merge semantic and BM25 results
关键词: hybrid search, rrf, reciprocal rank fusion, vector bm25, semantic keyword search 解决问题:
  • 如何结合向量与关键词搜索?
  • 基于RRF实现混合检索
  • 合并语义与BM25搜索结果

semantic-search

semantic-search

Keywords: semantic search, vector similarity, embedding, nearest neighbor, cosine distance Solves:
  • How does semantic search work?
  • When to use semantic vs keyword search
  • Semantic search strengths and weaknesses
关键词: semantic search, vector similarity, embedding, nearest neighbor, cosine distance 解决问题:
  • 语义搜索的工作原理是什么?
  • 何时使用语义搜索 vs 关键词搜索
  • 语义搜索的优势与劣势

keyword-search-bm25

keyword-search-bm25

Keywords: bm25, full-text search, tsvector, tsquery, keyword search Solves:
  • How does BM25 keyword search work?
  • Implement PostgreSQL full-text search
  • BM25 vs semantic search trade-offs
关键词: bm25, full-text search, tsvector, tsquery, keyword search 解决问题:
  • BM25关键词搜索的工作原理是什么?
  • 实现PostgreSQL全文搜索
  • BM25与语义搜索的权衡

rrf-algorithm

rrf-algorithm

Keywords: rrf, reciprocal rank fusion, rank-based fusion, score normalization Solves:
  • How does Reciprocal Rank Fusion work?
  • Why use rank instead of scores?
  • RRF smoothing constant (k parameter)
关键词: rrf, reciprocal rank fusion, rank-based fusion, score normalization 解决问题:
  • Reciprocal Rank Fusion的工作原理是什么?
  • 为什么基于排名而非分数进行融合?
  • RRF平滑常数(k参数)

database-schema

database-schema

Keywords: pgvector schema, chunk table, embedding column, tsvector, generated column Solves:
  • How do I design schema for hybrid search?
  • Store embeddings with vector(1024)
  • Pre-compute tsvector for performance
关键词: pgvector schema, chunk table, embedding column, tsvector, generated column 解决问题:
  • 如何设计混合搜索的数据库Schema?
  • 使用vector(1024)存储嵌入向量
  • 预计算tsvector以提升性能

search-query-implementation

search-query-implementation

Keywords: hybrid search query, sqlalchemy, vector distance, ts_rank_cd, full outer join Solves:
  • How do I write hybrid search SQL?
  • Implement RRF in SQLAlchemy
  • Use fetch multiplier for better coverage
关键词: hybrid search query, sqlalchemy, vector distance, ts_rank_cd, full outer join 解决问题:
  • 如何编写混合搜索的SQL语句?
  • 在SQLAlchemy中实现RRF
  • 使用结果获取系数提升覆盖度

indexing-strategies

indexing-strategies

Keywords: pgvector index, hnsw, ivfflat, vector index performance, index tuning Solves:
  • HNSW vs IVFFlat comparison
  • Optimize vector search speed
  • Scale to millions of vectors
关键词: pgvector index, hnsw, ivfflat, vector index performance, index tuning 解决问题:
  • HNSW与IVFFlat对比
  • 优化向量搜索速度
  • 扩展至百万级向量

pre-computed-tsvector

pre-computed-tsvector

Keywords: tsvector, gin index, full-text index, pre-computed column, generated column Solves:
  • Optimize keyword search performance
  • 5-10x speedup with indexed tsvector
关键词: tsvector, gin index, full-text index, pre-computed column, generated column 解决问题:
  • 优化关键词搜索性能
  • 索引化tsvector可提升5-10倍速度

metadata-filtering

metadata-filtering

Keywords: metadata filter, faceted search, content type filter, score boosting Solves:
  • Filter search by metadata
  • Boost results by section title
  • Pre-filter by content type
关键词: metadata filter, faceted search, content type filter, score boosting 解决问题:
  • 基于元数据过滤搜索结果
  • 通过章节标题加权提升结果排名
  • 按内容类型预过滤

common-patterns

common-patterns

Keywords: filtered search, similarity threshold, multi-query retrieval, search patterns Solves:
  • Filter search by content type
  • Set minimum similarity threshold
  • Implement multi-query retrieval
关键词: filtered search, similarity threshold, multi-query retrieval, search patterns 解决问题:
  • 按内容类型过滤搜索结果
  • 设置最小相似度阈值
  • 实现多查询检索

golden-dataset-testing

golden-dataset-testing

Keywords: golden dataset, search evaluation, pass rate, mrr, retrieval testing Solves:
  • Test hybrid search quality
  • Evaluate search with golden queries
  • Calculate pass rate and MRR metrics
关键词: golden dataset, search evaluation, pass rate, mrr, retrieval testing 解决问题:
  • 测试混合搜索的质量
  • 使用标准查询评估搜索效果
  • 计算通过率与MRR指标