faiss

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

FAISS - Efficient Similarity Search

FAISS - 高效相似性搜索

Facebook AI's library for billion-scale vector similarity search.
Facebook AI推出的面向十亿级向量规模的相似性搜索库。

When to use FAISS

何时使用FAISS

Use FAISS when:
  • Need fast similarity search on large vector datasets (millions/billions)
  • GPU acceleration required
  • Pure vector similarity (no metadata filtering needed)
  • High throughput, low latency critical
  • Offline/batch processing of embeddings
Metrics:
  • 31,700+ GitHub stars
  • Meta/Facebook AI Research
  • Handles billions of vectors
  • C++ with Python bindings
Use alternatives instead:
  • Chroma/Pinecone: Need metadata filtering
  • Weaviate: Need full database features
  • Annoy: Simpler, fewer features
在以下场景使用FAISS:
  • 需在大型向量数据集(百万/十亿级)上进行快速相似性搜索
  • 需要GPU加速
  • 仅需纯向量相似性搜索(无需元数据过滤)
  • 高吞吐量、低延迟要求严格
  • 对嵌入向量进行离线/批量处理
相关指标:
  • GitHub星标数31,700+
  • 由Meta/Facebook AI Research开发
  • 支持数十亿级向量处理
  • 基于**C++**开发,提供Python绑定
以下场景请使用替代方案:
  • Chroma/Pinecone: 需要元数据过滤功能
  • Weaviate: 需要完整数据库特性
  • Annoy: 需求简单、需要更少功能的场景

Quick start

快速开始

Installation

安装

bash
undefined
bash
undefined

CPU only

仅CPU版本

pip install faiss-cpu
pip install faiss-cpu

GPU support

带GPU支持版本

pip install faiss-gpu
undefined
pip install faiss-gpu
undefined

Basic usage

基础用法

python
import faiss
import numpy as np
python
import faiss
import numpy as np

Create sample data (1000 vectors, 128 dimensions)

创建示例数据(1000个向量,128维)

d = 128 nb = 1000 vectors = np.random.random((nb, d)).astype('float32')
d = 128 nb = 1000 vectors = np.random.random((nb, d)).astype('float32')

Create index

创建索引

index = faiss.IndexFlatL2(d) # L2 distance index.add(vectors) # Add vectors
index = faiss.IndexFlatL2(d) # L2距离 index.add(vectors) # 添加向量

Search

搜索

k = 5 # Find 5 nearest neighbors query = np.random.random((1, d)).astype('float32') distances, indices = index.search(query, k)
print(f"Nearest neighbors: {indices}") print(f"Distances: {distances}")
undefined
k = 5 # 查找5个最近邻 query = np.random.random((1, d)).astype('float32') distances, indices = index.search(query, k)
print(f"最近邻索引: {indices}") print(f"距离值: {distances}")
undefined

Index types

索引类型

1. Flat (exact search)

1. Flat(精确搜索)

python
undefined
python
undefined

L2 (Euclidean) distance

L2(欧几里得)距离

index = faiss.IndexFlatL2(d)
index = faiss.IndexFlatL2(d)

Inner product (cosine similarity if normalized)

内积(向量归一化后等价于余弦相似度)

index = faiss.IndexFlatIP(d)
index = faiss.IndexFlatIP(d)

Slowest, most accurate

速度最慢,但精度最高

undefined
undefined

2. IVF (inverted file) - Fast approximate

2. IVF(倒排文件)- 快速近似搜索

python
undefined
python
undefined

Create quantizer

创建量化器

quantizer = faiss.IndexFlatL2(d)
quantizer = faiss.IndexFlatL2(d)

IVF index with 100 clusters

包含100个聚类的IVF索引

nlist = 100 index = faiss.IndexIVFFlat(quantizer, d, nlist)
nlist = 100 index = faiss.IndexIVFFlat(quantizer, d, nlist)

Train on data

使用数据训练索引

index.train(vectors)
index.train(vectors)

Add vectors

添加向量

index.add(vectors)
index.add(vectors)

Search (nprobe = clusters to search)

搜索(nprobe = 要搜索的聚类数量)

index.nprobe = 10 distances, indices = index.search(query, k)
undefined
index.nprobe = 10 distances, indices = index.search(query, k)
undefined

3. HNSW (Hierarchical NSW) - Best quality/speed

3. HNSW(层次导航小世界)- 精度与速度最优平衡

python
undefined
python
undefined

HNSW index

HNSW索引

M = 32 # Number of connections per layer index = faiss.IndexHNSWFlat(d, M)
M = 32 # 每层的连接数 index = faiss.IndexHNSWFlat(d, M)

No training needed

无需训练

index.add(vectors)
index.add(vectors)

Search

搜索

distances, indices = index.search(query, k)
undefined
distances, indices = index.search(query, k)
undefined

4. Product Quantization - Memory efficient

4. 乘积量化 - 内存高效

python
undefined
python
undefined

PQ reduces memory by 16-32×

PQ可将内存占用降低16-32倍

m = 8 # Number of subquantizers nbits = 8 index = faiss.IndexPQ(d, m, nbits)
m = 8 # 子量化器数量 nbits = 8 index = faiss.IndexPQ(d, m, nbits)

Train and add

训练并添加向量

index.train(vectors) index.add(vectors)
undefined
index.train(vectors) index.add(vectors)
undefined

Save and load

保存与加载索引

python
undefined
python
undefined

Save index

保存索引

faiss.write_index(index, "large.index")
faiss.write_index(index, "large.index")

Load index

加载索引

index = faiss.read_index("large.index")
index = faiss.read_index("large.index")

Continue using

继续使用索引

distances, indices = index.search(query, k)
undefined
distances, indices = index.search(query, k)
undefined

GPU acceleration

GPU加速

python
undefined
python
undefined

Single GPU

单GPU使用

res = faiss.StandardGpuResources() index_cpu = faiss.IndexFlatL2(d) index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu) # GPU 0
res = faiss.StandardGpuResources() index_cpu = faiss.IndexFlatL2(d) index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu) # 使用GPU 0

Multi-GPU

多GPU使用

index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)
index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)

10-100× faster than CPU

速度比CPU快10-100倍

undefined
undefined

LangChain integration

与LangChain集成

python
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
python
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

Create FAISS vector store

创建FAISS向量存储

vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())

Save

保存

vectorstore.save_local("faiss_index")
vectorstore.save_local("faiss_index")

Load

加载

vectorstore = FAISS.load_local( "faiss_index", OpenAIEmbeddings(), allow_dangerous_deserialization=True )
vectorstore = FAISS.load_local( "faiss_index", OpenAIEmbeddings(), allow_dangerous_deserialization=True )

Search

搜索

results = vectorstore.similarity_search("query", k=5)
undefined
results = vectorstore.similarity_search("query", k=5)
undefined

LlamaIndex integration

与LlamaIndex集成

python
from llama_index.vector_stores.faiss import FaissVectorStore
import faiss
python
from llama_index.vector_stores.faiss import FaissVectorStore
import faiss

Create FAISS index

创建FAISS索引

d = 1536 faiss_index = faiss.IndexFlatL2(d)
vector_store = FaissVectorStore(faiss_index=faiss_index)
undefined
d = 1536 faiss_index = faiss.IndexFlatL2(d)
vector_store = FaissVectorStore(faiss_index=faiss_index)
undefined

Best practices

最佳实践

  1. Choose right index type - Flat for <10K, IVF for 10K-1M, HNSW for quality
  2. Normalize for cosine - Use IndexFlatIP with normalized vectors
  3. Use GPU for large datasets - 10-100× faster
  4. Save trained indices - Training is expensive
  5. Tune nprobe/ef_search - Balance speed/accuracy
  6. Monitor memory - PQ for large datasets
  7. Batch queries - Better GPU utilization
  1. 选择合适的索引类型 - 数据量<10K用Flat,10K-1M用IVF,追求精度用HNSW
  2. 余弦相似度需归一化 - 对归一化后的向量使用IndexFlatIP
  3. 大型数据集用GPU - 速度比CPU快10-100倍
  4. 保存训练后的索引 - 训练过程成本较高
  5. 调优nprobe/ef_search - 平衡搜索速度与精度
  6. 监控内存占用 - 大型数据集使用PQ索引
  7. 批量查询 - 提升GPU利用率

Performance

性能对比

Index TypeBuild TimeSearch TimeMemoryAccuracy
FlatFastSlowHigh100%
IVFMediumFastMedium95-99%
HNSWSlowFastestHigh99%
PQMediumFastLow90-95%
索引类型构建时间搜索时间内存占用精度
Flat100%
IVF95-99%
HNSW最快99%
PQ90-95%

Resources

相关资源