faiss
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFAISS - Efficient Similarity Search
FAISS - 高效相似性搜索
Facebook AI's library for billion-scale vector similarity search.
Facebook AI推出的面向十亿级规模向量的相似性搜索库。
When to use FAISS
何时使用FAISS
Use FAISS when:
- Need fast similarity search on large vector datasets (millions/billions)
- GPU acceleration required
- Pure vector similarity (no metadata filtering needed)
- High throughput, low latency critical
- Offline/batch processing of embeddings
Metrics:
- 31,700+ GitHub stars
- Meta/Facebook AI Research
- Handles billions of vectors
- C++ with Python bindings
Use alternatives instead:
- Chroma/Pinecone: Need metadata filtering
- Weaviate: Need full database features
- Annoy: Simpler, fewer features
使用FAISS的场景:
- 需要在大规模向量数据集(百万/十亿级)上进行快速相似性搜索
- 需要GPU加速
- 仅需向量相似性检索(无需元数据过滤)
- 对高吞吐量、低延迟有严格要求
- 对嵌入向量进行离线/批量处理
相关指标:
- GitHub星标数31700+
- 由Meta/Facebook AI Research开发
- 支持处理数十亿级向量
- 基于C++开发,提供Python绑定
可选择替代工具的场景:
- Chroma/Pinecone:需要元数据过滤功能时选择
- Weaviate:需要完整数据库功能时选择
- Annoy:需求简单、需要更少功能时选择
Quick start
快速开始
Installation
安装
bash
undefinedbash
undefinedCPU only
CPU only
pip install faiss-cpu
pip install faiss-cpu
GPU support
GPU support
pip install faiss-gpu
undefinedpip install faiss-gpu
undefinedBasic usage
基础使用
python
import faiss
import numpy as nppython
import faiss
import numpy as npCreate sample data (1000 vectors, 128 dimensions)
Create sample data (1000 vectors, 128 dimensions)
d = 128
nb = 1000
vectors = np.random.random((nb, d)).astype('float32')
d = 128
nb = 1000
vectors = np.random.random((nb, d)).astype('float32')
Create index
Create index
index = faiss.IndexFlatL2(d) # L2 distance
index.add(vectors) # Add vectors
index = faiss.IndexFlatL2(d) # L2 distance
index.add(vectors) # Add vectors
Search
Search
k = 5 # Find 5 nearest neighbors
query = np.random.random((1, d)).astype('float32')
distances, indices = index.search(query, k)
print(f"Nearest neighbors: {indices}")
print(f"Distances: {distances}")
undefinedk = 5 # Find 5 nearest neighbors
query = np.random.random((1, d)).astype('float32')
distances, indices = index.search(query, k)
print(f"Nearest neighbors: {indices}")
print(f"Distances: {distances}")
undefinedIndex types
索引类型
1. Flat (exact search)
1. Flat(精确搜索)
python
undefinedpython
undefinedL2 (Euclidean) distance
L2 (Euclidean) distance
index = faiss.IndexFlatL2(d)
index = faiss.IndexFlatL2(d)
Inner product (cosine similarity if normalized)
Inner product (cosine similarity if normalized)
index = faiss.IndexFlatIP(d)
index = faiss.IndexFlatIP(d)
Slowest, most accurate
Slowest, most accurate
undefinedundefined2. IVF (inverted file) - Fast approximate
2. IVF(倒排文件)- 快速近似搜索
python
undefinedpython
undefinedCreate quantizer
Create quantizer
quantizer = faiss.IndexFlatL2(d)
quantizer = faiss.IndexFlatL2(d)
IVF index with 100 clusters
IVF index with 100 clusters
nlist = 100
index = faiss.IndexIVFFlat(quantizer, d, nlist)
nlist = 100
index = faiss.IndexIVFFlat(quantizer, d, nlist)
Train on data
Train on data
index.train(vectors)
index.train(vectors)
Add vectors
Add vectors
index.add(vectors)
index.add(vectors)
Search (nprobe = clusters to search)
Search (nprobe = clusters to search)
index.nprobe = 10
distances, indices = index.search(query, k)
undefinedindex.nprobe = 10
distances, indices = index.search(query, k)
undefined3. HNSW (Hierarchical NSW) - Best quality/speed
3. HNSW(层次导航小世界图)- 最佳精度/速度平衡
python
undefinedpython
undefinedHNSW index
HNSW index
M = 32 # Number of connections per layer
index = faiss.IndexHNSWFlat(d, M)
M = 32 # Number of connections per layer
index = faiss.IndexHNSWFlat(d, M)
No training needed
No training needed
index.add(vectors)
index.add(vectors)
Search
Search
distances, indices = index.search(query, k)
undefineddistances, indices = index.search(query, k)
undefined4. Product Quantization - Memory efficient
4. 乘积量化 - 内存高效
python
undefinedpython
undefinedPQ reduces memory by 16-32×
PQ reduces memory by 16-32×
m = 8 # Number of subquantizers
nbits = 8
index = faiss.IndexPQ(d, m, nbits)
m = 8 # Number of subquantizers
nbits = 8
index = faiss.IndexPQ(d, m, nbits)
Train and add
Train and add
index.train(vectors)
index.add(vectors)
undefinedindex.train(vectors)
index.add(vectors)
undefinedSave and load
保存与加载
python
undefinedpython
undefinedSave index
Save index
faiss.write_index(index, "large.index")
faiss.write_index(index, "large.index")
Load index
Load index
index = faiss.read_index("large.index")
index = faiss.read_index("large.index")
Continue using
Continue using
distances, indices = index.search(query, k)
undefineddistances, indices = index.search(query, k)
undefinedGPU acceleration
GPU加速
python
undefinedpython
undefinedSingle GPU
Single GPU
res = faiss.StandardGpuResources()
index_cpu = faiss.IndexFlatL2(d)
index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu) # GPU 0
res = faiss.StandardGpuResources()
index_cpu = faiss.IndexFlatL2(d)
index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu) # GPU 0
Multi-GPU
Multi-GPU
index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)
index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)
10-100× faster than CPU
10-100× faster than CPU
undefinedundefinedLangChain integration
与LangChain集成
python
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddingspython
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddingsCreate FAISS vector store
Create FAISS vector store
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
Save
Save
vectorstore.save_local("faiss_index")
vectorstore.save_local("faiss_index")
Load
Load
vectorstore = FAISS.load_local(
"faiss_index",
OpenAIEmbeddings(),
allow_dangerous_deserialization=True
)
vectorstore = FAISS.load_local(
"faiss_index",
OpenAIEmbeddings(),
allow_dangerous_deserialization=True
)
Search
Search
results = vectorstore.similarity_search("query", k=5)
undefinedresults = vectorstore.similarity_search("query", k=5)
undefinedLlamaIndex integration
与LlamaIndex集成
python
from llama_index.vector_stores.faiss import FaissVectorStore
import faisspython
from llama_index.vector_stores.faiss import FaissVectorStore
import faissCreate FAISS index
Create FAISS index
d = 1536
faiss_index = faiss.IndexFlatL2(d)
vector_store = FaissVectorStore(faiss_index=faiss_index)
undefinedd = 1536
faiss_index = faiss.IndexFlatL2(d)
vector_store = FaissVectorStore(faiss_index=faiss_index)
undefinedBest practices
最佳实践
- Choose right index type - Flat for <10K, IVF for 10K-1M, HNSW for quality
- Normalize for cosine - Use IndexFlatIP with normalized vectors
- Use GPU for large datasets - 10-100× faster
- Save trained indices - Training is expensive
- Tune nprobe/ef_search - Balance speed/accuracy
- Monitor memory - PQ for large datasets
- Batch queries - Better GPU utilization
- 选择合适的索引类型 - 数据量<1万时用Flat,1万-100万时用IVF,追求精度时用HNSW
- 为余弦相似度做归一化 - 对归一化后的向量使用IndexFlatIP
- 大规模数据集使用GPU - 速度比CPU快10-100倍
- 保存训练好的索引 - 训练过程成本较高
- 调优nprobe/ef_search参数 - 在速度与精度之间取得平衡
- 监控内存使用 - 大规模数据集使用PQ索引
- 批量处理查询 - 提升GPU利用率
Performance
性能对比
| Index Type | Build Time | Search Time | Memory | Accuracy |
|---|---|---|---|---|
| Flat | Fast | Slow | High | 100% |
| IVF | Medium | Fast | Medium | 95-99% |
| HNSW | Slow | Fastest | High | 99% |
| PQ | Medium | Fast | Low | 90-95% |
| 索引类型 | 构建时间 | 搜索时间 | 内存占用 | 精度 |
|---|---|---|---|---|
| Flat | 快 | 慢 | 高 | 100% |
| IVF | 中等 | 快 | 中等 | 95-99% |
| HNSW | 慢 | 最快 | 高 | 99% |
| PQ | 中等 | 快 | 低 | 90-95% |
Resources
相关资源
- GitHub: https://github.com/facebookresearch/faiss ⭐ 31,700+
- Wiki: https://github.com/facebookresearch/faiss/wiki
- License: MIT
- GitHub地址: https://github.com/facebookresearch/faiss ⭐ 31,700+
- Wiki文档: https://github.com/facebookresearch/faiss/wiki
- 许可证: MIT