faiss
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFAISS - Efficient Similarity Search
FAISS - 高效相似性搜索
Facebook AI's library for billion-scale vector similarity search.
Facebook AI推出的面向十亿级向量规模的相似性搜索库。
When to use FAISS
何时使用FAISS
Use FAISS when:
- Need fast similarity search on large vector datasets (millions/billions)
- GPU acceleration required
- Pure vector similarity (no metadata filtering needed)
- High throughput, low latency critical
- Offline/batch processing of embeddings
Metrics:
- 31,700+ GitHub stars
- Meta/Facebook AI Research
- Handles billions of vectors
- C++ with Python bindings
Use alternatives instead:
- Chroma/Pinecone: Need metadata filtering
- Weaviate: Need full database features
- Annoy: Simpler, fewer features
在以下场景使用FAISS:
- 需在大型向量数据集(百万/十亿级)上进行快速相似性搜索
- 需要GPU加速
- 仅需纯向量相似性搜索(无需元数据过滤)
- 高吞吐量、低延迟要求严格
- 对嵌入向量进行离线/批量处理
相关指标:
- GitHub星标数31,700+
- 由Meta/Facebook AI Research开发
- 支持数十亿级向量处理
- 基于**C++**开发,提供Python绑定
以下场景请使用替代方案:
- Chroma/Pinecone: 需要元数据过滤功能
- Weaviate: 需要完整数据库特性
- Annoy: 需求简单、需要更少功能的场景
Quick start
快速开始
Installation
安装
bash
undefinedbash
undefinedCPU only
仅CPU版本
pip install faiss-cpu
pip install faiss-cpu
GPU support
带GPU支持版本
pip install faiss-gpu
undefinedpip install faiss-gpu
undefinedBasic usage
基础用法
python
import faiss
import numpy as nppython
import faiss
import numpy as npCreate sample data (1000 vectors, 128 dimensions)
创建示例数据(1000个向量,128维)
d = 128
nb = 1000
vectors = np.random.random((nb, d)).astype('float32')
d = 128
nb = 1000
vectors = np.random.random((nb, d)).astype('float32')
Create index
创建索引
index = faiss.IndexFlatL2(d) # L2 distance
index.add(vectors) # Add vectors
index = faiss.IndexFlatL2(d) # L2距离
index.add(vectors) # 添加向量
Search
搜索
k = 5 # Find 5 nearest neighbors
query = np.random.random((1, d)).astype('float32')
distances, indices = index.search(query, k)
print(f"Nearest neighbors: {indices}")
print(f"Distances: {distances}")
undefinedk = 5 # 查找5个最近邻
query = np.random.random((1, d)).astype('float32')
distances, indices = index.search(query, k)
print(f"最近邻索引: {indices}")
print(f"距离值: {distances}")
undefinedIndex types
索引类型
1. Flat (exact search)
1. Flat(精确搜索)
python
undefinedpython
undefinedL2 (Euclidean) distance
L2(欧几里得)距离
index = faiss.IndexFlatL2(d)
index = faiss.IndexFlatL2(d)
Inner product (cosine similarity if normalized)
内积(向量归一化后等价于余弦相似度)
index = faiss.IndexFlatIP(d)
index = faiss.IndexFlatIP(d)
Slowest, most accurate
速度最慢,但精度最高
undefinedundefined2. IVF (inverted file) - Fast approximate
2. IVF(倒排文件)- 快速近似搜索
python
undefinedpython
undefinedCreate quantizer
创建量化器
quantizer = faiss.IndexFlatL2(d)
quantizer = faiss.IndexFlatL2(d)
IVF index with 100 clusters
包含100个聚类的IVF索引
nlist = 100
index = faiss.IndexIVFFlat(quantizer, d, nlist)
nlist = 100
index = faiss.IndexIVFFlat(quantizer, d, nlist)
Train on data
使用数据训练索引
index.train(vectors)
index.train(vectors)
Add vectors
添加向量
index.add(vectors)
index.add(vectors)
Search (nprobe = clusters to search)
搜索(nprobe = 要搜索的聚类数量)
index.nprobe = 10
distances, indices = index.search(query, k)
undefinedindex.nprobe = 10
distances, indices = index.search(query, k)
undefined3. HNSW (Hierarchical NSW) - Best quality/speed
3. HNSW(层次导航小世界)- 精度与速度最优平衡
python
undefinedpython
undefinedHNSW index
HNSW索引
M = 32 # Number of connections per layer
index = faiss.IndexHNSWFlat(d, M)
M = 32 # 每层的连接数
index = faiss.IndexHNSWFlat(d, M)
No training needed
无需训练
index.add(vectors)
index.add(vectors)
Search
搜索
distances, indices = index.search(query, k)
undefineddistances, indices = index.search(query, k)
undefined4. Product Quantization - Memory efficient
4. 乘积量化 - 内存高效
python
undefinedpython
undefinedPQ reduces memory by 16-32×
PQ可将内存占用降低16-32倍
m = 8 # Number of subquantizers
nbits = 8
index = faiss.IndexPQ(d, m, nbits)
m = 8 # 子量化器数量
nbits = 8
index = faiss.IndexPQ(d, m, nbits)
Train and add
训练并添加向量
index.train(vectors)
index.add(vectors)
undefinedindex.train(vectors)
index.add(vectors)
undefinedSave and load
保存与加载索引
python
undefinedpython
undefinedSave index
保存索引
faiss.write_index(index, "large.index")
faiss.write_index(index, "large.index")
Load index
加载索引
index = faiss.read_index("large.index")
index = faiss.read_index("large.index")
Continue using
继续使用索引
distances, indices = index.search(query, k)
undefineddistances, indices = index.search(query, k)
undefinedGPU acceleration
GPU加速
python
undefinedpython
undefinedSingle GPU
单GPU使用
res = faiss.StandardGpuResources()
index_cpu = faiss.IndexFlatL2(d)
index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu) # GPU 0
res = faiss.StandardGpuResources()
index_cpu = faiss.IndexFlatL2(d)
index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu) # 使用GPU 0
Multi-GPU
多GPU使用
index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)
index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)
10-100× faster than CPU
速度比CPU快10-100倍
undefinedundefinedLangChain integration
与LangChain集成
python
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddingspython
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddingsCreate FAISS vector store
创建FAISS向量存储
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
Save
保存
vectorstore.save_local("faiss_index")
vectorstore.save_local("faiss_index")
Load
加载
vectorstore = FAISS.load_local(
"faiss_index",
OpenAIEmbeddings(),
allow_dangerous_deserialization=True
)
vectorstore = FAISS.load_local(
"faiss_index",
OpenAIEmbeddings(),
allow_dangerous_deserialization=True
)
Search
搜索
results = vectorstore.similarity_search("query", k=5)
undefinedresults = vectorstore.similarity_search("query", k=5)
undefinedLlamaIndex integration
与LlamaIndex集成
python
from llama_index.vector_stores.faiss import FaissVectorStore
import faisspython
from llama_index.vector_stores.faiss import FaissVectorStore
import faissCreate FAISS index
创建FAISS索引
d = 1536
faiss_index = faiss.IndexFlatL2(d)
vector_store = FaissVectorStore(faiss_index=faiss_index)
undefinedd = 1536
faiss_index = faiss.IndexFlatL2(d)
vector_store = FaissVectorStore(faiss_index=faiss_index)
undefinedBest practices
最佳实践
- Choose right index type - Flat for <10K, IVF for 10K-1M, HNSW for quality
- Normalize for cosine - Use IndexFlatIP with normalized vectors
- Use GPU for large datasets - 10-100× faster
- Save trained indices - Training is expensive
- Tune nprobe/ef_search - Balance speed/accuracy
- Monitor memory - PQ for large datasets
- Batch queries - Better GPU utilization
- 选择合适的索引类型 - 数据量<10K用Flat,10K-1M用IVF,追求精度用HNSW
- 余弦相似度需归一化 - 对归一化后的向量使用IndexFlatIP
- 大型数据集用GPU - 速度比CPU快10-100倍
- 保存训练后的索引 - 训练过程成本较高
- 调优nprobe/ef_search - 平衡搜索速度与精度
- 监控内存占用 - 大型数据集使用PQ索引
- 批量查询 - 提升GPU利用率
Performance
性能对比
| Index Type | Build Time | Search Time | Memory | Accuracy |
|---|---|---|---|---|
| Flat | Fast | Slow | High | 100% |
| IVF | Medium | Fast | Medium | 95-99% |
| HNSW | Slow | Fastest | High | 99% |
| PQ | Medium | Fast | Low | 90-95% |
| 索引类型 | 构建时间 | 搜索时间 | 内存占用 | 精度 |
|---|---|---|---|---|
| Flat | 快 | 慢 | 高 | 100% |
| IVF | 中 | 快 | 中 | 95-99% |
| HNSW | 慢 | 最快 | 高 | 99% |
| PQ | 中 | 快 | 低 | 90-95% |
Resources
相关资源
- GitHub: https://github.com/facebookresearch/faiss ⭐ 31,700+
- Wiki: https://github.com/facebookresearch/faiss/wiki
- License: MIT
- GitHub: https://github.com/facebookresearch/faiss ⭐ 31,700+
- Wiki: https://github.com/facebookresearch/faiss/wiki
- 许可证: MIT