grepai-storage-qdrant
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGrepAI Storage with Qdrant
基于Qdrant的GrepAI存储
This skill covers using Qdrant as the storage backend for GrepAI, offering high-performance vector search.
本技能介绍如何将Qdrant用作GrepAI的存储后端,以实现高性能向量搜索。
When to Use This Skill
何时使用本技能
- Need fastest possible search performance
- Very large codebases (50K+ files)
- Already using Qdrant infrastructure
- Want advanced vector search features
- 需要极致的搜索性能
- 超大规模代码库(5万+文件)
- 已在使用Qdrant基础设施
- 需要高级向量搜索功能
What is Qdrant?
什么是Qdrant?
Qdrant is a purpose-built vector database offering:
- ⚡ Extremely fast vector similarity search
- 📏 Excellent scalability
- 🔧 Advanced filtering capabilities
- 🐳 Easy Docker deployment
Qdrant是一款专为向量数据打造的数据库,具备以下特性:
- ⚡ 极快的向量相似度搜索
- 📏 出色的可扩展性
- 🔧 高级过滤功能
- 🐳 便捷的Docker部署
Prerequisites
前提条件
- Qdrant server running
- Network access to Qdrant
- Qdrant服务器已运行
- 可网络访问Qdrant
Advantages
优势
| Benefit | Description |
|---|---|
| ⚡ Performance | Fastest vector search |
| 📏 Scalability | Handles millions of vectors |
| 🔍 Advanced | Filtering, payloads, sharding |
| 🐳 Easy deploy | Docker-ready |
| ☁️ Cloud option | Qdrant Cloud available |
| 优势 | 说明 |
|---|---|
| ⚡ 性能 | 最快的向量搜索 |
| 📏 可扩展性 | 支持数百万级向量存储 |
| 🔍 高级功能 | 过滤、负载、分片 |
| 🐳 部署便捷 | 支持Docker一键部署 |
| ☁️ 云服务选项 | 提供Qdrant Cloud托管服务 |
Setting Up Qdrant
Qdrant搭建步骤
Option 1: Docker (Recommended)
选项1:Docker(推荐)
bash
undefinedbash
undefinedRun Qdrant with persistent storage
运行带持久化存储的Qdrant
docker run -d
--name grepai-qdrant
-p 6333:6333
-p 6334:6334
-v qdrant_storage:/qdrant/storage
qdrant/qdrant
--name grepai-qdrant
-p 6333:6333
-p 6334:6334
-v qdrant_storage:/qdrant/storage
qdrant/qdrant
Ports:
- `6333`: REST API
- `6334`: gRPC API (used by GrepAI)docker run -d
--name grepai-qdrant
-p 6333:6333
-p 6334:6334
-v qdrant_storage:/qdrant/storage
qdrant/qdrant
--name grepai-qdrant
-p 6333:6333
-p 6334:6334
-v qdrant_storage:/qdrant/storage
qdrant/qdrant
端口说明:
- `6333`:REST API端口
- `6334`:gRPC API端口(GrepAI使用)Option 2: Docker Compose
选项2:Docker Compose
yaml
undefinedyaml
undefineddocker-compose.yml
docker-compose.yml
version: '3.8'
services:
qdrant:
image: qdrant/qdrant
ports:
- "6333:6333"
- "6334:6334"
volumes:
- qdrant_storage:/qdrant/storage
environment:
- QDRANT__SERVICE__GRPC_PORT=6334
volumes:
qdrant_storage:
```bash
docker-compose up -dversion: '3.8'
services:
qdrant:
image: qdrant/qdrant
ports:
- "6333:6333"
- "6334:6334"
volumes:
- qdrant_storage:/qdrant/storage
environment:
- QDRANT__SERVICE__GRPC_PORT=6334
volumes:
qdrant_storage:
```bash
docker-compose up -dOption 3: Qdrant Cloud
选项3:Qdrant Cloud
- Sign up at cloud.qdrant.io
- Create a cluster
- Get your endpoint and API key
- 在cloud.qdrant.io注册账号
- 创建集群
- 获取集群端点和API密钥
Configuration
配置方法
Basic Configuration (Local)
基础配置(本地环境)
yaml
undefinedyaml
undefined.grepai/config.yaml
.grepai/config.yaml
store:
backend: qdrant
qdrant:
endpoint: localhost
port: 6334
undefinedstore:
backend: qdrant
qdrant:
endpoint: localhost
port: 6334
undefinedWith TLS (Production)
TLS加密配置(生产环境)
yaml
store:
backend: qdrant
qdrant:
endpoint: qdrant.company.com
port: 6334
use_tls: trueyaml
store:
backend: qdrant
qdrant:
endpoint: qdrant.company.com
port: 6334
use_tls: trueWith API Key (Qdrant Cloud)
API密钥配置(Qdrant Cloud)
yaml
store:
backend: qdrant
qdrant:
endpoint: your-cluster.aws.cloud.qdrant.io
port: 6334
use_tls: true
api_key: ${QDRANT_API_KEY}Set the environment variable:
bash
export QDRANT_API_KEY="your-api-key"yaml
store:
backend: qdrant
qdrant:
endpoint: your-cluster.aws.cloud.qdrant.io
port: 6334
use_tls: true
api_key: ${QDRANT_API_KEY}设置环境变量:
bash
export QDRANT_API_KEY="your-api-key"Configuration Options
配置选项说明
| Option | Default | Description |
|---|---|---|
| | Qdrant server hostname |
| | gRPC port |
| | Enable TLS encryption |
| none | Authentication key |
| 选项 | 默认值 | 说明 |
|---|---|---|
| | Qdrant服务器主机名 |
| | gRPC端口 |
| | 启用TLS加密 |
| 无 | 认证密钥 |
Verifying Setup
验证搭建结果
Check Qdrant is Running
检查Qdrant运行状态
bash
undefinedbash
undefinedREST API health check
REST API健康检查
Expected: {"status":"ok"}
预期返回:{"status":"ok"}
undefinedundefinedCheck Collections (after indexing)
检查集合(索引完成后)
bash
undefinedbash
undefinedList collections
列出所有集合
Get collection info
获取指定集合信息
undefinedundefinedFrom GrepAI
通过GrepAI验证
bash
grepai statusbash
grepai statusShould show Qdrant backend info
应显示Qdrant后端相关信息
undefinedundefinedQdrant Dashboard
Qdrant控制台
Access the web dashboard at :
http://localhost:6333/dashboard- View collections
- Browse vectors
- Execute queries
- Monitor performance
访问Web控制台:
http://localhost:6333/dashboard- 查看集合
- 浏览向量
- 执行查询
- 监控性能
Performance Characteristics
性能特性
Search Latency
搜索延迟
| Codebase Size | Vectors | Search Time |
|---|---|---|
| Small (1K files) | 5,000 | <10ms |
| Medium (10K files) | 50,000 | <20ms |
| Large (100K files) | 500,000 | <50ms |
| 代码库规模 | 向量数量 | 搜索耗时 |
|---|---|---|
| 小型(1千文件) | 5,000 | <10ms |
| 中型(1万文件) | 50,000 | <20ms |
| 大型(10万文件) | 500,000 | <50ms |
Memory Usage
内存占用
Qdrant loads vectors into memory for fast search:
| Vectors | Dimensions | Memory |
|---|---|---|
| 10,000 | 768 | ~60 MB |
| 100,000 | 768 | ~600 MB |
| 1,000,000 | 768 | ~6 GB |
Qdrant会将向量加载到内存以实现快速搜索:
| 向量数量 | 维度 | 内存占用 |
|---|---|---|
| 10,000 | 768 | ~60 MB |
| 100,000 | 768 | ~600 MB |
| 1,000,000 | 768 | ~6 GB |
Advanced Configuration
高级配置
Qdrant Server Configuration
Qdrant服务器配置
Create :
config/production.yamlyaml
storage:
storage_path: /qdrant/storage
service:
grpc_port: 6334
http_port: 6333
max_request_size_mb: 32
optimizers:
memmap_threshold_kb: 200000
indexing_threshold_kb: 50000Mount in Docker:
bash
docker run -d \
-v ./config:/qdrant/config \
-v qdrant_storage:/qdrant/storage \
qdrant/qdrant创建:
config/production.yamlyaml
storage:
storage_path: /qdrant/storage
service:
grpc_port: 6334
http_port: 6333
max_request_size_mb: 32
optimizers:
memmap_threshold_kb: 200000
indexing_threshold_kb: 50000在Docker中挂载配置:
bash
docker run -d \
-v ./config:/qdrant/config \
-v qdrant_storage:/qdrant/storage \
qdrant/qdrantCollection Settings
集合设置
GrepAI creates a collection named with:
grepai- Vector size: matches your embedding dimensions
- Distance: Cosine similarity
- On-disk storage for large datasets
GrepAI会自动创建名为的集合,具备以下特性:
grepai- 向量维度:与嵌入模型维度匹配
- 距离计算方式:余弦相似度
- 大规模数据集支持磁盘存储
Clustering (Advanced)
集群部署(高级)
For very large deployments, Qdrant supports distributed mode:
yaml
undefined针对超大规模部署,Qdrant支持分布式模式:
yaml
undefinedqdrant config
qdrant配置
cluster:
enabled: true
p2p:
port: 6335
undefinedcluster:
enabled: true
p2p:
port: 6335
undefinedBackup and Restore
备份与恢复
Snapshot Creation
创建快照
bash
undefinedbash
undefinedCreate snapshot via REST API
通过REST API创建快照
curl -X POST 'http://localhost:6333/collections/grepai/snapshots'
undefinedcurl -X POST 'http://localhost:6333/collections/grepai/snapshots'
undefinedRestore Snapshot
恢复快照
bash
undefinedbash
undefinedRestore from snapshot
从快照恢复
curl -X PUT 'http://localhost:6333/collections/grepai/snapshots/recover'
-H 'Content-Type: application/json'
-d '{"location": "/path/to/snapshot"}'
-H 'Content-Type: application/json'
-d '{"location": "/path/to/snapshot"}'
undefinedcurl -X PUT 'http://localhost:6333/collections/grepai/snapshots/recover'
-H 'Content-Type: application/json'
-d '{"location": "/path/to/snapshot"}'
-H 'Content-Type: application/json'
-d '{"location": "/path/to/snapshot"}'
undefinedMigrating from GOB
从GOB迁移
- Start Qdrant:
bash
docker run -d --name qdrant -p 6333:6333 -p 6334:6334 qdrant/qdrant- Update configuration:
yaml
store:
backend: qdrant
qdrant:
endpoint: localhost
port: 6334- Delete old index:
bash
rm .grepai/index.gob- Re-index:
bash
grepai watch- 启动Qdrant:
bash
docker run -d --name qdrant -p 6333:6333 -p 6334:6334 qdrant/qdrant- 更新配置:
yaml
store:
backend: qdrant
qdrant:
endpoint: localhost
port: 6334- 删除旧索引:
bash
rm .grepai/index.gob- 重新索引:
bash
grepai watchMigrating from PostgreSQL
从PostgreSQL迁移
- Start Qdrant
- Update configuration to use Qdrant
- Re-index (embeddings must be regenerated)
- 启动Qdrant
- 更新配置为使用Qdrant
- 重新索引(需重新生成嵌入向量)
Common Issues
常见问题
❌ Problem: Connection refused
✅ Solution: Ensure Qdrant is running:
bash
docker ps | grep qdrant
docker start grepai-qdrant❌ Problem: gRPC connection failed
✅ Solution: Check port 6334 is exposed:
bash
docker run -p 6334:6334 ...❌ Problem: Authentication failed
✅ Solution: Check API key:
bash
echo $QDRANT_API_KEY❌ Problem: Out of memory
✅ Solutions:
- Enable on-disk storage in Qdrant config
- Increase Docker memory limit
- Use Qdrant Cloud for managed scaling
❌ Problem: Slow initial indexing
✅ Solution: This is normal; Qdrant optimizes in background. Searches will be fast after indexing completes.
❌ 问题: 连接被拒绝
✅ 解决方案: 确保Qdrant已运行:
bash
docker ps | grep qdrant
docker start grepai-qdrant❌ 问题: gRPC连接失败
✅ 解决方案: 检查6334端口是否已暴露:
bash
docker run -p 6334:6334 ...❌ 问题: 认证失败
✅ 解决方案: 检查API密钥:
bash
echo $QDRANT_API_KEY❌ 问题: 内存不足
✅ 解决方案:
- 在Qdrant配置中启用磁盘存储
- 增加Docker内存限制
- 使用Qdrant Cloud托管服务实现自动扩容
❌ 问题: 初始索引速度慢
✅ 解决方案: 此为正常现象,Qdrant会在后台进行优化。索引完成后搜索速度会变快。
Qdrant vs PostgreSQL
Qdrant vs PostgreSQL
| Feature | Qdrant | PostgreSQL |
|---|---|---|
| Search speed | ⚡⚡⚡ | ⚡⚡ |
| Setup complexity | Easy (Docker) | Medium |
| SQL queries | ❌ | ✅ |
| Scalability | Excellent | Good |
| Memory efficiency | Excellent | Good |
| Team familiarity | Lower | Higher |
Recommendation: Use Qdrant for large codebases or maximum performance. Use PostgreSQL if you need SQL integration or team is familiar with it.
| 特性 | Qdrant | PostgreSQL |
|---|---|---|
| 搜索速度 | ⚡⚡⚡ | ⚡⚡ |
| 搭建复杂度 | 简单(Docker) | 中等 |
| SQL查询 | ❌ | ✅ |
| 可扩展性 | 优秀 | 良好 |
| 内存效率 | 优秀 | 良好 |
| 团队熟悉度 | 较低 | 较高 |
推荐建议: 超大规模代码库或追求极致性能时使用Qdrant。若需要SQL集成或团队熟悉PostgreSQL,可选择PostgreSQL。
Best Practices
最佳实践
- Use persistent volume: Mount
/qdrant/storage - Enable TLS in production: Set
use_tls: true - Secure API key: Use environment variables
- Monitor memory: Vector search is memory-intensive
- Regular snapshots: Backup before major changes
- 使用持久化卷: 挂载
/qdrant/storage - 生产环境启用TLS: 设置
use_tls: true - 安全存储API密钥: 使用环境变量管理
- 监控内存使用: 向量搜索对内存要求较高
- 定期创建快照: 重大变更前进行备份
Output Format
输出格式
Qdrant storage status:
✅ Qdrant Storage Configured
Backend: Qdrant
Endpoint: localhost:6334
TLS: disabled
Collection: grepai
Contents:
- Files: 5,000
- Vectors: 25,000
- Dimensions: 768
Performance:
- Connection: OK
- Indexed: Yes
- Search latency: ~15msQdrant存储状态示例:
✅ Qdrant存储已配置
后端:Qdrant
端点:localhost:6334
TLS:已禁用
集合:grepai
内容统计:
- 文件数:5,000
- 向量数:25,000
- 维度:768
性能指标:
- 连接状态:正常
- 索引状态:已完成
- 搜索延迟:~15ms