grepai-storage-qdrant

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

GrepAI Storage with Qdrant

基于Qdrant的GrepAI存储

This skill covers using Qdrant as the storage backend for GrepAI, offering high-performance vector search.
本技能介绍如何将Qdrant用作GrepAI的存储后端,以实现高性能向量搜索。

When to Use This Skill

何时使用本技能

  • Need fastest possible search performance
  • Very large codebases (50K+ files)
  • Already using Qdrant infrastructure
  • Want advanced vector search features
  • 需要极致的搜索性能
  • 超大规模代码库(5万+文件)
  • 已在使用Qdrant基础设施
  • 需要高级向量搜索功能

What is Qdrant?

什么是Qdrant?

Qdrant is a purpose-built vector database offering:
  • ⚡ Extremely fast vector similarity search
  • 📏 Excellent scalability
  • 🔧 Advanced filtering capabilities
  • 🐳 Easy Docker deployment
Qdrant是一款专为向量数据打造的数据库,具备以下特性:
  • ⚡ 极快的向量相似度搜索
  • 📏 出色的可扩展性
  • 🔧 高级过滤功能
  • 🐳 便捷的Docker部署

Prerequisites

前提条件

  1. Qdrant server running
  2. Network access to Qdrant
  1. Qdrant服务器已运行
  2. 可网络访问Qdrant

Advantages

优势

BenefitDescription
PerformanceFastest vector search
📏 ScalabilityHandles millions of vectors
🔍 AdvancedFiltering, payloads, sharding
🐳 Easy deployDocker-ready
☁️ Cloud optionQdrant Cloud available
优势说明
性能最快的向量搜索
📏 可扩展性支持数百万级向量存储
🔍 高级功能过滤、负载、分片
🐳 部署便捷支持Docker一键部署
☁️ 云服务选项提供Qdrant Cloud托管服务

Setting Up Qdrant

Qdrant搭建步骤

Option 1: Docker (Recommended)

选项1:Docker(推荐)

bash
undefined
bash
undefined

Run Qdrant with persistent storage

运行带持久化存储的Qdrant

docker run -d
--name grepai-qdrant
-p 6333:6333
-p 6334:6334
-v qdrant_storage:/qdrant/storage
qdrant/qdrant

Ports:
- `6333`: REST API
- `6334`: gRPC API (used by GrepAI)
docker run -d
--name grepai-qdrant
-p 6333:6333
-p 6334:6334
-v qdrant_storage:/qdrant/storage
qdrant/qdrant

端口说明:
- `6333`:REST API端口
- `6334`:gRPC API端口(GrepAI使用)

Option 2: Docker Compose

选项2:Docker Compose

yaml
undefined
yaml
undefined

docker-compose.yml

docker-compose.yml

version: '3.8' services: qdrant: image: qdrant/qdrant ports: - "6333:6333" - "6334:6334" volumes: - qdrant_storage:/qdrant/storage environment: - QDRANT__SERVICE__GRPC_PORT=6334
volumes: qdrant_storage:

```bash
docker-compose up -d
version: '3.8' services: qdrant: image: qdrant/qdrant ports: - "6333:6333" - "6334:6334" volumes: - qdrant_storage:/qdrant/storage environment: - QDRANT__SERVICE__GRPC_PORT=6334
volumes: qdrant_storage:

```bash
docker-compose up -d

Option 3: Qdrant Cloud

选项3:Qdrant Cloud

  1. Sign up at cloud.qdrant.io
  2. Create a cluster
  3. Get your endpoint and API key
  1. cloud.qdrant.io注册账号
  2. 创建集群
  3. 获取集群端点和API密钥

Configuration

配置方法

Basic Configuration (Local)

基础配置(本地环境)

yaml
undefined
yaml
undefined

.grepai/config.yaml

.grepai/config.yaml

store: backend: qdrant qdrant: endpoint: localhost port: 6334
undefined
store: backend: qdrant qdrant: endpoint: localhost port: 6334
undefined

With TLS (Production)

TLS加密配置(生产环境)

yaml
store:
  backend: qdrant
  qdrant:
    endpoint: qdrant.company.com
    port: 6334
    use_tls: true
yaml
store:
  backend: qdrant
  qdrant:
    endpoint: qdrant.company.com
    port: 6334
    use_tls: true

With API Key (Qdrant Cloud)

API密钥配置(Qdrant Cloud)

yaml
store:
  backend: qdrant
  qdrant:
    endpoint: your-cluster.aws.cloud.qdrant.io
    port: 6334
    use_tls: true
    api_key: ${QDRANT_API_KEY}
Set the environment variable:
bash
export QDRANT_API_KEY="your-api-key"
yaml
store:
  backend: qdrant
  qdrant:
    endpoint: your-cluster.aws.cloud.qdrant.io
    port: 6334
    use_tls: true
    api_key: ${QDRANT_API_KEY}
设置环境变量:
bash
export QDRANT_API_KEY="your-api-key"

Configuration Options

配置选项说明

OptionDefaultDescription
endpoint
localhost
Qdrant server hostname
port
6334
gRPC port
use_tls
false
Enable TLS encryption
api_key
noneAuthentication key
选项默认值说明
endpoint
localhost
Qdrant服务器主机名
port
6334
gRPC端口
use_tls
false
启用TLS加密
api_key
认证密钥

Verifying Setup

验证搭建结果

Check Qdrant is Running

检查Qdrant运行状态

bash
undefined
bash
undefined

REST API health check

REST API健康检查

Expected: {"status":"ok"}

预期返回:{"status":"ok"}

undefined
undefined

Check Collections (after indexing)

检查集合(索引完成后)

bash
undefined
bash
undefined

List collections

列出所有集合

Get collection info

获取指定集合信息

From GrepAI

通过GrepAI验证

bash
grepai status
bash
grepai status

Should show Qdrant backend info

应显示Qdrant后端相关信息

undefined
undefined

Qdrant Dashboard

Qdrant控制台

Access the web dashboard at
http://localhost:6333/dashboard
:
  • View collections
  • Browse vectors
  • Execute queries
  • Monitor performance
访问Web控制台:
http://localhost:6333/dashboard
  • 查看集合
  • 浏览向量
  • 执行查询
  • 监控性能

Performance Characteristics

性能特性

Search Latency

搜索延迟

Codebase SizeVectorsSearch Time
Small (1K files)5,000<10ms
Medium (10K files)50,000<20ms
Large (100K files)500,000<50ms
代码库规模向量数量搜索耗时
小型(1千文件)5,000<10ms
中型(1万文件)50,000<20ms
大型(10万文件)500,000<50ms

Memory Usage

内存占用

Qdrant loads vectors into memory for fast search:
VectorsDimensionsMemory
10,000768~60 MB
100,000768~600 MB
1,000,000768~6 GB
Qdrant会将向量加载到内存以实现快速搜索:
向量数量维度内存占用
10,000768~60 MB
100,000768~600 MB
1,000,000768~6 GB

Advanced Configuration

高级配置

Qdrant Server Configuration

Qdrant服务器配置

Create
config/production.yaml
:
yaml
storage:
  storage_path: /qdrant/storage

service:
  grpc_port: 6334
  http_port: 6333
  max_request_size_mb: 32

optimizers:
  memmap_threshold_kb: 200000
  indexing_threshold_kb: 50000
Mount in Docker:
bash
docker run -d \
  -v ./config:/qdrant/config \
  -v qdrant_storage:/qdrant/storage \
  qdrant/qdrant
创建
config/production.yaml
yaml
storage:
  storage_path: /qdrant/storage

service:
  grpc_port: 6334
  http_port: 6333
  max_request_size_mb: 32

optimizers:
  memmap_threshold_kb: 200000
  indexing_threshold_kb: 50000
在Docker中挂载配置:
bash
docker run -d \
  -v ./config:/qdrant/config \
  -v qdrant_storage:/qdrant/storage \
  qdrant/qdrant

Collection Settings

集合设置

GrepAI creates a collection named
grepai
with:
  • Vector size: matches your embedding dimensions
  • Distance: Cosine similarity
  • On-disk storage for large datasets
GrepAI会自动创建名为
grepai
的集合,具备以下特性:
  • 向量维度:与嵌入模型维度匹配
  • 距离计算方式:余弦相似度
  • 大规模数据集支持磁盘存储

Clustering (Advanced)

集群部署(高级)

For very large deployments, Qdrant supports distributed mode:
yaml
undefined
针对超大规模部署,Qdrant支持分布式模式:
yaml
undefined

qdrant config

qdrant配置

cluster: enabled: true p2p: port: 6335
undefined
cluster: enabled: true p2p: port: 6335
undefined

Backup and Restore

备份与恢复

Snapshot Creation

创建快照

bash
undefined
bash
undefined

Create snapshot via REST API

通过REST API创建快照

Restore Snapshot

恢复快照

bash
undefined
bash
undefined

Restore from snapshot

从快照恢复

curl -X PUT 'http://localhost:6333/collections/grepai/snapshots/recover'
-H 'Content-Type: application/json'
-d '{"location": "/path/to/snapshot"}'
undefined
curl -X PUT 'http://localhost:6333/collections/grepai/snapshots/recover'
-H 'Content-Type: application/json'
-d '{"location": "/path/to/snapshot"}'
undefined

Migrating from GOB

从GOB迁移

  1. Start Qdrant:
bash
docker run -d --name qdrant -p 6333:6333 -p 6334:6334 qdrant/qdrant
  1. Update configuration:
yaml
store:
  backend: qdrant
  qdrant:
    endpoint: localhost
    port: 6334
  1. Delete old index:
bash
rm .grepai/index.gob
  1. Re-index:
bash
grepai watch
  1. 启动Qdrant:
bash
docker run -d --name qdrant -p 6333:6333 -p 6334:6334 qdrant/qdrant
  1. 更新配置:
yaml
store:
  backend: qdrant
  qdrant:
    endpoint: localhost
    port: 6334
  1. 删除旧索引:
bash
rm .grepai/index.gob
  1. 重新索引:
bash
grepai watch

Migrating from PostgreSQL

从PostgreSQL迁移

  1. Start Qdrant
  2. Update configuration to use Qdrant
  3. Re-index (embeddings must be regenerated)
  1. 启动Qdrant
  2. 更新配置为使用Qdrant
  3. 重新索引(需重新生成嵌入向量)

Common Issues

常见问题

Problem: Connection refused ✅ Solution: Ensure Qdrant is running:
bash
docker ps | grep qdrant
docker start grepai-qdrant
Problem: gRPC connection failed ✅ Solution: Check port 6334 is exposed:
bash
docker run -p 6334:6334 ...
Problem: Authentication failed ✅ Solution: Check API key:
bash
echo $QDRANT_API_KEY
Problem: Out of memory ✅ Solutions:
  • Enable on-disk storage in Qdrant config
  • Increase Docker memory limit
  • Use Qdrant Cloud for managed scaling
Problem: Slow initial indexing ✅ Solution: This is normal; Qdrant optimizes in background. Searches will be fast after indexing completes.
问题: 连接被拒绝 ✅ 解决方案: 确保Qdrant已运行:
bash
docker ps | grep qdrant
docker start grepai-qdrant
问题: gRPC连接失败 ✅ 解决方案: 检查6334端口是否已暴露:
bash
docker run -p 6334:6334 ...
问题: 认证失败 ✅ 解决方案: 检查API密钥:
bash
echo $QDRANT_API_KEY
问题: 内存不足 ✅ 解决方案:
  • 在Qdrant配置中启用磁盘存储
  • 增加Docker内存限制
  • 使用Qdrant Cloud托管服务实现自动扩容
问题: 初始索引速度慢 ✅ 解决方案: 此为正常现象,Qdrant会在后台进行优化。索引完成后搜索速度会变快。

Qdrant vs PostgreSQL

Qdrant vs PostgreSQL

FeatureQdrantPostgreSQL
Search speed⚡⚡⚡⚡⚡
Setup complexityEasy (Docker)Medium
SQL queries
ScalabilityExcellentGood
Memory efficiencyExcellentGood
Team familiarityLowerHigher
Recommendation: Use Qdrant for large codebases or maximum performance. Use PostgreSQL if you need SQL integration or team is familiar with it.
特性QdrantPostgreSQL
搜索速度⚡⚡⚡⚡⚡
搭建复杂度简单(Docker)中等
SQL查询
可扩展性优秀良好
内存效率优秀良好
团队熟悉度较低较高
推荐建议: 超大规模代码库或追求极致性能时使用Qdrant。若需要SQL集成或团队熟悉PostgreSQL,可选择PostgreSQL。

Best Practices

最佳实践

  1. Use persistent volume: Mount
    /qdrant/storage
  2. Enable TLS in production: Set
    use_tls: true
  3. Secure API key: Use environment variables
  4. Monitor memory: Vector search is memory-intensive
  5. Regular snapshots: Backup before major changes
  1. 使用持久化卷: 挂载
    /qdrant/storage
  2. 生产环境启用TLS: 设置
    use_tls: true
  3. 安全存储API密钥: 使用环境变量管理
  4. 监控内存使用: 向量搜索对内存要求较高
  5. 定期创建快照: 重大变更前进行备份

Output Format

输出格式

Qdrant storage status:
✅ Qdrant Storage Configured

   Backend: Qdrant
   Endpoint: localhost:6334
   TLS: disabled
   Collection: grepai

   Contents:
   - Files: 5,000
   - Vectors: 25,000
   - Dimensions: 768

   Performance:
   - Connection: OK
   - Indexed: Yes
   - Search latency: ~15ms
Qdrant存储状态示例:
✅ Qdrant存储已配置

   后端:Qdrant
   端点:localhost:6334
   TLS:已禁用
   集合:grepai

   内容统计:
   - 文件数:5,000
   - 向量数:25,000
   - 维度:768

   性能指标:
   - 连接状态:正常
   - 索引状态:已完成
   - 搜索延迟:~15ms