grepai-storage-postgres
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGrepAI Storage with PostgreSQL
基于PostgreSQL的GrepAI存储
This skill covers using PostgreSQL with the pgvector extension as the storage backend for GrepAI.
本技能介绍如何使用带有pgvector扩展的PostgreSQL作为GrepAI的存储后端。
When to Use This Skill
何时使用此技能
- Team environments with shared index
- Large codebases (10K+ files)
- Need concurrent access
- Integration with existing PostgreSQL infrastructure
- 带有共享索引的团队环境
- 大型代码库(10K+文件)
- 需要并发访问
- 与现有PostgreSQL基础设施集成
Prerequisites
前提条件
- PostgreSQL 14+ with pgvector extension
- Database user with create table permissions
- Network access to PostgreSQL server
- 带有pgvector扩展的PostgreSQL 14+
- 拥有创建表权限的数据库用户
- 可访问PostgreSQL服务器的网络权限
Advantages
优势
| Benefit | Description |
|---|---|
| 👥 Team sharing | Multiple users can access same index |
| 📏 Scalable | Handles large codebases |
| 🔄 Concurrent | Multiple simultaneous searches |
| 💾 Persistent | Data survives machine restarts |
| 🔧 Familiar | Standard database tooling |
| 优势 | 描述 |
|---|---|
| 👥 团队共享 | 多个用户可访问同一索引 |
| 📏 可扩展 | 支持大型代码库 |
| 🔄 并发处理 | 支持多用户同时搜索 |
| 💾 持久化 | 数据在机器重启后仍保留 |
| 🔧 易上手 | 采用标准数据库工具 |
Setting Up PostgreSQL with pgvector
配置带有pgvector的PostgreSQL
Option 1: Docker (Recommended for Development)
选项1:Docker(开发环境推荐)
bash
undefinedbash
undefinedRun PostgreSQL with pgvector
Run PostgreSQL with pgvector
docker run -d
--name grepai-postgres
-e POSTGRES_USER=grepai
-e POSTGRES_PASSWORD=grepai
-e POSTGRES_DB=grepai
-p 5432:5432
pgvector/pgvector:pg16
--name grepai-postgres
-e POSTGRES_USER=grepai
-e POSTGRES_PASSWORD=grepai
-e POSTGRES_DB=grepai
-p 5432:5432
pgvector/pgvector:pg16
undefineddocker run -d
--name grepai-postgres
-e POSTGRES_USER=grepai
-e POSTGRES_PASSWORD=grepai
-e POSTGRES_DB=grepai
-p 5432:5432
pgvector/pgvector:pg16
--name grepai-postgres
-e POSTGRES_USER=grepai
-e POSTGRES_PASSWORD=grepai
-e POSTGRES_DB=grepai
-p 5432:5432
pgvector/pgvector:pg16
undefinedOption 2: Install on Existing PostgreSQL
选项2:在现有PostgreSQL上安装
bash
undefinedbash
undefinedInstall pgvector extension (Ubuntu/Debian)
Install pgvector extension (Ubuntu/Debian)
sudo apt install postgresql-16-pgvector
sudo apt install postgresql-16-pgvector
Or compile from source
Or compile from source
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install
Then enable the extension:
```sql
-- Connect to your database
CREATE EXTENSION IF NOT EXISTS vector;git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install
然后启用扩展:
```sql
-- Connect to your database
CREATE EXTENSION IF NOT EXISTS vector;Option 3: Managed Services
选项3:托管服务
- Supabase: pgvector included by default
- Neon: pgvector available
- AWS RDS: Install pgvector extension
- Azure Database: pgvector available
- Supabase: 默认包含pgvector
- Neon: 支持pgvector
- AWS RDS: 可安装pgvector扩展
- Azure Database: 支持pgvector
Configuration
配置
Basic Configuration
基础配置
yaml
undefinedyaml
undefined.grepai/config.yaml
.grepai/config.yaml
store:
backend: postgres
postgres:
dsn: postgres://user:password@localhost:5432/grepai
undefinedstore:
backend: postgres
postgres:
dsn: postgres://user:password@localhost:5432/grepai
undefinedWith Environment Variable
使用环境变量
yaml
store:
backend: postgres
postgres:
dsn: ${DATABASE_URL}Set the environment variable:
bash
export DATABASE_URL="postgres://user:password@localhost:5432/grepai"yaml
store:
backend: postgres
postgres:
dsn: ${DATABASE_URL}设置环境变量:
bash
export DATABASE_URL="postgres://user:password@localhost:5432/grepai"Full DSN Options
完整DSN选项
yaml
store:
backend: postgres
postgres:
dsn: postgres://user:password@host:5432/database?sslmode=requireDSN components:
- : Database username
user - : Database password
password - : Server hostname or IP
host - : Port (default: 5432)
5432 - : Database name
database - : SSL mode (disable, require, verify-full)
sslmode
yaml
store:
backend: postgres
postgres:
dsn: postgres://user:password@host:5432/database?sslmode=requireDSN组件:
- :数据库用户名
user - :数据库密码
password - :服务器主机名或IP
host - :端口(默认:5432)
5432 - :数据库名称
database - :SSL模式(disable、require、verify-full)
sslmode
SSL Modes
SSL模式
| Mode | Description | Use Case |
|---|---|---|
| No SSL | Local development |
| SSL required | Production |
| SSL + verify certificate | High security |
yaml
undefined| 模式 | 描述 | 使用场景 |
|---|---|---|
| 不使用SSL | 本地开发 |
| 必须使用SSL | 生产环境 |
| SSL + 证书验证 | 高安全场景 |
yaml
undefinedProduction with SSL
Production with SSL
store:
backend: postgres
postgres:
dsn: postgres://user:pass@prod.db.com:5432/grepai?sslmode=require
undefinedstore:
backend: postgres
postgres:
dsn: postgres://user:pass@prod.db.com:5432/grepai?sslmode=require
undefinedDatabase Schema
数据库架构
GrepAI automatically creates these tables:
sql
-- Vector embeddings table
CREATE TABLE IF NOT EXISTS embeddings (
id SERIAL PRIMARY KEY,
file_path TEXT NOT NULL,
chunk_index INTEGER NOT NULL,
content TEXT NOT NULL,
start_line INTEGER,
end_line INTEGER,
embedding vector(768), -- Dimension matches your model
created_at TIMESTAMP DEFAULT NOW(),
UNIQUE(file_path, chunk_index)
);
-- Index for vector similarity search
CREATE INDEX ON embeddings USING ivfflat (embedding vector_cosine_ops);GrepAI会自动创建以下表:
sql
-- Vector embeddings table
CREATE TABLE IF NOT EXISTS embeddings (
id SERIAL PRIMARY KEY,
file_path TEXT NOT NULL,
chunk_index INTEGER NOT NULL,
content TEXT NOT NULL,
start_line INTEGER,
end_line INTEGER,
embedding vector(768), -- Dimension matches your model
created_at TIMESTAMP DEFAULT NOW(),
UNIQUE(file_path, chunk_index)
);
-- Index for vector similarity search
CREATE INDEX ON embeddings USING ivfflat (embedding vector_cosine_ops);Verifying Setup
验证配置
Check pgvector Extension
检查pgvector扩展
sql
-- Connect to database
psql -U grepai -d grepai
-- Check extension is installed
SELECT * FROM pg_extension WHERE extname = 'vector';
-- Check GrepAI tables exist (after first grepai watch)
\dtsql
-- Connect to database
psql -U grepai -d grepai
-- Check extension is installed
SELECT * FROM pg_extension WHERE extname = 'vector';
-- Check GrepAI tables exist (after first grepai watch)\dtTest Connection from GrepAI
从GrepAI测试连接
bash
undefinedbash
undefinedCheck status
Check status
grepai status
grepai status
Should show PostgreSQL backend info
Should show PostgreSQL backend info
undefinedundefinedPerformance Tuning
性能调优
PostgreSQL Configuration
PostgreSQL配置
For better vector search performance:
sql
-- Increase work memory for vector operations
SET work_mem = '256MB';
-- Adjust for your hardware
SET effective_cache_size = '4GB';
SET shared_buffers = '1GB';为提升向量搜索性能:
sql
-- Increase work memory for vector operations
SET work_mem = '256MB';
-- Adjust for your hardware
SET effective_cache_size = '4GB';
SET shared_buffers = '1GB';Index Tuning
索引调优
For large indices, tune the IVFFlat index:
sql
-- More lists = faster search, more memory
CREATE INDEX ON embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100); -- Adjust based on row countRule of thumb:
lists = sqrt(rows)对于大型索引,调优IVFFlat索引:
sql
-- More lists = faster search, more memory
CREATE INDEX ON embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100); -- Adjust based on row count经验法则:
lists = sqrt(rows)Concurrent Access
并发访问
PostgreSQL handles concurrent access automatically:
- Multiple commands work simultaneously
grepai search - One daemon per codebase
grepai watch - Many users can share the same index
PostgreSQL会自动处理并发访问:
- 多个命令可同时运行
grepai search - 每个代码库对应一个守护进程
grepai watch - 多用户可共享同一索引
Team Setup
团队配置
Shared Database
共享数据库
All team members point to the same database:
yaml
undefined所有团队成员指向同一数据库:
yaml
undefinedEach developer's .grepai/config.yaml
Each developer's .grepai/config.yaml
store:
backend: postgres
postgres:
dsn: postgres://team:secret@shared-db.company.com:5432/grepai
undefinedstore:
backend: postgres
postgres:
dsn: postgres://team:secret@shared-db.company.com:5432/grepai
undefinedPer-Project Databases
按项目划分数据库
For isolated projects, use separate databases:
bash
undefined对于隔离的项目,使用独立数据库:
bash
undefinedCreate databases
Create databases
createdb -U postgres grepai_projecta
createdb -U postgres grepai_projectb
```yamlcreatedb -U postgres grepai_projecta
createdb -U postgres grepai_projectb
```yamlProject A config
Project A config
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai_projecta
undefinedstore:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai_projecta
undefinedBackup and Restore
备份与恢复
Backup
备份
bash
pg_dump -U grepai -d grepai > grepai_backup.sqlbash
pg_dump -U grepai -d grepai > grepai_backup.sqlRestore
恢复
bash
psql -U grepai -d grepai < grepai_backup.sqlbash
psql -U grepai -d grepai < grepai_backup.sqlMigrating from GOB
从GOB迁移
- Set up PostgreSQL with pgvector
- Update configuration:
yaml
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai- Delete old index:
bash
rm .grepai/index.gob- Re-index:
bash
grepai watch- 配置带有pgvector的PostgreSQL
- 更新配置:
yaml
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai- 删除旧索引:
bash
rm .grepai/index.gob- 重新索引:
bash
grepai watchCommon Issues
常见问题
❌ Problem:
✅ Solution: Check DSN credentials and pg_hba.conf
FATAL: password authentication failed❌ Problem:
✅ Solution: Install pgvector:
ERROR: extension "vector" is not availablebash
sudo apt install postgresql-16-pgvector❌ 问题:
✅ 解决方案: 检查DSN凭据和pg_hba.conf
FATAL: password authentication failed❌ 问题:
✅ 解决方案: 安装pgvector:
ERROR: extension "vector" is not availablebash
sudo apt install postgresql-16-pgvectorThen: CREATE EXTENSION vector;
Then: CREATE EXTENSION vector;
❌ **Problem:** `ERROR: type "vector" does not exist`
✅ **Solution:** Enable extension in the database:
```sql
CREATE EXTENSION IF NOT EXISTS vector;❌ Problem: Connection refused
✅ Solution:
- Check PostgreSQL is running
- Verify host and port
- Check firewall rules
❌ Problem: Slow searches
✅ Solution:
- Add IVFFlat index
- Increase
work_mem - Vacuum and analyze tables
❌ **问题:** `ERROR: type "vector" does not exist`
✅ **解决方案:** 在数据库中启用扩展:
```sql
CREATE EXTENSION IF NOT EXISTS vector;❌ 问题: 连接被拒绝
✅ 解决方案:
- 检查PostgreSQL是否正在运行
- 验证主机和端口
- 检查防火墙规则
❌ 问题: 搜索缓慢
✅ 解决方案:
- 添加IVFFlat索引
- 增大
work_mem - 清理并分析表
Best Practices
最佳实践
- Use environment variables: Don't commit credentials
- Enable SSL: For remote databases
- Regular backups: pg_dump before major changes
- Monitor performance: Check query times
- Index maintenance: Regular VACUUM ANALYZE
- 使用环境变量: 不要提交凭据
- 启用SSL: 对于远程数据库
- 定期备份: 重大变更前使用pg_dump
- 监控性能: 检查查询时间
- 索引维护: 定期执行VACUUM ANALYZE
Output Format
输出格式
PostgreSQL storage status:
✅ PostgreSQL Storage Configured
Backend: PostgreSQL + pgvector
Host: localhost:5432
Database: grepai
SSL: disabled
Contents:
- Files: 2,450
- Chunks: 12,340
- Vector dimension: 768
Performance:
- Connection: OK
- IVFFlat index: Yes
- Search latency: ~50msPostgreSQL存储状态:
✅ PostgreSQL Storage Configured
Backend: PostgreSQL + pgvector
Host: localhost:5432
Database: grepai
SSL: disabled
Contents:
- Files: 2,450
- Chunks: 12,340
- Vector dimension: 768
Performance:
- Connection: OK
- IVFFlat index: Yes
- Search latency: ~50ms