grepai-storage-gob
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGrepAI Storage with GOB
基于GOB的GrepAI存储方案
This skill covers using GOB (Go Binary) as the storage backend for GrepAI, the default and simplest option.
本技能介绍如何将GOB(Go Binary,Go语言原生二进制序列化格式)作为GrepAI的存储后端,这是默认且最简单的存储选项。
When to Use This Skill
适用场景
- Single developer projects
- Small to medium codebases
- Simple setup without external dependencies
- Local development environments
- 个人开发者项目
- 中小型代码库
- 无需外部依赖的简单部署
- 本地开发环境
What is GOB Storage?
什么是GOB存储?
GOB is Go's native binary serialization format. GrepAI uses it to store:
- Vector embeddings
- File metadata
- Chunk information
Everything is stored in a single local file.
GOB是Go语言的原生二进制序列化格式。GrepAI使用它存储以下内容:
- 向量嵌入(Vector embeddings)
- 文件元数据
- 代码块信息
所有数据都存储在单个本地文件中。
Advantages
优势
| Benefit | Description |
|---|---|
| 🚀 Simple | No external services needed |
| ⚡ Fast setup | Works immediately |
| 📁 Portable | Single file, easy to backup |
| 💰 Free | No infrastructure costs |
| 🔒 Private | Data stays local |
| 优势 | 说明 |
|---|---|
| 🚀 简单易用 | 无需依赖外部服务 |
| ⚡ 快速部署 | 配置后即可立即使用 |
| 📁 可移植性强 | 单文件存储,便于备份 |
| 💰 零成本 | 无基础设施开销 |
| 🔒 数据私密 | 数据完全保存在本地 |
Limitations
局限性
| Limitation | Description |
|---|---|
| 📏 Scalability | Not ideal for very large codebases |
| 👤 Single user | No concurrent access |
| 🔄 No sharing | Can't share index across machines |
| 💾 Memory | Loads into RAM for searches |
| 限制 | 说明 |
|---|---|
| 📏 扩展性不足 | 不适用于超大规模代码库 |
| 👤 单用户限制 | 不支持并发访问 |
| 🔄 无法共享 | 无法跨机器共享索引 |
| 💾 内存占用 | 搜索时需将整个索引加载至内存 |
Configuration
配置方法
Default Configuration
默认配置
GOB is the default backend. Minimal config:
yaml
undefinedGOB是默认的存储后端,仅需最简配置:
yaml
undefined.grepai/config.yaml
.grepai/config.yaml
store:
backend: gob
undefinedstore:
backend: gob
undefinedExplicit Configuration
显式配置
yaml
store:
backend: gob
# Index stored in .grepai/index.gob (automatic)yaml
store:
backend: gob
# 索引自动存储于 .grepai/index.gobStorage Location
存储位置
GOB storage creates files in your project's directory:
.grepai/.grepai/
├── config.yaml # Configuration
├── index.gob # Vector embeddings
└── symbols.gob # Symbol index for traceGOB存储会在项目的 目录下生成文件:
.grepai/.grepai/
├── config.yaml # 配置文件
├── index.gob # 向量嵌入索引
└── symbols.gob # 用于追踪的符号索引File Sizes
文件大小参考
Approximate sizes:
.grepai/index.gob| Codebase | Files | Chunks | Index Size |
|---|---|---|---|
| Small | 100 | 500 | ~5 MB |
| Medium | 1,000 | 5,000 | ~50 MB |
| Large | 10,000 | 50,000 | ~500 MB |
.grepai/index.gob| 代码库规模 | 文件数量 | 代码块数量 | 索引大小 |
|---|---|---|---|
| 小型 | 100 | 500 | ~5 MB |
| 中型 | 1,000 | 5,000 | ~50 MB |
| 大型 | 10,000 | 50,000 | ~500 MB |
Operations
操作指南
Creating the Index
创建索引
bash
undefinedbash
undefinedInitialize project
初始化项目
grepai init
grepai init
Start indexing (creates index.gob)
开始索引(生成index.gob)
grepai watch
undefinedgrepai watch
undefinedChecking Index Status
查看索引状态
bash
grepai statusbash
grepai statusOutput:
输出示例:
Index: .grepai/index.gob
Index: .grepai/index.gob
Files: 245
Files: 245
Chunks: 1,234
Chunks: 1,234
Size: 12.5 MB
Size: 12.5 MB
Last updated: 2025-01-28 10:30:00
Last updated: 2025-01-28 10:30:00
undefinedundefinedBacking Up the Index
备份索引
bash
undefinedbash
undefinedSimple file copy
简单文件复制
cp .grepai/index.gob .grepai/index.gob.backup
undefinedcp .grepai/index.gob .grepai/index.gob.backup
undefinedClearing the Index
清空索引
bash
undefinedbash
undefinedDelete and re-index
删除后重新索引
rm .grepai/index.gob
grepai watch
undefinedrm .grepai/index.gob
grepai watch
undefinedMoving to a New Machine
迁移至新机器
bash
undefinedbash
undefinedCopy entire .grepai directory
复制整个.grepai目录
cp -r .grepai /path/to/new/location/
cp -r .grepai /path/to/new/location/
Note: Only works if using same embedding model
注意:仅当使用相同的嵌入模型时生效
undefinedundefinedPerformance Considerations
性能考量
Memory Usage
内存占用
GOB loads the entire index into RAM for searches:
| Index Size | RAM Usage |
|---|---|
| 10 MB | ~20 MB |
| 50 MB | ~100 MB |
| 500 MB | ~1 GB |
GOB会将整个索引加载至内存以进行搜索:
| 索引大小 | 内存占用 |
|---|---|
| 10 MB | ~20 MB |
| 50 MB | ~100 MB |
| 500 MB | ~1 GB |
Search Speed
搜索速度
GOB provides fast searches for typical codebases:
| Codebase Size | Search Time |
|---|---|
| Small (100 files) | <50ms |
| Medium (1K files) | <200ms |
| Large (10K files) | <1s |
对于常规代码库,GOB可提供较快的搜索速度:
| 代码库规模 | 搜索耗时 |
|---|---|
| 小型(100个文件) | <50ms |
| 中型(1000个文件) | <200ms |
| 大型(10000个文件) | <1s |
When to Upgrade
升级时机
Consider PostgreSQL or Qdrant when:
- Index exceeds 1 GB
- Need concurrent access
- Want to share index across team
- Codebase has 50K+ files
当出现以下情况时,建议迁移至PostgreSQL或Qdrant:
- 索引大小超过1 GB
- 需要支持并发访问
- 需在团队内共享索引
- 代码库文件数量超过50000个
.gitignore Configuration
.gitignore配置
Add to your :
.grepai/.gitignoregitignore
undefined将 添加至你的 :
.grepai/.gitignoregitignore
undefinedGrepAI (machine-specific index)
GrepAI(机器专属索引)
.grepai/
**Why:** The index is machine-specific because:
- Contains binary embeddings
- Tied to the embedding model used
- Each machine should generate its own.grepai/
**原因:** 索引是机器专属的,因为:
- 包含二进制嵌入数据
- 与所使用的嵌入模型绑定
- 每台机器应生成自己的索引Sharing Index (Not Recommended)
索引共享(不推荐)
While you can copy the index file, it's not recommended because:
- Must use identical embedding model
- File paths are absolute
- Different machines may have different code versions
Better approach: Each developer runs their own .
grepai watch虽然可以复制索引文件,但不建议这么做,原因如下:
- 必须使用完全相同的嵌入模型
- 文件路径为绝对路径
- 不同机器的代码版本可能存在差异
更佳方案: 每位开发者独立运行 生成索引。
grepai watchMigrating to Other Backends
迁移至其他后端
To PostgreSQL
迁移至PostgreSQL
- Update config:
yaml
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai- Re-index:
bash
rm .grepai/index.gob
grepai watch- 更新配置:
yaml
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai- 重新索引:
bash
rm .grepai/index.gob
grepai watchTo Qdrant
迁移至Qdrant
- Update config:
yaml
store:
backend: qdrant
qdrant:
endpoint: localhost
port: 6334- Re-index:
bash
rm .grepai/index.gob
grepai watch- 更新配置:
yaml
store:
backend: qdrant
qdrant:
endpoint: localhost
port: 6334- 重新索引:
bash
rm .grepai/index.gob
grepai watchCommon Issues
常见问题
❌ Problem: Index file too large
✅ Solution: Add more ignore patterns or migrate to PostgreSQL/Qdrant
❌ Problem: Slow searches on large codebase
✅ Solution: Migrate to Qdrant for better performance
❌ Problem: Corrupted index
✅ Solution: Delete and re-index:
bash
rm .grepai/index.gob .grepai/symbols.gob
grepai watch❌ Problem: "Index not found" error
✅ Solution: Run to create the index
grepai watch❌ 问题: 索引文件过大
✅ 解决方案: 添加更多忽略规则,或迁移至PostgreSQL/Qdrant
❌ 问题: 大型代码库搜索缓慢
✅ 解决方案: 迁移至Qdrant以获得更好的性能
❌ 问题: 索引损坏
✅ 解决方案: 删除后重新索引:
bash
rm .grepai/index.gob .grepai/symbols.gob
grepai watch❌ 问题: 出现“索引未找到”错误
✅ 解决方案: 运行 生成索引
grepai watchBest Practices
最佳实践
- Use for small/medium projects: Up to ~10K files
- Add to .gitignore: Don't commit the index
- Backup before major changes: Copy index.gob before experiments
- Re-index after model changes: If you change embedding models
- Monitor file size: Migrate if index exceeds 1GB
- 适用于中小型项目: 建议代码库文件数量不超过10000个
- 添加至.gitignore: 不要提交索引文件至版本库
- 重大变更前备份: 实验前复制index.gob进行备份
- 更换模型后重新索引: 若更换嵌入模型,需重新生成索引
- 监控文件大小: 若索引超过1GB,考虑迁移至其他后端
Output Format
输出格式
GOB storage status:
✅ GOB Storage Configured
Backend: GOB (local file)
Index: .grepai/index.gob
Size: 12.5 MB
Contents:
- Files: 245
- Chunks: 1,234
- Vectors: 1,234 × 768 dimensions
Performance:
- Search latency: <100ms
- Memory usage: ~25 MBGOB存储状态示例:
✅ GOB存储已配置
后端类型: GOB(本地文件)
索引路径: .grepai/index.gob
大小: 12.5 MB
内容详情:
- 已索引文件: 245
- 已索引代码块: 1,234
- 向量数量: 1,234 × 768 维度
性能指标:
- 搜索延迟: <100ms
- 内存占用: ~25 MB