grepai-embeddings-ollama
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGrepAI Embeddings with Ollama
为GrepAI配置Ollama作为嵌入模型提供商
This skill covers using Ollama as the embedding provider for GrepAI, enabling 100% private, local code search.
本技能介绍如何将Ollama用作GrepAI的嵌入模型提供商,实现100%私有、本地的代码搜索。
When to Use This Skill
适用场景
- Setting up private, local embeddings
- Choosing the right Ollama model
- Optimizing Ollama performance
- Troubleshooting Ollama connection issues
- 搭建私有、本地的嵌入生成环境
- 选择合适的Ollama模型
- 优化Ollama性能
- 排查Ollama连接问题
Why Ollama?
选择Ollama的原因
| Advantage | Description |
|---|---|
| 🔒 Privacy | Code never leaves your machine |
| 💰 Free | No API costs or usage limits |
| ⚡ Speed | No network latency |
| 🔌 Offline | Works without internet |
| 🔧 Control | Choose your model |
| 优势 | 说明 |
|---|---|
| 🔒 隐私性 | 代码绝不会离开你的设备 |
| 💰 免费 | 无API费用或使用限制 |
| ⚡ 速度快 | 无网络延迟 |
| 🔌 离线可用 | 无需互联网即可运行 |
| 🔧 可控性强 | 可自主选择模型 |
Prerequisites
前置条件
- Ollama installed and running
- An embedding model downloaded
bash
undefined- 已安装并运行Ollama
- 已下载嵌入模型
bash
undefinedInstall Ollama
安装Ollama
brew install ollama # macOS
brew install ollama # macOS系统
or
或
curl -fsSL https://ollama.com/install.sh | sh # Linux
curl -fsSL https://ollama.com/install.sh | sh # Linux系统
Start Ollama
启动Ollama
ollama serve
ollama serve
Download model
下载模型
ollama pull nomic-embed-text
undefinedollama pull nomic-embed-text
undefinedConfiguration
配置方法
Basic Configuration
基础配置
yaml
undefinedyaml
undefined.grepai/config.yaml
.grepai/config.yaml
embedder:
provider: ollama
model: nomic-embed-text
endpoint: http://localhost:11434
undefinedembedder:
provider: ollama
model: nomic-embed-text
endpoint: http://localhost:11434
undefinedWith Custom Endpoint
自定义端点配置
yaml
embedder:
provider: ollama
model: nomic-embed-text
endpoint: http://192.168.1.100:11434 # Remote Ollama serveryaml
embedder:
provider: ollama
model: nomic-embed-text
endpoint: http://192.168.1.100:11434 # 远程Ollama服务器地址With Explicit Dimensions
显式设置维度
yaml
embedder:
provider: ollama
model: nomic-embed-text
endpoint: http://localhost:11434
dimensions: 768 # Usually auto-detectedyaml
embedder:
provider: ollama
model: nomic-embed-text
endpoint: http://localhost:11434
dimensions: 768 # 通常会自动检测Available Models
可用模型
Recommended: nomic-embed-text
推荐模型:nomic-embed-text
bash
ollama pull nomic-embed-text| Property | Value |
|---|---|
| Dimensions | 768 |
| Size | ~274 MB |
| Speed | Fast |
| Quality | Excellent for code |
| Language | English-optimized |
Configuration:
yaml
embedder:
provider: ollama
model: nomic-embed-textbash
ollama pull nomic-embed-text| 属性 | 数值 |
|---|---|
| 维度 | 768 |
| 大小 | ~274 MB |
| 速度 | 快 |
| 质量 | 适用于代码场景,表现优秀 |
| 语言 | 针对英语优化 |
配置示例:
yaml
embedder:
provider: ollama
model: nomic-embed-textMultilingual: nomic-embed-text-v2-moe
多语言模型:nomic-embed-text-v2-moe
bash
ollama pull nomic-embed-text-v2-moe| Property | Value |
|---|---|
| Dimensions | 768 |
| Size | ~500 MB |
| Speed | Medium |
| Quality | Excellent |
| Language | Multilingual |
Best for codebases with non-English comments/documentation.
Configuration:
yaml
embedder:
provider: ollama
model: nomic-embed-text-v2-moebash
ollama pull nomic-embed-text-v2-moe| 属性 | 数值 |
|---|---|
| 维度 | 768 |
| 大小 | ~500 MB |
| 速度 | 中等 |
| 质量 | 表现优秀 |
| 语言 | 多语言支持 |
适合包含非英语注释/文档的代码库。
配置示例:
yaml
embedder:
provider: ollama
model: nomic-embed-text-v2-moeHigh Quality: bge-m3
高质量模型:bge-m3
bash
ollama pull bge-m3| Property | Value |
|---|---|
| Dimensions | 1024 |
| Size | ~1.2 GB |
| Speed | Slower |
| Quality | Very high |
| Language | Multilingual |
Best for large, complex codebases where accuracy is critical.
Configuration:
yaml
embedder:
provider: ollama
model: bge-m3
dimensions: 1024bash
ollama pull bge-m3| 属性 | 数值 |
|---|---|
| 维度 | 1024 |
| 大小 | ~1.2 GB |
| 速度 | 较慢 |
| 质量 | 极高 |
| 语言 | 多语言支持 |
适合对准确性要求极高的大型复杂代码库。
配置示例:
yaml
embedder:
provider: ollama
model: bge-m3
dimensions: 1024Maximum Quality: mxbai-embed-large
最高质量模型:mxbai-embed-large
bash
ollama pull mxbai-embed-large| Property | Value |
|---|---|
| Dimensions | 1024 |
| Size | ~670 MB |
| Speed | Medium |
| Quality | Highest |
| Language | English |
Configuration:
yaml
embedder:
provider: ollama
model: mxbai-embed-large
dimensions: 1024bash
ollama pull mxbai-embed-large| 属性 | 数值 |
|---|---|
| 维度 | 1024 |
| 大小 | ~670 MB |
| 速度 | 中等 |
| 质量 | 最高 |
| 语言 | 英语 |
配置示例:
yaml
embedder:
provider: ollama
model: mxbai-embed-large
dimensions: 1024Model Comparison
模型对比
| Model | Dims | Size | Speed | Quality | Use Case |
|---|---|---|---|---|---|
| 768 | 274MB | ⚡⚡⚡ | ⭐⭐⭐ | General use |
| 768 | 500MB | ⚡⚡ | ⭐⭐⭐⭐ | Multilingual |
| 1024 | 1.2GB | ⚡ | ⭐⭐⭐⭐⭐ | Large codebases |
| 1024 | 670MB | ⚡⚡ | ⭐⭐⭐⭐⭐ | Maximum accuracy |
| 模型 | 维度 | 大小 | 速度 | 质量 | 适用场景 |
|---|---|---|---|---|---|
| 768 | 274MB | ⚡⚡⚡ | ⭐⭐⭐ | 通用场景 |
| 768 | 500MB | ⚡⚡ | ⭐⭐⭐⭐ | 多语言场景 |
| 1024 | 1.2GB | ⚡ | ⭐⭐⭐⭐⭐ | 大型代码库 |
| 1024 | 670MB | ⚡⚡ | ⭐⭐⭐⭐⭐ | 追求最高准确性 |
Performance Optimization
性能优化
Memory Management
内存管理
Models load into RAM. Ensure sufficient memory:
| Model | RAM Required |
|---|---|
| ~500 MB |
| ~800 MB |
| ~1.5 GB |
| ~1 GB |
模型会加载到RAM中,请确保设备有足够内存:
| 模型 | 所需内存 |
|---|---|
| ~500 MB |
| ~800 MB |
| ~1.5 GB |
| ~1 GB |
GPU Acceleration
GPU加速
Ollama automatically uses:
- macOS: Metal (Apple Silicon)
- Linux/Windows: CUDA (NVIDIA GPUs)
Check GPU usage:
bash
ollama psOllama会自动使用以下GPU加速:
- macOS: Metal(Apple Silicon芯片)
- Linux/Windows: CUDA(NVIDIA显卡)
查看GPU使用情况:
bash
ollama psKeeping Model Loaded
保持模型加载状态
By default, Ollama unloads models after 5 minutes of inactivity. Keep loaded:
bash
undefined默认情况下,Ollama会在模型闲置5分钟后卸载。如需持续加载:
bash
undefinedKeep model loaded indefinitely
让模型保持永久加载状态
curl http://localhost:11434/api/generate -d '{
"model": "nomic-embed-text",
"keep_alive": -1
}'
undefinedcurl http://localhost:11434/api/generate -d '{
"model": "nomic-embed-text",
"keep_alive": -1
}'
undefinedVerifying Connection
连接验证
Check Ollama is Running
检查Ollama是否运行
bash
curl http://localhost:11434/api/tagsbash
curl http://localhost:11434/api/tagsList Available Models
列出可用模型
bash
ollama listbash
ollama listTest Embedding
测试嵌入功能
bash
curl http://localhost:11434/api/embeddings -d '{
"model": "nomic-embed-text",
"prompt": "function authenticate(user, password)"
}'bash
curl http://localhost:11434/api/embeddings -d '{
"model": "nomic-embed-text",
"prompt": "function authenticate(user, password)"
}'Running Ollama as a Service
将Ollama作为服务运行
macOS (launchd)
macOS(launchd)
Ollama app runs automatically on login.
Ollama应用会在登录时自动启动。
Linux (systemd)
Linux(systemd)
bash
undefinedbash
undefinedEnable service
启用服务
sudo systemctl enable ollama
sudo systemctl enable ollama
Start service
启动服务
sudo systemctl start ollama
sudo systemctl start ollama
Check status
查看状态
sudo systemctl status ollama
undefinedsudo systemctl status ollama
undefinedManual Background
手动后台运行
bash
nohup ollama serve > /dev/null 2>&1 &bash
nohup ollama serve > /dev/null 2>&1 &Remote Ollama Server
远程Ollama服务器
Run Ollama on a powerful server and connect remotely:
在性能强劲的服务器上运行Ollama,然后远程连接:
On the Server
服务器端设置
bash
undefinedbash
undefinedAllow remote connections
允许远程连接
OLLAMA_HOST=0.0.0.0 ollama serve
undefinedOLLAMA_HOST=0.0.0.0 ollama serve
undefinedOn the Client
客户端配置
yaml
undefinedyaml
undefined.grepai/config.yaml
.grepai/config.yaml
embedder:
provider: ollama
model: nomic-embed-text
endpoint: http://server-ip:11434
undefinedembedder:
provider: ollama
model: nomic-embed-text
endpoint: http://server-ip:11434
undefinedCommon Issues
常见问题
❌ Problem: Connection refused
✅ Solution:
bash
undefined❌ 问题: 连接被拒绝
✅ 解决方案:
bash
undefinedStart Ollama
启动Ollama
ollama serve
❌ **Problem:** Model not found
✅ **Solution:**
```bashollama serve
❌ **问题:** 模型未找到
✅ **解决方案:**
```bashPull the model
拉取模型
ollama pull nomic-embed-text
❌ **Problem:** Slow embedding generation
✅ **Solutions:**
- Use a smaller model (`nomic-embed-text`)
- Ensure GPU is being used (`ollama ps`)
- Close memory-intensive applications
- Consider a remote server with better hardware
❌ **Problem:** Out of memory
✅ **Solutions:**
- Use a smaller model
- Close other applications
- Upgrade RAM
- Use remote Ollama server
❌ **Problem:** Embeddings differ after model update
✅ **Solution:** Re-index after model updates:
```bash
rm .grepai/index.gob
grepai watchollama pull nomic-embed-text
❌ **问题:** 嵌入生成速度慢
✅ **解决方案:**
- 使用更小的模型(如`nomic-embed-text`)
- 确认GPU已被使用(执行`ollama ps`查看)
- 关闭占用大量内存的应用
- 考虑使用硬件配置更好的远程服务器
❌ **问题:** 内存不足
✅ **解决方案:**
- 使用更小的模型
- 关闭其他应用程序
- 升级内存
- 使用远程Ollama服务器
❌ **问题:** 模型更新后嵌入结果不一致
✅ **解决方案:** 模型更新后重新索引:
```bash
rm .grepai/index.gob
grepai watchBest Practices
最佳实践
- Start with : Best balance of speed/quality
nomic-embed-text - Keep Ollama running: Background service recommended
- Match dimensions: Don't mix models with different dimensions
- Re-index on model change: Delete index and re-run watch
- Monitor memory: Embedding models use significant RAM
- 从开始: 速度与质量的最佳平衡
nomic-embed-text - 保持Ollama运行: 推荐将其配置为后台服务
- 匹配维度: 不要混合使用不同维度的模型
- 更换模型后重新索引: 删除索引文件并重新运行watch命令
- 监控内存使用: 嵌入模型会占用大量内存
Output Format
输出格式
Successful Ollama configuration:
✅ Ollama Embedding Provider Configured
Provider: Ollama
Model: nomic-embed-text
Endpoint: http://localhost:11434
Dimensions: 768 (auto-detected)
Status: Connected
Model Info:
- Size: 274 MB
- Loaded: Yes
- GPU: Apple MetalOllama配置成功后的输出示例:
✅ Ollama嵌入模型提供商已配置完成
提供商:Ollama
模型:nomic-embed-text
端点:http://localhost:11434
维度:768(自动检测)
状态:已连接
模型信息:
- 大小:274 MB
- 是否已加载:是
- GPU:Apple Metal