grepai-ollama-setup

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Ollama Setup for GrepAI

为GrepAI配置Ollama

This skill covers installing and configuring Ollama as the local embedding provider for GrepAI. Ollama enables 100% private code search where your code never leaves your machine.
本技能介绍如何安装和配置Ollama作为GrepAI的本地嵌入提供商。Ollama支持100%私密的代码搜索,你的代码绝不会离开你的设备。

When to Use This Skill

何时使用本技能

  • Setting up GrepAI with local, private embeddings
  • Installing Ollama for the first time
  • Choosing and downloading embedding models
  • Troubleshooting Ollama connection issues
  • 为GrepAI设置本地私有嵌入
  • 首次安装Ollama
  • 选择并下载嵌入模型
  • 排查Ollama连接问题

Why Ollama?

为什么选择Ollama?

BenefitDescription
🔒 PrivacyCode never leaves your machine
💰 FreeNo API costs
FastLocal processing, no network latency
🔌 OfflineWorks without internet
优势说明
🔒 隐私性代码绝不会离开你的设备
💰 免费无API费用
快速本地处理,无网络延迟
🔌 离线可用无需互联网即可使用

Installation

安装

macOS (Homebrew)

macOS(Homebrew)

bash
undefined
bash
undefined

Install Ollama

Install Ollama

brew install ollama
brew install ollama

Start the Ollama service

Start the Ollama service

ollama serve
undefined
ollama serve
undefined

macOS (Direct Download)

macOS(直接下载)

  1. Download from ollama.com
  2. Open the
    .dmg
    and drag to Applications
  3. Launch Ollama from Applications
  1. ollama.com下载
  2. 打开
    .dmg
    文件并拖至应用程序文件夹
  3. 从应用程序中启动Ollama

Linux

Linux

bash
undefined
bash
undefined

One-line installer

One-line installer

Start the service

Start the service

ollama serve
undefined
ollama serve
undefined

Windows

Windows

  1. Download installer from ollama.com
  2. Run the installer
  3. Ollama starts automatically as a service
  1. ollama.com下载安装程序
  2. 运行安装程序
  3. Ollama会自动作为服务启动

Downloading Embedding Models

下载嵌入模型

GrepAI requires an embedding model to convert code into vectors.
GrepAI需要嵌入模型来将代码转换为向量。

Recommended Model: nomic-embed-text

推荐模型:nomic-embed-text

bash
undefined
bash
undefined

Download the recommended model (768 dimensions)

Download the recommended model (768 dimensions)

ollama pull nomic-embed-text

**Specifications:**
- Dimensions: 768
- Size: ~274 MB
- Performance: Excellent for code search
- Language: English-optimized
ollama pull nomic-embed-text

**规格:**
- 维度:768
- 大小:约274 MB
- 性能:非常适合代码搜索
- 语言:针对英语优化

Alternative Models

替代模型

bash
undefined
bash
undefined

Multilingual support (better for non-English code/comments)

多语言支持(更适合非英语代码/注释)

ollama pull nomic-embed-text-v2-moe
ollama pull nomic-embed-text-v2-moe

Larger, more accurate

更大、更准确

ollama pull bge-m3
ollama pull bge-m3

Maximum quality

最高质量

ollama pull mxbai-embed-large

| Model | Dimensions | Size | Best For |
|-------|------------|------|----------|
| `nomic-embed-text` | 768 | 274 MB | General code search |
| `nomic-embed-text-v2-moe` | 768 | 500 MB | Multilingual codebases |
| `bge-m3` | 1024 | 1.2 GB | Large codebases |
| `mxbai-embed-large` | 1024 | 670 MB | Maximum accuracy |
ollama pull mxbai-embed-large

| 模型 | 维度 | 大小 | 最佳适用场景 |
|-------|------------|------|----------|
| `nomic-embed-text` | 768 | 274 MB | 通用代码搜索 |
| `nomic-embed-text-v2-moe` | 768 | 500 MB | 多语言代码库 |
| `bge-m3` | 1024 | 1.2 GB | 大型代码库 |
| `mxbai-embed-large` | 1024 | 670 MB | 最高准确性 |

Verifying Installation

验证安装

Check Ollama is Running

检查Ollama是否运行

bash
undefined
bash
undefined

Check if Ollama server is responding

Check if Ollama server is responding

Expected output: JSON with available models

Expected output: JSON with available models

undefined
undefined

List Downloaded Models

列出已下载的模型

bash
ollama list
bash
ollama list

Output:

Output:

NAME ID SIZE MODIFIED

NAME ID SIZE MODIFIED

nomic-embed-text:latest abc123... 274 MB 2 hours ago

nomic-embed-text:latest abc123... 274 MB 2 hours ago

undefined
undefined

Test Embedding Generation

测试嵌入生成

bash
undefined
bash
undefined

Quick test (should return embedding vector)

Quick test (should return embedding vector)

curl http://localhost:11434/api/embeddings -d '{ "model": "nomic-embed-text", "prompt": "function hello() { return world; }" }'
undefined
curl http://localhost:11434/api/embeddings -d '{ "model": "nomic-embed-text", "prompt": "function hello() { return world; }" }'
undefined

Configuring GrepAI for Ollama

为GrepAI配置Ollama

After installing Ollama, configure GrepAI to use it:
yaml
undefined
安装Ollama后,配置GrepAI以使用它:
yaml
undefined

.grepai/config.yaml

.grepai/config.yaml

embedder: provider: ollama model: nomic-embed-text endpoint: http://localhost:11434

This is the **default configuration** when you run `grepai init`, so no changes are needed if using `nomic-embed-text`.
embedder: provider: ollama model: nomic-embed-text endpoint: http://localhost:11434

这是运行`grepai init`时的**默认配置**,因此如果使用`nomic-embed-text`,无需进行任何更改。

Running Ollama

运行Ollama

Foreground (Development)

前台运行(开发环境)

bash
undefined
bash
undefined

Run in current terminal (see logs)

Run in current terminal (see logs)

ollama serve
undefined
ollama serve
undefined

Background (macOS/Linux)

后台运行(macOS/Linux)

bash
undefined
bash
undefined

Using nohup

Using nohup

nohup ollama serve &
nohup ollama serve &

Or as a systemd service (Linux)

Or as a systemd service (Linux)

sudo systemctl enable ollama sudo systemctl start ollama
undefined
sudo systemctl enable ollama sudo systemctl start ollama
undefined

Check Status

检查状态

bash
undefined
bash
undefined

Check if running

Check if running

pgrep -f ollama
pgrep -f ollama

Or test the API

Or test the API

undefined
undefined

Resource Considerations

资源注意事项

Memory Usage

内存使用

Embedding models load into RAM:
  • nomic-embed-text
    : ~500 MB RAM
  • bge-m3
    : ~1.5 GB RAM
  • mxbai-embed-large
    : ~1 GB RAM
嵌入模型会加载到内存中:
  • nomic-embed-text
    :约500 MB内存
  • bge-m3
    :约1.5 GB内存
  • mxbai-embed-large
    :约1 GB内存

CPU vs GPU

CPU vs GPU

Ollama uses CPU by default. For faster embeddings:
  • macOS: Uses Metal (Apple Silicon) automatically
  • Linux/Windows: Install CUDA for NVIDIA GPU support
Ollama默认使用CPU。如需更快的嵌入速度:
  • macOS: 会自动使用Metal(Apple Silicon)
  • Linux/Windows: 安装CUDA以支持NVIDIA GPU

Common Issues

常见问题

Problem:
connection refused
to localhost:11434 ✅ Solution: Start Ollama:
bash
ollama serve
Problem: Model not found ✅ Solution: Pull the model first:
bash
ollama pull nomic-embed-text
Problem: Slow embedding generation ✅ Solution:
  • Use a smaller model
  • Ensure Ollama is using GPU (check
    ollama ps
    )
  • Close other memory-intensive applications
Problem: Out of memory ✅ Solution: Use a smaller model or increase system RAM
问题: 连接到localhost:11434时出现
connection refused
解决方案: 启动Ollama:
bash
ollama serve
问题: 模型未找到 ✅ 解决方案: 先拉取模型:
bash
ollama pull nomic-embed-text
问题: 嵌入生成速度慢 ✅ 解决方案:
  • 使用更小的模型
  • 确保Ollama正在使用GPU(查看
    ollama ps
  • 关闭其他占用大量内存的应用程序
问题: 内存不足 ✅ 解决方案: 使用更小的模型或增加系统内存

Best Practices

最佳实践

  1. Start Ollama before GrepAI: Ensure
    ollama serve
    is running
  2. Use recommended model:
    nomic-embed-text
    offers best balance
  3. Keep Ollama running: Leave it as a background service
  4. Update periodically:
    ollama pull nomic-embed-text
    for updates
  1. 先启动Ollama再启动GrepAI: 确保
    ollama serve
    正在运行
  2. 使用推荐模型:
    nomic-embed-text
    提供最佳平衡
  3. 保持Ollama运行: 将其作为后台服务运行
  4. 定期更新: 运行
    ollama pull nomic-embed-text
    获取更新

Output Format

输出格式

After successful setup:
✅ Ollama Setup Complete

   Ollama Version: 0.1.x
   Endpoint: http://localhost:11434
   Model: nomic-embed-text (768 dimensions)
   Status: Running

   GrepAI is ready to use with local embeddings.
   Your code will never leave your machine.
设置成功后:
✅ Ollama Setup Complete

   Ollama Version: 0.1.x
   Endpoint: http://localhost:11434
   Model: nomic-embed-text (768 dimensions)
   Status: Running

   GrepAI is ready to use with local embeddings.
   Your code will never leave your machine.