rlama

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

RLAMA - Local RAG System

RLAMA - 本地RAG系统

RLAMA (Retrieval-Augmented Language Model Adapter) provides fully local, offline RAG for semantic search over your documents.
RLAMA(Retrieval-Augmented Language Model Adapter,检索增强语言模型适配器)为你的文档提供完全本地、离线的RAG语义搜索功能。

When to Use This Skill

适用场景

  • Building knowledge bases from local documents
  • Searching personal notes, research papers, or code documentation
  • Document-based Q&A without sending data to the cloud
  • Indexing project documentation for quick semantic lookup
  • Creating searchable archives of PDFs, markdown, or code files
  • 从本地文档构建知识库
  • 搜索个人笔记、研究论文或代码文档
  • 无需将数据发送到云端的文档问答
  • 为项目文档建立索引以实现快速语义查找
  • 创建PDF、Markdown或代码文件的可搜索档案

Prerequisites

前置条件

RLAMA requires Ollama running locally:
bash
undefined
RLAMA要求本地运行Ollama:
bash
undefined

Verify Ollama is running

Verify Ollama is running

ollama list
ollama list

If not running, start it

If not running, start it

brew services start ollama # macOS
brew services start ollama # macOS

or: ollama serve

or: ollama serve

undefined
undefined

Quick Reference

快速参考

Query a RAG (Most Common)

查询RAG(最常用)

Query an existing RAG system with a natural language question:
bash
undefined
使用自然语言问题查询现有RAG系统:
bash
undefined

Non-interactive query (returns answer and exits)

Non-interactive query (returns answer and exits)

rlama run <rag-name> --query "your question here"
rlama run <rag-name> --query "your question here"

With more context chunks for complex questions

With more context chunks for complex questions

rlama run <rag-name> --query "explain the authentication flow" --context-size 30
rlama run <rag-name> --query "explain the authentication flow" --context-size 30

Show which documents contributed to the answer

Show which documents contributed to the answer

rlama run <rag-name> --query "what are the API endpoints?" --show-context
rlama run <rag-name> --query "what are the API endpoints?" --show-context

Use a different model for answering

Use a different model for answering

rlama run <rag-name> --query "summarize the architecture" -m deepseek-r1:8b

**Script wrapper** for cleaner output:

```bash
python3 ~/.claude/skills/rlama/scripts/rlama_query.py <rag-name> "your query"
python3 ~/.claude/skills/rlama/scripts/rlama_query.py my-docs "what is the main idea?" --show-sources
rlama run <rag-name> --query "summarize the architecture" -m deepseek-r1:8b

**更简洁输出的脚本封装**:

```bash
python3 ~/.claude/skills/rlama/scripts/rlama_query.py <rag-name> "your query"
python3 ~/.claude/skills/rlama/scripts/rlama_query.py my-docs "what is the main idea?" --show-sources

Retrieve-Only Mode (Claude Synthesizes)

仅检索模式(由Claude合成结果)

Get raw chunks without local LLM generation. Claude reads the chunks directly and synthesizes a stronger answer than local models can produce.
When to use retrieve vs standard query:
ScenarioUse
Quick lookup, local model sufficient
rlama_query.py
(standard)
Complex synthesis, nuanced reasoning
rlama_retrieve.py
(retrieve-only)
Claude needs raw evidence to cite
rlama_retrieve.py
(retrieve-only)
Offline/no Ollama for generation
rlama_retrieve.py
(retrieve-only)
bash
undefined
获取原始文本块,无需本地LLM生成。Claude直接读取文本块并生成比本地模型更优质的结果。
何时使用仅检索模式 vs 标准查询:
场景推荐用法
快速查找,本地模型足够满足需求
rlama_query.py
(标准模式)
复杂合成、精细化推理
rlama_retrieve.py
(仅检索模式)
需要Claude引用原始证据
rlama_retrieve.py
(仅检索模式)
离线环境/无Ollama用于生成
rlama_retrieve.py
(仅检索模式)
bash
undefined

Retrieve top 10 chunks (human-readable)

检索前10个文本块(人类可读格式)

python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query"
python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query"

Retrieve as JSON for programmatic use

以JSON格式检索(用于程序化调用)

python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --json
python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --json

More chunks for broad queries

针对宽泛查询检索更多文本块

python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" -k 20
python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" -k 20

Force rebuild embedding cache

强制重建嵌入缓存

python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --rebuild-cache
python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --rebuild-cache

List RAGs with cache status

列出所有RAG及其缓存状态

python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py --list

**External LLM Synthesis** (optional—retrieve chunks AND synthesize via OpenRouter, TogetherAI, Ollama, or any OpenAI-compatible endpoint):

```bash
python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py --list

**外部LLM合成(可选)**——检索文本块并通过OpenRouter、TogetherAI、Ollama或任何兼容OpenAI的端点生成结果:

```bash

Synthesize via OpenRouter (auto-detected from model with /)

通过OpenRouter合成(自动从包含/的模型名称检测)

python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --synthesize --synth-model anthropic/claude-sonnet-4
python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --synthesize --synth-model anthropic/claude-sonnet-4

Synthesize via TogetherAI

通过TogetherAI合成

python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --synthesize --provider togetherai
python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --synthesize --provider togetherai

Synthesize via local Ollama (fully offline, uses research-grade system prompt)

通过本地Ollama合成(完全离线,使用研究级系统提示词)

python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --synthesize --provider ollama
python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --synthesize --provider ollama

Synthesize via custom endpoint

通过自定义端点合成

python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --synthesize --endpoint https://my-api.com/v1/chat/completions

**Environment variables for synthesis:**

| Variable | Provider |
|----------|----------|
| `OPENROUTER_API_KEY` | OpenRouter (default, auto-detected first) |
| `TOGETHER_API_KEY` | TogetherAI |
| `SYNTH_API_KEY` | Custom endpoint (via `--endpoint`) |
| *(none needed)* | Ollama (local, no auth) |

Provider auto-detection: model names with `/` → OpenRouter, otherwise → TogetherAI. Falls back to whichever API key is set.

**Quality tiers:**

| Tier | Method | Quality | Latency |
|------|--------|---------|---------|
| Best | Retrieve-only → Claude synthesizes | Strongest synthesis | ~1s retrieve |
| Good | `--synthesize --synth-model anthropic/claude-sonnet-4` | Strong, cited | ~3s |
| Decent | `--synthesize --provider togetherai` (Llama 70B) | Solid for factual | ~2s |
| Local | `--synthesize --provider ollama` (Qwen 7B) | Basic, may hedge | ~5s |
| Baseline | `rlama_query.py` (RLAMA built-in) | Weakest, no prompt control | ~3s |

Small local models (7B) use a tuned prompt optimized for Qwen (structured output, anti-hedge, domain-keyword aware). Cloud providers use a strict research-grade prompt with mandatory citations.

First run builds an embedding cache (~30s for 3K chunks, ~10min for 25K chunks). Subsequent queries are <1s. Large RAGs use incremental checkpointing—if Ollama crashes mid-build, re-run to resume from the last checkpoint. Individual chunks are truncated to 5K chars to stay within nomic-embed-text's context window.

**Benchmarking:**

```bash
python3 ~/.claude/skills/rlama/scripts/rlama_retrieve.py <rag-name> "your query" --synthesize --endpoint https://my-api.com/v1/chat/completions

**合成功能的环境变量:**

| 变量名 | 服务商 |
|----------|----------|
| `OPENROUTER_API_KEY` | OpenRouter(默认,优先自动检测) |
| `TOGETHER_API_KEY` | TogetherAI |
| `SYNTH_API_KEY` | 自定义端点(配合`--endpoint`使用) |
| *无需设置* | Ollama(本地运行,无需认证) |

服务商自动检测规则:模型名称包含/ → OpenRouter,否则 → TogetherAI。若都不匹配,则使用已设置的任意API密钥。

**质量层级:**

| 层级 | 方式 | 质量 | 延迟 |
|------|--------|---------|---------|
| 最佳 | 仅检索 → 由Claude合成 | 最强合成能力 | ~1秒检索 |
| 优秀 | `--synthesize --synth-model anthropic/claude-sonnet-4` | 高质量,带引用 | ~3秒 |
| 良好 | `--synthesize --provider togetherai`(Llama 70B) | 事实性可靠 | ~2秒 |
| 本地 | `--synthesize --provider ollama`(Qwen 7B) | 基础能力,可能存在模糊表述 | ~5秒 |
| 基准 | `rlama_query.py`(RLAMA内置) | 能力最弱,无提示词控制 | ~3秒 |

小型本地模型(7B参数)使用针对Qwen优化的提示词(结构化输出、减少模糊表述、领域关键词感知)。云端服务商使用严格的研究级提示词,强制要求引用来源。

首次运行会构建嵌入缓存(3000个文本块约需30秒,25000个文本块约需10分钟)。后续查询耗时<1秒。大型RAG系统使用增量 checkpoint——若Ollama在构建过程中崩溃,重新运行即可从上次断点恢复。单个文本块会被截断至5000字符,以适配nomic-embed-text的上下文窗口。

**基准测试:**

```bash

Retrieval quality only

仅测试检索质量

python3 ~/.claude/skills/rlama/scripts/rlama_bench.py <rag-name> --retrieval-only
python3 ~/.claude/skills/rlama/scripts/rlama_bench.py <rag-name> --retrieval-only

Full synthesis benchmark (8 test cases)

完整合成基准测试(8个测试用例)

python3 ~/.claude/skills/rlama/scripts/rlama_bench.py <rag-name> --provider ollama --verbose
python3 ~/.claude/skills/rlama/scripts/rlama_bench.py <rag-name> --provider ollama --verbose

Single test case

单个测试用例

python3 ~/.claude/skills/rlama/scripts/rlama_bench.py <rag-name> --provider ollama --case 0
python3 ~/.claude/skills/rlama/scripts/rlama_bench.py <rag-name> --provider ollama --case 0

JSON output for analysis

以JSON格式输出结果用于分析

python3 ~/.claude/skills/rlama/scripts/rlama_bench.py <rag-name> --provider ollama --json

Scores: retrieval precision, topic coverage, grounding, directness (anti-hedge), composite (0-100).
python3 ~/.claude/skills/rlama/scripts/rlama_bench.py <rag-name> --provider ollama --json

评分维度:检索精度、主题覆盖度、事实依据、直接性(反模糊表述)、综合得分(0-100)。

Create a RAG

创建RAG系统

Index documents from a folder into a new RAG system:
bash
undefined
将文件夹中的文档索引到新的RAG系统:
bash
undefined

Basic creation (uses llama3.2 by default)

基础创建(默认使用llama3.2)

rlama rag llama3.2 <rag-name> <folder-path>
rlama rag llama3.2 <rag-name> <folder-path>

Examples

示例

rlama rag llama3.2 my-notes ~/Notes rlama rag llama3.2 project-docs ./docs rlama rag llama3.2 research-papers ~/Papers
rlama rag llama3.2 my-notes ~/Notes rlama rag llama3.2 project-docs ./docs rlama rag llama3.2 research-papers ~/Papers

With exclusions

排除指定内容

rlama rag llama3.2 codebase ./src --exclude-dir=node_modules,dist,.git --exclude-ext=.log,.tmp
rlama rag llama3.2 codebase ./src --exclude-dir=node_modules,dist,.git --exclude-ext=.log,.tmp

Only specific file types

仅处理特定文件类型

rlama rag llama3.2 markdown-docs ./docs --process-ext=.md,.txt
rlama rag llama3.2 markdown-docs ./docs --process-ext=.md,.txt

Custom chunking strategy

自定义分块策略

rlama rag llama3.2 my-rag ./docs --chunking=semantic --chunk-size=1500 --chunk-overlap=300

**Chunking strategies:**
- `hybrid` (default) - Combines semantic and fixed chunking
- `semantic` - Respects document structure (paragraphs, sections)
- `fixed` - Fixed character count chunks
- `hierarchical` - Preserves document hierarchy
rlama rag llama3.2 my-rag ./docs --chunking=semantic --chunk-size=1500 --chunk-overlap=300

**分块策略:**
- `hybrid`(默认)- 结合语义分块和固定长度分块
- `semantic` - 尊重文档结构(段落、章节)
- `fixed` - 固定字符长度分块
- `hierarchical` - 保留文档层级结构

List RAG Systems

列出RAG系统

bash
undefined
bash
undefined

List all RAGs

列出所有RAG系统

rlama list
rlama list

List documents in a specific RAG

列出指定RAG中的文档

rlama list-docs <rag-name>
rlama list-docs <rag-name>

Inspect chunks (debugging)

查看文本块(调试用)

rlama list-chunks <rag-name> --document=filename.pdf
undefined
rlama list-chunks <rag-name> --document=filename.pdf
undefined

Manage Documents

文档管理

Add documents to existing RAG:
bash
rlama add-docs <rag-name> <folder-or-file>
向现有RAG添加文档:
bash
rlama add-docs <rag-name> <folder-or-file>

Examples

示例

rlama add-docs my-notes ~/Notes/new-notes rlama add-docs research ./papers/new-paper.pdf

**Remove a document:**

```bash
rlama remove-doc <rag-name> <document-id>
rlama add-docs my-notes ~/Notes/new-notes rlama add-docs research ./papers/new-paper.pdf

**删除文档:**

```bash
rlama remove-doc <rag-name> <document-id>

Document ID is typically the filename

文档ID通常为文件名

rlama remove-doc my-notes old-note.md rlama remove-doc research outdated-paper.pdf
rlama remove-doc my-notes old-note.md rlama remove-doc research outdated-paper.pdf

Force remove without confirmation

强制删除,无需确认

rlama remove-doc my-notes old-note.md --force
undefined
rlama remove-doc my-notes old-note.md --force
undefined

Delete a RAG

删除RAG系统

bash
rlama delete <rag-name>
bash
rlama delete <rag-name>

Or manually remove the data directory

或手动删除数据目录

rm -rf ~/.rlama/<rag-name>
undefined
rm -rf ~/.rlama/<rag-name>
undefined

Advanced Features

高级功能

Web Crawling

网页爬取

Create a RAG from website content:
bash
undefined
从网站内容创建RAG系统:
bash
undefined

Crawl a website and create RAG

爬取网站并创建RAG

rlama crawl-rag llama3.2 docs-rag https://docs.example.com
rlama crawl-rag llama3.2 docs-rag https://docs.example.com

Add web content to existing RAG

向现有RAG添加网页内容

rlama crawl-add-docs my-rag https://blog.example.com
undefined
rlama crawl-add-docs my-rag https://blog.example.com
undefined

Directory Watching

目录监控

Automatically update RAG when files change:
bash
undefined
文件变化时自动更新RAG系统:
bash
undefined

Enable watching

启用监控

rlama watch <rag-name> <folder-path>
rlama watch <rag-name> <folder-path>

Check for new files manually

手动检查新文件

rlama check-watched <rag-name>
rlama check-watched <rag-name>

Disable watching

禁用监控

rlama watch-off <rag-name>
undefined
rlama watch-off <rag-name>
undefined

Website Watching

网站监控

Monitor websites for content updates:
bash
rlama web-watch <rag-name> https://docs.example.com
rlama check-web-watched <rag-name>
rlama web-watch-off <rag-name>
监控网站内容更新:
bash
rlama web-watch <rag-name> https://docs.example.com
rlama check-web-watched <rag-name>
rlama web-watch-off <rag-name>

Reranking

重排序

Improve result relevance with reranking:
bash
undefined
提升结果相关性:
bash
undefined

Add reranker to existing RAG

为现有RAG添加重排序器

rlama add-reranker <rag-name>
rlama add-reranker <rag-name>

Configure reranker weight (0-1, default 0.7)

配置重排序器权重(0-1,默认0.7)

rlama update-reranker <rag-name> --reranker-weight=0.8
rlama update-reranker <rag-name> --reranker-weight=0.8

Disable reranking

禁用重排序

rlama rag llama3.2 my-rag ./docs --disable-reranker
undefined
rlama rag llama3.2 my-rag ./docs --disable-reranker
undefined

API Server

API服务端

Run RLAMA as an API server for programmatic access:
bash
undefined
将RLAMA作为API服务端运行,支持程序化调用:
bash
undefined

Start API server

启动API服务端

rlama api --port 11249
rlama api --port 11249

Query via API

通过API查询

curl -X POST http://localhost:11249/rag
-H "Content-Type: application/json"
-d '{ "rag_name": "my-docs", "prompt": "What are the key points?", "context_size": 20 }'
undefined
curl -X POST http://localhost:11249/rag
-H "Content-Type: application/json"
-d '{ "rag_name": "my-docs", "prompt": "What are the key points?", "context_size": 20 }'
undefined

Model Management

模型管理

bash
undefined
bash
undefined

Update the model used by a RAG

更新RAG使用的模型

rlama update-model <rag-name> <new-model>
rlama update-model <rag-name> <new-model>

Example: Switch to a more powerful model

示例:切换到更强大的模型

rlama update-model my-rag deepseek-r1:8b
rlama update-model my-rag deepseek-r1:8b

Use Hugging Face models

使用Hugging Face模型

rlama rag hf.co/username/repo my-rag ./docs rlama rag hf.co/username/repo:Q4_K_M my-rag ./docs
rlama rag hf.co/username/repo my-rag ./docs rlama rag hf.co/username/repo:Q4_K_M my-rag ./docs

Use OpenAI models (requires OPENAI_API_KEY)

使用OpenAI模型(需要OPENAI_API_KEY)

export OPENAI_API_KEY="your-key" rlama rag gpt-4-turbo my-openai-rag ./docs
undefined
export OPENAI_API_KEY="your-key" rlama rag gpt-4-turbo my-openai-rag ./docs
undefined

Configuration

配置

Data Directory

数据目录

By default, RLAMA stores data in
~/.rlama/
. Change this with
--data-dir
:
bash
undefined
默认情况下,RLAMA将数据存储在
~/.rlama/
。可通过
--data-dir
修改:
bash
undefined

Use custom data directory

使用自定义数据目录

rlama --data-dir=/path/to/custom list rlama --data-dir=/projects/rag-data rag llama3.2 project-rag ./docs
rlama --data-dir=/path/to/custom list rlama --data-dir=/projects/rag-data rag llama3.2 project-rag ./docs

Or set via environment (add to ~/.zshrc)

或通过环境变量设置(添加到~/.zshrc)

export RLAMA_DATA_DIR="/path/to/custom"
undefined
export RLAMA_DATA_DIR="/path/to/custom"
undefined

Ollama Configuration

Ollama配置

bash
undefined
bash
undefined

Custom Ollama host

自定义Ollama主机

rlama --host=192.168.1.100 --port=11434 run my-rag
rlama --host=192.168.1.100 --port=11434 run my-rag

Or via environment

或通过环境变量设置

export OLLAMA_HOST="http://192.168.1.100:11434"
undefined
export OLLAMA_HOST="http://192.168.1.100:11434"
undefined

Default Model

默认模型

The skill uses
qwen2.5:7b
by default (changed from llama3.2 in Jan 2026). For legacy mode:
bash
undefined
本技能默认使用
qwen2.5:7b
(2026年1月从llama3.2变更)。如需使用旧版默认模型:
bash
undefined

Use the old llama3.2 default

使用旧版默认模型创建RAG

python3 ~/.claude/skills/rlama/scripts/rlama_manage.py create my-rag ./docs --legacy
python3 ~/.claude/skills/rlama/scripts/rlama_manage.py create my-rag ./docs --legacy

Per-command model override

单命令模型覆盖

rlama rag deepseek-r1:8b my-rag ./docs
rlama rag deepseek-r1:8b my-rag ./docs

For queries

查询时指定模型

rlama run my-rag --query "question" -m deepseek-r1:8b

**Recommended models:**
| Model | Size | Best For |
|-------|------|----------|
| `qwen2.5:7b` | 7B | Default - better reasoning (recommended) |
| `llama3.2` | 3B | Fast, legacy default (use `--legacy`) |
| `deepseek-r1:8b` | 8B | Complex questions |
| `llama3.3:70b` | 70B | Highest quality (slow) |
rlama run my-rag --query "question" -m deepseek-r1:8b

**推荐模型:**
| 模型 | 大小 | 最佳适用场景 |
|-------|------|----------|
| `qwen2.5:7b` | 7B | 默认选项——推理能力更强(推荐) |
| `llama3.2` | 3B | 速度快,旧版默认(使用`--legacy`) |
| `deepseek-r1:8b` | 8B | 复杂问题 |
| `llama3.3:70b` | 70B | 最高质量(速度慢) |

Supported File Types

支持的文件类型

RLAMA indexes these formats:
  • Text:
    .txt
    ,
    .md
    ,
    .markdown
  • Documents:
    .pdf
    ,
    .docx
    ,
    .doc
  • Code:
    .py
    ,
    .js
    ,
    .ts
    ,
    .go
    ,
    .rs
    ,
    .java
    ,
    .rb
    ,
    .cpp
    ,
    .c
    ,
    .h
  • Data:
    .json
    ,
    .yaml
    ,
    .yml
    ,
    .csv
  • Web:
    .html
    ,
    .htm
  • Org-mode:
    .org
RLAMA可索引以下格式:
  • 文本
    .txt
    ,
    .md
    ,
    .markdown
  • 文档
    .pdf
    ,
    .docx
    ,
    .doc
  • 代码
    .py
    ,
    .js
    ,
    .ts
    ,
    .go
    ,
    .rs
    ,
    .java
    ,
    .rb
    ,
    .cpp
    ,
    .c
    ,
    .h
  • 数据
    .json
    ,
    .yaml
    ,
    .yml
    ,
    .csv
  • 网页
    .html
    ,
    .htm
  • Org模式
    .org

Example Workflows

示例工作流

Personal Knowledge Base

个人知识库

bash
undefined
bash
undefined

Create from multiple folders

从多个文件夹创建

rlama rag llama3.2 personal-kb ~/Documents rlama add-docs personal-kb ~/Notes rlama add-docs personal-kb ~/Downloads/papers
rlama rag llama3.2 personal-kb ~/Documents rlama add-docs personal-kb ~/Notes rlama add-docs personal-kb ~/Downloads/papers

Query

查询

rlama run personal-kb --query "what did I write about project management?"
undefined
rlama run personal-kb --query "what did I write about project management?"
undefined

Code Documentation

代码文档

bash
undefined
bash
undefined

Index project docs

索引项目文档

rlama rag llama3.2 project-docs ./docs ./README.md
rlama rag llama3.2 project-docs ./docs ./README.md

Query architecture

查询架构

rlama run project-docs --query "how does authentication work?" --context-size 25
undefined
rlama run project-docs --query "how does authentication work?" --context-size 25
undefined

Research Papers

研究论文

bash
undefined
bash
undefined

Create research RAG

创建研究RAG

rlama rag llama3.2 papers ~/Papers --exclude-ext=.bib
rlama rag llama3.2 papers ~/Papers --exclude-ext=.bib

Add specific paper

添加特定论文

rlama add-docs papers ./new-paper.pdf
rlama add-docs papers ./new-paper.pdf

Query with high context

高上下文查询

rlama run papers --query "what methods are used for evaluation?" --context-size 30
undefined
rlama run papers --query "what methods are used for evaluation?" --context-size 30
undefined

Interactive Wizard

交互式向导

For guided RAG creation:
bash
rlama wizard
引导式创建RAG:
bash
rlama wizard

Resilient Indexing (Skip Problem Files)

弹性索引(跳过问题文件)

For folders with mixed content where some files may exceed embedding context limits (e.g., large PDFs), use the resilient script that processes files individually and skips failures:
bash
undefined
针对包含混合内容的文件夹(部分文件可能超出嵌入上下文限制,如大型PDF),使用弹性脚本逐个处理文件并跳过失败项:
bash
undefined

Create RAG, skipping files that fail

创建RAG,跳过处理失败的文件

python3 ~/.claude/skills/rlama/scripts/rlama_resilient.py create my-rag ~/Documents
python3 ~/.claude/skills/rlama/scripts/rlama_resilient.py create my-rag ~/Documents

Add to existing RAG, skipping failures

向现有RAG添加文件,跳过处理失败的文件

python3 ~/.claude/skills/rlama/scripts/rlama_resilient.py add my-rag ~/MoreDocs
python3 ~/.claude/skills/rlama/scripts/rlama_resilient.py add my-rag ~/MoreDocs

With docs-only filter

仅处理文档类文件

python3 ~/.claude/skills/rlama/scripts/rlama_resilient.py create research ~/Papers --docs-only
python3 ~/.claude/skills/rlama/scripts/rlama_resilient.py create research ~/Papers --docs-only

With legacy model

使用旧版模型

python3 ~/.claude/skills/rlama/scripts/rlama_resilient.py create my-rag ~/Docs --legacy

The script reports which files were added and which were skipped due to errors.
python3 ~/.claude/skills/rlama/scripts/rlama_resilient.py create my-rag ~/Docs --legacy

脚本会报告已添加的文件和因错误跳过的文件。

Progress Monitoring

进度监控

Monitor long-running RLAMA operations in real-time using the logging system.
通过日志系统实时监控长时间运行的RLAMA操作。

Tail the Log File

跟踪日志文件

bash
undefined
bash
undefined

Watch all operations in real-time

实时查看所有操作

tail -f ~/.rlama/logs/rlama.log
tail -f ~/.rlama/logs/rlama.log

Filter by RAG name

按RAG名称过滤

tail -f ~/.rlama/logs/rlama.log | grep my-rag
tail -f ~/.rlama/logs/rlama.log | grep my-rag

Pretty-print with jq

使用jq格式化输出

tail -f ~/.rlama/logs/rlama.log | jq -r '"(.ts) [(.cat)] (.msg)"'
tail -f ~/.rlama/logs/rlama.log | jq -r '"(.ts) [(.cat)] (.msg)"'

Show only progress updates

仅显示进度更新

tail -f ~/.rlama/logs/rlama.log | jq -r 'select(.data.i) | "(.ts) [(.cat)] (.data.i)/(.data.total) (.data.file // .data.status)"'
undefined
tail -f ~/.rlama/logs/rlama.log | jq -r 'select(.data.i) | "(.ts) [(.cat)] (.data.i)/(.data.total) (.data.file // .data.status)"'
undefined

Check Operation Status

检查操作状态

bash
undefined
bash
undefined

Show active operations

显示活跃操作

python3 ~/.claude/skills/rlama/scripts/rlama_status.py
python3 ~/.claude/skills/rlama/scripts/rlama_status.py

Show recent completed operations

显示最近完成的操作

python3 ~/.claude/skills/rlama/scripts/rlama_status.py --recent
python3 ~/.claude/skills/rlama/scripts/rlama_status.py --recent

Show both active and recent

显示活跃和最近操作

python3 ~/.claude/skills/rlama/scripts/rlama_status.py --all
python3 ~/.claude/skills/rlama/scripts/rlama_status.py --all

Follow mode (formatted tail -f)

跟随模式(格式化的tail -f)

python3 ~/.claude/skills/rlama/scripts/rlama_status.py --follow
python3 ~/.claude/skills/rlama/scripts/rlama_status.py --follow

JSON output

JSON格式输出

python3 ~/.claude/skills/rlama/scripts/rlama_status.py --json
undefined
python3 ~/.claude/skills/rlama/scripts/rlama_status.py --json
undefined

Log File Format

日志文件格式

Logs are written in JSON Lines format to
~/.rlama/logs/rlama.log
:
json
{"ts": "2026-02-03T12:34:56.789", "level": "info", "cat": "INGEST", "msg": "Progress 45/100", "data": {"op_id": "ingest_abc123", "i": 45, "total": 100, "file": "doc.pdf", "eta_sec": 85}}
日志以JSON Lines格式写入
~/.rlama/logs/rlama.log
json
{"ts": "2026-02-03T12:34:56.789", "level": "info", "cat": "INGEST", "msg": "Progress 45/100", "data": {"op_id": "ingest_abc123", "i": 45, "total": 100, "file": "doc.pdf", "eta_sec": 85}}

Operations State

操作状态

Active and recent operations are tracked in
~/.rlama/logs/operations.json
:
json
{
  "active": {
    "ingest_abc123": {
      "type": "ingest",
      "rag_name": "my-docs",
      "started": "2026-02-03T12:30:00",
      "processed": 45,
      "total": 100,
      "eta_sec": 85
    }
  },
  "recent": [...]
}
活跃和最近的操作记录在
~/.rlama/logs/operations.json
json
{
  "active": {
    "ingest_abc123": {
      "type": "ingest",
      "rag_name": "my-docs",
      "started": "2026-02-03T12:30:00",
      "processed": 45,
      "total": 100,
      "eta_sec": 85
    }
  },
  "recent": [...]
}

Troubleshooting

故障排除

"Ollama not found"

"Ollama not found"

bash
undefined
bash
undefined

Check Ollama status

检查Ollama状态

ollama --version ollama list
ollama --version ollama list

Start Ollama

启动Ollama

brew services start ollama # macOS ollama serve # Manual start
undefined
brew services start ollama # macOS ollama serve # 手动启动
undefined

"Model not found"

"Model not found"

bash
undefined
bash
undefined

Pull the required model

拉取所需模型

ollama pull llama3.2 ollama pull nomic-embed-text # Embedding model
undefined
ollama pull llama3.2 ollama pull nomic-embed-text # 嵌入模型
undefined

Slow Indexing

索引速度慢

  • Use smaller embedding models
  • Exclude large binary files:
    --exclude-ext=.bin,.zip,.tar
  • Exclude build directories:
    --exclude-dir=node_modules,dist,build
  • 使用更小的嵌入模型
  • 排除大型二进制文件:
    --exclude-ext=.bin,.zip,.tar
  • 排除构建目录:
    --exclude-dir=node_modules,dist,build

Poor Query Results

查询结果差

  1. Increase context size:
    --context-size=30
  2. Use a better model:
    -m deepseek-r1:8b
  3. Re-index with semantic chunking:
    --chunking=semantic
  4. Enable reranking:
    rlama add-reranker <rag-name>
  1. 增加上下文大小:
    --context-size=30
  2. 使用更优模型:
    -m deepseek-r1:8b
  3. 使用语义分块重新索引:
    --chunking=semantic
  4. 启用重排序:
    rlama add-reranker <rag-name>

Index Corruption

索引损坏

bash
undefined
bash
undefined

Delete and recreate

删除并重新创建

rm -rf ~/.rlama/<rag-name> rlama rag llama3.2 <rag-name> <folder-path>
undefined
rm -rf ~/.rlama/<rag-name> rlama rag llama3.2 <rag-name> <folder-path>
undefined

CLI Reference

CLI参考

Full command reference available at:
bash
rlama --help
rlama <command> --help
Or see
references/rlama-commands.md
for complete documentation.
完整命令参考可通过以下方式查看:
bash
rlama --help
rlama <command> --help
或查看
references/rlama-commands.md
获取完整文档。