rlm
Original:🇺🇸 English
Translated
Recursive Language Models (RLM) CLI - enables LLMs to recursively process large contexts by decomposing inputs and calling themselves over parts. Use for code analysis, diff reviews, codebase exploration. Triggers on "rlm ask", "rlm complete", "rlm search", "rlm index".
7installs
Sourcerawwerks/rlm-cli
Added on
NPX Install
npx skill4agent add rawwerks/rlm-cli rlmTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →RLM CLI
Recursive Language Models (RLM) CLI - enables LLMs to handle near-infinite context by recursively decomposing inputs and calling themselves over parts. Supports files, directories, URLs, and stdin.
Installation
bash
pip install rlm-cli # or: pipx install rlm-cli
uvx rlm-cli ask ... # run without installingSet an API key for your backend (openrouter is default):
bash
export OPENROUTER_API_KEY=... # default backend
export OPENAI_API_KEY=... # for --backend openai
export ANTHROPIC_API_KEY=... # for --backend anthropicCommands
ask - Query with context
bash
rlm ask <inputs> -q "question"Inputs (combinable):
| Type | Example | Notes |
|---|---|---|
| Directory | | Recursive, respects .gitignore |
| File | | Single file |
| URL | | Auto-converts to markdown |
| stdin | | |
| Literal | | Treat as raw text |
| Multiple | | Combine any types |
Options:
| Flag | Description |
|---|---|
| Question/prompt (required) |
| Provider: |
| Model override (format: |
| Machine-readable output |
| Output format: |
| Show execution summary with depth statistics |
| Filter by extension |
| Glob patterns |
| Limit REPL iterations (default: 30) |
| Recursive RLM depth (default: 1 = no recursion) |
| Spending limit in USD (requires OpenRouter) |
| Time limit in seconds |
| Total token limit (input + output) |
| Consecutive error limit before stopping |
| Skip auto-indexing |
| Enable Exa web search (requires |
| Execute Python code between iterations |
JSON output structure:
json
{"ok": true, "exit_code": 0, "result": {"response": "..."}, "stats": {...}}JSON-tree output ():
Adds execution tree showing nested RLM calls:
--output-format=json-treejson
{
"result": {
"response": "...",
"tree": {
"depth": 0,
"model": "openai/gpt-4",
"duration": 2.3,
"cost": 0.05,
"iterations": [...],
"children": [...]
}
}
}Summary output ():
Shows depth-wise statistics after completion:
--summary- JSON mode: adds field to
summarystats - Text mode: prints summary to stderr
=== RLM Execution Summary ===
Total depth: 2 | Nodes: 3 | Cost: $0.0054 | Duration: 17.38s
Depth 0: 1 call(s) ($0.0047, 13.94s)
Depth 1: 2 call(s) ($0.0007, 3.44s)complete - Query without context
bash
rlm complete "prompt text"
rlm complete "Generate SQL" --json --backend openaisearch - Search indexed files
bash
rlm search "query" [options]| Flag | Description |
|---|---|
| Max results (default: 20) |
| Filter by language |
| Output file paths only |
| JSON output |
Auto-indexes on first use. Manual index:
rlm index .index - Build search index
bash
rlm index . # Index current dir
rlm index ./src --force # Force full reindexdoctor - Check setup
bash
rlm doctor # Check config, API keys, deps
rlm doctor --jsonWorkflows
Git diff review:
bash
git diff | rlm ask - -q "Review for bugs"
git diff --cached | rlm ask - -q "Ready to commit?"
git diff HEAD~3 | rlm ask - -q "Summarize changes"Codebase analysis:
bash
rlm ask . -q "Explain architecture"
rlm ask src/ -q "How does auth work?" --extensions .pySearch + analyze:
bash
rlm search "database" --paths-only
rlm ask src/db.py -q "How is connection pooling done?"Compare files:
bash
rlm ask old.py new.py -q "What changed?"Configuration
Precedence: CLI flags > env vars > config file > defaults
Config locations: , ,
./rlm.yaml./.rlm.yaml~/.config/rlm/config.yamlyaml
backend: openrouter
model: google/gemini-3-flash-preview
max_iterations: 30Environment variables:
- - Default backend
RLM_BACKEND - - Default model
RLM_MODEL - - Config file path
RLM_CONFIG - - Always output JSON
RLM_JSON=1
Recursion and Budget Limits
Recursive RLM (--max-depth
)
--max-depthEnable recursive calls where child RLMs process sub-tasks:
llm_query()bash
# 2 levels of recursion
rlm ask . -q "Research thoroughly" --max-depth 2
# With budget cap
rlm ask . -q "Analyze codebase" --max-depth 3 --max-budget 0.50Budget Control (--max-budget
)
--max-budgetLimit spending per completion. Raises when exceeded:
BudgetExceededErrorbash
# Cap at $1.00
rlm complete "Complex task" --max-budget 1.00
# Very low budget (will likely exceed)
rlm ask . -q "Analyze everything" --max-budget 0.001Requirements: OpenRouter backend (returns cost data in responses).
Other Limits
Timeout () - Stop after N seconds:
--max-timeoutbash
rlm complete "Complex task" --max-timeout 30Token limit () - Stop after N total tokens:
--max-tokensbash
rlm ask . -q "Analyze" --max-tokens 10000Error threshold () - Stop after N consecutive code errors:
--max-errorsbash
rlm complete "Write code" --max-errors 3Stop Conditions
RLM execution stops when any of these occur:
- Final answer - LLM calls with the NAME of a variable (as a string)
FINAL_VAR("variable_name") - Max iterations - Exceeds (exit code 0, graceful - forces final answer)
--max-iterations
FINAL_VAR usage (common mistake - pass variable NAME, not value):
python
# CORRECT:
result = {"answer": "hello", "score": 42}
FINAL_VAR("result") # pass the variable NAME as a string
# WRONG:
FINAL_VAR(result) # passing the dict directly causes AttributeError- Max budget exceeded - Spending > (exit code 20, error)
--max-budget - Max timeout exceeded - Time > (exit code 20, error with partial answer)
--max-timeout - Max tokens exceeded - Tokens > (exit code 20, error with partial answer)
--max-tokens - Max errors exceeded - Consecutive errors > (exit code 20, error with partial answer)
--max-errors - User cancellation - Ctrl+C or SIGUSR1 (exit code 0, returns partial answer as success)
- Max depth reached - Child RLM at depth 0 cannot recurse further
Note on max iterations: This is a soft limit. When exceeded, RLM prompts the LLM one more time to provide a final answer. Modern LLMs typically complete in 1-2 iterations.
Partial answers: When timeout, tokens, or errors stop execution, the error includes if any response was generated before stopping.
partial_answerEarly exit (Ctrl+C): Pressing Ctrl+C (or sending SIGUSR1) returns the partial answer as success (exit code 0) with in the result.
early_exit: trueInject File (--inject-file)
Update REPL variables mid-run by modifying an inject file:
bash
# Create inject file
echo 'focus = "authentication"' > inject.py
# Run with inject file
rlm ask . -q "Analyze based on 'focus'" --inject-file inject.py
# In another terminal, update mid-run
echo 'focus = "authorization"' > inject.pyThe file is checked before each iteration and executed if modified.
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Success |
| 2 | CLI usage error |
| 10 | Input error (file not found) |
| 11 | Config error (missing API key) |
| 20 | Backend/API error (includes budget exceeded) |
| 30 | Runtime error |
| 40 | Index/search error |
LLM Search Tools
When runs on a directory, the LLM gets search tools:
rlm ask| Tool | Cost | Privacy | Use For |
|---|---|---|---|
| Free | Local | Exact patterns, function names, imports |
| Free | Local | Topics, concepts, related files |
| $ | API | Web search (requires |
| $$$ | API | Hierarchical PDF/document navigation |
Free Local Tools (auto-loaded)
- rg.search(pattern, paths, globs) - ripgrep for exact patterns
- tv.search(query, limit) - Tantivy BM25 for concepts
Exa Web Search (--exa flag, Costs Money)
⚠️ Opt-in: Requires flag and environment variable.
--exaEXA_API_KEYSetup:
bash
export EXA_API_KEY=... # Get from https://exa.aiUsage in REPL:
python
from rlm_cli.tools_search import exa, web
# Basic search
results = exa.search(query="Python async patterns", limit=5)
for r in results:
print(f"{r['title']}: {r['url']}")
# With highlights (relevant excerpts)
results = exa.search(
query="error handling best practices",
limit=3,
include_highlights=True
)
# Semantic alias
results = web(query="machine learning tutorial", limit=5)
# Find similar pages
results = exa.find_similar(url="https://example.com/article", limit=5)exa.search() parameters:
| Param | Default | Description |
|---|---|---|
| required | Search query |
| 10 | Max results |
| "auto" | "auto", "neural", or "keyword" |
| None | Only these domains |
| None | Exclude these domains |
| False | Include full page text |
| True | Include relevant excerpts |
| None | "company", "research paper", "news", etc. |
When to use exa.search() / web():
- Finding external documentation, tutorials, articles
- Researching topics beyond the local codebase
- Finding similar pages to a reference URL
PageIndex (pi.* - Opt-in, Costs Money)
⚠️ WARNING: PageIndex sends document content to LLM APIs and costs money.
Only use when:
- User explicitly requests document/PDF analysis
- Document has hierarchical structure (reports, manuals)
- User accepts cost/privacy tradeoffs
Prerequisites:
- (or other backend key) must be set in environment
OPENROUTER_API_KEY - PageIndex submodule must be initialized
- Run within rlm-cli's virtual environment (has required dependencies)
Setup (REQUIRED before any pi. operation):*
python
import sys
sys.path.insert(0, "/path/to/rlm-cli/rlm") # rlm submodule
sys.path.insert(0, "/path/to/rlm-cli/pageindex") # pageindex submodule
from rlm.clients import get_client
from rlm_cli.tools_pageindex import pi
# Configure with existing rlm backend
client = get_client(backend="openrouter", backend_kwargs={"model_name": "google/gemini-2.0-flash-001"})
pi.configure(client)Indexing (costs $$$):
python
# Build tree index - THIS COSTS MONEY (no caching, re-indexes each call)
tree = pi.index(path="report.pdf")
# Returns: PITree object with doc_name, nodes, doc_description, rawViewing structure (free after indexing):
python
# Display table of contents
print(pi.toc(tree))
# Get section by node_id (IDs are "0000", "0001", "0002", etc.)
section = pi.get_section(tree, "0003")
# Returns: PINode with title, node_id, start_index, end_index, summary, children
# Returns: None if not found
if section:
print(f"{section.title}: pages {section.start_index}-{section.end_index}")Finding node IDs:
Node IDs are assigned sequentially ("0000", "0001", ...) in tree traversal order.
To see all node IDs, access the raw tree structure:
python
import json
print(json.dumps(tree.raw["structure"], indent=2))
# Each node has: title, node_id, start_index, end_indexpi. API Reference:*
| Method | Cost | Returns | Description |
|---|---|---|---|
| Free | None | Set rlm backend (REQUIRED first) |
| Free | dict | Check availability, config, warning |
| $$$ | PITree | Build tree from PDF |
| Free | str | Formatted table of contents |
| Free | PINode or None | Get section by ID |
| Free | bool | Check if PageIndex installed |
| Free | bool | Check if client configured |
PITree attributes: , (list of PINode), , (dict)
PINode attributes: , , , , (may be None), (may be None)
doc_namenodesdoc_descriptionrawtitlenode_idstart_indexend_indexsummarychildrenNotes:
- is only populated if
summaryinadd_summaries=Truepi.index() - is None for leaf nodes (sections with no subsections)
children - is a flat list; hierarchy is in PINode.children
tree.raw["structure"] - PageIndex extracts document structure (TOC), not content. Use page numbers to locate sections in the original PDF.
Example output from pi.toc():
📄 annual_report.pdf
• Executive Summary (p.1-5)
• Financial Overview (p.6-20)
• Revenue (p.6-10)
• Expenses (p.11-15)
• Projections (p.16-20)
• Risk Factors (p.21-35)