rlm

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

RLM CLI

RLM CLI

Recursive Language Models (RLM) CLI - enables LLMs to handle near-infinite context by recursively decomposing inputs and calling themselves over parts. Supports files, directories, URLs, and stdin.
递归语言模型(RLM)CLI - 让LLM通过递归分解输入并处理各部分内容,从而支持近乎无限的上下文。支持文件、目录、URL和标准输入(stdin)。

Installation

安装

bash
pip install rlm-cli    # or: pipx install rlm-cli
uvx rlm-cli ask ...    # run without installing
Set an API key for your backend (openrouter is default):
bash
export OPENROUTER_API_KEY=...  # default backend
export OPENAI_API_KEY=...      # for --backend openai
export ANTHROPIC_API_KEY=...   # for --backend anthropic
bash
pip install rlm-cli    # or: pipx install rlm-cli
uvx rlm-cli ask ...    # run without installing
为你的后端服务设置API密钥(默认使用OpenRouter):
bash
export OPENROUTER_API_KEY=...  # default backend
export OPENAI_API_KEY=...      # for --backend openai
export ANTHROPIC_API_KEY=...   # for --backend anthropic

Commands

命令

ask - Query with context

ask - 带上下文的查询

bash
rlm ask <inputs> -q "question"
Inputs (combinable):
TypeExampleNotes
Directory
rlm ask . -q "..."
Recursive, respects .gitignore
File
rlm ask main.py -q "..."
Single file
URL
rlm ask https://x.com -q "..."
Auto-converts to markdown
stdin
git diff | rlm ask - -q "..."
-
reads from pipe
Literal
rlm ask "text" -q "..." --literal
Treat as raw text
Multiple
rlm ask a.py b.py -q "..."
Combine any types
Options:
FlagDescription
-q "..."
Question/prompt (required)
--backend
Provider:
openrouter
(default),
openai
,
anthropic
--model NAME
Model override (format:
provider/model
or just
model
)
--json
Machine-readable output
--output-format
Output format:
text
,
json
, or
json-tree
--summary
Show execution summary with depth statistics
--extensions .py .ts
Filter by extension
--include/--exclude
Glob patterns
--max-iterations N
Limit REPL iterations (default: 30)
--max-depth N
Recursive RLM depth (default: 1 = no recursion)
--max-budget N.NN
Spending limit in USD (requires OpenRouter)
--max-timeout N
Time limit in seconds
--max-tokens N
Total token limit (input + output)
--max-errors N
Consecutive error limit before stopping
--no-index
Skip auto-indexing
--exa
Enable Exa web search (requires
EXA_API_KEY
)
--inject-file FILE
Execute Python code between iterations
JSON output structure:
json
{"ok": true, "exit_code": 0, "result": {"response": "..."}, "stats": {...}}
JSON-tree output (
--output-format=json-tree
):
Adds execution tree showing nested RLM calls:
json
{
  "result": {
    "response": "...",
    "tree": {
      "depth": 0,
      "model": "openai/gpt-4",
      "duration": 2.3,
      "cost": 0.05,
      "iterations": [...],
      "children": [...]
    }
  }
}
Summary output (
--summary
):
Shows depth-wise statistics after completion:
  • JSON mode: adds
    summary
    field to
    stats
  • Text mode: prints summary to stderr
=== RLM Execution Summary ===
Total depth: 2 | Nodes: 3 | Cost: $0.0054 | Duration: 17.38s
Depth 0: 1 call(s) ($0.0047, 13.94s)
Depth 1: 2 call(s) ($0.0007, 3.44s)
bash
rlm ask <inputs> -q "question"
输入(可组合):
类型示例说明
目录
rlm ask . -q "..."
递归遍历,遵循.gitignore规则
文件
rlm ask main.py -q "..."
单个文件
URL
rlm ask https://x.com -q "..."
自动转换为Markdown格式
标准输入
git diff | rlm ask - -q "..."
-
表示从管道读取内容
文本内容
rlm ask "text" -q "..." --literal
作为原始文本处理
多类型组合
rlm ask a.py b.py -q "..."
可组合任意类型的输入
选项:
参数描述
-q "..."
问题/提示词(必填)
--backend
服务提供商:
openrouter
(默认)、
openai
anthropic
--model NAME
模型覆盖(格式:
provider/model
或仅
model
--json
机器可读格式输出
--output-format
输出格式:
text
json
json-tree
--summary
显示包含深度统计信息的执行摘要
--extensions .py .ts
按文件扩展名过滤
--include/--exclude
通配符模式
--max-iterations N
限制REPL迭代次数(默认:30)
--max-depth N
RLM递归深度(默认:1 = 不递归)
--max-budget N.NN
美元计价的消费上限(需要使用OpenRouter)
--max-timeout N
时间限制(秒)
--max-tokens N
总令牌限制(输入+输出)
--max-errors N
停止前允许的连续错误次数上限
--no-index
跳过自动索引
--exa
启用Exa网页搜索(需要
EXA_API_KEY
--inject-file FILE
在迭代之间执行Python代码
JSON输出结构:
json
{"ok": true, "exit_code": 0, "result": {"response": "..."}, "stats": {...}}
JSON-tree输出(
--output-format=json-tree
):
添加显示嵌套RLM调用的执行树:
json
{
  "result": {
    "response": "...",
    "tree": {
      "depth": 0,
      "model": "openai/gpt-4",
      "duration": 2.3,
      "cost": 0.05,
      "iterations": [...],
      "children": [...]
    }
  }
}
摘要输出(
--summary
):
完成后显示深度统计信息:
  • JSON模式:在
    stats
    中添加
    summary
    字段
  • 文本模式:将摘要打印到标准错误输出
=== RLM执行摘要 ===
总深度:2 | 节点数:3 | 花费:$0.0054 | 耗时:17.38s
深度0:1次调用($0.0047,13.94s)
深度1:2次调用($0.0007,3.44s)

complete - Query without context

complete - 无上下文查询

bash
rlm complete "prompt text"
rlm complete "Generate SQL" --json --backend openai
bash
rlm complete "prompt text"
rlm complete "Generate SQL" --json --backend openai

search - Search indexed files

search - 搜索已索引文件

bash
rlm search "query" [options]
FlagDescription
--limit N
Max results (default: 20)
--language python
Filter by language
--paths-only
Output file paths only
--json
JSON output
Auto-indexes on first use. Manual index:
rlm index .
bash
rlm search "query" [options]
参数描述
--limit N
最大结果数(默认:20)
--language python
按编程语言过滤
--paths-only
仅输出文件路径
--json
JSON格式输出
首次使用时自动创建索引。手动创建索引:
rlm index .

index - Build search index

index - 构建搜索索引

bash
rlm index .              # Index current dir
rlm index ./src --force  # Force full reindex
bash
rlm index .              # 索引当前目录
rlm index ./src --force  # 强制全量重新索引

doctor - Check setup

doctor - 检查环境配置

bash
rlm doctor       # Check config, API keys, deps
rlm doctor --json
bash
rlm doctor       # 检查配置、API密钥、依赖项
rlm doctor --json

Workflows

工作流

Git diff review:
bash
git diff | rlm ask - -q "Review for bugs"
git diff --cached | rlm ask - -q "Ready to commit?"
git diff HEAD~3 | rlm ask - -q "Summarize changes"
Codebase analysis:
bash
rlm ask . -q "Explain architecture"
rlm ask src/ -q "How does auth work?" --extensions .py
Search + analyze:
bash
rlm search "database" --paths-only
rlm ask src/db.py -q "How is connection pooling done?"
Compare files:
bash
rlm ask old.py new.py -q "What changed?"
Git Diff评审:
bash
git diff | rlm ask - -q "检查是否存在Bug"
git diff --cached | rlm ask - -q "是否可以提交?"
git diff HEAD~3 | rlm ask - -q "总结变更内容"
代码库分析:
bash
rlm ask . -q "解释代码架构"
rlm ask src/ -q "认证功能是如何实现的?" --extensions .py
搜索+分析:
bash
rlm search "database" --paths-only
rlm ask src/db.py -q "连接池是如何实现的?"
文件对比:
bash
rlm ask old.py new.py -q "有哪些变更?"

Configuration

配置

Precedence: CLI flags > env vars > config file > defaults
Config locations:
./rlm.yaml
,
./.rlm.yaml
,
~/.config/rlm/config.yaml
yaml
backend: openrouter
model: google/gemini-3-flash-preview
max_iterations: 30
Environment variables:
  • RLM_BACKEND
    - Default backend
  • RLM_MODEL
    - Default model
  • RLM_CONFIG
    - Config file path
  • RLM_JSON=1
    - Always output JSON
优先级: CLI参数 > 环境变量 > 配置文件 > 默认值
配置文件位置:
./rlm.yaml
./.rlm.yaml
~/.config/rlm/config.yaml
yaml
backend: openrouter
model: google/gemini-3-flash-preview
max_iterations: 30
环境变量:
  • RLM_BACKEND
    - 默认后端服务
  • RLM_MODEL
    - 默认模型
  • RLM_CONFIG
    - 配置文件路径
  • RLM_JSON=1
    - 始终输出JSON格式

Recursion and Budget Limits

递归与预算限制

Recursive RLM (
--max-depth
)

递归RLM(
--max-depth

Enable recursive
llm_query()
calls where child RLMs process sub-tasks:
bash
undefined
启用递归
llm_query()
调用,子RLM处理子任务:
bash
undefined

2 levels of recursion

2层递归

rlm ask . -q "Research thoroughly" --max-depth 2
rlm ask . -q "进行全面调研" --max-depth 2

With budget cap

带预算上限

rlm ask . -q "Analyze codebase" --max-depth 3 --max-budget 0.50
undefined
rlm ask . -q "分析代码库" --max-depth 3 --max-budget 0.50
undefined

Budget Control (
--max-budget
)

预算控制(
--max-budget

Limit spending per completion. Raises
BudgetExceededError
when exceeded:
bash
undefined
限制每次完成任务的消费。超出上限时会抛出
BudgetExceededError
bash
undefined

Cap at $1.00

上限1.00美元

rlm complete "Complex task" --max-budget 1.00
rlm complete "复杂任务" --max-budget 1.00

Very low budget (will likely exceed)

极低预算(很可能超出)

rlm ask . -q "Analyze everything" --max-budget 0.001

**Requirements:** OpenRouter backend (returns cost data in responses).
rlm ask . -q "分析所有内容" --max-budget 0.001

**要求:** 使用OpenRouter后端(响应中会返回成本数据)。

Other Limits

其他限制

Timeout (
--max-timeout
)
- Stop after N seconds:
bash
rlm complete "Complex task" --max-timeout 30
Token limit (
--max-tokens
)
- Stop after N total tokens:
bash
rlm ask . -q "Analyze" --max-tokens 10000
Error threshold (
--max-errors
)
- Stop after N consecutive code errors:
bash
rlm complete "Write code" --max-errors 3
超时限制(
--max-timeout
- N秒后停止执行:
bash
rlm complete "复杂任务" --max-timeout 30
令牌限制(
--max-tokens
- 总令牌数超过N时停止:
bash
rlm ask . -q "进行分析" --max-tokens 10000
错误阈值(
--max-errors
- 连续错误次数超过N时停止:
bash
rlm complete "编写代码" --max-errors 3

Stop Conditions

停止条件

RLM execution stops when any of these occur:
  1. Final answer - LLM calls
    FINAL_VAR("variable_name")
    with the NAME of a variable (as a string)
  2. Max iterations - Exceeds
    --max-iterations
    (exit code 0, graceful - forces final answer)
FINAL_VAR usage (common mistake - pass variable NAME, not value):
python
undefined
当出现以下任一情况时,RLM执行会停止:
  1. 最终答案 - LLM调用
    FINAL_VAR("variable_name")
    ,传入变量名称(字符串形式)
  2. 最大迭代次数 - 超出
    --max-iterations
    限制(退出码0,正常结束 - 强制生成最终答案)
FINAL_VAR使用注意(常见错误 - 传入变量名称而非值):
python
undefined

CORRECT:

正确用法:

result = {"answer": "hello", "score": 42} FINAL_VAR("result") # pass the variable NAME as a string
result = {"answer": "hello", "score": 42} FINAL_VAR("result") # 传入变量名称的字符串形式

WRONG:

错误用法:

FINAL_VAR(result) # passing the dict directly causes AttributeError
3. **Max budget exceeded** - Spending > `--max-budget` (exit code 20, error)
4. **Max timeout exceeded** - Time > `--max-timeout` (exit code 20, error with partial answer)
5. **Max tokens exceeded** - Tokens > `--max-tokens` (exit code 20, error with partial answer)
6. **Max errors exceeded** - Consecutive errors > `--max-errors` (exit code 20, error with partial answer)
7. **User cancellation** - Ctrl+C or SIGUSR1 (exit code 0, returns partial answer as success)
8. **Max depth reached** - Child RLM at depth 0 cannot recurse further

**Note on max iterations:** This is a soft limit. When exceeded, RLM prompts the LLM one more time to provide a final answer. Modern LLMs typically complete in 1-2 iterations.

**Partial answers:** When timeout, tokens, or errors stop execution, the error includes `partial_answer` if any response was generated before stopping.

**Early exit (Ctrl+C):** Pressing Ctrl+C (or sending SIGUSR1) returns the partial answer as success (exit code 0) with `early_exit: true` in the result.
FINAL_VAR(result) # 直接传入字典会导致AttributeError
3. **超出预算上限** - 消费金额 > `--max-budget`(退出码20,错误)
4. **超出超时限制** - 耗时 > `--max-timeout`(退出码20,错误,返回部分答案)
5. **超出令牌限制** - 令牌数 > `--max-tokens`(退出码20,错误,返回部分答案)
6. **超出错误阈值** - 连续错误次数 > `--max-errors`(退出码20,错误,返回部分答案)
7. **用户取消** - 按下Ctrl+C或发送SIGUSR1信号(退出码0,返回部分答案作为成功结果)
8. **达到最大深度** - 深度为0的子RLM无法继续递归

**关于最大迭代次数的说明:** 这是一个软限制。当超出限制时,RLM会再提示LLM一次,要求其提供最终答案。现代LLM通常在1-2次迭代内即可完成任务。

**部分答案:** 当因超时、令牌不足或错误导致执行停止时,如果在停止前已生成部分响应,错误信息中会包含`partial_answer`字段。

**提前退出(Ctrl+C):** 按下Ctrl+C(或发送SIGUSR1信号)会将部分答案作为成功结果返回(退出码0),结果中会包含`early_exit: true`。

Inject File (--inject-file)

注入文件(--inject-file)

Update REPL variables mid-run by modifying an inject file:
bash
undefined
通过修改注入文件,在运行过程中更新REPL变量:
bash
undefined

Create inject file

创建注入文件

echo 'focus = "authentication"' > inject.py
echo 'focus = "authentication"' > inject.py

Run with inject file

运行时指定注入文件

rlm ask . -q "Analyze based on 'focus'" --inject-file inject.py
rlm ask . -q "基于'focus'进行分析" --inject-file inject.py

In another terminal, update mid-run

在另一个终端中,运行过程中更新注入文件

echo 'focus = "authorization"' > inject.py

The file is checked before each iteration and executed if modified.
echo 'focus = "authorization"' > inject.py

每次迭代前会检查该文件,如果有修改则执行文件内容。

Exit Codes

退出码

CodeMeaning
0Success
2CLI usage error
10Input error (file not found)
11Config error (missing API key)
20Backend/API error (includes budget exceeded)
30Runtime error
40Index/search error
代码含义
0成功
2CLI使用错误
10输入错误(文件未找到)
11配置错误(缺少API密钥)
20后端/API错误(包含预算超出)
30运行时错误
40索引/搜索错误

LLM Search Tools

LLM搜索工具

When
rlm ask
runs on a directory, the LLM gets search tools:
ToolCostPrivacyUse For
rg.search()
FreeLocalExact patterns, function names, imports
tv.search()
FreeLocalTopics, concepts, related files
exa.search()
$APIWeb search (requires
--exa
flag)
pi.*
$$$APIHierarchical PDF/document navigation
rlm ask
在目录上运行时,LLM会获得以下搜索工具:
工具成本隐私性适用场景
rg.search()
免费本地处理精确匹配模式、函数名、导入语句
tv.search()
免费本地处理主题、概念、相关文件搜索
exa.search()
付费API调用网页搜索(需要
--exa
参数)
pi.*
高额付费API调用分层PDF/文档导航

Free Local Tools (auto-loaded)

免费本地工具(自动加载)

  • rg.search(pattern, paths, globs) - ripgrep for exact patterns
  • tv.search(query, limit) - Tantivy BM25 for concepts
  • rg.search(pattern, paths, globs) - 使用ripgrep进行精确模式匹配
  • tv.search(query, limit) - 使用Tantivy BM25进行概念搜索

Exa Web Search (--exa flag, Costs Money)

Exa网页搜索(--exa参数,需付费)

⚠️ Opt-in: Requires
--exa
flag and
EXA_API_KEY
environment variable.
Setup:
bash
export EXA_API_KEY=...  # Get from https://exa.ai
Usage in REPL:
python
from rlm_cli.tools_search import exa, web
⚠️ 需主动启用:需要
--exa
参数和
EXA_API_KEY
环境变量。
配置:
bash
export EXA_API_KEY=...  # 从https://exa.ai获取
REPL中使用:
python
from rlm_cli.tools_search import exa, web

Basic search

基础搜索

results = exa.search(query="Python async patterns", limit=5) for r in results: print(f"{r['title']}: {r['url']}")
results = exa.search(query="Python async patterns", limit=5) for r in results: print(f"{r['title']}: {r['url']}")

With highlights (relevant excerpts)

包含高亮显示(相关片段)

results = exa.search( query="error handling best practices", limit=3, include_highlights=True )
results = exa.search( query="error handling best practices", limit=3, include_highlights=True )

Semantic alias

语义别名

results = web(query="machine learning tutorial", limit=5)
results = web(query="machine learning tutorial", limit=5)

Find similar pages

查找相似页面

results = exa.find_similar(url="https://example.com/article", limit=5)

**exa.search() parameters:**
| Param | Default | Description |
|-------|---------|-------------|
| `query` | required | Search query |
| `limit` | 10 | Max results |
| `search_type` | "auto" | "auto", "neural", or "keyword" |
| `include_domains` | None | Only these domains |
| `exclude_domains` | None | Exclude these domains |
| `include_text` | False | Include full page text |
| `include_highlights` | True | Include relevant excerpts |
| `category` | None | "company", "research paper", "news", etc. |

**When to use exa.search() / web():**
- Finding external documentation, tutorials, articles
- Researching topics beyond the local codebase
- Finding similar pages to a reference URL
results = exa.find_similar(url="https://example.com/article", limit=5)

**exa.search()参数:**
| 参数 | 默认值 | 描述 |
|-------|---------|-------------|
| `query` | 必填 | 搜索查询词 |
| `limit` | 10 | 最大结果数 |
| `search_type` | "auto" | "auto"、"neural" 或 "keyword" |
| `include_domains` | None | 仅包含这些域名 |
| `exclude_domains` | None | 排除这些域名 |
| `include_text` | False | 包含完整页面文本 |
| `include_highlights` | True | 包含相关片段 |
| `category` | None | "company"、"research paper"、"news"等 |

**何时使用exa.search() / web():**
- 查找外部文档、教程、文章
- 研究本地代码库之外的主题
- 查找与参考URL相似的页面

PageIndex (pi.* - Opt-in, Costs Money)

PageIndex(pi.* - 需主动启用,需付费)

⚠️ WARNING: PageIndex sends document content to LLM APIs and costs money.
Only use when:
  1. User explicitly requests document/PDF analysis
  2. Document has hierarchical structure (reports, manuals)
  3. User accepts cost/privacy tradeoffs
Prerequisites:
  • OPENROUTER_API_KEY
    (or other backend key) must be set in environment
  • PageIndex submodule must be initialized
  • Run within rlm-cli's virtual environment (has required dependencies)
Setup (REQUIRED before any pi. operation):*
python
import sys
sys.path.insert(0, "/path/to/rlm-cli/rlm")        # rlm submodule
sys.path.insert(0, "/path/to/rlm-cli/pageindex")  # pageindex submodule

from rlm.clients import get_client
from rlm_cli.tools_pageindex import pi
⚠️ 注意:PageIndex会将文档内容发送到LLM API,产生费用。
仅在以下场景使用:
  1. 用户明确要求分析文档/PDF
  2. 文档具有分层结构(报告、手册)
  3. 用户接受成本/隐私权衡
前提条件:
  • 环境中必须设置
    OPENROUTER_API_KEY
    (或其他后端服务密钥)
  • 必须初始化PageIndex子模块
  • 在rlm-cli的虚拟环境中运行(已安装所需依赖)
*配置(所有pi.操作前必须完成):
python
import sys
sys.path.insert(0, "/path/to/rlm-cli/rlm")        # rlm子模块路径
sys.path.insert(0, "/path/to/rlm-cli/pageindex")  # pageindex子模块路径

from rlm.clients import get_client
from rlm_cli.tools_pageindex import pi

Configure with existing rlm backend

使用已有的rlm后端配置

client = get_client(backend="openrouter", backend_kwargs={"model_name": "google/gemini-2.0-flash-001"}) pi.configure(client)

**Indexing (costs $$$):**
```python
client = get_client(backend="openrouter", backend_kwargs={"model_name": "google/gemini-2.0-flash-001"}) pi.configure(client)

**索引构建(高额费用):**
```python

Build tree index - THIS COSTS MONEY (no caching, re-indexes each call)

构建树状索引 - 会产生费用(无缓存,每次调用都会重新索引)

tree = pi.index(path="report.pdf")
tree = pi.index(path="report.pdf")

Returns: PITree object with doc_name, nodes, doc_description, raw

返回:PITree对象,包含doc_name、nodes、doc_description、raw字段


**Viewing structure (free after indexing):**
```python

**查看结构(索引构建后免费):**
```python

Display table of contents

显示目录

print(pi.toc(tree))
print(pi.toc(tree))

Get section by node_id (IDs are "0000", "0001", "0002", etc.)

通过node_id获取章节(ID格式为"0000"、"0001"、"0002"等)

section = pi.get_section(tree, "0003")
section = pi.get_section(tree, "0003")

Returns: PINode with title, node_id, start_index, end_index, summary, children

返回:PINode对象,包含title、node_id、start_index、end_index、summary、children字段

Returns: None if not found

未找到时返回None

if section: print(f"{section.title}: pages {section.start_index}-{section.end_index}")

**Finding node IDs:**
Node IDs are assigned sequentially ("0000", "0001", ...) in tree traversal order.
To see all node IDs, access the raw tree structure:
```python
import json
print(json.dumps(tree.raw["structure"], indent=2))
if section: print(f"{section.title}: 第{section.start_index}-{section.end_index}页")

**查找node_id:**
node_id按树遍历顺序依次分配("0000"、"0001"...)。要查看所有node_id,可访问原始树结构:
```python
import json
print(json.dumps(tree.raw["structure"], indent=2))

Each node has: title, node_id, start_index, end_index

每个节点包含:title、node_id、start_index、end_index


**pi.* API Reference:**
| Method | Cost | Returns | Description |
|--------|------|---------|-------------|
| `pi.configure(client)` | Free | None | Set rlm backend (REQUIRED first) |
| `pi.status()` | Free | dict | Check availability, config, warning |
| `pi.index(path=str)` | $$$ | PITree | Build tree from PDF |
| `pi.toc(tree, max_depth=3)` | Free | str | Formatted table of contents |
| `pi.get_section(tree, node_id)` | Free | PINode or None | Get section by ID |
| `pi.available()` | Free | bool | Check if PageIndex installed |
| `pi.configured()` | Free | bool | Check if client configured |

**PITree attributes:** `doc_name`, `nodes` (list of PINode), `doc_description`, `raw` (dict)
**PINode attributes:** `title`, `node_id`, `start_index`, `end_index`, `summary` (may be None), `children` (may be None)

**Notes:**
- `summary` is only populated if `add_summaries=True` in `pi.index()`
- `children` is None for leaf nodes (sections with no subsections)
- `tree.raw["structure"]` is a flat list; hierarchy is in PINode.children
- PageIndex extracts document structure (TOC), not content. Use page numbers to locate sections in the original PDF.

**Example output from pi.toc():**
📄 annual_report.pdf
• Executive Summary (p.1-5) • Financial Overview (p.6-20) • Revenue (p.6-10) • Expenses (p.11-15) • Projections (p.16-20) • Risk Factors (p.21-35)
undefined

**pi.* API参考:**
| 方法 | 成本 | 返回值 | 描述 |
|--------|------|---------|-------------|
| `pi.configure(client)` | 免费 | None | 设置rlm后端(必须首先完成) |
| `pi.status()` | 免费 | dict | 检查可用性、配置、警告信息 |
| `pi.index(path=str)` | 高额付费 | PITree | 从PDF构建树状索引 |
| `pi.toc(tree, max_depth=3)` | 免费 | str | 格式化的目录 |
| `pi.get_section(tree, node_id)` | 免费 | PINode或None | 通过ID获取章节 |
| `pi.available()` | 免费 | bool | 检查PageIndex是否已安装 |
| `pi.configured()` | 免费 | bool | 检查客户端是否已配置 |

**PITree属性:** `doc_name`、`nodes`(PINode列表)、`doc_description`、`raw`(字典)
**PINode属性:** `title`、`node_id`、`start_index`、`end_index`、`summary`(可能为None)、`children`(可能为None)

**注意事项:**
- 只有在`pi.index()`中设置`add_summaries=True`时,`summary`字段才会被填充
- 叶子节点(无子章节)的`children`字段为None
- `tree.raw["structure"]`是一个扁平列表;层级关系存储在PINode.children中
- PageIndex仅提取文档结构(目录),不提取内容。需根据页码在原始PDF中定位章节。

**pi.toc()示例输出:**
📄 annual_report.pdf
• 执行摘要(第1-5页) • 财务概述(第6-20页) • 收入(第6-10页) • 支出(第11-15页) • 预测(第16-20页) • 风险因素(第21-35页)
undefined