letta-development-guide

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Letta Development Guide

Letta Agent开发指南

Comprehensive guide for designing and building effective Letta agents with appropriate architectures, memory configurations, model selection, and tool setups.

本指南全面介绍了如何设计和构建高效的Letta Agent，包括合适的架构选择、内存配置、模型选型及工具设置。

When to Use This Skill

适用场景

Use this skill when:

Starting a new Letta agent project
Choosing between agent architectures (letta_v1_agent vs memgpt_v2_agent)
Designing memory block structure and architecture
Selecting appropriate models for your use case
Planning tool configurations
Optimizing memory management and performance
Implementing shared memory between agents
Debugging memory-related issues

在以下场景中使用本技能：

启动新的Letta Agent项目
在letta_v1_agent与memgpt_v2_agent两种Agent架构间做选择
设计内存块结构与架构
根据使用场景选择合适的模型
规划工具配置
优化内存管理与性能
实现Agent间的共享内存
调试内存相关问题

Quick Start Guide

快速入门指南

Minimal Working Example

最简运行示例

python

from letta_client import Letta

client = Letta()
agent = client.agents.create(
    name="my-assistant",
    model="openai/gpt-4o",
    embedding="openai/text-embedding-3-small",
    memory_blocks=[
        {"label": "persona", "value": "You are a helpful assistant."},
        {"label": "human", "value": "The user's name and preferences."},
    ],
)

python

from letta_client import Letta

client = Letta()
agent = client.agents.create(
    name="my-assistant",
    model="openai/gpt-4o",
    embedding="openai/text-embedding-3-small",
    memory_blocks=[
        {"label": "persona", "value": "You are a helpful assistant."},
        {"label": "human", "value": "The user's name and preferences."},
    ],
)

Send a message

response = client.agents.messages.create( agent_id=agent.id, messages=[{"role": "user", "content": "Hello!"}], ) print(response.messages[-1].content)

undefined

response = client.agents.messages.create( agent_id=agent.id, messages=[{"role": "user", "content": "Hello!"}], ) print(response.messages[-1].content)

undefined

1. Architecture Selection

1. 架构选择

Use letta_v1_agent when:

Building new agents (recommended default)
Need compatibility with reasoning models (GPT-4o, Claude Sonnet 4)
Want simpler system prompts and direct message generation

Use memgpt_v2_agent when:

Maintaining legacy agents
Require specific tool patterns not yet supported in v1

For detailed comparison, see

references/architectures.md

选择letta_v1_agent的场景：

构建新Agent（推荐默认选项）
需要与推理模型（GPT-4o、Claude Sonnet 4）兼容
希望使用更简洁的系统提示词和直接生成消息的方式

选择memgpt_v2_agent的场景：

维护旧版Agent
需要v1版本暂不支持的特定工具模式

详细对比请参考

references/architectures.md

。

2. Memory Architecture Design

2. 内存架构设计

Memory is the foundation of effective agents. Letta provides three memory types:

Core Memory (in-context):

Always accessible in agent's context window
Use for: current state, active context, frequently referenced information
Limit: Keep total core memory under 80% of context window

Archival Memory (out-of-context):

Semantic search over vector database
Use for: historical records, large knowledge bases, past interactions
Access: Agent must explicitly call archival_memory_search
Note: NOT automatically populated from context overflow

Conversation History:

Past messages from current conversation
Retrieved via conversation_search tool
Use for: referencing earlier discussion, tracking conversation flow

See

references/memory-architecture.md

for detailed guidance.

内存是高效Agent的基础。Letta提供三种内存类型：

核心内存（上下文内）：

始终可在Agent的上下文窗口中访问
适用场景：当前状态、活跃上下文、频繁引用的信息
限制：核心内存总大小需控制在上下文窗口的80%以内

归档内存（上下文外）：

基于向量数据库的语义搜索
适用场景：历史记录、大型知识库、过往交互内容
访问方式：Agent必须显式调用archival_memory_search工具
注意：不会自动从上下文溢出内容中填充

对话历史：

当前对话中的过往消息
通过conversation_search工具检索
适用场景：参考早期讨论、追踪对话流程

详细指南请参考

references/memory-architecture.md

。

3. Memory Block Design

3. 内存块设计

Core principle: One block per distinct functional unit.

Essential blocks:

```
persona
```
: Agent identity, behavioral guidelines, capabilities
```
human
```
: User information, preferences, context

Add domain-specific blocks based on use case:

Customer support:

company_policies

product_knowledge

customer

Coding assistant:

project_context

coding_standards

current_task

Personal assistant:
```
schedule
```
,
```
preferences
```
,
```
contacts
```

Memory block guidelines:

Keep blocks focused and purpose-specific
Use clear, instructional descriptions
Monitor size limits (typically 2000-5000 characters per block)
Design for append operations when sharing memory between agents

See

references/memory-patterns.md

for domain examples and

references/description-patterns.md

for writing effective descriptions.

核心原则：每个功能单元对应一个独立内存块。

必备内存块：

```
persona
```
：Agent身份、行为准则、能力范围
```
human
```
：用户信息、偏好、上下文

根据使用场景添加特定领域内存块：

客服场景：

company_policies

、

product_knowledge

、

customer

编码助手场景：

project_context

、

coding_standards

、

current_task

个人助手场景：
```
schedule
```
、
```
preferences
```
、
```
contacts
```

内存块设计准则：

内存块需聚焦特定用途
使用清晰、指导性的描述
监控大小限制（通常每个块为2000-5000字符）
当Agent间共享内存时，设计为支持追加操作

领域示例请参考

references/memory-patterns.md

，编写有效描述的方法请参考

references/description-patterns.md

。

4. Model Selection

4. 模型选型

Match model capabilities to agent requirements:

For production agents:

GPT-4o or Claude Sonnet 4 for complex reasoning
GPT-4o-mini for cost-efficient general tasks
Claude Haiku 3.5 for fast, lightweight operations
Gemini 2.0 Flash for balanced speed/capability

Avoid for production:

Small Ollama models (<7B parameters) - poor tool calling
Models without reliable function calling support

See

references/model-recommendations.md

for detailed guidance.

根据Agent需求匹配模型能力：

生产环境Agent推荐：

复杂推理场景：GPT-4o或Claude Sonnet 4
成本优先的通用任务：GPT-4o-mini
快速轻量操作：Claude Haiku 3.5
速度与能力平衡：Gemini 2.0 Flash

生产环境避免使用：

小型Ollama模型（<7B参数）- 工具调用能力差
不支持可靠函数调用的模型

详细指南请参考

references/model-recommendations.md

。

5. Tool Configuration

5. 工具配置

Start minimal: Attach only tools the agent will actively use.

Common starting points:

Memory tools (memory_insert, memory_replace, memory_rethink): Core for most agents
File system tools: Auto-attached when folders are connected
Custom tools: For domain-specific operations (databases, APIs, etc.)

Tool Rules: Use to enforce sequencing when needed (e.g., "always call search before answer")

Consult

references/tool-patterns.md

for common configurations.

从最简配置开始：仅添加Agent会实际使用的工具。

常见初始配置：

内存工具（memory_insert、memory_replace、memory_rethink）：大多数Agent的核心工具
文件系统工具：连接文件夹时会自动附加
自定义工具：用于特定领域操作（数据库、API等）

工具规则： 必要时用于强制执行操作顺序（例如："回答前必须先调用搜索工具"）

常见配置请参考

references/tool-patterns.md

。

Advanced Topics

高级主题

Memory Size Management

内存大小管理

When approaching character limits:

Split by topic:

customer_profile

→

customer_business

customer_preferences

Split by time:
```
interaction_history
```
→
```
recent_interactions
```
, archive older to archival memory
Archive historical data: Move old information to archival memory
Consolidate with memory_rethink: Summarize and rewrite block

See

references/size-management.md

for strategies.

当接近字符限制时：

按主题拆分：

customer_profile

→

customer_business

、

customer_preferences

按时间拆分：
```
interaction_history
```
→
```
recent_interactions
```
，将旧内容归档至归档内存
归档历史数据：将旧信息移至归档内存
使用memory_rethink合并：总结并重写内存块

策略详情请参考

references/size-management.md

。

Concurrency Patterns

并发模式

When multiple agents share memory blocks or an agent processes concurrent requests:

Safest operations:

```
memory_insert
```
: Append-only, minimal race conditions
Database uses PostgreSQL row-level locking

Risk of race conditions:

```
memory_replace
```
: Target string may change before write
```
memory_rethink
```
: Last-writer-wins, no merge

Best practices:

Design for append operations when possible
Use memory_insert for concurrent writes
Reserve memory_rethink for single-agent exclusive access

Consult

references/concurrency.md

for detailed patterns.

当多个Agent共享内存块或单个Agent处理并发请求时：

最安全的操作：

```
memory_insert
```
：仅追加操作，竞态条件极少
数据库使用PostgreSQL行级锁

存在竞态风险的操作：

```
memory_replace
```
：写入前目标字符串可能已更改
```
memory_rethink
```
：最后写入者获胜，无合并机制

最佳实践：

尽可能设计为支持追加操作
并发写入时使用memory_insert
仅在单Agent独占访问场景下使用memory_rethink

详细模式请参考

references/concurrency.md

。

Validation Checklist

验证清单

Common Antipatterns

常见反模式

Too few memory blocks:

yaml

undefined

内存块过少：

yaml

undefined

Bad: Everything in one block

错误示例：所有内容放在一个块中

agent_memory: "Agent is helpful. User is John..."

Split into focused blocks instead.

**Too many memory blocks:**
Creating 10+ blocks when 3-4 would suffice. Start minimal, expand as needed.

**Poor descriptions:**
```yaml

agent_memory: "Agent is helpful. User is John..."

应拆分为多个聚焦的内存块。

**内存块过多：**
当3-4个内存块足够时，创建了10个以上的块。从最简配置开始，按需扩展。

**描述质量差：**
```yaml

Bad

错误示例

data: "Contains data"

Provide actionable guidance instead. See `references/description-patterns.md`.

**Ignoring size limits:**
Letting blocks grow indefinitely until they hit limits. Monitor and manage proactively.

data: "Contains data"

应提供可操作的指导。请参考`references/description-patterns.md`。

**忽略大小限制：**
让内存块无限增长直至达到限制。需主动监控和管理。

Implementation Steps

实施步骤

1. Design Phase

1. 设计阶段

Choose architecture based on requirements
Design memory block structure
Select appropriate model
Plan tool configuration

根据需求选择架构
设计内存块结构
选择合适的模型
规划工具配置

2. Creation Phase (SDK)

2. 创建阶段（SDK）

Python:

python

from letta_client import Letta

client = Letta()  # Uses LETTA_API_KEY env var

Python：

python

from letta_client import Letta

client = Letta()  # 使用环境变量LETTA_API_KEY

Create agent with custom memory blocks

创建带自定义内存块的Agent

agent = client.agents.create( name="my-agent", model="openai/gpt-4o", # or "anthropic/claude-sonnet-4-20250514" embedding="openai/text-embedding-3-small", memory_blocks=[ {"label": "persona", "value": "You are a helpful assistant..."}, {"label": "human", "value": "User preferences and context..."}, {"label": "project", "value": "Current project details..."}, ], description="Agent for helping with X", ) print(f"Created agent: {agent.id}")


**TypeScript:**
```typescript
import Letta from "letta-client";

const client = new Letta();

const agent = await client.agents.create({
  name: "my-agent",
  model: "openai/gpt-4o",
  embedding: "openai/text-embedding-3-small",
  memoryBlocks: [
    { label: "persona", value: "You are a helpful assistant..." },
    { label: "human", value: "User preferences and context..." },
    { label: "project", value: "Current project details..." },
  ],
  description: "Agent for helping with X",
});
console.log(`Created agent: ${agent.id}`);

Note: Letta Code CLI (

letta

command) creates agents interactively. Use

letta --new-agent

to start fresh, then

/rename

and

/description

to configure.

agent = client.agents.create( name="my-agent", model="openai/gpt-4o", # 或 "anthropic/claude-sonnet-4-20250514" embedding="openai/text-embedding-3-small", memory_blocks=[ {"label": "persona", "value": "You are a helpful assistant..."}, {"label": "human", "value": "User preferences and context..."}, {"label": "project", "value": "Current project details..."}, ], description="Agent for helping with X", ) print(f"Created agent: {agent.id}")


**TypeScript：**
```typescript
import Letta from "letta-client";

const client = new Letta();

const agent = await client.agents.create({
  name: "my-agent",
  model: "openai/gpt-4o",
  embedding: "openai/text-embedding-3-small",
  memoryBlocks: [
    { label: "persona", value: "You are a helpful assistant..." },
    { label: "human", value: "User preferences and context..." },
    { label: "project", value: "Current project details..." },
  ],
  description: "Agent for helping with X",
});
console.log(`Created agent: ${agent.id}`);

注意： Letta Code CLI（

letta

命令）可交互式创建Agent。使用

letta --new-agent

开始创建，然后通过

/rename

和

/description

进行配置。

3. Testing Phase

3. 测试阶段

Test with representative queries
Monitor memory tool usage patterns
Verify tool calling behavior

使用代表性查询进行测试
监控内存工具的使用模式
验证工具调用行为

4. Iteration Phase

4. 迭代阶段

Refine memory block structure based on actual usage
Optimize system instructions
Adjust tool configurations

根据实际使用情况优化内存块结构
优化系统提示词
调整工具配置

References

参考资料

For detailed information on specific topics, consult the reference materials:

```
references/architectures.md
```
- Architecture comparison and selection
```
references/memory-architecture.md
```
- Memory types and when to use them
```
references/memory-patterns.md
```
- Domain-specific memory block examples
```
references/description-patterns.md
```
- Writing effective block descriptions
```
references/size-management.md
```
- Managing memory block size limits
```
references/concurrency.md
```
- Multi-agent memory sharing patterns
```
references/model-recommendations.md
```
- Model selection guidance
```
references/tool-patterns.md
```
- Common tool configurations

如需特定主题的详细信息，请参考以下资料：

```
references/architectures.md
```
- 架构对比与选择指南
```
references/memory-architecture.md
```
- 内存类型及适用场景
```
references/memory-patterns.md
```
- 特定领域内存块示例
```
references/description-patterns.md
```
- 编写有效内存块描述的方法
```
references/size-management.md
```
- 内存块大小限制管理策略
```
references/concurrency.md
```
- 多Agent内存共享模式
```
references/model-recommendations.md
```
- 模型选型指南
```
references/tool-patterns.md
```
- 常见工具配置