letta-development-guide

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Letta Development Guide

Letta Agent开发指南

Comprehensive guide for designing and building effective Letta agents with appropriate architectures, memory configurations, model selection, and tool setups.
本指南全面介绍了如何设计和构建高效的Letta Agent,包括合适的架构选择、内存配置、模型选型及工具设置。

When to Use This Skill

适用场景

Use this skill when:
  • Starting a new Letta agent project
  • Choosing between agent architectures (letta_v1_agent vs memgpt_v2_agent)
  • Designing memory block structure and architecture
  • Selecting appropriate models for your use case
  • Planning tool configurations
  • Optimizing memory management and performance
  • Implementing shared memory between agents
  • Debugging memory-related issues
在以下场景中使用本技能:
  • 启动新的Letta Agent项目
  • 在letta_v1_agent与memgpt_v2_agent两种Agent架构间做选择
  • 设计内存块结构与架构
  • 根据使用场景选择合适的模型
  • 规划工具配置
  • 优化内存管理与性能
  • 实现Agent间的共享内存
  • 调试内存相关问题

Quick Start Guide

快速入门指南

Minimal Working Example

最简运行示例

python
from letta_client import Letta

client = Letta()
agent = client.agents.create(
    name="my-assistant",
    model="openai/gpt-4o",
    embedding="openai/text-embedding-3-small",
    memory_blocks=[
        {"label": "persona", "value": "You are a helpful assistant."},
        {"label": "human", "value": "The user's name and preferences."},
    ],
)
python
from letta_client import Letta

client = Letta()
agent = client.agents.create(
    name="my-assistant",
    model="openai/gpt-4o",
    embedding="openai/text-embedding-3-small",
    memory_blocks=[
        {"label": "persona", "value": "You are a helpful assistant."},
        {"label": "human", "value": "The user's name and preferences."},
    ],
)

Send a message

Send a message

response = client.agents.messages.create( agent_id=agent.id, messages=[{"role": "user", "content": "Hello!"}], ) print(response.messages[-1].content)
undefined
response = client.agents.messages.create( agent_id=agent.id, messages=[{"role": "user", "content": "Hello!"}], ) print(response.messages[-1].content)
undefined

1. Architecture Selection

1. 架构选择

Use letta_v1_agent when:
  • Building new agents (recommended default)
  • Need compatibility with reasoning models (GPT-4o, Claude Sonnet 4)
  • Want simpler system prompts and direct message generation
Use memgpt_v2_agent when:
  • Maintaining legacy agents
  • Require specific tool patterns not yet supported in v1
For detailed comparison, see
references/architectures.md
.
选择letta_v1_agent的场景:
  • 构建新Agent(推荐默认选项)
  • 需要与推理模型(GPT-4o、Claude Sonnet 4)兼容
  • 希望使用更简洁的系统提示词和直接生成消息的方式
选择memgpt_v2_agent的场景:
  • 维护旧版Agent
  • 需要v1版本暂不支持的特定工具模式
详细对比请参考
references/architectures.md

2. Memory Architecture Design

2. 内存架构设计

Memory is the foundation of effective agents. Letta provides three memory types:
Core Memory (in-context):
  • Always accessible in agent's context window
  • Use for: current state, active context, frequently referenced information
  • Limit: Keep total core memory under 80% of context window
Archival Memory (out-of-context):
  • Semantic search over vector database
  • Use for: historical records, large knowledge bases, past interactions
  • Access: Agent must explicitly call archival_memory_search
  • Note: NOT automatically populated from context overflow
Conversation History:
  • Past messages from current conversation
  • Retrieved via conversation_search tool
  • Use for: referencing earlier discussion, tracking conversation flow
See
references/memory-architecture.md
for detailed guidance.
内存是高效Agent的基础。Letta提供三种内存类型:
核心内存(上下文内):
  • 始终可在Agent的上下文窗口中访问
  • 适用场景:当前状态、活跃上下文、频繁引用的信息
  • 限制:核心内存总大小需控制在上下文窗口的80%以内
归档内存(上下文外):
  • 基于向量数据库的语义搜索
  • 适用场景:历史记录、大型知识库、过往交互内容
  • 访问方式:Agent必须显式调用archival_memory_search工具
  • 注意:不会自动从上下文溢出内容中填充
对话历史:
  • 当前对话中的过往消息
  • 通过conversation_search工具检索
  • 适用场景:参考早期讨论、追踪对话流程
详细指南请参考
references/memory-architecture.md

3. Memory Block Design

3. 内存块设计

Core principle: One block per distinct functional unit.
Essential blocks:
  • persona
    : Agent identity, behavioral guidelines, capabilities
  • human
    : User information, preferences, context
Add domain-specific blocks based on use case:
  • Customer support:
    company_policies
    ,
    product_knowledge
    ,
    customer
  • Coding assistant:
    project_context
    ,
    coding_standards
    ,
    current_task
  • Personal assistant:
    schedule
    ,
    preferences
    ,
    contacts
Memory block guidelines:
  • Keep blocks focused and purpose-specific
  • Use clear, instructional descriptions
  • Monitor size limits (typically 2000-5000 characters per block)
  • Design for append operations when sharing memory between agents
See
references/memory-patterns.md
for domain examples and
references/description-patterns.md
for writing effective descriptions.
核心原则:每个功能单元对应一个独立内存块。
必备内存块:
  • persona
    :Agent身份、行为准则、能力范围
  • human
    :用户信息、偏好、上下文
根据使用场景添加特定领域内存块:
  • 客服场景:
    company_policies
    product_knowledge
    customer
  • 编码助手场景:
    project_context
    coding_standards
    current_task
  • 个人助手场景:
    schedule
    preferences
    contacts
内存块设计准则:
  • 内存块需聚焦特定用途
  • 使用清晰、指导性的描述
  • 监控大小限制(通常每个块为2000-5000字符)
  • 当Agent间共享内存时,设计为支持追加操作
领域示例请参考
references/memory-patterns.md
,编写有效描述的方法请参考
references/description-patterns.md

4. Model Selection

4. 模型选型

Match model capabilities to agent requirements:
For production agents:
  • GPT-4o or Claude Sonnet 4 for complex reasoning
  • GPT-4o-mini for cost-efficient general tasks
  • Claude Haiku 3.5 for fast, lightweight operations
  • Gemini 2.0 Flash for balanced speed/capability
Avoid for production:
  • Small Ollama models (<7B parameters) - poor tool calling
  • Models without reliable function calling support
See
references/model-recommendations.md
for detailed guidance.
根据Agent需求匹配模型能力:
生产环境Agent推荐:
  • 复杂推理场景:GPT-4o或Claude Sonnet 4
  • 成本优先的通用任务:GPT-4o-mini
  • 快速轻量操作:Claude Haiku 3.5
  • 速度与能力平衡:Gemini 2.0 Flash
生产环境避免使用:
  • 小型Ollama模型(<7B参数)- 工具调用能力差
  • 不支持可靠函数调用的模型
详细指南请参考
references/model-recommendations.md

5. Tool Configuration

5. 工具配置

Start minimal: Attach only tools the agent will actively use.
Common starting points:
  • Memory tools (memory_insert, memory_replace, memory_rethink): Core for most agents
  • File system tools: Auto-attached when folders are connected
  • Custom tools: For domain-specific operations (databases, APIs, etc.)
Tool Rules: Use to enforce sequencing when needed (e.g., "always call search before answer")
Consult
references/tool-patterns.md
for common configurations.
从最简配置开始:仅添加Agent会实际使用的工具。
常见初始配置:
  • 内存工具(memory_insert、memory_replace、memory_rethink):大多数Agent的核心工具
  • 文件系统工具:连接文件夹时会自动附加
  • 自定义工具:用于特定领域操作(数据库、API等)
工具规则: 必要时用于强制执行操作顺序(例如:"回答前必须先调用搜索工具")
常见配置请参考
references/tool-patterns.md

Advanced Topics

高级主题

Memory Size Management

内存大小管理

When approaching character limits:
  1. Split by topic:
    customer_profile
    customer_business
    ,
    customer_preferences
  2. Split by time:
    interaction_history
    recent_interactions
    , archive older to archival memory
  3. Archive historical data: Move old information to archival memory
  4. Consolidate with memory_rethink: Summarize and rewrite block
See
references/size-management.md
for strategies.
当接近字符限制时:
  1. 按主题拆分
    customer_profile
    customer_business
    customer_preferences
  2. 按时间拆分
    interaction_history
    recent_interactions
    ,将旧内容归档至归档内存
  3. 归档历史数据:将旧信息移至归档内存
  4. 使用memory_rethink合并:总结并重写内存块
策略详情请参考
references/size-management.md

Concurrency Patterns

并发模式

When multiple agents share memory blocks or an agent processes concurrent requests:
Safest operations:
  • memory_insert
    : Append-only, minimal race conditions
  • Database uses PostgreSQL row-level locking
Risk of race conditions:
  • memory_replace
    : Target string may change before write
  • memory_rethink
    : Last-writer-wins, no merge
Best practices:
  • Design for append operations when possible
  • Use memory_insert for concurrent writes
  • Reserve memory_rethink for single-agent exclusive access
Consult
references/concurrency.md
for detailed patterns.
当多个Agent共享内存块或单个Agent处理并发请求时:
最安全的操作:
  • memory_insert
    :仅追加操作,竞态条件极少
  • 数据库使用PostgreSQL行级锁
存在竞态风险的操作:
  • memory_replace
    :写入前目标字符串可能已更改
  • memory_rethink
    :最后写入者获胜,无合并机制
最佳实践:
  • 尽可能设计为支持追加操作
  • 并发写入时使用memory_insert
  • 仅在单Agent独占访问场景下使用memory_rethink
详细模式请参考
references/concurrency.md

Validation Checklist

验证清单

Before finalizing your agent design:
Architecture:
  • Does the architecture match the model's capabilities?
  • Is the model appropriate for expected workload and latency requirements?
Memory:
  • Is core memory total under 80% of context window?
  • Is each block focused on one functional area?
  • Are descriptions clear about when to read/write?
  • Have you planned for size growth and overflow?
  • If multi-agent, are concurrency patterns considered?
Tools:
  • Are tools necessary and properly configured?
  • Are memory blocks granular enough for effective updates?
在最终确定Agent设计前,请检查以下内容:
架构:
  • 架构是否匹配模型能力?
  • 模型是否符合预期工作负载和延迟要求?
内存:
  • 核心内存总大小是否控制在上下文窗口的80%以内?
  • 每个内存块是否聚焦单一功能领域?
  • 内存块描述是否清晰说明读写时机?
  • 是否规划了内存增长和溢出处理方案?
  • 若为多Agent场景,是否考虑了并发模式?
工具:
  • 工具是否必要且配置正确?
  • 内存块的粒度是否足够支持有效更新?

Common Antipatterns

常见反模式

Too few memory blocks:
yaml
undefined
内存块过少:
yaml
undefined

Bad: Everything in one block

错误示例:所有内容放在一个块中

agent_memory: "Agent is helpful. User is John..."
Split into focused blocks instead.

**Too many memory blocks:**
Creating 10+ blocks when 3-4 would suffice. Start minimal, expand as needed.

**Poor descriptions:**
```yaml
agent_memory: "Agent is helpful. User is John..."
应拆分为多个聚焦的内存块。

**内存块过多:**
当3-4个内存块足够时,创建了10个以上的块。从最简配置开始,按需扩展。

**描述质量差:**
```yaml

Bad

错误示例

data: "Contains data"
Provide actionable guidance instead. See `references/description-patterns.md`.

**Ignoring size limits:**
Letting blocks grow indefinitely until they hit limits. Monitor and manage proactively.
data: "Contains data"
应提供可操作的指导。请参考`references/description-patterns.md`。

**忽略大小限制:**
让内存块无限增长直至达到限制。需主动监控和管理。

Implementation Steps

实施步骤

1. Design Phase

1. 设计阶段

  • Choose architecture based on requirements
  • Design memory block structure
  • Select appropriate model
  • Plan tool configuration
  • 根据需求选择架构
  • 设计内存块结构
  • 选择合适的模型
  • 规划工具配置

2. Creation Phase (SDK)

2. 创建阶段(SDK)

Python:
python
from letta_client import Letta

client = Letta()  # Uses LETTA_API_KEY env var
Python:
python
from letta_client import Letta

client = Letta()  # 使用环境变量LETTA_API_KEY

Create agent with custom memory blocks

创建带自定义内存块的Agent

agent = client.agents.create( name="my-agent", model="openai/gpt-4o", # or "anthropic/claude-sonnet-4-20250514" embedding="openai/text-embedding-3-small", memory_blocks=[ {"label": "persona", "value": "You are a helpful assistant..."}, {"label": "human", "value": "User preferences and context..."}, {"label": "project", "value": "Current project details..."}, ], description="Agent for helping with X", ) print(f"Created agent: {agent.id}")

**TypeScript:**
```typescript
import Letta from "letta-client";

const client = new Letta();

const agent = await client.agents.create({
  name: "my-agent",
  model: "openai/gpt-4o",
  embedding: "openai/text-embedding-3-small",
  memoryBlocks: [
    { label: "persona", value: "You are a helpful assistant..." },
    { label: "human", value: "User preferences and context..." },
    { label: "project", value: "Current project details..." },
  ],
  description: "Agent for helping with X",
});
console.log(`Created agent: ${agent.id}`);
Note: Letta Code CLI (
letta
command) creates agents interactively. Use
letta --new-agent
to start fresh, then
/rename
and
/description
to configure.
agent = client.agents.create( name="my-agent", model="openai/gpt-4o", # 或 "anthropic/claude-sonnet-4-20250514" embedding="openai/text-embedding-3-small", memory_blocks=[ {"label": "persona", "value": "You are a helpful assistant..."}, {"label": "human", "value": "User preferences and context..."}, {"label": "project", "value": "Current project details..."}, ], description="Agent for helping with X", ) print(f"Created agent: {agent.id}")

**TypeScript:**
```typescript
import Letta from "letta-client";

const client = new Letta();

const agent = await client.agents.create({
  name: "my-agent",
  model: "openai/gpt-4o",
  embedding: "openai/text-embedding-3-small",
  memoryBlocks: [
    { label: "persona", value: "You are a helpful assistant..." },
    { label: "human", value: "User preferences and context..." },
    { label: "project", value: "Current project details..." },
  ],
  description: "Agent for helping with X",
});
console.log(`Created agent: ${agent.id}`);
注意: Letta Code CLI(
letta
命令)可交互式创建Agent。使用
letta --new-agent
开始创建,然后通过
/rename
/description
进行配置。

3. Testing Phase

3. 测试阶段

  • Test with representative queries
  • Monitor memory tool usage patterns
  • Verify tool calling behavior
  • 使用代表性查询进行测试
  • 监控内存工具的使用模式
  • 验证工具调用行为

4. Iteration Phase

4. 迭代阶段

  • Refine memory block structure based on actual usage
  • Optimize system instructions
  • Adjust tool configurations
  • 根据实际使用情况优化内存块结构
  • 优化系统提示词
  • 调整工具配置

References

参考资料

For detailed information on specific topics, consult the reference materials:
  • references/architectures.md
    - Architecture comparison and selection
  • references/memory-architecture.md
    - Memory types and when to use them
  • references/memory-patterns.md
    - Domain-specific memory block examples
  • references/description-patterns.md
    - Writing effective block descriptions
  • references/size-management.md
    - Managing memory block size limits
  • references/concurrency.md
    - Multi-agent memory sharing patterns
  • references/model-recommendations.md
    - Model selection guidance
  • references/tool-patterns.md
    - Common tool configurations
如需特定主题的详细信息,请参考以下资料:
  • references/architectures.md
    - 架构对比与选择指南
  • references/memory-architecture.md
    - 内存类型及适用场景
  • references/memory-patterns.md
    - 特定领域内存块示例
  • references/description-patterns.md
    - 编写有效内存块描述的方法
  • references/size-management.md
    - 内存块大小限制管理策略
  • references/concurrency.md
    - 多Agent内存共享模式
  • references/model-recommendations.md
    - 模型选型指南
  • references/tool-patterns.md
    - 常见工具配置