moai-foundation-context

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Quick Reference

快速参考

Enterprise Context and Session Management - Unified context optimization and session state management for Claude Code with 200K token budget management, session persistence, and multi-agent handoff protocols.

Core Capabilities:

200K token budget allocation and monitoring
Session state tracking with persistence
Context-aware token optimization
Multi-agent handoff protocols
Progressive disclosure and memory management
Session forking for parallel exploration

When to Use:

Session initialization and cleanup
Long-running workflows exceeding 10 minutes
Multi-agent orchestration
Context window approaching limits exceeding 150K tokens
Model switches between Haiku and Sonnet
Workflow phase transitions

Key Principles:

Avoid Last 20%: Performance degrades in final fifth of context window.

Aggressive Clearing: Execute /clear every 1-3 messages for SPEC workflows.

Lean Memory Files: Keep each file under 500 lines.

Disable Unused MCPs: Minimize tool definition overhead.

Quality Over Quantity: 10% relevant context beats 90% noise.

企业级上下文与会话管理 - 为Claude Code提供统一的上下文优化和会话状态管理，支持200K令牌预算管理、会话持久化以及多Agent交接协议。

核心功能：

200K令牌预算分配与监控
会话状态跟踪与持久化
上下文感知的令牌优化
多Agent交接协议
渐进式披露与内存管理
会话分支用于并行探索

适用场景：

会话初始化与清理
运行时长超过10分钟的长时工作流
多Agent编排
上下文窗口接近150K令牌的限制
Haiku与Sonnet之间的模型切换
工作流阶段转换

核心原则：

避免最后20%：上下文窗口的最后五分之一区间性能会下降。

主动清理：对于SPEC工作流，每1-3条消息执行一次/clear。

精简内存文件：每个文件保持在500行以内。

禁用未使用的MCP：最小化工具定义开销。

质量优先：10%的相关上下文优于90%的无效信息。

Implementation Guide

实施指南

Features

功能特性

Intelligent context window management for Claude Code sessions
Progressive file loading with priority-based caching
Token budget tracking and optimization alerts
Selective context preservation across /clear boundaries
MCP integration context persistence

针对Claude Code会话的智能上下文窗口管理
基于优先级缓存的渐进式文件加载
令牌预算跟踪与优化告警
跨/clear边界的选择性上下文保留
MCP集成上下文持久化

When to Use

适用场景

Managing large codebases exceeding 150K token limits
Optimizing token usage in long-running development sessions
Preserving critical context across session resets
Coordinating multi-agent workflows with shared context
Debugging context-related issues in Claude Code

管理超过150K令牌限制的大型代码库
优化长时开发会话中的令牌使用
在会话重置时保留关键上下文
协调共享上下文的多Agent工作流
调试Claude Code中与上下文相关的问题

Core Patterns

核心模式

Pattern 1 - Progressive File Loading:

Load files by priority tiers. Tier 1 includes CLAUDE.md and config.json which are always loaded. Tier 2 includes current SPEC and implementation files. Tier 3 includes related modules and dependencies. Tier 4 includes reference documentation loaded on-demand.

Pattern 2 - Context Checkpointing:

Monitor token usage with warning at 150K and critical at 180K. Identify essential context to preserve. Execute /clear to reset session. Reload Tier 1 and Tier 2 files automatically. Resume work with preserved context.

Pattern 3 - MCP Context Continuity:

Preserve MCP agent context across /clear by storing the agent_id. After /clear, context is restored through fresh MCP agent initialization.

模式1 - 渐进式文件加载：

按优先级层级加载文件。层级1始终加载CLAUDE.md和config.json。层级2包含当前SPEC和实现文件。层级3包含相关模块与依赖项。层级4包含按需加载的参考文档。

模式2 - 上下文检查点：

监控令牌使用情况，在150K时发出警告，180K时触发临界告警。识别需要保留的关键上下文。执行/clear重置会话。自动重新加载层级1和层级2文件。基于保留的上下文恢复工作。

模式3 - MCP上下文连续性：

通过存储agent_id在/clear后保留MCP Agent上下文。/clear后，通过重新初始化MCP Agent恢复上下文。

Core Patterns Detail

核心模式详情

Pattern 1: Token Budget Management

模式1：令牌预算管理

Concept: Strategic allocation and monitoring of 200K token context window.

Budget Breakdown: System Prompt and Instructions take approximately 15K tokens at 7.5%, including CLAUDE.md at 8K, Command definitions at 4K, and Skill metadata at 3K. Active Conversation takes approximately 80K tokens at 40%, including Recent messages at 50K, Context cache at 20K, and Active references at 10K. Reference Context with Progressive Disclosure takes approximately 50K at 25%, including Project structure at 15K, Related Skills at 20K, and Tool definitions at 15K. Reserve for Emergency Recovery takes approximately 55K tokens at 27.5%, including Session state snapshot at 10K, TAGs and cross-references at 15K, Error recovery context at 20K, and Free buffer at 10K.

Monitoring Thresholds: When usage exceeds 85%, trigger emergency compression and execute clear command. When usage exceeds 75%, defer non-critical context and warn user of approaching limit. When usage exceeds 60%, track context growth patterns.

Use Case: Prevent context overflow in long-running SPEC-First workflows.

概念：对200K令牌上下文窗口进行战略性分配与监控。

预算细分：系统提示与指令约占15K令牌（7.5%），其中包括8K的CLAUDE.md、4K的命令定义以及3K的Skill元数据。活跃对话约占80K令牌（40%），其中包括50K的近期消息、20K的上下文缓存以及10K的活跃引用。渐进式披露的参考上下文约占50K（25%），其中包括15K的项目结构、20K的相关Skills以及15K的工具定义。应急恢复预留约占55K令牌（27.5%），其中包括10K的会话状态快照、15K的TAG与交叉引用、20K的错误恢复上下文以及10K的自由缓冲区。

监控阈值：当使用量超过85%时，触发紧急压缩并执行clear命令。当使用量超过75%时，延迟非关键上下文并向用户发出接近限制的警告。当使用量超过60%时，跟踪上下文增长模式。

应用场景：防止长时SPEC-First工作流中的上下文溢出。

Pattern 2: Aggressive /clear Strategy

模式2：主动/clear策略

Concept: Proactive context clearing at strategic checkpoints to maintain efficiency.

Mandatory /clear Points: After /moai:1-plan completion to save 45-50K tokens. When context exceeds 150K tokens to prevent overflow. When conversation exceeds 50 messages to remove stale history. Before major phase transitions for clean slate. During model switches for Haiku to Sonnet handoffs.

Use Case: Maximize token efficiency across SPEC-Run-Sync cycles.

概念：在关键检查点主动清理上下文以维持效率。

强制/clear节点：完成/moai:1-plan后执行，可节省45-50K令牌。当上下文超过150K令牌时执行以防止溢出。当对话超过50条消息时执行以移除陈旧历史。在主要阶段转换前执行以获得干净的状态。在Haiku与Sonnet之间切换模型时执行。

应用场景：在SPEC-Run-Sync周期中最大化令牌效率。

Pattern 3: Session State Persistence

模式3：会话状态持久化

Concept: Maintain session continuity across interruptions with state snapshots.

Session State Layers: L1 is the Context-Aware Layer for Claude 4.5+ with token budget tracking, context window position, auto-summarization triggers, and model-specific optimizations. L2 is Active Context for current task, variables, and scope. L3 is Session History for recent actions and decisions. L4 is Project State for SPEC progress and milestones. L5 is User Context for preferences, language, and expertise. L6 is System State for tools, permissions, and environment.

Use Case: Resume long-running tasks after interruptions without context loss.

概念：通过状态快照在中断后维持会话连续性。

会话状态层级：L1是针对Claude 4.5+的上下文感知层，包含令牌预算跟踪、上下文窗口位置、自动汇总触发以及模型特定优化。L2是当前任务的活跃上下文、变量与范围。L3是近期操作与决策的会话历史。L4是SPEC进度与里程碑的项目状态。L5是用户偏好、语言与专业能力的用户上下文。L6是工具、权限与环境的系统状态。

应用场景：中断后恢复长时任务而不丢失上下文。

Pattern 4: Multi-Agent Handoff Protocols

模式4：多Agent交接协议

Concept: Seamless context transfer between agents with minimal token overhead.

Handoff Package Contents: Include handoff_id, from_agent, to_agent, session_context with session_id, model, context_position, available_tokens, and user_language, task_context with spec_id, current_phase, completed_steps, and next_step, and recovery_info with last_checkpoint, recovery_tokens_reserved, and session_fork_available.

Handoff Validation: Check token budget with minimum 30K available buffer. Verify agent compatibility. Trigger context compression if needed.

Use Case: Efficient Plan to Run to Sync workflow execution.

概念：以最小令牌开销在Agent之间无缝传输上下文。

交接包内容：包含handoff_id、from_agent、to_agent、带有session_id、模型、上下文位置、可用令牌和用户语言的session_context、带有spec_id、当前阶段、已完成步骤和下一步的task_context，以及带有最后检查点、预留恢复令牌和可用会话分支的recovery_info。

交接验证：检查令牌预算是否有至少30K的可用缓冲区。验证Agent兼容性。必要时触发上下文压缩。

应用场景：高效执行从Plan到Run再到Sync的工作流。

Pattern 5: Progressive Disclosure and Memory Optimization

模式5：渐进式披露与内存优化

Concept: Load context progressively based on relevance and need.

Progressive Summarization: Extract key sentences to compress 50K to 15K at target ratio of 0.3. Add pointers to original content for reference. Store original in session archive for recovery. Result saves approximately 35K tokens.

Context Tagging: Avoid high token cost phrases like "The user configuration from the previous 20 messages..." and use efficient references like "Refer to @CONFIG-001 for user preferences".

Use Case: Maintain context continuity while minimizing token overhead.

概念：基于相关性与需求渐进式加载上下文。

渐进式汇总：提取关键句子将50K内容压缩至15K，目标压缩比为0.3。添加指向原始内容的引用指针。将原始内容存储在会话归档中以便恢复。结果可节省约35K令牌。

上下文标记：避免使用高令牌成本的表述，如“来自前20条消息的用户配置...”，改用高效引用，如“参考@CONFIG-001获取用户偏好”。

应用场景：在最小化令牌开销的同时维持上下文连续性。

Advanced Documentation

高级文档

For detailed patterns and implementation strategies:

modules/token-budget-allocation.md - Budget breakdown, allocation strategies, monitoring thresholds
modules/session-state-management.md - State layers, persistence, resumption patterns
modules/context-optimization.md - Progressive disclosure, summarization, memory management
modules/handoff-protocols.md - Inter-agent communication, package format, validation
modules/memory-mcp-optimization.md - Memory file structure, MCP server configuration
modules/reference.md - API reference, troubleshooting, best practices

如需详细模式与实施策略，请参考：

modules/token-budget-allocation.md - 预算细分、分配策略、监控阈值
modules/session-state-management.md - 状态层级、持久化、恢复模式
modules/context-optimization.md - 渐进式披露、汇总、内存管理
modules/handoff-protocols.md - 跨Agent通信、包格式、验证
modules/memory-mcp-optimization.md - 内存文件结构、MCP服务器配置
modules/reference.md - API参考、故障排除、最佳实践

Best Practices

最佳实践

Recommended Practices:

Execute /clear immediately after SPEC creation
Monitor token usage and plan accordingly
Use context-aware token budget tracking
Create checkpoints before major operations
Apply progressive summarization for long workflows
Enable session persistence for recovery
Use session forking for parallel exploration
Keep memory files under 500 lines each
Disable unused MCP servers to reduce overhead

Required Practices:

Maintain bounded context history with regular clearing cycles. Unbounded context accumulation degrades performance and increases token costs exponentially. This prevents context overflow, maintains consistent response quality, and reduces token waste by 60-70%.

Respond to token budget warnings immediately when usage exceeds 150K tokens. Operating in the final 20% of context window causes significant performance degradation.

Execute state validation checks during session recovery operations. Invalid state can cause workflow failures and data loss in multi-step processes.

Persist session identifiers before any context clearing operations. Session IDs are the only reliable mechanism for resuming interrupted workflows.

Execute context compression or clearing when usage reaches 85% threshold. This maintains 55K token emergency reserve and prevents forced interruptions.

推荐实践：

创建SPEC后立即执行/clear
监控令牌使用情况并相应规划
使用上下文感知的令牌预算跟踪
在主要操作前创建检查点
对长时工作流应用渐进式汇总
启用会话持久化以便恢复
使用会话分支进行并行探索
每个内存文件保持在500行以内
禁用未使用的MCP服务器以减少开销

强制实践：

通过定期清理周期维持有限的上下文历史。无限制的上下文累积会导致性能下降，且令牌成本呈指数级增长。这可防止上下文溢出、维持一致的响应质量，并减少60-70%的令牌浪费。

当使用量超过150K令牌时，立即响应令牌预算警告。在上下文窗口的最后20%区间运行会导致性能显著下降。

在会话恢复操作期间执行状态验证检查。无效状态可能导致多步骤流程中的工作流失败与数据丢失。

在任何上下文清理操作前持久化会话标识符。会话ID是恢复中断工作流的唯一可靠机制。

当使用量达到85%阈值时执行上下文压缩或清理。这可保留55K令牌的应急储备并防止强制中断。

Works Well With

兼容组件

moai-cc-memory - Memory management and context persistence
moai-cc-configuration - Session configuration and preferences
moai-core-workflow - Workflow state persistence and recovery
moai-cc-agents - Agent state management across sessions
moai-foundation-trust - Quality gate integration

moai-cc-memory - 内存管理与上下文持久化
moai-cc-configuration - 会话配置与偏好
moai-core-workflow - 工作流状态持久化与恢复
moai-cc-agents - 跨会话的Agent状态管理
moai-foundation-trust - 质量门集成

Workflow Integration

工作流集成

Session Initialization: Initialize token budget with Pattern 1, load session state with Pattern 3, setup progressive disclosure with Pattern 5, configure handoff protocols with Pattern 4.

SPEC-First Workflow: Execute /moai:1-plan, then mandatory /clear to save 45-50K tokens, then /moai:2-run SPEC-XXX, then multi-agent handoffs with Pattern 4, then /moai:3-sync SPEC-XXX, then session state persistence with Pattern 3.

Context Monitoring: Continuously track token usage with Pattern 1, apply progressive disclosure with Pattern 5, execute /clear at thresholds with Pattern 2, validate handoffs with Pattern 4.

会话初始化：使用模式1初始化令牌预算，使用模式3加载会话状态，使用模式5设置渐进式披露，使用模式4配置交接协议。

SPEC-First工作流：执行/moai:1-plan，然后强制执行/clear以节省45-50K令牌，接着执行/moai:2-run SPEC-XXX，然后使用模式4进行多Agent交接，再执行/moai:3-sync SPEC-XXX，最后使用模式3进行会话状态持久化。

上下文监控：使用模式1持续跟踪令牌使用情况，使用模式5应用渐进式披露，使用模式2在阈值处执行/clear，使用模式4验证交接。

Success Metrics

成功指标

Token Efficiency: 60-70% reduction through aggressive clearing
Context Overhead: Less than 15K tokens for system/skill metadata
Handoff Success Rate: Greater than 95% with validation
Session Recovery: Less than 5 seconds with state persistence
Memory Optimization: Less than 500 lines per memory file

Status: Production Ready (Enterprise) Modular Architecture: SKILL.md + 6 modules Integration: Plan-Run-Sync workflow optimized Generated with: MoAI-ADK Skill Factory

令牌效率：通过主动清理减少60-70%的令牌消耗
上下文开销：系统/Skill元数据占用令牌少于15K
交接成功率：验证后成功率超过95%
会话恢复：借助状态持久化恢复时间少于5秒
内存优化：每个内存文件少于500行

状态：已就绪（企业级）模块化架构：SKILL.md + 6个模块集成：针对Plan-Run-Sync工作流优化生成工具：MoAI-ADK Skill Factory