context-manager

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Context Manager

上下文管理器

Purpose

用途

Provides expertise in AI context management, memory architectures, and context window optimization. Handles conversation history, RAG memory systems, and efficient context utilization for LLM applications.
提供AI上下文管理、内存架构以及上下文窗口优化的专业能力,负责处理LLM应用中的对话历史、RAG内存系统以及上下文高效利用问题。

When to Use

适用场景

  • Designing AI memory and context systems
  • Optimizing context window usage
  • Implementing conversation history management
  • Building long-term memory for AI agents
  • Managing RAG retrieval context
  • Reducing token usage while preserving quality
  • Designing multi-session memory persistence
  • 设计AI内存与上下文系统
  • 优化上下文窗口使用效率
  • 实现对话历史管理
  • 为AI Agent构建长期内存
  • 管理RAG检索上下文
  • 在保证质量的前提下减少Token消耗
  • 设计多会话内存持久化方案

Quick Start

快速开始

Invoke this skill when:
  • Designing AI memory and context systems
  • Optimizing context window usage
  • Implementing conversation history management
  • Building long-term memory for AI agents
  • Reducing token usage while preserving quality
Do NOT invoke when:
  • Building full RAG pipelines (use ai-engineer)
  • Managing vector databases (use data-engineer)
  • Coordinating multiple agents (use agent-organizer)
  • Training embedding models (use ml-engineer)
在以下场景调用此技能:
  • 设计AI内存与上下文系统
  • 优化上下文窗口使用效率
  • 实现对话历史管理
  • 为AI Agent构建长期内存
  • 在保证质量的前提下减少Token消耗
请勿在以下场景调用:
  • 构建完整RAG流水线(请使用ai-engineer)
  • 管理向量数据库(请使用data-engineer)
  • 协调多个Agent(请使用agent-organizer)
  • 训练嵌入模型(请使用ml-engineer)

Decision Framework

决策框架

Memory Type Selection:
├── Single conversation → Sliding window context
├── Multi-session user → Persistent memory store
├── Knowledge-heavy → RAG with vector DB
├── Task-oriented → Working memory + tool results
└── Long-running agent
    ├── Episodic memory → Event summaries
    ├── Semantic memory → Knowledge graph
    └── Procedural memory → Learned patterns
Memory Type Selection:
├── Single conversation → Sliding window context
├── Multi-session user → Persistent memory store
├── Knowledge-heavy → RAG with vector DB
├── Task-oriented → Working memory + tool results
└── Long-running agent
    ├── Episodic memory → Event summaries
    ├── Semantic memory → Knowledge graph
    └── Procedural memory → Learned patterns

Core Workflows

核心工作流程

1. Context Window Optimization

1. 上下文窗口优化

  1. Measure current token usage
  2. Identify redundant or verbose content
  3. Implement summarization for old messages
  4. Prioritize recent and relevant context
  5. Use compression techniques
  6. Monitor quality vs. token tradeoff
  1. 测量当前Token使用量
  2. 识别冗余或冗长内容
  3. 对旧消息实现摘要处理
  4. 优先保留近期且相关的上下文
  5. 使用压缩技术
  6. 监控质量与Token消耗的平衡

2. Conversation Memory Design

2. 对话内存设计

  1. Define memory retention requirements
  2. Choose storage strategy (in-memory, DB)
  3. Implement message windowing
  4. Add summarization for overflow
  5. Design retrieval for relevant history
  6. Handle session boundaries
  1. 定义内存保留需求
  2. 选择存储策略(内存内、数据库)
  3. 实现消息窗口机制
  4. 为溢出内容添加摘要处理
  5. 设计相关历史的检索方案
  6. 处理会话边界

3. Long-term Memory Implementation

3. 长期内存实现

  1. Define memory types needed
  2. Design memory storage schema
  3. Implement memory write triggers
  4. Build retrieval mechanisms
  5. Add memory consolidation
  6. Implement forgetting policies
  1. 确定所需的内存类型
  2. 设计内存存储架构
  3. 实现内存写入触发机制
  4. 构建检索机制
  5. 添加内存整合功能
  6. 实现遗忘策略

Best Practices

最佳实践

  • Summarize old context rather than truncating
  • Use semantic search for relevant history retrieval
  • Separate system instructions from conversation
  • Cache frequently accessed context
  • Monitor context utilization metrics
  • Implement graceful degradation at limits
  • 对旧上下文进行摘要而非直接截断
  • 使用语义搜索检索相关历史
  • 将系统指令与对话内容分离
  • 缓存频繁访问的上下文
  • 监控上下文使用指标
  • 在达到限制时实现优雅降级

Anti-Patterns

反模式

Anti-PatternProblemCorrect Approach
Full history alwaysExceeds context limitsSliding window + summaries
No summarizationLost important contextSummarize before eviction
Equal priorityWastes tokens on irrelevantWeight recent/relevant higher
No persistenceLost memory across sessionsStore important memories
Ignoring token costsExpensive API callsMonitor and optimize usage
反模式问题正确方案
始终保留完整历史超出上下文限制滑动窗口 + 摘要
不做摘要处理丢失重要上下文在淘汰前进行摘要
所有内容优先级相同浪费Token在无关内容上为近期/相关内容赋予更高权重
无持久化机制跨会话丢失内存存储重要内存内容
忽略Token成本API调用成本高昂监控并优化使用量