context-compression

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Context Compression Strategies

上下文压缩策略

When agent sessions generate millions of tokens, compression becomes mandatory. Optimize for tokens-per-task (total tokens to complete a task), not tokens-per-request.
当Agent会话生成数百万个token时,压缩就变得必不可少。我们要针对「每任务token数」(完成一项任务的总token数)进行优化,而非「每请求token数」。

Compression Approaches

压缩方法

1. Anchored Iterative Summarization (Recommended)

1. 锚定迭代总结(推荐方案)

  • Maintain structured summaries with explicit sections
  • On compression, summarize only newly-truncated content
  • Merge with existing summary instead of regenerating
  • Structure forces preservation of critical info
  • 维护带有明确板块的结构化总结
  • 压缩时,仅对新截断的内容进行总结
  • 与现有总结合并,而非重新生成
  • 结构化设计可强制保留关键信息

2. Opaque Compression

2. 不透明压缩

  • Highest compression ratios (99%+)
  • Sacrifices interpretability
  • Cannot verify what was preserved
  • 压缩率最高(可达99%以上)
  • 牺牲了可解释性
  • 无法验证哪些内容被保留

3. Regenerative Full Summary

3. 再生式完整总结

  • Generate detailed summary on each compression
  • Readable but may lose details across cycles
  • Full regeneration rather than merging
  • 每次压缩时生成详细总结
  • 可读性强,但多轮循环后可能丢失细节
  • 重新生成完整总结,而非合并现有内容

Structured Summary Format

结构化总结格式

markdown
undefined
markdown
undefined

Session Intent

Session Intent

[What the user is trying to accomplish]
[What the user is trying to accomplish]

Files Modified

Files Modified

  • auth.controller.ts: Fixed JWT token generation
  • config/redis.ts: Updated connection pooling
  • auth.controller.ts: Fixed JWT token generation
  • config/redis.ts: Updated connection pooling

Decisions Made

Decisions Made

  • Using Redis connection pool instead of per-request
  • Retry logic with exponential backoff
  • Using Redis connection pool instead of per-request
  • Retry logic with exponential backoff

Current State

Current State

  • 14 tests passing, 2 failing
  • Remaining: mock setup for session service tests
  • 14 tests passing, 2 failing
  • Remaining: mock setup for session service tests

Next Steps

Next Steps

  1. Fix remaining test failures
  2. Run full test suite
  3. Update documentation
undefined
  1. Fix remaining test failures
  2. Run full test suite
  3. Update documentation
undefined

Compression Triggers

压缩触发条件

StrategyTriggerTrade-off
Fixed threshold70-80% contextSimple but may compress early
Sliding windowLast N turns + summaryPredictable size
Importance-basedLow-relevance firstComplex but preserves signal
Task-boundaryAt task completionsClean but unpredictable
策略触发条件权衡点
固定阈值上下文占用70-80%实现简单,但可能过早触发压缩
滑动窗口最近N轮对话 + 总结尺寸可预测
基于重要性优先压缩低相关性内容实现复杂,但能保留关键信息
任务边界任务完成时逻辑清晰,但触发时机不可预测

The Artifact Trail Problem

工件追踪难题

File tracking is the weakest dimension (2.2-2.5/5.0 in evaluations). Coding agents need:
  • Which files were created
  • Which files were modified and what changed
  • Which files were read but not changed
  • Function names, variable names, error messages
Solution: Separate artifact index or explicit file-state tracking.
文件追踪是最薄弱的环节(在评估中仅得2.2-2.5/5.0分)。代码Agent需要追踪:
  • 创建了哪些文件
  • 修改了哪些文件以及具体变更内容
  • 读取但未修改的文件
  • 函数名、变量名、错误信息
解决方案:单独维护工件索引或显式追踪文件状态。

Probe-Based Evaluation

基于探针的评估

Test compression quality with probes:
Probe TypeTestsExample
RecallFactual retention"What was the original error?"
ArtifactFile tracking"Which files have we modified?"
ContinuationTask planning"What should we do next?"
DecisionReasoning chain"What did we decide about Redis?"
通过探针测试压缩质量:
探针类型测试内容示例
召回率事实信息留存「最初的错误是什么?」
工件追踪文件追踪情况「我们修改了哪些文件?」
任务延续任务规划连贯性「下一步我们应该做什么?」
决策验证推理链完整性「我们针对Redis做出了什么决策?」

Compression Ratios

压缩率对比

MethodCompressionQualityTrade-off
Anchored Iterative98.6%3.70Best quality
Regenerative98.7%3.44Moderate
Opaque99.3%3.35Best compression
The 0.7% extra tokens buys 0.35 quality points—worth it when re-fetching costs matter.
方法压缩率质量权衡点
锚定迭代法98.6%3.70质量最优
再生式总结98.7%3.44表现中等
不透明压缩99.3%3.35压缩率最高
多占用0.7%的token可换取0.35的质量分数——当重新获取信息的成本较高时,这笔开销是值得的。

Three-Phase Workflow (Large Codebases)

三阶段工作流(针对大型代码库)

  1. Research Phase: Explore and compress into structured analysis
  2. Planning Phase: Convert to implementation spec (~2,000 words for 5M tokens)
  3. Implementation Phase: Execute against the spec
  1. 调研阶段:探索信息并压缩为结构化分析报告
  2. 规划阶段:转换为实现规格(约2000字对应500万token)
  3. 实施阶段:根据规格执行开发

Best Practices

最佳实践

  1. Optimize for tokens-per-task, not tokens-per-request
  2. Use structured summaries with explicit file sections
  3. Trigger compression at 70-80% utilization
  4. Implement incremental merging over regeneration
  5. Test with probe-based evaluation
  6. Track artifact trail separately if critical
  7. Monitor re-fetching frequency as quality signal
  1. 针对「每任务token数」而非「每请求token数」进行优化
  2. 使用带有明确文件板块的结构化总结
  3. 在上下文占用率达70-80%时触发压缩
  4. 采用增量合并而非重新生成总结
  5. 通过基于探针的评估测试压缩效果
  6. 若工件追踪至关重要,单独维护相关记录
  7. 将重新获取信息的频率作为质量监控指标