context-compression
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseContext Compression Strategies
上下文压缩策略
When agent sessions generate millions of tokens, compression becomes mandatory. Optimize for tokens-per-task (total tokens to complete a task), not tokens-per-request.
当Agent会话生成数百万个token时,压缩就变得必不可少。我们要针对「每任务token数」(完成一项任务的总token数)进行优化,而非「每请求token数」。
Compression Approaches
压缩方法
1. Anchored Iterative Summarization (Recommended)
1. 锚定迭代总结(推荐方案)
- Maintain structured summaries with explicit sections
- On compression, summarize only newly-truncated content
- Merge with existing summary instead of regenerating
- Structure forces preservation of critical info
- 维护带有明确板块的结构化总结
- 压缩时,仅对新截断的内容进行总结
- 与现有总结合并,而非重新生成
- 结构化设计可强制保留关键信息
2. Opaque Compression
2. 不透明压缩
- Highest compression ratios (99%+)
- Sacrifices interpretability
- Cannot verify what was preserved
- 压缩率最高(可达99%以上)
- 牺牲了可解释性
- 无法验证哪些内容被保留
3. Regenerative Full Summary
3. 再生式完整总结
- Generate detailed summary on each compression
- Readable but may lose details across cycles
- Full regeneration rather than merging
- 每次压缩时生成详细总结
- 可读性强,但多轮循环后可能丢失细节
- 重新生成完整总结,而非合并现有内容
Structured Summary Format
结构化总结格式
markdown
undefinedmarkdown
undefinedSession Intent
Session Intent
[What the user is trying to accomplish]
[What the user is trying to accomplish]
Files Modified
Files Modified
- auth.controller.ts: Fixed JWT token generation
- config/redis.ts: Updated connection pooling
- auth.controller.ts: Fixed JWT token generation
- config/redis.ts: Updated connection pooling
Decisions Made
Decisions Made
- Using Redis connection pool instead of per-request
- Retry logic with exponential backoff
- Using Redis connection pool instead of per-request
- Retry logic with exponential backoff
Current State
Current State
- 14 tests passing, 2 failing
- Remaining: mock setup for session service tests
- 14 tests passing, 2 failing
- Remaining: mock setup for session service tests
Next Steps
Next Steps
- Fix remaining test failures
- Run full test suite
- Update documentation
undefined- Fix remaining test failures
- Run full test suite
- Update documentation
undefinedCompression Triggers
压缩触发条件
| Strategy | Trigger | Trade-off |
|---|---|---|
| Fixed threshold | 70-80% context | Simple but may compress early |
| Sliding window | Last N turns + summary | Predictable size |
| Importance-based | Low-relevance first | Complex but preserves signal |
| Task-boundary | At task completions | Clean but unpredictable |
| 策略 | 触发条件 | 权衡点 |
|---|---|---|
| 固定阈值 | 上下文占用70-80% | 实现简单,但可能过早触发压缩 |
| 滑动窗口 | 最近N轮对话 + 总结 | 尺寸可预测 |
| 基于重要性 | 优先压缩低相关性内容 | 实现复杂,但能保留关键信息 |
| 任务边界 | 任务完成时 | 逻辑清晰,但触发时机不可预测 |
The Artifact Trail Problem
工件追踪难题
File tracking is the weakest dimension (2.2-2.5/5.0 in evaluations). Coding agents need:
- Which files were created
- Which files were modified and what changed
- Which files were read but not changed
- Function names, variable names, error messages
Solution: Separate artifact index or explicit file-state tracking.
文件追踪是最薄弱的环节(在评估中仅得2.2-2.5/5.0分)。代码Agent需要追踪:
- 创建了哪些文件
- 修改了哪些文件以及具体变更内容
- 读取但未修改的文件
- 函数名、变量名、错误信息
解决方案:单独维护工件索引或显式追踪文件状态。
Probe-Based Evaluation
基于探针的评估
Test compression quality with probes:
| Probe Type | Tests | Example |
|---|---|---|
| Recall | Factual retention | "What was the original error?" |
| Artifact | File tracking | "Which files have we modified?" |
| Continuation | Task planning | "What should we do next?" |
| Decision | Reasoning chain | "What did we decide about Redis?" |
通过探针测试压缩质量:
| 探针类型 | 测试内容 | 示例 |
|---|---|---|
| 召回率 | 事实信息留存 | 「最初的错误是什么?」 |
| 工件追踪 | 文件追踪情况 | 「我们修改了哪些文件?」 |
| 任务延续 | 任务规划连贯性 | 「下一步我们应该做什么?」 |
| 决策验证 | 推理链完整性 | 「我们针对Redis做出了什么决策?」 |
Compression Ratios
压缩率对比
| Method | Compression | Quality | Trade-off |
|---|---|---|---|
| Anchored Iterative | 98.6% | 3.70 | Best quality |
| Regenerative | 98.7% | 3.44 | Moderate |
| Opaque | 99.3% | 3.35 | Best compression |
The 0.7% extra tokens buys 0.35 quality points—worth it when re-fetching costs matter.
| 方法 | 压缩率 | 质量 | 权衡点 |
|---|---|---|---|
| 锚定迭代法 | 98.6% | 3.70 | 质量最优 |
| 再生式总结 | 98.7% | 3.44 | 表现中等 |
| 不透明压缩 | 99.3% | 3.35 | 压缩率最高 |
多占用0.7%的token可换取0.35的质量分数——当重新获取信息的成本较高时,这笔开销是值得的。
Three-Phase Workflow (Large Codebases)
三阶段工作流(针对大型代码库)
- Research Phase: Explore and compress into structured analysis
- Planning Phase: Convert to implementation spec (~2,000 words for 5M tokens)
- Implementation Phase: Execute against the spec
- 调研阶段:探索信息并压缩为结构化分析报告
- 规划阶段:转换为实现规格(约2000字对应500万token)
- 实施阶段:根据规格执行开发
Best Practices
最佳实践
- Optimize for tokens-per-task, not tokens-per-request
- Use structured summaries with explicit file sections
- Trigger compression at 70-80% utilization
- Implement incremental merging over regeneration
- Test with probe-based evaluation
- Track artifact trail separately if critical
- Monitor re-fetching frequency as quality signal
- 针对「每任务token数」而非「每请求token数」进行优化
- 使用带有明确文件板块的结构化总结
- 在上下文占用率达70-80%时触发压缩
- 采用增量合并而非重新生成总结
- 通过基于探针的评估测试压缩效果
- 若工件追踪至关重要,单独维护相关记录
- 将重新获取信息的频率作为质量监控指标