context-compression

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Context Compression Strategies

上下文压缩策略

When agent sessions generate millions of tokens, compression becomes mandatory. Optimize for tokens-per-task (total tokens to complete a task), not tokens-per-request.

当Agent会话生成数百万个token时，压缩就变得必不可少。我们要针对「每任务token数」（完成一项任务的总token数）进行优化，而非「每请求token数」。

Compression Approaches

压缩方法

1. Anchored Iterative Summarization (Recommended)

1. 锚定迭代总结（推荐方案）

Maintain structured summaries with explicit sections
On compression, summarize only newly-truncated content
Merge with existing summary instead of regenerating
Structure forces preservation of critical info

维护带有明确板块的结构化总结
压缩时，仅对新截断的内容进行总结
与现有总结合并，而非重新生成
结构化设计可强制保留关键信息

2. Opaque Compression

2. 不透明压缩

Highest compression ratios (99%+)
Sacrifices interpretability
Cannot verify what was preserved

压缩率最高（可达99%以上）
牺牲了可解释性
无法验证哪些内容被保留

3. Regenerative Full Summary

3. 再生式完整总结

Generate detailed summary on each compression
Readable but may lose details across cycles
Full regeneration rather than merging

每次压缩时生成详细总结
可读性强，但多轮循环后可能丢失细节
重新生成完整总结，而非合并现有内容

Structured Summary Format

结构化总结格式

markdown

undefined

markdown

undefined

Session Intent

[What the user is trying to accomplish]

Files Modified

auth.controller.ts: Fixed JWT token generation
config/redis.ts: Updated connection pooling

auth.controller.ts: Fixed JWT token generation
config/redis.ts: Updated connection pooling

Decisions Made

Using Redis connection pool instead of per-request
Retry logic with exponential backoff

Using Redis connection pool instead of per-request
Retry logic with exponential backoff

Current State

14 tests passing, 2 failing
Remaining: mock setup for session service tests

14 tests passing, 2 failing
Remaining: mock setup for session service tests

Next Steps

Fix remaining test failures
Run full test suite
Update documentation

undefined

Fix remaining test failures
Run full test suite
Update documentation

undefined

Compression Triggers

压缩触发条件

Strategy	Trigger	Trade-off
Fixed threshold	70-80% context	Simple but may compress early
Sliding window	Last N turns + summary	Predictable size
Importance-based	Low-relevance first	Complex but preserves signal
Task-boundary	At task completions	Clean but unpredictable

策略	触发条件	权衡点
固定阈值	上下文占用70-80%	实现简单，但可能过早触发压缩
滑动窗口	最近N轮对话 + 总结	尺寸可预测
基于重要性	优先压缩低相关性内容	实现复杂，但能保留关键信息
任务边界	任务完成时	逻辑清晰，但触发时机不可预测

The Artifact Trail Problem

工件追踪难题

File tracking is the weakest dimension (2.2-2.5/5.0 in evaluations). Coding agents need:

Which files were created
Which files were modified and what changed
Which files were read but not changed
Function names, variable names, error messages

Solution: Separate artifact index or explicit file-state tracking.

文件追踪是最薄弱的环节（在评估中仅得2.2-2.5/5.0分）。代码Agent需要追踪：

创建了哪些文件
修改了哪些文件以及具体变更内容
读取但未修改的文件
函数名、变量名、错误信息

解决方案：单独维护工件索引或显式追踪文件状态。

Probe-Based Evaluation

基于探针的评估

Test compression quality with probes:

Probe Type	Tests	Example
Recall	Factual retention	"What was the original error?"
Artifact	File tracking	"Which files have we modified?"
Continuation	Task planning	"What should we do next?"
Decision	Reasoning chain	"What did we decide about Redis?"

通过探针测试压缩质量：

探针类型	测试内容	示例
召回率	事实信息留存	「最初的错误是什么？」
工件追踪	文件追踪情况	「我们修改了哪些文件？」
任务延续	任务规划连贯性	「下一步我们应该做什么？」
决策验证	推理链完整性	「我们针对Redis做出了什么决策？」

Compression Ratios

压缩率对比

Method	Compression	Quality	Trade-off
Anchored Iterative	98.6%	3.70	Best quality
Regenerative	98.7%	3.44	Moderate
Opaque	99.3%	3.35	Best compression

The 0.7% extra tokens buys 0.35 quality points—worth it when re-fetching costs matter.

方法	压缩率	质量	权衡点
锚定迭代法	98.6%	3.70	质量最优
再生式总结	98.7%	3.44	表现中等
不透明压缩	99.3%	3.35	压缩率最高

多占用0.7%的token可换取0.35的质量分数——当重新获取信息的成本较高时，这笔开销是值得的。

Three-Phase Workflow (Large Codebases)

三阶段工作流（针对大型代码库）

Research Phase: Explore and compress into structured analysis
Planning Phase: Convert to implementation spec (~2,000 words for 5M tokens)
Implementation Phase: Execute against the spec

调研阶段：探索信息并压缩为结构化分析报告
规划阶段：转换为实现规格（约2000字对应500万token）
实施阶段：根据规格执行开发

Best Practices

最佳实践

Optimize for tokens-per-task, not tokens-per-request
Use structured summaries with explicit file sections
Trigger compression at 70-80% utilization
Implement incremental merging over regeneration
Test with probe-based evaluation
Track artifact trail separately if critical
Monitor re-fetching frequency as quality signal

针对「每任务token数」而非「每请求token数」进行优化
使用带有明确文件板块的结构化总结
在上下文占用率达70-80%时触发压缩
采用增量合并而非重新生成总结
通过基于探针的评估测试压缩效果
若工件追踪至关重要，单独维护相关记录
将重新获取信息的频率作为质量监控指标