context-degradation
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseContext Degradation Patterns
上下文退化模式
Language models exhibit predictable degradation as context grows. Understanding these patterns is essential for diagnosing failures and designing resilient systems.
随着上下文长度增加,语言模型会出现可预测的性能退化。理解这些模式对于诊断故障和设计具备韧性的系统至关重要。
Degradation Patterns
退化模式
| Pattern | Cause | Symptoms |
|---|---|---|
| Lost-in-Middle | Attention mechanics | 10-40% lower recall for middle content |
| Context Poisoning | Errors compound | Tool misalignment, persistent hallucinations |
| Context Distraction | Irrelevant info | Uses wrong information for decisions |
| Context Confusion | Mixed tasks | Responses address wrong aspects |
| Context Clash | Conflicting info | Contradictory guidance derails reasoning |
| 模式 | 成因 | 症状 |
|---|---|---|
| Lost-in-Middle(中间信息丢失) | 注意力机制 | 中间内容的召回率降低10-40% |
| Context Poisoning(上下文污染) | 错误累积 | 工具调用失准、持续产生幻觉内容 |
| Context Distraction(上下文干扰) | 无关信息 | 决策时使用错误信息 |
| Context Confusion(上下文混淆) | 任务混合 | 回复针对错误的任务维度 |
| Context Clash(上下文冲突) | 信息矛盾 | 矛盾的指导信息破坏推理过程 |
Lost-in-Middle
Lost-in-Middle(中间信息丢失)
Information at beginning and end receives reliable attention. Middle content suffers dramatically reduced recall.
Mitigation:
markdown
[CURRENT TASK] # At start (high attention)
- Goal: Generate quarterly report
- Deadline: End of week
[DETAILED CONTEXT] # Middle (less attention)
- 50 pages of data
- Supporting evidence
[KEY FINDINGS] # At end (high attention)
- Revenue up 15%
- Growth in Region A位于上下文开头和结尾的信息会获得稳定的注意力,而中间内容的召回率会大幅下降。
缓解方案:
markdown
[CURRENT TASK] # 开头位置(高关注度)
- Goal: Generate quarterly report
- Deadline: End of week
[DETAILED CONTEXT] # 中间位置(低关注度)
- 50 pages of data
- Supporting evidence
[KEY FINDINGS] # 结尾位置(高关注度)
- Revenue up 15%
- Growth in Region AContext Poisoning
Context Poisoning(上下文污染)
Once errors enter context, they compound through repeated reference.
Entry pathways:
- Tool outputs with errors
- Retrieved docs with incorrect info
- Model-generated summaries with hallucinations
Symptoms:
- Tool calls with wrong parameters
- Strategies that take effort to undo
- Hallucinations that persist despite correction
Recovery:
- Truncate to before poisoning point
- Explicitly note poisoning and re-evaluate
- Restart with clean context
一旦错误进入上下文,就会通过重复引用不断累积恶化。
错误引入途径:
- 包含错误的工具输出
- 检索到的文档存在错误信息
- 模型生成的摘要包含幻觉内容
症状:
- 工具调用使用错误参数
- 形成难以纠正的错误策略
- 即使修正后仍持续产生幻觉内容
恢复方法:
- 截断上下文至污染发生前的位置
- 明确标记污染并重新评估
- 使用干净的上下文重启任务
Context Distraction
Context Distraction(上下文干扰)
Even a single irrelevant document reduces performance. Models must attend to everything—they cannot "skip" irrelevant content.
Mitigation:
- Filter for relevance before loading
- Use namespacing for organization
- Access via tools instead of context
即使是单份无关文档也会降低性能。模型必须关注所有内容——它们无法「跳过」无关信息。
缓解方案:
- 加载前过滤无关内容
- 使用命名空间进行组织
- 通过工具访问而非直接放入上下文
Degradation Thresholds
退化阈值
| Model | Degradation Onset | Severe Degradation |
|---|---|---|
| GPT-5.2 | ~64K tokens | ~200K tokens |
| Claude Opus 4.5 | ~100K tokens | ~180K tokens |
| Claude Sonnet 4.5 | ~80K tokens | ~150K tokens |
| Gemini 3 Pro | ~500K tokens | ~800K tokens |
| 模型 | 退化起始点 | 严重退化点 |
|---|---|---|
| GPT-5.2 | ~64K tokens | ~200K tokens |
| Claude Opus 4.5 | ~100K tokens | ~180K tokens |
| Claude Sonnet 4.5 | ~80K tokens | ~150K tokens |
| Gemini 3 Pro | ~500K tokens | ~800K tokens |
The Four-Bucket Approach
四桶法策略
| Strategy | Purpose |
|---|---|
| Write | Save context outside window |
| Select | Pull relevant context in |
| Compress | Reduce tokens, preserve info |
| Isolate | Split across sub-agents |
| 策略 | 目的 |
|---|---|
| Write(写入) | 将上下文保存到窗口之外 |
| Select(筛选) | 引入相关上下文 |
| Compress(压缩) | 减少tokens数量,保留关键信息 |
| Isolate(隔离) | 拆分任务给子Agent处理 |
Counterintuitive Findings
反直觉发现
- Shuffled haystacks outperform coherent - Coherent context creates false associations
- Single distractors have outsized impact - Step function, not proportional
- Needle-question similarity matters - Dissimilar content degrades faster
- 打乱的信息堆比连贯信息表现更好 - 连贯上下文会产生错误关联
- 单个干扰项影响巨大 - 性能呈阶跃式下降,而非比例性下降
- 关键信息与问题的相似度很重要 - 相似度低的内容退化速度更快
When Larger Contexts Hurt
大上下文反而有害的场景
- Performance degrades non-linearly after threshold
- Cost grows exponentially with context length
- Cognitive bottleneck remains regardless of size
- 超过阈值后性能呈非线性退化
- 成本随上下文长度呈指数增长
- 无论上下文多大,认知瓶颈依然存在
Best Practices
最佳实践
- Monitor context length and performance correlation
- Place critical information at beginning or end
- Implement compaction triggers before degradation
- Validate retrieved documents for accuracy
- Use versioning to prevent outdated info clash
- Segment tasks to prevent confusion
- Design for graceful degradation
- Test with progressively larger contexts
- 监控上下文长度与性能的相关性
- 将关键信息放在上下文的开头或结尾
- 在退化发生前触发上下文压缩
- 验证检索文档的准确性
- 使用版本控制防止过时信息冲突
- 拆分任务避免上下文混淆
- 设计具备优雅退化能力的系统
- 用逐步增大的上下文进行测试