context-compression

Original🇺🇸 English
Translated

Use when compressing agent context, implementing conversation summarization, reducing token usage in long sessions, or asking about "context compression", "conversation history", "token optimization", "context limits", "summarization strategies"

7installs
Added on

NPX Install

npx skill4agent add eyadsibai/ltk context-compression

Context Compression Strategies

When agent sessions generate millions of tokens, compression becomes mandatory. Optimize for tokens-per-task (total tokens to complete a task), not tokens-per-request.

Compression Approaches

1. Anchored Iterative Summarization (Recommended)

  • Maintain structured summaries with explicit sections
  • On compression, summarize only newly-truncated content
  • Merge with existing summary instead of regenerating
  • Structure forces preservation of critical info

2. Opaque Compression

  • Highest compression ratios (99%+)
  • Sacrifices interpretability
  • Cannot verify what was preserved

3. Regenerative Full Summary

  • Generate detailed summary on each compression
  • Readable but may lose details across cycles
  • Full regeneration rather than merging

Structured Summary Format

markdown
## Session Intent
[What the user is trying to accomplish]

## Files Modified
- auth.controller.ts: Fixed JWT token generation
- config/redis.ts: Updated connection pooling

## Decisions Made
- Using Redis connection pool instead of per-request
- Retry logic with exponential backoff

## Current State
- 14 tests passing, 2 failing
- Remaining: mock setup for session service tests

## Next Steps
1. Fix remaining test failures
2. Run full test suite
3. Update documentation

Compression Triggers

StrategyTriggerTrade-off
Fixed threshold70-80% contextSimple but may compress early
Sliding windowLast N turns + summaryPredictable size
Importance-basedLow-relevance firstComplex but preserves signal
Task-boundaryAt task completionsClean but unpredictable

The Artifact Trail Problem

File tracking is the weakest dimension (2.2-2.5/5.0 in evaluations). Coding agents need:
  • Which files were created
  • Which files were modified and what changed
  • Which files were read but not changed
  • Function names, variable names, error messages
Solution: Separate artifact index or explicit file-state tracking.

Probe-Based Evaluation

Test compression quality with probes:
Probe TypeTestsExample
RecallFactual retention"What was the original error?"
ArtifactFile tracking"Which files have we modified?"
ContinuationTask planning"What should we do next?"
DecisionReasoning chain"What did we decide about Redis?"

Compression Ratios

MethodCompressionQualityTrade-off
Anchored Iterative98.6%3.70Best quality
Regenerative98.7%3.44Moderate
Opaque99.3%3.35Best compression
The 0.7% extra tokens buys 0.35 quality points—worth it when re-fetching costs matter.

Three-Phase Workflow (Large Codebases)

  1. Research Phase: Explore and compress into structured analysis
  2. Planning Phase: Convert to implementation spec (~2,000 words for 5M tokens)
  3. Implementation Phase: Execute against the spec

Best Practices

  1. Optimize for tokens-per-task, not tokens-per-request
  2. Use structured summaries with explicit file sections
  3. Trigger compression at 70-80% utilization
  4. Implement incremental merging over regeneration
  5. Test with probe-based evaluation
  6. Track artifact trail separately if critical
  7. Monitor re-fetching frequency as quality signal