token-saver-context-compression

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Token Saver Context Compression

Token Saver 上下文压缩

Use this skill to reduce token usage while preserving grounded evidence. This integrates:
  • pnpm search:code
    (hybrid retrieval)
  • token-saver Python compression scripts
  • MemoryRecord persistence into framework memory
  • spawn prompt evidence injection (
    [mem:*]
    /
    [rag:*]
    )
使用此技能可在保留可靠证据的同时降低Token使用量。它整合了以下功能:
  • pnpm search:code
    (混合检索)
  • token-saver Python压缩脚本
  • 将MemoryRecord持久化到框架内存中
  • 生成提示词证据注入(
    [mem:*]
    /
    [rag:*]

When to Use

适用场景

  • pnpm search:tokens
    shows a file/directory exceeds 32K tokens
  • Context is large or expensive and you need a compressed summary
  • You need query-targeted compression before synthesis
  • You need hard evidence sufficiency gating before persisting memory
  • You're building a prompt and
    search:code
    results alone aren't enough context
  • pnpm search:tokens
    显示某个文件/目录Token数量超过32K时
  • 上下文内容庞大或成本高昂,需要压缩摘要时
  • 在生成内容前需要针对查询进行定向压缩时
  • 在持久化内存前需要对证据充分性进行校验时
  • 构建提示词时,仅靠
    search:code
    结果提供的上下文不足时

Iron Law

铁则

Do not persist compressed content directly to memory files from a subprocess. Emit MemoryRecord payloads and let framework hooks process sync/indexing.
切勿从子进程直接将压缩内容持久化到内存文件中。应输出MemoryRecord负载,由框架钩子处理同步/索引操作。

Workflow

工作流

  1. Retrieve candidate context (
    pnpm search:code "<query>"
    ).
  2. Compress using token-saver in JSON mode (
    run_skill_workflow.py --output-format json
    ).
  3. If evidence is insufficient and fail gate is on, stop.
  4. Map distilled insights into MemoryRecord-ready payloads.
  5. Persist through MemoryRecord so
    .claude/hooks/memory/sync-memory-index.cjs
    runs.
  1. 检索候选上下文(
    pnpm search:code "<查询语句>"
    )。
  2. 使用token-saver的JSON模式进行压缩(
    run_skill_workflow.py --output-format json
    )。
  3. 若证据不足且开启失败校验,则停止流程。
  4. 将提炼的洞察映射为可用于MemoryRecord的负载。
  5. 通过MemoryRecord进行持久化,触发
    .claude/hooks/memory/sync-memory-index.cjs
    运行。

Mapping Rule (Deterministic)

映射规则(确定性)

  • gotchas.json
    :
    • text contains
      gotcha|pitfall|anti-pattern|risk|warning|failure
  • issues.md
    :
    • text contains
      issue|bug|error|incident|defect|gap
  • decisions.md
    :
    • text contains
      decision|tradeoff|choose|selected|rationale
  • patterns.json
    :
    • default fallback for all remaining distilled evidence
  • gotchas.json
    :
    • 文本包含
      gotcha|pitfall|anti-pattern|risk|warning|failure
      关键词
  • issues.md
    :
    • 文本包含
      issue|bug|error|incident|defect|gap
      关键词
  • decisions.md
    :
    • 文本包含
      decision|tradeoff|choose|selected|rationale
      关键词
  • patterns.json
    :
    • 所有剩余提炼证据的默认存储位置

Tooling Commands

工具命令

Preferred wrapper entrypoint:
bash
node .claude/skills/token-saver-context-compression/scripts/main.cjs --query "<question>" --mode evidence_aware --limit 20 --fail-on-insufficient-evidence
Direct Python engine (advanced):
bash
python .claude/skills/token-saver-context-compression/scripts/run_skill_workflow.py --file <path> --mode evidence_aware --query "<question>" --output-format json --fail-on-insufficient-evidence
推荐的包装器入口:
bash
node .claude/skills/token-saver-context-compression/scripts/main.cjs --query "<问题>" --mode evidence_aware --limit 20 --fail-on-insufficient-evidence
直接使用Python引擎(进阶用法):
bash
python .claude/skills/token-saver-context-compression/scripts/run_skill_workflow.py --file <路径> --mode evidence_aware --query "<问题>" --output-format json --fail-on-insufficient-evidence

Output Contract

输出约定

  • Wrapper emits JSON with:
    • search
      summary
    • compression
      summary
    • memoryRecords
      grouped by target (
      patterns
      ,
      gotchas
      ,
      issues
      ,
      decisions
      )
    • evidence
      sufficiency status
  • 包装器会输出包含以下内容的JSON:
    • search
      摘要
    • compression
      摘要
    • memoryRecords
      按目标分类(
      patterns
      gotchas
      issues
      decisions
    • evidence
      充分性状态

Workflow References

工作流参考

  • Skill workflow:
    .claude/workflows/token-saver-context-compression-skill-workflow.md
  • Companion tool:
    .claude/tools/token-saver-context-compression/token-saver-context-compression.cjs
  • Command surface:
    .claude/skills/token-saver-context-compression/commands/token-saver-context-compression.md
  • Citation format is unchanged:
    • memory entries become
      [mem:xxxxxxxx]
    • RAG entries remain
      [rag:xxxxxxxx]
  • 技能工作流:
    .claude/workflows/token-saver-context-compression-skill-workflow.md
  • 配套工具:
    .claude/tools/token-saver-context-compression/token-saver-context-compression.cjs
  • 命令说明:
    .claude/skills/token-saver-context-compression/commands/token-saver-context-compression.md
  • 引用格式保持不变:
    • 内存条目格式为
      [mem:xxxxxxxx]
    • RAG条目格式保持
      [rag:xxxxxxxx]

Integration with search:tokens

与search:tokens的集成

Use
pnpm search:tokens
to decide when to invoke this skill:
bash
undefined
使用
pnpm search:tokens
判断是否需要调用此技能:
bash
undefined

Check if you need compression

检查是否需要压缩

pnpm search:tokens .claude/lib/memory
pnpm search:tokens .claude/lib/memory

Output: 60 files, 500KB, ~128K tokens ⚠ OVER CONTEXT

输出:60 files, 500KB, ~128K tokens ⚠ OVER CONTEXT

Then compress with a targeted query

随后使用定向查询进行压缩

node .claude/skills/token-saver-context-compression/scripts/main.cjs
--query "how does memory persistence work" --mode evidence_aware --limit 10

The tool reads actual file content from search results (not just file paths), compresses via the Python engine, and extracts memory records classified by type (patterns, gotchas, issues, decisions).
node .claude/skills/token-saver-context-compression/scripts/main.cjs
--query "内存持久化的工作原理" --mode evidence_aware --limit 10

该工具会从搜索结果中读取实际文件内容(而非仅文件路径),通过Python引擎进行压缩,并按类型(模式、注意事项、问题、决策)提取内存记录。

Adaptive Compression

自适应压缩

Adaptive compression (adjusting compression ratio based on corpus size) is automatic and requires no env var configuration. When the input corpus is small, compression is lighter; when it is large, compression is more aggressive. This is controlled internally by the Python engine based on token counts.
自适应压缩(根据语料库大小调整压缩比率)为自动功能,无需配置环境变量。当输入语料库较小时,压缩程度较轻;当语料库较大时,压缩程度会更激进。这由Python引擎根据Token数量自动控制。

Requirements

环境要求

  • Node.js 18+
  • Python 3.10+
  • Node.js 18+
  • Python 3.10+

Memory Protocol (MANDATORY)

内存协议(强制要求)

Before work:
bash
cat .claude/context/memory/learnings.md
After work:
  • Add integration learnings to
    .claude/context/memory/learnings.md
  • Add integration risks to
    .claude/context/memory/issues.md
工作开始前:
bash
cat .claude/context/memory/learnings.md
工作完成后:
  • 将集成经验添加到
    .claude/context/memory/learnings.md
  • 将集成风险添加到
    .claude/context/memory/issues.md