doc-condenser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Document Condenser

文档压缩器

Transform verbose technical documentation into concise, developer-focused specs.
将冗长的技术文档转换为简洁、面向开发者的规范。

Core Principles

核心原则

  1. Paths first - Every file reference includes full/relative path
  2. Tables over prose - Use tables for metrics, coverage, file lists
  3. Code samples stay - Keep small, illustrative snippets; remove verbose examples
  4. Commentary, not explanation - Brief context sentences, not paragraphs
  5. One-line history - Reference legacy docs, don't preserve their content
  1. 路径优先 - 所有文件引用需包含完整/相对路径
  2. 表格优先 - 指标、覆盖范围、文件列表等内容采用表格形式展示
  3. 保留代码示例 - 保留小型、具有说明性的代码片段;移除冗长的示例
  4. 仅做注释,不做解释 - 用简短的上下文语句,而非段落
  5. 单行历史记录 - 仅引用旧版文档,不保留其内容

Output Structure

输出结构

markdown
undefined
markdown
undefined

[path/to/output.md]

[path/to/output.md]

[Title] - [Subtitle if needed]

[Title] - [Subtitle if needed]

Purpose

Purpose

[2-3 sentences: what this is, why it exists, key design principle]
[2-3 sentences: what this is, why it exists, key design principle]

Status

Status

[Table: metrics, rates, performance]
[Table: metrics, rates, performance]

Architecture Overview

Architecture Overview

[Optional diagram or brief flow description] [Only if it aids understanding]
[Optional diagram or brief flow description] [Only if it aids understanding]

Implementation Files

Implementation Files

[Grouped by category with paths and one-line descriptions]
[Grouped by category with paths and one-line descriptions]

[Domain-Specific Sections]

[Domain-Specific Sections]

[Tables, code snippets, brief commentary as needed]
[Tables, code snippets, brief commentary as needed]

Quick Reference

Quick Reference

[Box or code block with key stats for scanning]

See `assets/template.md` for a copy-ready scaffold of this structure.
[Box or code block with key stats for scanning]

如需可直接复用的模板框架,请查看`assets/template.md`。

Transformation Rules

转换规则

KEEP

保留内容

  • File paths (always full or project-relative)
  • Metrics and measurements
  • Code snippets under 15 lines that illustrate patterns
  • Schema examples and data structures
  • Coverage/status tables
  • 文件路径(始终使用完整路径或项目相对路径)
  • 指标与测量数据
  • 15行以内、用于展示模式的代码片段
  • 模式示例与数据结构
  • 覆盖范围/状态表格

CONDENSE

压缩内容

  • Multi-paragraph explanations → 1-2 sentences
  • Verbose examples → representative snippet + "see X for more"
  • Implementation checklists → completion status table
  • Long rationales → single "Design principle: X" line
  • Code snippets longer than 20 lines → condense to the core pattern + a reference comment pointing to the source file
  • Rationale sections where the same point is restated across more than 3 sentences → collapse to one "Design principle:" line
  • 多段落解释 → 精简为1-2句话
  • 冗长示例 → 保留代表性片段,并添加“详见X获取更多内容”
  • 实施检查清单 → 转换为完成状态表格
  • 长篇原理说明 → 精简为单行“设计原则:X”
  • 超过20行的代码片段 → 精简为核心模式,并添加指向源文件的引用注释
  • 同一观点重复超过3句的原理说明部分 → 合并为单行“设计原则:”语句

REMOVE

移除内容

  • Historical context beyond one reference line
  • Achieved/completed celebration language
  • Redundant explanations of the same concept
  • Step-by-step tutorials (link to them instead)
  • "What we learned" retrospectives
  • 除单行引用外的历史背景
  • 关于完成目标的庆祝性语言
  • 同一概念的冗余解释
  • 分步教程(改为链接指向)
  • “我们学到了什么”这类回顾内容

FORMAT

格式规范

  • Use
    code blocks
    for paths and commands
  • Group related files under headers
  • Prefer tables over bullet lists for structured data
  • End with quick-reference block for scanning
  • 路径与命令使用
    代码块
    包裹
  • 相关文件按类别分组,置于对应标题下
  • 结构化数据优先使用表格,而非项目符号列表
  • 结尾添加便于快速浏览的快速参考块

Working with Existing Documents

处理现有文档

When condensing an existing verbose doc:
  1. Identify the core purpose (first paragraph of output)
  2. Extract all file paths into grouped tables
  3. Preserve code samples that show patterns
  4. Convert prose sections to tables where possible
  5. Add single history reference line
  6. Verify no information loss on key technical details
压缩现有冗长文档时:
  1. 明确核心目标(作为输出的第一段)
  2. 提取所有文件路径,整理为分组表格
  3. 保留展示模式的代码示例
  4. 尽可能将叙述性文本转换为表格
  5. 添加单行历史记录引用
  6. 验证关键技术细节无遗漏

Style Guide

风格指南

See
references/style-guide.md
for detailed formatting rules, table patterns, and code sample guidelines.
如需详细的格式规则、表格样式与代码示例指南,请查看
references/style-guide.md

Example Transformation

转换示例

Before (verbose):
We have successfully achieved and EXCEEDED the original goals of this specification!
After many iterations and improvements, our automation rate reached 96.6% which is
above our target of 95%. The team worked hard on this and we're very proud...
After (concise):
**v31 PRODUCTION** | 96.6% automation (target: 95%)
转换前(冗长版本):
We have successfully achieved and EXCEEDED the original goals of this specification!
After many iterations and improvements, our automation rate reached 96.6% which is
above our target of 95%. The team worked hard on this and we're very proud...
转换后(简洁版本):
**v31 PRODUCTION** | 96.6% automation (target: 95%)

Calibration Rules

校准规则

  • Condensed output must not exceed 40% of source document length measured in words.
  • All file paths present in the original must appear in the condensed output — paths are never dropped.
  • Code blocks are never removed outright; reduce length by extracting the representative pattern and adding a source reference comment.
  • 压缩后的输出字数不得超过源文档的40%。
  • 源文档中所有的文件路径必须出现在压缩后的输出中——不得丢弃任何路径。
  • 不得直接移除代码块;需通过提取代表性模式并添加源文件引用注释来缩短长度。

Error Handling

错误处理

  • Source document has no clear purpose: ask for one sentence of context before condensing — do not infer a purpose and proceed.
  • A section contains items that are ambiguous between KEEP and REMOVE: default to KEEP and flag the section with a
    <!-- review: ambiguous -->
    comment in the output.
  • Condensed result loses required technical detail identified during verification: restore the omitted detail and re-measure against the 40% length cap; if the cap cannot be met, document the exception inline.
  • Source contains no file paths: skip the Implementation Files section entirely rather than generating placeholder paths.
  • Source is a non-text format (image, diagram, spreadsheet): report the format is unsupported and return without output.
  • Style guide conflicts with source formatting conventions: follow
    references/style-guide.md
    and note the override at the top of the output.
  • 源文档无明确目标:压缩前需请求一句上下文说明——不得自行推断目标并继续处理。
  • 某部分内容在“保留”与“移除”之间存在歧义:默认保留,并在输出中添加
    <!-- review: ambiguous -->
    注释标记该部分。
  • 压缩结果在验证过程中发现丢失必要技术细节:恢复遗漏的细节,并重新检查是否符合40%的长度限制;若无法满足限制,需在输出中内联记录该例外情况。
  • 源文档无文件路径:直接跳过“实现文件”部分,不得生成占位符路径。
  • 源文档为非文本格式(图片、图表、电子表格):报告该格式不支持,且不返回任何输出。
  • 风格指南与源文档格式规范冲突:遵循
    references/style-guide.md
    ,并在输出顶部记录该覆盖操作。

Limitations

局限性

  • Works on text documents only; images, diagrams, and binary files cannot be condensed.
  • Condensation ratio depends on source verbosity — highly structured sources yield less reduction.
  • Style guide deference (
    references/style-guide.md
    ) takes precedence over source formatting, which can alter heading levels and table layouts.
  • Does not follow hyperlinks or fetch referenced external documents; referenced content is noted but not inlined.
  • 仅支持文本文档;无法压缩图片、图表与二进制文件。
  • 压缩比例取决于源文档的冗长程度——结构化程度高的源文档压缩幅度较小。
  • 优先遵循风格指南(
    references/style-guide.md
    ),而非源文档格式,这可能会改变标题层级与表格布局。
  • 不支持跟随超链接或获取引用的外部文档;仅记录引用内容,不将其嵌入输出。