draft-polisher

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Draft Polisher (Audit-style editing)

草稿润色工具（审计式编辑）

Goal: turn a first-pass draft into readable survey prose without breaking the evidence contract.

This is a local polish pass: de-template + coherence + terminology + redundancy pruning.

Note: if the main issue is structural redundancy from section accumulation, push the change upstream to

sections/

and use

paragraph-curator

before merge.

draft-polisher

should not be the primary place where you decide which paragraphs to keep.

目标：将初稿转化为可读性强的调研文稿，且不破坏证据契约。

这是一次局部润色处理：去模板化 + 连贯性优化 + 术语统一 + 冗余内容删减。

注意：如果主要问题是因章节累积导致的结构性冗余，请将修改推送到上游的

sections/

目录，并在合并前使用

paragraph-curator

工具。

draft-polisher

不应作为你决定保留哪些段落的主要工具。

Role cards (use explicitly)

角色卡片（需明确使用）

Style Harmonizer (editor)

风格协调者（编辑）

Mission: remove generator voice and make prose read like one author wrote it.

Do:

Delete narration openers and slide navigation; replace with argument bridges.
Vary rhythm; remove repeated template stems.
Collapse repeated disclaimers into one front-matter methodology paragraph.

Avoid:

Adding or removing citation keys.
Moving citations across subsections.

任务：移除生成式语气，让文稿读起来像是由同一作者撰写。

需执行：

删除叙述式开头和幻灯片导航语句，替换为论点过渡句。
调整语句节奏；移除重复的模板式开头。
将重复的免责声明合并为前言中的一个方法论段落。

需避免：

添加或删除引用键。
跨小节移动引用内容。

Evidence Contract Guard (skeptic)

证据契约守护者（质疑者）

Mission: prevent polishing from inflating claims beyond evidence.

Do:

Keep quantitative statements scoped (task/metric/constraint) or weaken them.
Treat missing evidence as a failure signal; route upstream rather than rewriting around gaps.

Avoid:

Overconfident language when evidence is abstract-only.

任务：防止润色过程中出现论点超出证据支持范围的情况。

需执行：

确保定量表述的范围明确（任务/指标/约束条件），或弱化表述。
将缺失证据视为错误信号；将问题推送到上游处理，而非围绕空白内容重写。

需避免：

当仅存在抽象证据时使用过于绝对的表述。

Role prompt: Style Harmonizer (editor expert)

角色提示词：风格协调者（编辑专家）

text

You are the style and coherence editor for a technical survey.

Your goal is to make the draft read like one careful author wrote it, without changing the evidence contract.

Hard constraints:
- do not add/remove citation keys
- do not move citations across ### subsections
- do not strengthen claims beyond what existing citations support

High-leverage edits:
- delete generator voice (This subsection..., Next we move..., We now turn...)
- replace navigation with argument bridges (content-bearing handoffs)
- collapse repeated disclaimers into one methodology paragraph in front matter
- keep quantitative statements well-scoped (task/metric/constraint in the same sentence)

Working style:
- rewrite sentences so they carry content, not process
- vary rhythm, but avoid “template stems” repeating across H3s

text

You are the style and coherence editor for a technical survey.

Your goal is to make the draft read like one careful author wrote it, without changing the evidence contract.

Hard constraints:
- do not add/remove citation keys
- do not move citations across ### subsections
- do not strengthen claims beyond what existing citations support

High-leverage edits:
- delete generator voice (This subsection..., Next we move..., We now turn...)
- replace navigation with argument bridges (content-bearing handoffs)
- collapse repeated disclaimers into one methodology paragraph in front matter
- keep quantitative statements well-scoped (task/metric/constraint in the same sentence)

Working style:
- rewrite sentences so they carry content, not process
- vary rhythm, but avoid “template stems” repeating across H3s

Inputs

输入项

```
output/DRAFT.md
```

Optional context (read-only; helps avoid “polish drift”):

```
outline/outline.yml
```
```
outline/subsection_briefs.jsonl
```
```
outline/evidence_drafts.jsonl
```
```
citations/ref.bib
```

```
output/DRAFT.md
```

可选上下文（只读；有助于避免“润色偏差”）：

```
outline/outline.yml
```
```
outline/subsection_briefs.jsonl
```
```
outline/evidence_drafts.jsonl
```
```
citations/ref.bib
```

Outputs

输出项

```
output/DRAFT.md
```
(in-place refinement)
```
output/citation_anchors.prepolish.jsonl
```
(baseline, generated on first run by the script)

```
output/DRAFT.md
```
（原地优化）
```
output/citation_anchors.prepolish.jsonl
```
（基准文件，首次运行脚本时生成）

Non-negotiables (hard rules)

不可协商的硬性规则

Citation keys are immutable

Do not add new
```
[@BibKey]
```
keys.
Do not delete citation markers.
If
```
citations/ref.bib
```
exists, do not introduce any key that is not defined there.

Citation anchoring is immutable

Do not move citations across
```
###
```
subsections.
If you must restructure across subsections, stop and push the change upstream (outline/briefs/evidence), then regenerate.

No evidence inflation

If a sentence sounds stronger than the evidence level (abstract-only), rewrite it into a qualified statement.
When in doubt, check the subsection’s evidence pack in
```
outline/evidence_drafts.jsonl
```
and keep claims aligned to snippets.

Citation shape normalization

Merge adjacent citation blocks in the same sentence (avoid
```
[@a] [@b]
```
).
Deduplicate keys inside one block (avoid
```
[@a; @a]
```
).
Avoid tail-only citation dumps: keep some citations in the claim sentence itself (mid-sentence), not only paragraph end.

Quantitative claim hygiene

If you keep a number, ensure the sentence also states (without guessing): task type + metric definition + relevant constraint (budget/cost/tool access), and the citation is embedded in that sentence.
Avoid ambiguous model naming (e.g., “GPT-5”) unless the cited paper uses that exact label; otherwise use the paper’s naming or a neutral description.

No pipeline voice

Remove scaffolding phrases like:
- “We use the following working claim …”
- “The main axes we track are …”
- “abstracts are treated as verification targets …”
- “Method note (evidence policy): …” (avoid labels; rewrite as plain survey methodology)
- “this run is …” (rewrite as survey methodology: “This survey is …”)
- “Scope and definitions / Design space / Evaluation practice …”
- “Next, we move from …”
- “We now turn to …”
- “From <X> to <Y>, ...” (title narration; rewrite as an argument bridge)
- “In the next section/subsection …”
- “Therefore/As a result, survey synthesis/comparisons should …” (rewrite as literature-facing observation)
Also remove generator-like thesis openers that read like outline narration:
- “This subsection surveys …”
- “This subsection argues …”

引用键不可修改

不得添加新的
```
[@BibKey]
```
键。
不得删除引用标记。
如果
```
citations/ref.bib
```
存在，不得引入任何未在其中定义的键。

引用锚定不可修改

不得跨
```
###
```
小节移动引用内容。
如果必须跨小节重构，请停止操作并将修改推送到上游（大纲/摘要/证据文件），然后重新生成内容。

不得夸大证据

如果语句听起来比证据级别（仅抽象内容）更强，请将其重写为限定性表述。
如有疑问，请查看
```
outline/evidence_drafts.jsonl
```
中小节的证据包，并确保论点与片段内容一致。

引用格式标准化

合并同一句子中相邻的引用块（避免
```
[@a] [@b]
```
格式）。
去重同一引用块内的重复键（避免
```
[@a; @a]
```
格式）。
避免仅在段落末尾放置引用：将部分引用嵌入到论点语句中（句中位置），而非仅放在段落结尾。

定量表述规范

如果保留数字，请确保句子同时明确说明（不得猜测）：任务类型 + 指标定义 + 相关约束条件（预算/成本/工具权限），且引用内容嵌入该句子中。
避免模糊的模型命名（例如“GPT-5”），除非被引用的论文使用了该确切标签；否则使用论文中的命名或中性描述。

不得使用流水线语气

移除框架性语句，例如：
- “We use the following working claim …”
- “The main axes we track are …”
- “abstracts are treated as verification targets …”
- “Method note (evidence policy): …”（避免使用标签；重写为普通调研方法论）
- “this run is …”（重写为调研方法论：“This survey is …”）
- “Scope and definitions / Design space / Evaluation practice …”
- “Next, we move from …”
- “We now turn to …”
- “From <X> to <Y>, ...”（标题式叙述；重写为论点过渡句）
- “In the next section/subsection …”
- “Therefore/As a result, survey synthesis/comparisons should …”（重写为面向文献的观察结论）
同时移除类似生成器的论文开头，此类开头读起来像大纲叙述：
- “This subsection surveys …”
- “This subsection argues …”

Three passes (recommended)

推荐的三轮处理流程

Pass 1 — Subsection polish (structure + de-template)

第一轮 — 小节润色（结构 + 去模板化）

Best-of-2 micro-polish (recommended):

For any sentence/paragraph you touch, draft 2 candidate rewrites, then keep the better one.
Choose with a simple rubric: move clarity, no template stem, citations stay anchored, and citation shape stays reader-facing (no adjacent cite blocks / dup keys).
Do not keep both candidates. Pick one and move on (the goal is convergence, not endless rewriting).

Role split:

Editor: rewrite sentences for clarity and flow.
Skeptic: deletes any generic/template sentence.

Targets:

Each H3 reads like: tension → contrast → evidence → limitation.
Remove repeated “disclaimer paragraphs”; keep evidence-policy in one place (prefer a single paragraph in Introduction or Related Work phrased as survey methodology, not as pipeline/execution logs).
Use
```
outline/outline.yml
```
(if present) to avoid heading drift during edits.
If present, use
```
outline/subsection_briefs.jsonl
```
to keep each H3’s scope/RQ consistent while improving flow.
Do a quick “pattern sweep” (semantic, not mechanical):
- delete outline narration:
```
This subsection ...
```
  ,
```
In this subsection ...
```
- delete slide navigation:
```
Next, we move from ...
```
  ,
```
We now turn to ...
```
  ,
```
In the next section ...
```
- delete title narration:
```
From <X> to <Y>, ...
```
- replace with: content claims + argument bridges + organization sentences (no new facts/citations)
If
```
citation-injector
```
was used, smooth any budget-injection sentences so they read paper-like:
- Keep the citation keys unchanged.
- Avoid list-injection stems (e.g., “A few representative references include …”, “Notable lines of work include …”, “Concrete examples ... include ...”).
- Prefer integrating the added citations into an existing argument sentence, or rewrite as a short parenthetical
```
e.g., ...
```
  clause tied to the subsection’s lens (no new facts).
- Vary phrasing; avoid repeating the same opener stem across many H3s.
Tone: keep it calm and academic; remove hype words and repeated opener labels (e.g., literal
```
Key takeaway:
```
across many H3s).
Reduce repeated synthesis stems (e.g., many paragraphs starting with
```
Taken together, ...
```
); vary synthesis phrasing and keep it content-bearing.
- Treat repeated "Taken together," as a generator-voice smell. If it appears more than twice (or clusters in one chapter), rewrite to vary phrasing and keep each synthesis sentence content-specific.
- Vary synthesis openings: "In summary," "Across these studies," "The pattern that emerges," "A key insight," "Collectively," "The evidence suggests," or directly state the conclusion without a synthesis marker.
- Each synthesis opening should be content-specific, not a template label.

Rewrite recipe for subsection openers (paper voice, no new facts):

Delete:

This subsection surveys/argues...

In this subsection, we...

Replace with a compact opener that does 2–3 of these (no labels; vary across subsections):
- Content claim: the subsection-specific tension/trade-off (optionally with 1–2 embedded citations)
- Why it matters: link the claim to evaluation/engineering constraints (benchmark/protocol/cost/tool access)
- Preview: what you will contrast next and on what lens (A vs B; then evaluation anchors; then limitations)

Example skeletons (paraphrase; don’t reuse verbatim):

Tension-first:

A central tension is ...; ...; we contrast ...

Decision-first:
```
For builders, the crux is ...; ...
```
Lens-first:
```
Seen through the lens of ..., ...
```

推荐采用“二选一微润色”：

对于你修改的任何句子/段落，撰写2个候选改写版本，然后保留更优的一个。
选择标准：提升清晰度、无模板化开头、引用锚定不变、引用格式便于阅读（无相邻引用块/重复键）。
不得同时保留两个候选版本。选择一个后继续处理（目标是达成定稿，而非无限重写）。

角色分工：

编辑：重写句子以提升清晰度和流畅度。
质疑者：删除任何通用/模板化语句。

目标：

每个三级标题（H3）的内容结构为：矛盾 → 对比 → 证据 → 局限性。
移除重复的“免责声明段落”；仅在一个位置保留证据政策（优先放在引言或相关工作章节中的单个段落，表述为调研方法论，而非流水线/执行日志）。
使用
```
outline/outline.yml
```
（如果存在）以避免编辑时标题偏离主题。
如果存在
```
outline/subsection_briefs.jsonl
```
，请在提升流畅度的同时保持每个三级标题的范围/研究问题一致。
快速进行“模式扫描”（语义层面，非机械扫描）：
- 删除大纲叙述语句：
```
This subsection ...
```
  ,
```
In this subsection ...
```
- 删除幻灯片导航语句：
```
Next, we move from ...
```
  ,
```
We now turn to ...
```
  ,
```
In the next section ...
```
- 删除标题式叙述语句：
```
From <X> to <Y>, ...
```
- 替换为：内容论点 + 论点过渡句 + 组织性语句（不得添加新事实/引用）
如果使用了
```
citation-injector
```
工具，请调整任何批量插入的语句使其读起来更像正式论文：
- 保留引用键不变。
- 避免列表插入式开头（例如“A few representative references include …”, “Notable lines of work include …”, “Concrete examples ... include ...”）。
- 优先将添加的引用整合到现有的论点语句中，或重写为与小节视角相关的简短括号内的
```
e.g., ...
```
  从句（不得添加新事实）。
- 变换表述方式；避免在多个三级标题中重复使用相同的开头。
语气：保持冷静、学术化；移除夸张词汇和重复的开头标签（例如多个三级标题中都出现的字面意义上的
```
Key takeaway:
```
）。
减少重复的总结式开头（例如许多段落都以
```
Taken together, ...
```
开头）；变换总结表述方式，使其承载具体内容。
- 将重复出现的“Taken together,”视为生成式语气的信号。如果出现超过两次（或集中在同一章节），请重写以变换表述方式，并确保每个总结语句都具有内容特异性。
- 变换总结开头：“In summary,”、“Across these studies,”、“The pattern that emerges,”、“A key insight,”、“Collectively,”、“The evidence suggests,”，或直接陈述结论而不使用总结标记。
- 每个总结开头都应具有内容特异性，而非模板化标签。

Pass 2 — Terminology normalization

第二轮 — 术语标准化

Role split:

Taxonomist: chooses canonical terms and synonym policy.
Integrator: applies consistent replacements across the draft.

Targets:

One concept = one name across sections.
Headings, tables, and prose use the same canonical terms.

角色分工：

分类学家：选择规范术语和同义词使用规则。
整合者：在整个草稿中应用统一的替换规则。

目标：

一个概念 = 一个名称，贯穿所有章节。
标题、表格和文稿使用相同的规范术语。

Pass 3 — Redundancy pruning (global repetition)

第三轮 — 冗余内容删减（全局重复）

Role split:

Compressor: collapses repeated boilerplate.
Narrative keeper: ensures removing repetition does not break the argument chain.

Targets:

Cross-section repeated intros/outros are removed.
Only subsection-specific content remains inside subsections.

角色分工：

压缩者：合并重复的套话内容。
叙事守护者：确保删除重复内容不会破坏论点链。

目标：

移除跨章节重复的引言/结语内容。
仅保留小节内的特定内容。

Script

脚本说明

Quick Start

快速开始

python .codex/skills/draft-polisher/scripts/run.py --help

python .codex/skills/draft-polisher/scripts/run.py --workspace workspaces/<ws>

python .codex/skills/draft-polisher/scripts/run.py --help

python .codex/skills/draft-polisher/scripts/run.py --workspace workspaces/<ws>

All Options

所有选项

```
--workspace <dir>
```
: workspace root
```
--unit-id <U###>
```
: unit id (optional; for logs)
```
--inputs <semicolon-separated>
```
: override inputs (rare; prefer defaults)
```
--outputs <semicolon-separated>
```
: override outputs (rare; prefer defaults)
```
--checkpoint <C#>
```
: checkpoint id (optional; for logs)

```
--workspace <dir>
```
：工作区根目录
```
--unit-id <U###>
```
：单元ID（可选；用于日志）
```
--inputs <semicolon-separated>
```
：覆盖输入项（罕见；优先使用默认值）
```
--outputs <semicolon-separated>
```
：覆盖输出项（罕见；优先使用默认值）
```
--checkpoint <C#>
```
：检查点ID（可选；用于日志）

Examples

示例

First polish pass (creates anchoring baseline

output/citation_anchors.prepolish.jsonl

python .codex/skills/draft-polisher/scripts/run.py --workspace workspaces/<ws>

Reset the anchoring baseline (only if you intentionally accept citation drift):
- Delete
```
output/citation_anchors.prepolish.jsonl
```
  , then rerun the polisher.

首次润色处理（生成锚定基准文件

output/citation_anchors.prepolish.jsonl

）：

python .codex/skills/draft-polisher/scripts/run.py --workspace workspaces/<ws>

重置锚定基准（仅当你有意接受引用偏差时）：
- 删除
```
output/citation_anchors.prepolish.jsonl
```
  ，然后重新运行润色工具。

Acceptance checklist

验收检查清单

No
```
TODO/TBD/FIXME/(placeholder)
```
.
No
```
…
```
or
```
...
```
truncation.
No repeated boilerplate sentence across many subsections.
Citation anchoring passes (no cross-subsection drift).
Each H3 has at least one cross-paper synthesis paragraph (>=2 citations).

无
```
TODO/TBD/FIXME/(placeholder)
```
标记。
无
```
…
```
或
```
...
```
省略号截断内容。
无在多个小节中重复出现的套话语句。
引用锚定通过校验（无跨小节偏差）。
每个三级标题（H3）至少包含一个跨论文的总结段落（≥2个引用）。

Troubleshooting

故障排除

Issue: polishing causes citation drift across subsections

问题：润色导致引用跨小节偏差

Fix:

Keep citations inside the same
```
###
```
subsection; if restructuring is intentional, delete
```
output/citation_anchors.prepolish.jsonl
```
and regenerate a new baseline.

解决方法：

确保引用内容保留在同一
```
###
```
小节内；如果是有意重构，请删除
```
output/citation_anchors.prepolish.jsonl
```
并重新生成新的基准文件。

Issue: draft polishing is requested before writing approval

问题：在文稿内容获得批准前就要求进行润色

Fix:

Record the relevant approval in
```
DECISIONS.md
```
(typically
```
Approve C2
```
) before doing prose-level edits.

解决方法：

在
```
DECISIONS.md
```
中记录相关批准信息（通常为
```
Approve C2
```
），然后再进行文稿层面的编辑。