paragraph-curator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Paragraph Curator (select -> evaluate -> subset -> fuse)

Paragraph Curator（选择→评估→筛选→融合）

Purpose: turn “keep rewriting and getting longer” into a controlled convergence step.

This skill adds a decision layer between “draft paragraphs” and “polish voice”:

keep the best paragraphs
merge redundant ones
rewrite for clearer argument moves
expand only when coverage is missing (using existing evidence cards)

This is a content-structure pass (not a style pass). Run

style-harmonizer

and

opener-variator

after curation.

目的：将“反复重写且内容不断变长”转变为可控的内容收敛步骤。

该Skill在“草稿段落”与“润色措辞”之间新增了一个决策层：

保留最优段落
合并冗余段落
重写以让论证逻辑更清晰
仅在内容覆盖不足时进行扩充（使用现有证据卡片）

这是一次内容结构优化（非风格优化）。整理完成后，请运行

style-harmonizer

和

opener-variator

。

Inputs

输入项

Required:

```
sections/
```
(especially H3 bodies:
```
sections/S<sub_id>.md
```
)
```
outline/writer_context_packs.jsonl
```
(what each H3 must cover + allowed citations)
```
output/ARGUMENT_SKELETON.md
```
(single source of truth for terminology + premises)

Recommended:

```
output/SECTION_ARGUMENT_SUMMARIES.jsonl
```
(paragraph moves + outputs)
```
output/SECTION_LOGIC_REPORT.md
```
(paragraph linkage risks)
```
output/WRITER_SELFLOOP_TODO.md
```
(style smells / scope/citation warnings)

必填：

```
sections/
```
（尤其H3级内容：
```
sections/S<sub_id>.md
```
）
```
outline/writer_context_packs.jsonl
```
（每个H3需覆盖的内容范围 + 允许使用的引用）
```
output/ARGUMENT_SKELETON.md
```
（术语与前提的唯一权威来源）

Outputs

输出项

Updated
```
sections/*.md
```
(same filenames; body-only; no headings)
```
output/PARAGRAPH_CURATION_REPORT.md
```
(short; PASS/FAIL + what changed)
Create
```
sections/paragraphs_curated.refined.ok
```
when done (empty file; pipeline contract signal)

更新后的
```
sections/*.md
```
（文件名不变；仅包含内容主体；无标题）
```
output/PARAGRAPH_CURATION_REPORT.md
```
（简短；包含PASS/FAIL状态 + 变更内容）
完成后创建
```
sections/paragraphs_curated.refined.ok
```
（空文件；用于标识流程契约完成）

What this skill optimizes (rubric)

该Skill的优化方向（评分标准）

You are not trying to “shorten”. You are trying to increase information density while keeping the section verifiable.

Score each paragraph on a simple 0-2 rubric:

Criterion	0 (bad)	1 (ok)	2 (good)
Coverage	does not match any required axis/card	matches one axis, thin	directly executes a must-use card/comparison
Novelty	repeats nearby content	partially redundant	adds a distinct comparison/insight
Move clarity	unclear what it does	move exists, weak output	clear move + reusable output
Consistency	premise/term drift vs skeleton	minor mismatch	fully aligned with Consistency Contract
Citation hygiene	uncited when it should be; cite-dump vibe	acceptable	citations are local and anchored (not just tail)
Fusion readiness	cannot merge; tangled	mergeable with edits	clean unit that can be fused or kept

Decision labels:

```
KEEP
```
: keep mostly as-is
```
REWRITE
```
: keep content, rewrite for clearer move/output
```
FUSE
```
: merge with neighbor(s) and rewrite into one stronger paragraph
```
REPLACE
```
: keep the slot, but rewrite using existing evidence cards (when coverage is missing)

目标并非“缩短内容”，而是在保证章节可验证性的同时提升信息密度。

采用0-2分的简单评分标准对每个段落进行评估：

评估维度	0分（差）	1分（合格）	2分（优）
内容覆盖	未匹配任何要求的维度/卡片	匹配一个维度，但内容单薄	直接对应必须使用的卡片/对比内容
内容新颖性	重复附近内容	部分冗余	新增独特的对比/见解
逻辑清晰度	功能定位模糊	有明确功能，但输出效果弱	功能清晰 + 输出内容可复用
一致性	与骨架文档的前提/术语存在偏差	存在微小不匹配	完全符合一致性契约
引用规范	应引用未引用；存在堆砌引用的情况	符合要求	引用本地化且锚定到对应语句（并非仅放在段落末尾）
可融合性	无法合并；内容混乱	经编辑后可合并	内容规整，可直接融合或保留

决策标签：

```
KEEP
```
：基本保持原样
```
REWRITE
```
：保留内容，重写以让功能/输出更清晰
```
FUSE
```
：与相邻段落合并并重写为更有力的单一段落
```
REPLACE
```
：保留段落位置，但使用现有证据卡片重写（当内容覆盖不足时）

Paragraph budget (profile-aware)

段落数量预算（基于文档类型）

Default per-H3 target:

```
draft_profile=survey
```
: 10-12 paragraphs
```
draft_profile=deep
```
: 11-13 paragraphs

If you exceed the budget, do not delete content blindly. Prefer

FUSE

(merge redundancy) and make the fused paragraph denser.

默认每个H3的目标段落数：

```
draft_profile=survey
```
：10-12段
```
draft_profile=deep
```
：11-13段

若超出预算，请勿盲目删除内容。优先选择

FUSE

（合并冗余内容），并让融合后的段落更紧凑。

Must-have coverage checklist (per H3)

必选内容覆盖清单（每个H3）

Each H3 must contain at least:

1x
```
Definition/Setup
```
(only if this H3 introduces a new term/protocol field)
2x concrete
```
Contrast
```
paragraphs (A-vs-B comparisons; not just “many papers do...”)
1x
```
Evaluation anchor
```
paragraph (task + metric + constraint/budget/tool access; cite-backed)
1x cross-paper
```
Synthesis
```
paragraph (what generalizes, what does not; cite-backed)
1x
```
Boundary/Failure
```
paragraph (limitations; threats to validity; cite-backed when possible)
1x
```
Local conclusion
```
(a reusable takeaway used downstream)

If any item is missing, use

REPLACE

to write that paragraph from the writer context pack (do not invent new facts).

每个H3至少需包含：

1个
```
Definition/Setup
```
段落（仅当该H3引入新术语/协议字段时需要）
2个具体的
```
Contrast
```
段落（A与B的对比；而非仅“多篇论文提及...”）
1个
```
Evaluation anchor
```
段落（任务+指标+约束/预算/工具权限；有引用支撑）
1个跨论文的
```
Synthesis
```
段落（总结通用结论与例外情况；有引用支撑）
1个
```
Boundary/Failure
```
段落（局限性；有效性威胁；尽可能有引用支撑）
1个
```
Local conclusion
```
段落（可复用的结论，供后续环节使用）

若有任何项缺失，请使用

REPLACE

，基于writer context pack撰写对应段落（不得编造新内容）。

Workflow (minimal)

最简工作流程

Pick the target set

Start with the H3 bodies listed in
```
output/SECTION_LOGIC_REPORT.md
```
, plus any H3 flagged in
```
output/WRITER_SELFLOOP_TODO.md
```
as repetitive/template-y, plus any H3 that keeps growing across edits.
Work file-by-file: each target is a concrete
```
sections/S<sub_id>.md
```
.

Build a paragraph inventory (scratch only; do not paste into the paper)

If
```
output/SECTION_ARGUMENT_SUMMARIES.jsonl
```
exists, use its per-paragraph
```
moves
```
/
```
output
```
as the first draft of your inventory, then reconcile with the actual text. For each paragraph, write one line:

P<i> :: move(s) -> output (1 sentence) :: citations (keys)

Apply the rubric and label each paragraph

Mark
```
KEEP/REWRITE/FUSE/REPLACE
```
.
If two adjacent paragraphs repeat the same axis,
```
FUSE
```
.
For any paragraph you plan to change (
```
REWRITE
```
/
```
REPLACE
```
/
```
FUSE
```
), draft 2-3 candidate rewrites in parallel (different angles: contrast-first / protocol-first / synthesis-first).
- Score candidates quickly with the rubric; keep one winner (or fuse two if they cover complementary axes).
- Keep citation keys unchanged while sampling; you are choosing surface form + structure, not changing the evidence set.

Construct the curated set

Use
```
outline/writer_context_packs.jsonl
```
to enforce must-have coverage (paragraph_plan/must_use/comparison_cards/limitation_hooks) without inventing new content.
Enforce the must-have coverage checklist.
Enforce the paragraph budget by fusing redundancy rather than deleting substance.

Fuse + rewrite (keep citation keys fixed) Rules that keep the pipeline stable:

Do not add/remove citation keys; when fusing, carry citations forward and re-anchor them to the right sentence.
Do not move citations across subsections.
Avoid adjacent citation blocks (e.g.,
```
[@a] [@b]
```
) and duplicate keys in one block (e.g.,
```
[@a; @a]
```
).
When fusing, it is often faster to write two fused candidates (one contrast-heavy, one synthesis-heavy) and pick the better one.

Write the report + marker

```
output/PARAGRAPH_CURATION_REPORT.md
```
should be short and actionable:
- ```
- Status: PASS|FAIL
```
- per H3: paragraph count before/after; what was fused; any remaining gaps
- (minimal) how many candidates you tried for the main rewrites (e.g., 2-3), so future passes can see whether this was a real selection step
Create
```
sections/paragraphs_curated.refined.ok
```
.

选择目标范围

从
```
output/SECTION_LOGIC_REPORT.md
```
中列出的H3内容开始，加上
```
output/WRITER_SELFLOOP_TODO.md
```
中标记为重复/模板化的H3，以及在多次编辑中持续变长的H3。
逐文件处理：每个目标对应具体的
```
sections/S<sub_id>.md
```
文件。

建立段落清单（仅用于草稿；请勿粘贴到正式文档）

若
```
output/SECTION_ARGUMENT_SUMMARIES.jsonl
```
存在，使用其中的段落
```
moves
```
/
```
output
```
作为清单初稿，再与实际文本核对。
为每个段落编写一行信息：

P<i> :: move(s) -> output（一句话总结）:: citations（引用标识）

应用评分标准并为每个段落打标签

标记
```
KEEP/REWRITE/FUSE/REPLACE
```
。
若两个相邻段落重复同一维度的内容，标记为
```
FUSE
```
。
对于任何计划修改的段落（
```
REWRITE
```
/
```
REPLACE
```
/
```
FUSE
```
），并行撰写2-3个重写候选版本（不同角度：对比优先 / 协议优先 / 综合优先）。
- 用评分标准快速评估候选版本；保留最优版本（若两个版本覆盖互补维度，可融合）。
- 评估过程中保持引用标识不变；仅调整表述形式与结构，不改变证据集合。

构建整理后的内容集合

使用
```
outline/writer_context_packs.jsonl
```
确保必选内容覆盖（paragraph_plan/must_use/comparison_cards/limitation_hooks），且不编造新内容。
执行必选内容覆盖清单要求。
通过融合冗余内容而非删除核心信息来控制段落数量预算。

融合 + 重写（保持引用标识固定）维持流程稳定的规则：

不得添加/移除引用标识；融合时，将引用带入新段落并重新锚定到对应语句。
不得跨小节移动引用。
避免相邻的引用块（如
```
[@a] [@b]
```
）和同一引用块中的重复标识（如
```
[@a; @a]
```
）。
融合时，通常可以快速撰写两个融合候选版本（一个侧重对比，一个侧重综合），再选择更优的版本。

撰写报告 + 标记文件

```
output/PARAGRAPH_CURATION_REPORT.md
```
应简短且具备可操作性：
- ```
- 状态: PASS|FAIL
```
- 每个H3：整理前后的段落数量；融合的内容；剩余的内容缺口
- （简要说明）主要重写环节尝试了多少个候选版本（如2-3个），以便后续环节了解是否为真正的筛选步骤
创建
```
sections/paragraphs_curated.refined.ok
```
文件。

Routing rules

路由规则

If you cannot fill a missing must-have paragraph without new evidence: stop and route upstream (
```
evidence-selfloop
```
/ C3-C4). Do not pad.
If you feel forced to change a definition or evaluation premise: update
```
output/ARGUMENT_SKELETON.md# Consistency Contract
```
first, then rerun
```
argument-selfloop
```
.
If the only issue is surface cadence/openers: do not overwork curation; run
```
style-harmonizer
```
/
```
opener-variator
```
.

若无法在不新增证据的情况下填补必选段落的缺口：停止操作并路由到上游环节（
```
evidence-selfloop
```
/ C3-C4）。请勿凑数。
若必须修改定义或评估前提：先更新
```
output/ARGUMENT_SKELETON.md# Consistency Contract
```
，再重新运行
```
argument-selfloop
```
。
若仅存在措辞节奏/开头问题：无需过度投入整理工作；直接运行
```
style-harmonizer
```
/
```
opener-variator
```
。

Done checklist

完成检查清单

Each targeted H3 stays within its paragraph budget (survey 10-12; deep 11-13) without losing required moves.
Redundant paragraphs are fused into denser, clearer ones (not just deleted).
No citation keys were added/removed; citation shape is reader-facing (no adjacent blocks, no dup keys).
```
output/PARAGRAPH_CURATION_REPORT.md
```
exists and is understandable.
```
sections/paragraphs_curated.refined.ok
```
exists.

每个目标H3在不丢失必要功能的前提下，段落数量符合预算（survey类型10-12段；deep类型11-13段）。
冗余段落已融合为更紧凑、清晰的段落（而非直接删除）。
未添加/移除引用标识；引用格式便于阅读（无相邻引用块、无重复标识）。
```
output/PARAGRAPH_CURATION_REPORT.md
```
已存在且内容易懂。
```
sections/paragraphs_curated.refined.ok
```
已存在。