nsfc-length-aligner
Goal: Turn "page length" from a subjective feeling into a quantifiable, closed-loop indicator, and guide expansion/compression based on budget.
Applicable Scenarios
- You have an NSFC proposal and want to quickly determine if "certain sections are too short/too long"
- You need to align with the mandatory page length requirements (page count/word count/character count) of the template
- You want to expand or compress content while preserving the original meaning as much as possible (maintaining the argumentation mainline and evidence chain)
Inapplicable Scenarios
- Only need to "count words" without caring about budget and rewrite closed-loop (a simpler script can be used)
- The proposal is not local (cannot provide text/files/paths)
Workflow (Strongly Recommended to Follow in Order)
1) Requirement Confirmation (Budget Caliber)
First confirm what the "hard standards" you need to align with are:
- 2026 research consensus "golden ratio" (for General/Young Investigator Category C, for proofreading): Rationale for the Project 30% (6–10 pages, approx. 8000–10000 words) / Research Content 50% (12–15 pages, approx. 12000–15000 words) / Research Basis 20% (5–8 pages, approx. 5000–6000 words); total recommended ≤28 pages for buffer (principally no more than 30 pages)
- Page count (hard constraint): After the 2026+ revision, "principally no more than 30 pages"; practical suggestion: ≤28 pages for buffer; do not "squeeze pages" by reducing font size/line spacing
- Character budget (proxy indicator): Chinese characters / total characters, used for the deterministic closed-loop of "rewrite → recheck" (page count must be verified with the final PDF)
- Budget scope: Total length + budget for each section/key chapter (at least covering: Rationale for the Project, Research Content, Research Basis)
Note: This skill defaults to using the
sample caliber in config.yaml:length_standard
(aligned with 2026 research recommendations). You should verify with the annual guidelines/templates before use.
2) Run Length Check (Deterministic)
Run the check script on the target proposal directory (or single file) to generate a report:
bash
python3 scripts/check_length.py --input <target proposal path> --config config.yaml
If your proposal is based on the
/
template (the project root directory contains
), it is recommended to point
to the project root directory: The script will automatically collect "files that will actually be compiled into the PDF" along the
dependency tree of
, and ignore commented-out
(to avoid counting optional chapters by mistake).
If you have compiled the final PDF (recommended; page count is a hard constraint), pass the PDF along to count pages:
bash
python3 scripts/check_length.py --input <target proposal path> --config config.yaml --pdf <proposal.pdf>
Output:
- Console summary (total length, items over/under budget)
<input>/_artifacts/nsfc-length-aligner/length_report.md
(default output directory; customizable with )
<input>/_artifacts/nsfc-length-aligner/length_report.json
(default output directory; customizable with )
Note: If your
directory is not writable (e.g., you set the template repository as read-only), be sure to use
to point to a writable location.
After running,
must read
(assisted by
if necessary), and use the "file-level deviation table + (optional) section-level statistics" as input for Step 3.
3) Interpret Gaps (Where the Gaps Lie)
Do 3 things based on the report:
- Locate files or chapters that are "too long/too short"
- Determine the type of gap:
- Insufficient evidence chain (needs supplementary data/controls/limitations)
- Logical jumps (needs supplementary transitions/definitions/hypotheses)
- Redundancy and repetition (needs merging/deletion)
- Generate an action list (priority for expansion/compression)
Section-level data usage (more precise positioning):
- If a chapter table appears in (or the field exists in JSON), prioritize locating the specific chapters contributing the most within the "too long/too short" files, then perform targeted rewrites instead of only making average deletions/expansions at the file level
- When a file is too long/too short: Compare its section statistics; if the gap is mainly concentrated in 1–2 chapters, prioritize modifying only those 1–2 sections (easier to preserve original meaning and structural stability)
Reference:
references/MEANING_PRESERVING_REWRITE_RUBRIC.md
4) Expand/Compress (Preserve Original Meaning as Much as Possible)
Expansion Strategies (When Too Short)
- First supplement "verifiable information density": definitions, hypotheses, controls, ablation studies, risks and alternative solutions
- Then supplement "argumentation closed-loop": Why do it → How to do it → How to verify expectations → What to do if it fails
- Avoid vague expansion: Do not introduce new claims, do not stack adjectives
Compression Strategies (When Too Long)
- Remove repetitions: Keep only the strongest expression for the same argument
- Cut background: Condense general background into 1-2 sentences, allocate more space to "problem-method-verification"
- Structured rewriting: Split long paragraphs into bullet points (without changing the order of facts)
⚠️ After rewriting, must perform Step 5 recheck to confirm that gaps have been eliminated. Failure to recheck is considered incomplete.
2026 "Trim/Enrich" List for Three Core Sections (for Priority Setting)
Usage (turn "static suggestions" into "gap-triggered actions"):
- First check the deviation of the corresponding file in the report: means over-length (prioritize "trim"); means under-length (prioritize "enrich"); means no changes needed for budget reasons
- The larger the absolute value of , the higher the priority; recommended processing order: first modify files with the largest , then the next largest
Rationale for the Project (Why do it):
- Trim: Textbook-style popular science, generalized literature reviews, weakly relevant "national needs" elaboration, repeated significance, filler literature
- Enrich: Gap (bottleneck) → Key Idea (breakthrough) → Value argumentation (why it is worth doing)
Research Content (What to do/How to do it):
- Trim: Repetitive statements, overly detailed operational details, list-style method stacking
- Enrich: Logical framework, key experimental design with controls/ablation studies, expected results and verifiable indicators, use figures to illustrate points
Research Basis (Why you can do it):
- Trim: Unrelated achievement stacking, excessive background setup
- Enrich: Strongly relevant pre-experimental data, core technical capabilities, platform conditions (aligned with research content)
5) Recheck for Closed Loop
After revisions, must run the script again to confirm "meets standards and does not exceed limits":
bash
python3 scripts/check_length.py --input <target proposal path> --config config.yaml
Format Red Lines (Common for 2026+)
- Do not reduce font size or line spacing to "squeeze pages" (page count requirements are a review risk point)
- Do not fill up to exactly 30 pages: Recommended ≤28 pages for buffer
- If the annual guidelines require declaring generative AI usage: Be sure to truthfully explain as required (compliance item)
Agreements and Output Formats
- Reports are presented at "file-level + (optional) section-level"
- Budget takes
config.yaml:length_standard
as the sole source of truth
- All rewrites should follow the principle of "minimal changes, preserve original meaning" (see references)