kimi-docx

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Part 1: Goals

第一部分:目标

⚠️ When to Unzip vs Read

⚠️ 何时选择解压解析XML vs 直接读取

To preserve ANY formatting from the source document, MUST unzip and parse XML.
Read tool returns plain text only — fonts, colors, alignment, borders, styles are lost.
NeedMethod
Text content only (summarize, analyze, translate)Read tool is fine
Formatting info (copy styles, preserve layout, template filling)Unzip and parse XML
Structure + comments/track changes
pandoc input.docx -t markdown
要保留源文档的任何格式,必须解压并解析XML。
读取工具仅返回纯文本——字体、颜色、对齐方式、边框、样式都会丢失。
需求方法
仅需文本内容(总结、分析、翻译)使用读取工具即可
需要格式信息(复制样式、保留布局、填充模板)解压并解析XML
需要结构 + 批注/修订模式
pandoc input.docx -t markdown

Core Principles

核心原则

  1. Preserve formatting — When editing existing documents, retain original formatting. Clone and modify, never recreate.
  2. Correct feature implementation — Comments need multi-file sync. Track Changes need revision marks. Use the right structure.
Never use python-docx/docx-js as fallback. These libraries produce lower quality output than direct XML manipulation.
  1. 保留格式 — 编辑现有文档时,保留原始格式。应克隆并修改内容,而非重新创建。
  2. 正确实现功能 — 批注需要多文件同步。修订模式需要保留修订标记。使用正确的结构。
切勿将python-docx/docx-js作为备选方案。 这些库生成的输出质量远低于直接操作XML。

Source Principle

源文件原则

Template provided = Act as form-filler, not designer.
  • Format is the user's decision
  • Task: replace placeholders, not redesign
  • Like filling a PDF form—do not redesign
No template = Act as designer. Design freely based on scenario.
For .doc (legacy format), first convert with
libreoffice --headless --convert-to docx
.

提供模板 = 仅作为表单填充者,而非设计者。
  • 格式由用户决定
  • 任务:替换占位符,而非重新设计
  • 如同填写PDF表单——不要重新设计
未提供模板 = 作为设计者。 根据场景自由设计。
对于.doc(旧格式),请先使用
libreoffice --headless --convert-to docx
转换格式。

Part 2: Execution

第二部分:执行流程

File Structure

文件结构

docx/
├── SKILL.md                      ← This file (entry point + reference)
├── references/
│   └── EditingGuide.md           → Complete Python editing tutorial (comments, track changes, 5-file sync)
├── scripts/
│   ├── docx                      → Unified entry (the only script to call)
│   ├── fix_element_order.py      → Auto-fix XML element ordering
│   ├── validate_docx.py          → Business rule validation
│   ├── generate_backgrounds.py       → Morandi style backgrounds
│   ├── generate_inkwash_backgrounds.py → Ink-wash style backgrounds
│   └── generate_chart.py         → matplotlib (only for heatmaps/3D/radar; simple charts must use native)
├── assets/
│   └── templates/
│       ├── KimiDocx.csproj       → Project file template (for creating new docs)
│       ├── Program.cs            → Program entry template
│       ├── Example.cs            → Complete example (cover+TOC+charts+back cover)
│       ├── CJKExample.cs         → CJK content patterns (quote escaping, fonts)
│       └── xml/                  → XML templates for comments infrastructure
└── validator/                    → OpenXML validator (pre-compiled binary, AI does not modify)
Creating new documents: Use C# SDK with
./scripts/docx build
→ See
Example.cs
for patterns,
CJKExample.cs
for CJK content Editing existing documents: Use Python + lxml → See
references/EditingGuide.md
for complete tutorial
⚠️ Do NOT mix these approaches. C# SDK for creation, Python for editing. Never use python-docx/docx-js.
docx/
├── SKILL.md                      ← 本文件(入口点 + 参考文档)
├── references/
│   └── EditingGuide.md           → 完整的Python编辑教程(批注、修订模式、多文件同步)
├── scripts/
│   ├── docx                      → 统一入口(唯一需要调用的脚本)
│   ├── fix_element_order.py      → 自动修复XML元素顺序问题
│   ├── validate_docx.py          → 业务规则验证
│   ├── generate_backgrounds.py       → 莫兰迪风格背景生成
│   ├── generate_inkwash_backgrounds.py → 水墨风格背景生成
│   └── generate_chart.py         → matplotlib(仅用于热力图/3D图/雷达图;简单图表必须使用原生Word图表)
├── assets/
│   └── templates/
│       ├── KimiDocx.csproj       → 项目文件模板(用于创建新文档)
│       ├── Program.cs            → 程序入口模板
│       ├── Example.cs            → 完整示例(封面+目录+图表+封底)
│       ├── CJKExample.cs         → CJK内容示例(引号转义、字体配置)
│       └── xml/                  → 批注基础设施的XML模板
└── validator/                    → OpenXML验证器(预编译二进制文件,AI无需修改)
创建新文档:使用C# SDK,执行
./scripts/docx build
→ 参考
Example.cs
的结构,CJK文档请参考
CJKExample.cs
编辑现有文档:使用Python + lxml → 完整教程请查看
references/EditingGuide.md
⚠️ 请勿混合使用上述两种方式。 创建文档用C# SDK,编辑文档用Python。切勿使用python-docx/docx-js。

Environment Setup

环境搭建

First time, execute in the SKILL directory:
bash
cd /app/.kimi/skills/kimi-docx/
./scripts/docx init
Fixed Path Conventions (cannot be changed):
PathPurpose
/app/.kimi/skills/kimi-docx/
SKILL directory, where commands are executed
/tmp/docx-work/
Working directory, edit
Program.cs
here
/mnt/okcomputer/output/
Output directory, final deliverables
/mnt/okcomputer/upload/
User upload location (input files)
Script Commands (
./scripts/docx <cmd>
):
CommandPurpose
env
Show environment status (no changes)
init
Setup dependencies + workspace
build [out]
Compile, run, validate (default: output/output.docx)
validate FILE
Validate existing docx
The script automatically handles:
  • Detects dotnet/python3 (required), pandoc/playwright/matplotlib (optional)
  • Installed → use directly; Not installed → auto-install; Broken → repair
  • Initializes working directory, copies template files
首次使用时,在SKILL目录中执行:
bash
cd /app/.kimi/skills/kimi-docx/
./scripts/docx init
固定路径约定(不可修改):
路径用途
/app/.kimi/skills/kimi-docx/
SKILL目录,所有命令在此执行
/tmp/docx-work/
工作目录,在此编辑
Program.cs
/mnt/okcomputer/output/
输出目录,最终交付文件存放于此
/mnt/okcomputer/upload/
用户上传文件的位置(输入文件)
脚本命令
./scripts/docx <cmd>
):
命令用途
env
查看环境状态(无修改操作)
init
安装依赖 + 初始化工作区
build [out]
编译、运行、验证(默认输出:output/output.docx)
validate FILE
验证现有docx文件
脚本会自动处理:
  • 检测dotnet/python3(必需)、pandoc/playwright/matplotlib(可选)
  • 已安装则直接使用;未安装则自动安装;损坏则自动修复
  • 初始化工作目录,复制模板文件

Build Process

构建流程

Must use
./scripts/docx build
, do not execute
dotnet build && dotnet run
separately (skips validation).
必须使用
./scripts/docx build
,请勿单独执行
dotnet build && dotnet run
(会跳过验证步骤)。

Program.cs Output Path Convention (Critical)

Program.cs输出路径约定(至关重要)

Program.cs must get output path from command line arguments, otherwise build script cannot find the generated file:
csharp
// Correct - get output path from command line arguments
string outputFile = args.Length > 0 ? args[0] : "/mnt/okcomputer/output/output.docx";

// Wrong - hardcoded path causes build failure
string outputFile = "my_document.docx";  // Script can't find file!
StepActionNotes
1. Compile
dotnet build
Provides fix suggestions on failure
2. Generate
dotnet run -- <output path>
Path passed via command line args
3. Auto-fix
fix_element_order.py
Fixes XML element ordering issues
4. OpenXML validation
validator/
Mandatory
5. Business rules
validate_docx.py
Mandatory
6. StatisticsCharacter + word countOptional (requires pandoc)
Validation is mandatory: On failure, file is kept but warnings are shown. Check error messages to fix issues.
Program.cs必须从命令行参数获取输出路径,否则构建脚本无法找到生成的文件:
csharp
// 正确方式 - 从命令行参数获取输出路径
string outputFile = args.Length > 0 ? args[0] : "/mnt/okcomputer/output/output.docx";

// 错误方式 - 硬编码路径会导致构建失败
string outputFile = "my_document.docx";  // 脚本无法找到该文件!
步骤操作说明
1. 编译
dotnet build
编译失败时会提供修复建议
2. 生成
dotnet run -- <output path>
通过命令行参数传递路径
3. 自动修复
fix_element_order.py
修复XML元素顺序问题
4. OpenXML验证
validator/
必须执行
5. 业务规则验证
validate_docx.py
必须执行
6. 统计字符数 + 单词数可选(需安装pandoc)
验证步骤是强制性的:验证失败时,文件会被保留,但会显示警告。请根据错误信息修复问题。

Standalone Validation

独立验证

bash
cd /app/.kimi/skills/kimi-docx/
./scripts/docx validate /mnt/okcomputer/output/report.docx
bash
cd /app/.kimi/skills/kimi-docx/
./scripts/docx validate /mnt/okcomputer/output/report.docx

Content Verification (Mandatory)

内容验证(必须执行)

pandoc is the SOURCE OF TRUTH. OpenXML validator checks structure; pandoc shows actual content.
Before delivery, verify with pandoc:
  • pandoc output.docx -t plain
    — check text completeness
  • For revisions/comments: add
    --track-changes=all
    to verify marker positions
⚠️ Critical:
comments.xml
exists ≠ comments visible. Count mismatch =
doc_tree
not saved. See
references/EditingGuide.md
§5.3.

pandoc是内容验证的唯一标准。 OpenXML验证器检查结构;pandoc显示实际内容。
交付前,请使用pandoc验证:
  • pandoc output.docx -t plain
    — 检查文本完整性
  • 对于修订/批注:添加
    --track-changes=all
    参数验证标记位置
⚠️ 关键注意事项:存在
comments.xml
不代表批注可见。如果计数不匹配,说明
doc_tree
未保存。请查看
references/EditingGuide.md
第5.3节。

Part 3: Quality Standards

第三部分:质量标准

Delivery Standard

交付标准

Generic styling and mediocre aesthetics = mediocre delivery.
Deliver studio-quality Word documents with deep thought on content, functionality, and styling. Users often don't explicitly request advanced features (covers, TOC, backgrounds, back covers, footnotes, charts)—deeply understand needs and proactively extend.
通用样式和普通美观度 = 普通交付质量。
交付工作室级别的Word文档,需深入思考内容、功能和样式。用户通常不会明确要求高级功能(封面、目录、背景、封底、脚注、图表)——请深度理解用户需求并主动提供这些功能。

Language Consistency

语言一致性

Document language = User conversation language (including filename, body text, headings, headers, TOC hints, chart labels, and all other text).
文档语言 = 用户对话语言(包括文件名、正文、标题、页眉、目录提示、图表标签及所有其他文本)。

Headers and Footers - REQUIRED BY DEFAULT

页眉和页脚 - 默认必须包含

Most documents MUST include headers and footers. The specific style (alignment, format, content) should match the document's overall design.
  • Header: Typically document title, company name, or chapter name
  • Footer: Typically page numbers (format flexible: "X / Y", "Page X", "— X —", etc.)
  • Cover/Back cover: Use
    TitlePage
    setting to hide header/footer on first page
大多数文档必须包含页眉和页脚。具体样式(对齐方式、格式、内容)需与文档整体设计匹配。
  • 页眉:通常为文档标题、公司名称或章节名称
  • 页脚:通常为页码(格式灵活:"X / Y"、"第X页"、"— X —"等)
  • 封面/封底:使用
    TitlePage
    设置隐藏首页的页眉/页脚

Professional Elements (Critical)

专业元素(至关重要)

Create documents that exceed user expectations, proactively add professional elements, don't wait for users to ask. Delivery standard: Visual quality of a top designer in 2024.
Cover & Visual:
  • Formal documents (proposals, reports, financials, bids, contracts) / creative documents (invitations, greeting cards) must have cover and back cover
  • Covers must have designer-quality background images
  • Body pages can optionally include backgrounds to enhance visual appeal
Structure:
  • Long documents (3+ sections) add TOC, must add refresh hint after TOC
Data Presentation:
  • When comparing data or showing trends, use charts instead of plain text lists
  • Tables use light gray headers or three-line style, avoid Word default blue
Links & References:
  • URLs must be clickable hyperlinks
  • Multiple figures/tables add numbering and cross-references ("see Figure 1", "as shown in Table 2")
  • Academic/legal/data analysis citation scenarios implement correct in-text click-to-jump references with corresponding footnotes/endnotes
创建超出用户预期的文档,主动添加专业元素,不要等待用户提出要求。交付标准:达到2024年顶级设计师的视觉质量。
封面与视觉设计
  • 正式文档(提案、报告、财务文件、标书、合同)/创意文档(邀请函、贺卡)必须包含封面和封底
  • 封面必须使用设计师级别的背景图片
  • 正文页面可选择性添加背景以提升视觉效果
结构
  • 长文档(3个以上章节)需添加目录,且必须在目录后添加刷新提示
数据展示
  • 比较数据或展示趋势时,使用图表而非纯文本列表
  • 表格使用浅灰色表头或三线样式,避免使用Word默认的蓝色样式
链接与引用
  • URL必须设置为可点击的超链接
  • 多图/多表需添加编号和交叉引用(如"参见图1"、"如表2所示")
  • 学术/法律/数据分析场景需实现正确的文内跳转引用,并添加对应的脚注/尾注

TOC Refresh Hint

目录刷新提示

Word TOC is field code, page numbers may be inaccurate when generated. Must add gray hint text after TOC, informing users to manually refresh:
Table of Contents
─────────────────
Chapter 1 Overview .......................... 1
Chapter 2 Methods ........................... 3
...

(Hint: On first open, right-click the TOC and select "Update Field" to show correct page numbers)
Hint text requirements:
  • Visually subtle — gray color, smaller font size, should not compete with actual TOC entries
  • Language: Matches user conversation language
Word目录是域代码,生成时页码可能不准确。必须在目录后添加灰色提示文本,告知用户手动刷新:
目录
─────────────────
第一章 概述 .......................... 1
第二章 方法 ........................... 3
...

(提示:首次打开时,右键点击目录并选择「更新域」以显示正确页码)
提示文本要求
  • 视觉上不突兀——使用灰色、小字号,不要与目录条目竞争注意力
  • 语言:与用户对话语言一致

Only When User Explicitly Requests

仅在用户明确要求时使用的功能

FeatureReason
WatermarkChanges visual state. SDK limitation: VML watermark classes don't serialize correctly; must write raw XML to header.
Document protectionRestricts editing
Mail merge fieldsRequires data source
功能原因
水印会改变文档视觉状态。SDK限制:VML水印类无法正确序列化;必须直接编写XML到页眉中。
文档保护会限制编辑权限
邮件合并字段需要数据源支持

Chart Selection Strategy (Critical)

图表选择策略(至关重要)

Default to native Word charts, editable, small file size, professional.
Chart TypeMethodNotes
Pie chartNative
Example.cs
AddPieChart()
Bar chartNative
Example.cs
AddBarChart()
Line chartNativeReference bar chart structure, use
c:lineChart
Horizontal barNativeReference bar chart structure, use
barDir="bar"
Heatmap, 3D, radarmatplotlibWord native doesn't support
Complex statistics (box plot, etc.)matplotlibWord native doesn't support
Native charts are preferred (editable, smaller files), but matplotlib is acceptable for data analysis scenarios.
默认使用Word原生图表,可编辑、文件体积小、专业美观。
图表类型方法说明
饼图原生
Example.cs
AddPieChart()
柱状图原生
Example.cs
AddBarChart()
折线图原生参考柱状图结构,使用
c:lineChart
横向柱状图原生参考柱状图结构,使用
barDir="bar"
热力图、3D图、雷达图matplotlibWord原生不支持
复杂统计图表(箱线图等)matplotlibWord原生不支持
优先使用原生图表(可编辑、文件更小),数据分析场景可使用matplotlib。

Inserting Images/Charts

插入图片/图表

Any PNG (matplotlib charts, backgrounds, photos) must be inserted using
AddInlineImage()
:
csharp
AddInlineImage(body, mainPart, "/path/to/image.png", "Description", docPrId++);
Critical:
  • Chart labels/titles must match document language (e.g., Chinese labels for Chinese docs)
  • Build output shows
    X images
    — if 0, images were not inserted
所有PNG图片(matplotlib图表、背景图、照片)必须使用
AddInlineImage()
插入:
csharp
AddInlineImage(body, mainPart, "/path/to/image.png", "Description", docPrId++);
关键注意事项
  • 图表标签/标题必须与文档语言一致(例如中文文档使用中文标签)
  • 构建输出会显示
    X images
    ——如果显示0,说明图片未成功插入

Content Constraints

内容约束

Word/Page Count Requirements

字数/页数要求

User RequestExecution Standard
Specific word count (e.g., "3000 words")Actual output within ±20%
Specific page count (e.g., "5 pages")Exact match
Range (e.g., "2000-3000 words")Within range
Minimum (e.g., "at least 5000 words")No more than 2x the requirement
Forbidden: Padding word count with excessive bullet point lists. Maintain information density.
用户需求执行标准
特定字数(如"3000字")实际输出在±20%范围内
特定页数(如"5页")完全匹配页数
范围要求(如"2000-3000字")输出在指定范围内
最低要求(如"至少5000字")输出不超过要求的2倍
禁止:通过添加大量无意义的项目符号列表凑字数。保持信息密度。

Outline Adherence

大纲遵循

  • User provides outline: Follow strictly, no additions, deletions, or reordering
  • No outline provided: Use standard structure
    • Academic: Introduction → Literature → Methods → Results → Discussion → Conclusion
    • Business: Executive Summary → Analysis → Recommendations
    • Technical: Overview → Principles → Usage → Examples → FAQ
  • 用户提供大纲:严格遵循,不得增删或调整顺序
  • 未提供大纲:使用标准结构
    • 学术文档:引言 → 文献综述 → 研究方法 → 研究结果 → 讨论 → 结论
    • 商务文档:执行摘要 → 分析 → 建议
    • 技术文档:概述 → 原理 → 使用方法 → 示例 → 常见问题

Scene Completeness

场景完整性

Think one step ahead of the user, complete elements the scenario needs. Examples below are not exhaustive — apply this principle to ALL document types:
  • Exam paper → Name/class/ID fill areas, point allocation per question (consider total), grading section
  • Contract → Signature and seal areas for both parties, date, contract number, attachment list
  • Meeting minutes → Attendees, absentees, action items with owners, next meeting time
提前一步考虑用户需求,补充场景所需的元素。以下示例并非全部——请将此原则应用于所有文档类型:
  • 试卷 → 姓名/班级/学号填写区域、每题分值(考虑总分)、评分栏
  • 合同 → 双方签字盖章区域、日期、合同编号、附件列表
  • 会议纪要 → 参会人员、缺席人员、行动项及负责人、下次会议时间

Design Philosophy

设计理念

Color Scheme

配色方案

Low saturation tones, avoid Word default blue and matplotlib default high saturation.
Flexibly choose color schemes based on document scenario:
StylePaletteSuitable Scenarios
MorandiSoft muted tonesArtistic, editorial
Earth tonesBrown, olive, naturalEnvironmental, organic
NordicCool gray, misty blueMinimalist, tech
Japanese Wabi-sabiGray, raw wood, zenTraditional, contemplative
French eleganceOff-white, dusty pinkLuxury, feminine
IndustrialCharcoal, rust, concreteManufacturing, engineering
AcademicNavy, burgundy, ivoryResearch, education
Ocean mistMisty blue, sandMarine, wellness
Forest mossOlive, moss greenNature, sustainability
Desert duskOchre, sandy goldWarm, regional
Color scheme must be consistent within the same document.
低饱和度色调,避免使用Word默认的蓝色和matplotlib默认的高饱和度配色。
根据文档场景灵活选择配色方案:
风格调色板适用场景
莫兰迪柔和低饱和色调艺术类、编辑类文档
大地色系棕色、橄榄绿、自然色调环保类、有机类文档
北欧风冷灰色、雾蓝色极简主义、科技类文档
日式侘寂风灰色、原木色、禅意色调传统类、沉思类文档
法式优雅米白色、灰粉色奢侈品、女性向文档
工业风炭黑色、铁锈红、混凝土色制造业、工程类文档
学术风藏青色、酒红色、象牙白研究、教育类文档
海洋雾感雾蓝色、沙色海洋、健康类文档
森林苔藓橄榄绿、苔藓绿自然、可持续发展类文档
沙漠黄昏赭石色、沙金色温暖、地域特色类文档
同一文档内的配色方案必须保持一致。

Layout

布局

White space (margins, paragraph spacing), clear hierarchy (H1 > H2 > body), proper padding (text shouldn't touch borders).
留白合理(边距、段落间距)、层级清晰(H1 > H2 > 正文)、内边距适当(文本不应紧贴边框)。

Pagination Control

分页控制

Word uses flow layout, not fixed pages. Control pagination with these properties:
PropertyXMLEffect
Keep with next
<w:keepNext/>
Heading stays on same page as following paragraph
Keep lines together
<w:keepLines/>
Paragraph won't break across pages
Page break before
<w:pageBreakBefore/>
Force new page (for H1)
Widow/orphan control
<w:widowControl/>
Prevent single lines at top/bottom of page
csharp
// Example: H1 always starts on new page, stays with next paragraph
new ParagraphProperties(
    new ParagraphStyleId { Val = "Heading1" },
    new PageBreakBefore(),
    new KeepNext(),
    new KeepLines()
)
Table pagination:
csharp
// Allow row to break across pages (avoid large blank areas)
new TableRowProperties(
    new CantSplit { Val = false }  // false = can split
)

// Repeat header row on each page
new TableRowProperties(
    new TableHeader()
)

Word使用流式布局,而非固定页面。使用以下属性控制分页:
属性XML效果
与下段同页
<w:keepNext/>
标题与后续段落保持在同一页
段落内不分页
<w:keepLines/>
段落不会跨页拆分
段前分页
<w:pageBreakBefore/>
强制新页面(用于H1标题)
孤行控制
<w:widowControl/>
避免页面顶部/底部出现单行文本
csharp
// 示例:H1标题始终从新页面开始,并与后续段落保持同页
new ParagraphProperties(
    new ParagraphStyleId { Val = "Heading1" },
    new PageBreakBefore(),
    new KeepNext(),
    new KeepLines()
)
表格分页
csharp
// 允许行跨页拆分(避免出现大面积空白)
new TableRowProperties(
    new CantSplit { Val = false }  // false = 允许拆分
)

// 每页重复表头行
new TableRowProperties(
    new TableHeader()
)

Part 4: Technical Reference

第四部分:技术参考

Choose your path:
TaskStackReference
Create new documentC# + OpenXML SDK4.1-4.6 +
Example.cs
Edit existing documentPython + lxml4.7 +
references/EditingGuide.md

选择适合的技术路径:
任务技术栈参考文档
创建新文档C# + OpenXML SDK4.1-4.6 +
Example.cs
编辑现有文档Python + lxml4.7 +
references/EditingGuide.md

4.1 SDK Fundamentals

4.1 SDK基础

Schema Compliance (MEMORIZE THESE)

schema合规性(必须牢记)

OpenXML has strict element ordering requirements. Wrong order = Word cannot open the file.
OpenXML对元素顺序有严格要求。顺序错误 = Word无法打开文件。

Required Styles

必需样式

csharp
// Normal style must exist - all Heading styles use basedOn="Normal"
styles.Append(new Style(
    new StyleName { Val = "Normal" },
    new StyleParagraphProperties(
        new SpacingBetweenLines { After = "200", Line = "276", LineRule = LineSpacingRuleValues.Auto }
    ),
    new StyleRunProperties(
        new RunFonts { Ascii = "Calibri", HighAnsi = "Calibri" },
        new FontSize { Val = "22" },
        new FontSizeComplexScript { Val = "22" }
    )
) { Type = StyleValues.Paragraph, StyleId = "Normal", Default = true });
csharp
// 必须存在Normal样式 - 所有Heading样式都基于Normal样式
styles.Append(new Style(
    new StyleName { Val = "Normal" },
    new StyleParagraphProperties(
        new SpacingBetweenLines { After = "200", Line = "276", LineRule = LineSpacingRuleValues.Auto }
    ),
    new StyleRunProperties(
        new RunFonts { Ascii = "Calibri", HighAnsi = "Calibri" },
        new FontSize { Val = "22" },
        new FontSizeComplexScript { Val = "22" }
    )
) { Type = StyleValues.Paragraph, StyleId = "Normal", Default = true });

Element Order Rules

元素顺序规则

Most ordering issues are auto-fixed by
fix_element_order.py
. Key rules to remember:
ParentKey Rule
sectPr
headerRef
footerRef
must come before
pgSz
pgMar
Table
Must have
tblGrid
between
tblPr
and
tr
(see below)
大多数顺序问题会被
fix_element_order.py
自动修复。需牢记的关键规则:
父元素关键规则
sectPr
headerRef
footerRef
必须在
pgSz
pgMar
之前
Table
tblPr
tr
之间必须包含
tblGrid
(如下所示)

Tables Must Have tblGrid

表格必须包含tblGrid

csharp
// Correct - table must define grid
var table = new Table();
table.Append(new TableProperties(...));
table.Append(new TableGrid(           // Required!
    new GridColumn { Width = "4680" },
    new GridColumn { Width = "4680" }
));
table.Append(new TableRow(...));

// Wrong - missing tblGrid, Word cannot open
var table = new Table();
table.Append(new TableProperties(...));
table.Append(new TableRow(...));  // Adding rows directly
csharp
// 正确方式 - 表格必须定义网格
var table = new Table();
table.Append(new TableProperties(...));
table.Append(new TableGrid(           // 必需!
    new GridColumn { Width = "4680" },
    new GridColumn { Width = "4680" }
));
table.Append(new TableRow(...));

// 错误方式 - 缺少tblGrid,Word无法打开
var table = new Table();
table.Append(new TableProperties(...));
table.Append(new TableRow(...));  // 直接添加行

Table Column Width Consistency

表格列宽一致性

Main cause of skewed tables:
gridCol
width in
tblGrid
doesn't match cell's
tcW
width.
csharp
// Correct - gridCol and tcW match exactly
table.Append(new TableGrid(
    new GridColumn { Width = "3600" },  // First column
    new GridColumn { Width = "5400" }   // Second column
));

var row = new TableRow(
    new TableCell(
        new TableCellProperties(
            new TableCellWidth { Width = "3600", Type = TableWidthUnitValues.Dxa }  // Matches gridCol!
        ),
        new Paragraph(new Run(new Text("Content")))
    ),
    new TableCell(
        new TableCellProperties(
            new TableCellWidth { Width = "5400", Type = TableWidthUnitValues.Dxa }  // Matches gridCol!
        ),
        new Paragraph(new Run(new Text("Content")))
    )
);
RuleReason
gridCol count = table column countOtherwise column width calculation fails
gridCol.Width = tcW.WidthMismatch causes skewing (checked during validation)
All rows in same column use same tcWMaintains column width consistency
表格变形的主要原因:
tblGrid
中的
gridCol
宽度与单元格的
tcW
宽度不匹配。
csharp
// 正确方式 - gridCol与tcW宽度完全匹配
table.Append(new TableGrid(
    new GridColumn { Width = "3600" },  // 第一列
    new GridColumn { Width = "5400" }   // 第二列
));

var row = new TableRow(
    new TableCell(
        new TableCellProperties(
            new TableCellWidth { Width = "3600", Type = TableWidthUnitValues.Dxa }  // 与gridCol匹配!
        ),
        new Paragraph(new Run(new Text("内容")))
    ),
    new TableCell(
        new TableCellProperties(
            new TableCellWidth { Width = "5400", Type = TableWidthUnitValues.Dxa }  // 与gridCol匹配!
        ),
        new Paragraph(new Run(new Text("内容")))
    )
);
规则原因
gridCol数量 = 表格列数否则列宽计算会失败
gridCol.Width = tcW.Width不匹配会导致表格变形(验证时会检查)
同一列的所有行使用相同的tcW保持列宽一致性

Value Limits

值限制

  • paraId
    must be <
    0x80000000
    (for comment paragraph IDs)
  • paraId
    必须小于
    0x80000000
    (用于批注段落ID)

Creation vs Editing

创建 vs 编辑

TaskMethodWhy
Create new documentC# OpenXML SDKHandles package structure, rels, Content_Types automatically
Edit existing documentPython + lxmlTransparent, no black box, full control
For creating new documents: Use
Example.cs
patterns with SDK.
For editing existing documents: See
references/EditingGuide.md
for complete Python workflow.

任务方法原因
创建新文档C# OpenXML SDK自动处理包结构、关系、Content_Types
编辑现有文档Python + lxml透明无黑盒,完全可控
创建新文档:使用
Example.cs
的模式结合SDK。
编辑现有文档:完整Python工作流请查看
references/EditingGuide.md

Example.cs

Example.cs

Read the entire file to understand the overall structure, not just individual functions. The file demonstrates how sections connect (cover → TOC → body → back cover).
The "Project Proposal", "[Company Name]", etc. in Example are example content only, and the color scheme is for reference only.
What to LearnWhat NOT to Learn
Section division (cover → TOC → body → back cover)Specific color values
Floating background insertion codeBusiness content from the example
Chart creation API callsCopy/wording from the example
Style definition structureHardcoded data from the example
⚠️ Do NOT copy the Example's color scheme. Redesign visual style based on YOUR document's scenario, like a top designer.
Function Index (read source for implementation details):
FeatureFunctionLine #
Document Structure
Styles (Normal, Heading1-3)
AddStyles()
85-203
Cover page
AddCoverSection()
369-453
Table of contents
AddTocSection()
458-526
Body section
AddContentSection()
531-729
Back cover
AddBackcoverSection()
734-794
Visual Elements
Floating background
CreateFloatingBackground()
228-279
Proportional inline image
AddInlineImage()
285-364
Tables
Three-line table
CreateDataTable()
853-888
Header row (gray bg)
CreateSimpleHeaderRow()
933-971
Data row
CreateSimpleDataRow()
976-1008
Charts
Pie chart
AddPieChart()
1013-1049
Bar chart
AddBarChart()
1133-1169
Page Elements
Header with backgroundwithin
AddContentSection()
534-575
Footer with page numberswithin
AddContentSection()
578-588
Page number field
CreatePageNumberField()
1345-1354
Total pages field
CreateTotalPagesField()
1356-1365
Advanced Features
Footnote
AddFootnote()
1370-1410
Cross-reference
CreateCrossReference()
1415-1425
Numbering/lists
CreateBasicNumbering()
1327-1340
请通读整个文件以理解整体结构,不要只看单个函数。该文件展示了文档各部分的连接方式(封面 → 目录 → 正文 → 封底)。
Example中的"项目提案"、"[公司名称]"等是示例内容,配色方案仅作参考
需要学习的内容不需要学习的内容
章节划分(封面 → 目录 → 正文 → 封底)具体颜色值
浮动背景插入代码示例中的业务内容
图表创建API调用示例中的文案
样式定义结构示例中的硬编码数据
⚠️ 请勿直接复制Example的配色方案。 请根据你的文档场景重新设计视觉风格,达到顶级设计师的水平。
函数索引(请查看源码了解实现细节):
功能函数行号
文档结构
样式(Normal、Heading1-3)
AddStyles()
85-203
封面
AddCoverSection()
369-453
目录
AddTocSection()
458-526
正文
AddContentSection()
531-729
封底
AddBackcoverSection()
734-794
视觉元素
浮动背景
CreateFloatingBackground()
228-279
等比例嵌入式图片
AddInlineImage()
285-364
表格
三线表
CreateDataTable()
853-888
表头行(灰色背景)
CreateSimpleHeaderRow()
933-971
数据行
CreateSimpleDataRow()
976-1008
图表
饼图
AddPieChart()
1013-1049
柱状图
AddBarChart()
1133-1169
页面元素
带背景的页眉位于
AddContentSection()
534-575
带页码的页脚位于
AddContentSection()
578-588
页码域
CreatePageNumberField()
1345-1354
总页数域
CreateTotalPagesField()
1356-1365
高级功能
脚注
AddFootnote()
1370-1410
交叉引用
CreateCrossReference()
1415-1425
编号/列表
CreateBasicNumbering()
1327-1340

CJKExample.cs

CJKExample.cs

CJK documents must read
CJKExample.cs
only
— reading
Example.cs
instead will cause errors (missing font config, quote escaping). It handles:
  • Quote escaping (
    ""
    \u201c
    \u201d
    )
  • CJK font configuration (SimHei, Microsoft YaHei)
  • Paragraph indentation for CJK text
Structure is identical to
Example.cs
— no need to read both.
CJK文档请仅参考
CJKExample.cs
——如果参考
Example.cs
会导致错误(缺少字体配置、引号转义处理)。该文件处理:
  • 引号转义(
    ""
    \u201c
    \u201d
  • CJK字体配置(黑体、微软雅黑)
  • CJK文本的段落缩进
结构与
Example.cs
完全相同——无需同时阅读两个文件。

4.2 Content Elements

4.2 内容元素

Field Codes

域代码

PAGE/NUMPAGES/DATE/TOC — structure:
FieldChar(Begin)
FieldCode(" PAGE ")
FieldChar(Separate)
Text
FieldChar(End)
. Results cached; WPS doesn't support
UpdateFieldsOnOpen
.
PAGE/NUMPAGES/DATE/TOC — 结构:
FieldChar(Begin)
FieldCode(" PAGE ")
FieldChar(Separate)
Text
FieldChar(End)
。结果会被缓存;WPS不支持
UpdateFieldsOnOpen

Bookmarks and Cross-References

书签与交叉引用

Bookmarks mark positions (
BookmarkStart
/
BookmarkEnd
with matching IDs); cross-references link via REF field (
" REF bookmarkName \\h "
).
Pitfall: Deleting bookmarked text deletes bookmark → "Error! Reference source not found".
书签标记位置(
BookmarkStart
/
BookmarkEnd
使用匹配的ID);交叉引用通过REF域链接(
" REF bookmarkName \\h "
)。
陷阱:删除带书签的文本会同时删除书签 → 出现"错误!未找到引用源"。

4.3 Visual Design

4.3 视觉设计

Background Image Design

背景图片设计

Cover/back cover must have background. Background images should have center white space, use low saturation colors. Background images must NOT contain any text; text should be implemented in Word for user editability.
封面/封底必须包含背景。背景图片应保留中心空白区域,使用低饱和度颜色。背景图片不得包含任何文本;文本应在Word中实现,以便用户编辑。

Design Flow

设计流程

  1. Read example: Read
    scripts/generate_backgrounds.py
    for HTML/CSS techniques (radial-gradient, transparency, positioning)
  2. Choose direction: Select a style direction from the table below based on document scenario
  3. Create original: Write new HTML/CSS from scratch—the example shows ONE style, yours should be different
⚠️ Copying the example = all documents look the same = mediocre delivery. Each document deserves a unique visual identity matching its content and purpose.
  1. 参考示例:查看
    scripts/generate_backgrounds.py
    中的HTML/CSS技术(径向渐变、透明度、定位)
  2. 选择方向:根据文档场景从下表中选择风格方向
  3. 原创设计:从头编写新的HTML/CSS——示例仅展示一种风格,你的设计应与众不同
⚠️ 直接复制示例 = 所有文档风格雷同 = 普通交付质量。 每个文档都应拥有与其内容和用途匹配的独特视觉标识。

Style Reference

风格参考

StyleKey ElementsScenarios
MUJIThin borders + white spaceMinimalist, Japanese, lifestyle
BauhausScattered geometric shapesArt, design, creative
Swiss StyleGrid lines + accent barsProfessional, corporate
Soft BlocksSoft color rectangles, overlapping transparentWarm, education, healthcare
Rounded GeometryRounded rectangles, pill shapesTech, internet, youthful
Frosted GlassBlur + transparency + subtle bordersModern, premium, tech
Gradient RibbonsSoft gradient ellipses + small dotsFeminine, beauty, soft
Dot MatrixRegular dot pattern textureTechnical, data, engineering
Double BorderNested borders + corner decorationsTraditional, formal, legal
WavesBottom SVG waves + gradient backgroundOcean, environmental, flowing
Warm NaturalEarth tones + organic shapesEnvironmental, agriculture, natural
Technical: Playwright generates 794×1123px (
device_scale_factor=2
), insert as floating Anchor with
BehindDoc=true
. See
Example.cs:CreateFloatingBackground()
.
风格核心元素适用场景
无印良品风细边框 + 留白极简主义、日式、生活方式类文档
包豪斯风分散的几何图形艺术、设计、创意类文档
瑞士风格网格线 + 强调条专业、企业类文档
柔和色块柔和彩色矩形、透明重叠温暖、教育、医疗类文档
圆角几何圆角矩形、胶囊形状科技、互联网、年轻化文档
毛玻璃模糊 + 透明度 + 细微边框现代、高端、科技类文档
渐变丝带柔和渐变椭圆 + 小点女性向、美妆、柔和风格文档
点阵纹理规则点阵图案技术、数据、工程类文档
双层边框嵌套边框 + 角落装饰传统、正式、法律类文档
波浪纹理底部SVG波浪 + 渐变背景海洋、环保、流动感文档
温暖自然大地色系 + 有机形状环保、农业、自然类文档
技术实现:Playwright生成794×1123px图片(
device_scale_factor=2
),以浮动Anchor插入,设置
BehindDoc=true
。参考
Example.cs:CreateFloatingBackground()

Letterhead (Business Documents)

信头(商务文档)

For formal business letters, consider adding a letterhead in the header area. Common patterns:
  • Full letterhead on first page (logo + company name + contact info), simplified or hidden on subsequent pages
  • Use
    TitlePage
    in
    SectionProperties
    to enable different first-page header
  • Design flexibly based on the specific business context—no fixed rules
对于正式商务信函,可考虑在页眉区域添加信头。常见模式:
  • 首页完整信头(Logo + 公司名称 + 联系信息),后续页面使用简化版或隐藏信头
  • SectionProperties
    中使用
    TitlePage
    启用首页不同页眉
  • 根据具体商务场景灵活设计——无固定规则

Two-Column Layout

双栏布局

Use
sectPr
with
Columns
. Affects entire section until next
sectPr
.
使用
sectPr
结合
Columns
。该设置会影响整个章节,直到下一个
sectPr

4.4 Special Content

4.4 特殊内容

Math Formulas (OMML)

数学公式(OMML)

Core pattern:
<m:e>
is the universal content container. Almost all elements wrap content in
<m:e>
.
Text: Always
<m:r><m:t>text</m:t></m:r>
, never bare text.
Root:
<m:oMath>
(inline) or
<m:oMathPara>
(display). Do NOT nest
<m:oMath>
inside another.
Structure examples:
ElementStructure
Fraction
<m:f><m:num><m:e>…</m:e></m:num><m:den><m:e>…</m:e></m:den></m:f>
Subscript
<m:sSub><m:e>base</m:e><m:sub><m:e>…</m:e></m:sub></m:sSub>
Superscript
<m:sSup><m:e>base</m:e><m:sup><m:e>…</m:e></m:sup></m:sSup>
Radical
<m:rad><m:deg><m:e>n</m:e></m:deg><m:e>radicand</m:e></m:rad>
Matrix
<m:m><m:mr><m:e>cell</m:e><m:e>cell</m:e></m:mr></m:m>
Nary (∑∫)
<m:nary><m:sub><m:e>…</m:e></m:sub><m:sup><m:e>…</m:e></m:sup><m:e>body</m:e></m:nary>
Delimiter
<m:d><m:dPr><m:begChr m:val="("/><m:endChr m:val=")"/></m:dPr><m:e>…</m:e></m:d>
Equation array
<m:eqArr><m:e>eq1</m:e><m:e>eq2</m:e></m:eqArr>
Trap: Matrix uses
<m:e>
for cells, NOT
<m:mc>
(which is for column properties).
核心模式
<m:e>
是通用内容容器。几乎所有元素都将内容包裹在
<m:e>
中。
文本:始终使用
<m:r><m:t>text</m:t></m:r>
,切勿使用裸文本。
根元素
<m:oMath>
(行内公式)或
<m:oMathPara>
(显示公式)。请勿嵌套
<m:oMath>
结构示例
元素结构
分数
<m:f><m:num><m:e>…</m:e></m:num><m:den><m:e>…</m:e></m:den></m:f>
下标
<m:sSub><m:e>base</m:e><m:sub><m:e>…</m:e></m:sub></m:sSub>
上标
<m:sSup><m:e>base</m:e><m:sup><m:e>…</m:e></m:sup></m:sSup>
根式
<m:rad><m:deg><m:e>n</m:e></m:deg><m:e>radicand</m:e></m:rad>
矩阵
<m:m><m:mr><m:e>cell</m:e><m:e>cell</m:e></m:mr></m:m>
累加/积分(∑∫)
<m:nary><m:sub><m:e>…</m:e></m:sub><m:sup><m:e>…</m:e></m:sup><m:e>body</m:e></m:nary>
分隔符
<m:d><m:dPr><m:begChr m:val="("/><m:endChr m:val=")"/></m:dPr><m:e>…</m:e></m:d>
方程组
<m:eqArr><m:e>eq1</m:e><m:e>eq2</m:e></m:eqArr>
陷阱:矩阵使用
<m:e>
表示单元格,而非
<m:mc>
<m:mc>
用于列属性)。

Curly Quotes in C# Strings

C#字符串中的弯引号

C# treats
"
"
as string delimiters → CS1003. Simplest fix: Use escaped straight quotes
\"
in string literals. If curly quotes are required, use XML entity encoding:
&#8220;
&#8221;
(doubles) or
&#8216;
&#8217;
(singles).
Chinese quote handling — see
CJKExample.cs
for complete patterns:
csharp
// ❌ Wrong - Chinese quotes break compilation
new Text("请点击"确定"按钮")  // CS1003!

// ✓ Correct - use Unicode escapes
new Text("请点击\u201c确定\u201d按钮")
CharacterUnicodeUsage
" (left double)
\u201c
Opening quote
" (right double)
\u201d
Closing quote
' (left single)
\u2018
Opening single
' (right single)
\u2019
Closing single
⚠️ Do NOT use verbatim strings
@""
\u
escapes don't work in verbatim strings:
csharp
// ❌ WRONG - @"" verbatim string, \u NOT escaped, outputs literal "\u201c"
string text = @"她说\u201c你好\u201d";  // Outputs: 她说\u201c你好\u201d

// ✓ CORRECT - regular string, \u IS escaped
string text = "她说\u201c你好\u201d";   // Outputs: 她说"你好"

// ✓ For long text, use + concatenation
string para = "第一段内容," +
              "她说\u201c这是引用\u201d," +
              "继续写第二段。";
C#将
"
"
视为字符串分隔符 → 会导致CS1003错误。最简单的修复方式:在字符串字面量中使用转义的直引号
\"
。如果需要弯引号,使用XML实体编码:
&#8220;
&#8221;
(双引号)或
&#8216;
&#8217;
(单引号)。
中文引号处理——完整模式请参考
CJKExample.cs
csharp
// ❌ 错误方式 - 中文引号会导致编译失败
new Text("请点击"确定"按钮")  // CS1003!

// ✓ 正确方式 - 使用Unicode转义
new Text("请点击\u201c确定\u201d按钮")
字符Unicode用途
“ (左双引号)
\u201c
开引号
” (右双引号)
\u201d
闭引号
‘ (左单引号)
\u2018
开单引号
’ (右单引号)
\u2019
闭单引号
⚠️ 请勿使用逐字字符串
@""
——
\u
转义在逐字字符串中无效:
csharp
// ❌ 错误方式 - @""逐字字符串,\u未被转义,输出字面量"\u201c"
string text = @"她说\u201c你好\u201d";  // 输出:她说\u201c你好\u201d

// ✓ 正确方式 - 普通字符串,\u会被转义
string text = "她说\u201c你好\u201d";   // 输出:她说“你好”

// ✓ 长文本可使用+拼接
string para = "第一段内容," +
              "她说\u201c这是引用\u201d," +
              "继续写第二段。";

Units

单位

Twips = 1/20 pt (11906 = A4 width). Half-points for font size (24 = 12pt). EMU = 914400/inch.
Twips = 1/20 磅(11906 = A4宽度)。字体大小使用半磅(24 = 12磅)。EMU = 914400/英寸。

4.5 Page Layout

4.5 页面布局

Image Size

图片尺寸

wp:extent
and
a:ext
Cx/Cy must match. For proportional scaling: read PNG header (bytes 16-23) for dimensions, calculate
cy = cx * height / width
.
wp:extent
a:ext
的Cx/Cy必须匹配。等比例缩放:读取PNG头部(字节16-23)获取尺寸,计算
cy = cx * height / width

Pagination Control

分页控制

Add
KeepNext
to title/chart paragraphs to prevent orphaned titles or chart-caption separation.
为标题/图表段落添加
KeepNext
,避免标题孤立或图表与说明文字分离。

Section Breaks

分节符

sectPr
inside
pPr
= last paragraph of section. Avoid
PageBreak
+
Continuous
(blank page). Use
NextPage
.
sectPr
位于
pPr
内 = 章节的最后一段。避免使用
PageBreak
+
Continuous
(会产生空白页)。请使用
NextPage

Table of Contents (TOC)

目录(TOC)

WPS doesn't support
UpdateFieldsOnOpen
→ must pre-populate TOC entries using field code structure:
FieldChar(Begin)
FieldCode(" TOC ...")
FieldChar(Separate)
→ placeholder entries (hyperlinked text + page numbers) →
FieldChar(End)
. The placeholder entries between Separate and End allow Word to display a TOC immediately; users refresh to get accurate page numbers. Never use static text paragraphs to simulate a TOC—must use field code structure, otherwise it cannot be refreshed. See
Example.cs:AddTocSection()
.
Parameters:
\o "1-3"
(heading levels),
\h
(hyperlinks),
\z
(hide page# in web),
\u
(outline level).
Headings must use built-in
Heading1
/
Heading2
styles (custom styles not recognized).
WPS不支持
UpdateFieldsOnOpen
→ 必须使用域代码结构预填充目录条目:
FieldChar(Begin)
FieldCode(" TOC ...")
FieldChar(Separate)
→ 占位符条目(带超链接的文本 + 页码) →
FieldChar(End)
。Separate和End之间的占位符条目可让Word立即显示目录;用户刷新后可获取准确页码。切勿使用静态文本段落模拟目录——必须使用域代码结构,否则无法刷新。参考
Example.cs:AddTocSection()
参数:
\o "1-3"
(标题级别)、
\h
(超链接)、
\z
(网页中隐藏页码)、
\u
(大纲级别)。
标题必须使用内置的
Heading1
/
Heading2
样式(自定义样式不被识别)。

Alignment and Typography

对齐方式与排版

CJK body: justify + 2-char indent. English: left. Table numbers: right. Headings: no indent.
CJK正文:两端对齐 + 2字符缩进。英文:左对齐。表格编号:右对齐。标题:无缩进。

4.6 Page Elements

4.6 页面元素

Headers and Footers

页眉和页脚

csharp
// 1. Create header part
var headerPart = mainPart.AddNewPart<HeaderPart>();
var headerId = mainPart.GetIdOfPart(headerPart);

headerPart.Header = new Header(
    new Paragraph(
        new ParagraphProperties(
            new ParagraphStyleId { Val = "Header" },
            new Justification { Val = JustificationValues.Center }
        ),
        new Run(new Text("Document Title"))
    )
);

// 2. Create footer part (with page numbers)
var footerPart = mainPart.AddNewPart<FooterPart>();
var footerId = mainPart.GetIdOfPart(footerPart);

var footerPara = new Paragraph(
    new ParagraphProperties(
        new Justification { Val = JustificationValues.Center }
    )
);
// PAGE field: Begin → FieldCode → Separate → Text → End
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.Begin }));
footerPara.Append(new Run(new FieldCode(" PAGE ")));
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.Separate }));
footerPara.Append(new Run(new Text("1")));  // Placeholder, updated on open
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.End }));
footerPara.Append(new Run(new Text(" / ") { Space = SpaceProcessingModeValues.Preserve }));
// NUMPAGES field (same structure)
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.Begin }));
footerPara.Append(new Run(new FieldCode(" NUMPAGES ")));
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.Separate }));
footerPara.Append(new Run(new Text("1")));
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.End }));
footerPart.Footer = new Footer(footerPara);

// 3. Reference in SectionProperties
new SectionProperties(
    new HeaderReference { Type = HeaderFooterValues.Default, Id = headerId },
    new FooterReference { Type = HeaderFooterValues.Default, Id = footerId },
    new PageSize { Width = 11906, Height = 16838 },
    new PageMargin { Top = 1440, Right = 1440, Bottom = 1440, Left = 1440, Header = 720, Footer = 720 }
)
Header/Footer Types:
TypeHeaderFooterValuesPurpose
Default
.Default
Odd pages (or all pages)
Even
.Even
Even pages
First
.First
First page
Different first page (for cover): add
TitlePage()
to sectPr.
Different odd/even pages: add
<w:evenAndOddHeaders/>
in settings.xml.
csharp
// 1. 创建页眉部分
var headerPart = mainPart.AddNewPart<HeaderPart>();
var headerId = mainPart.GetIdOfPart(headerPart);

headerPart.Header = new Header(
    new Paragraph(
        new ParagraphProperties(
            new ParagraphStyleId { Val = "Header" },
            new Justification { Val = JustificationValues.Center }
        ),
        new Run(new Text("文档标题"))
    )
);

// 2. 创建页脚部分(带页码)
var footerPart = mainPart.AddNewPart<FooterPart>();
var footerId = mainPart.GetIdOfPart(footerPart);

var footerPara = new Paragraph(
    new ParagraphProperties(
        new Justification { Val = JustificationValues.Center }
    )
);
// PAGE域:Begin → FieldCode → Separate → Text → End
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.Begin }));
footerPara.Append(new Run(new FieldCode(" PAGE ")));
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.Separate }));
footerPara.Append(new Run(new Text("1")));  // 占位符,打开时会更新
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.End }));
footerPara.Append(new Run(new Text(" / ") { Space = SpaceProcessingModeValues.Preserve }));
// NUMPAGES域(相同结构)
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.Begin }));
footerPara.Append(new Run(new FieldCode(" NUMPAGES ")));
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.Separate }));
footerPara.Append(new Run(new Text("1")));
footerPara.Append(new Run(new FieldChar { FieldCharType = FieldCharValues.End }));
footerPart.Footer = new Footer(footerPara);

// 3. 在SectionProperties中引用
new SectionProperties(
    new HeaderReference { Type = HeaderFooterValues.Default, Id = headerId },
    new FooterReference { Type = HeaderFooterValues.Default, Id = footerId },
    new PageSize { Width = 11906, Height = 16838 },
    new PageMargin { Top = 1440, Right = 1440, Bottom = 1440, Left = 1440, Header = 720, Footer = 720 }
)
页眉/页脚类型
类型HeaderFooterValues用途
默认
.Default
奇数页(或所有页面)
偶数页
.Even
偶数页
首页
.First
第一页
首页不同(用于封面):在sectPr中添加
TitlePage()
奇偶页不同:在settings.xml中添加
<w:evenAndOddHeaders/>

Footnotes and Endnotes

脚注和尾注

Separator trap: FootnotesPart/EndnotesPart must include Id=-1 (Separator) and Id=0 (ContinuationSeparator) before any user notes. Missing these → Word fails to render.
xml
<!-- Required in footnotes.xml / endnotes.xml before user notes -->
<w:footnote w:type="separator" w:id="-1">
  <w:p><w:r><w:separator/></w:r></w:p>
</w:footnote>
<w:footnote w:type="continuationSeparator" w:id="0">
  <w:p><w:r><w:continuationSeparator/></w:r></w:p>
</w:footnote>
<!-- User notes start from id="1" -->
分隔符陷阱:FootnotesPart/EndnotesPart在用户注释之前必须包含Id=-1(分隔符)和Id=0(连续分隔符)。缺少这些会导致Word无法渲染。
xml
<!-- footnotes.xml / endnotes.xml中用户注释之前必须包含以下内容 -->
<w:footnote w:type="separator" w:id="-1">
  <w:p><w:r><w:separator/></w:r></w:p>
</w:footnote>
<w:footnote w:type="continuationSeparator" w:id="0">
  <w:p><w:r><w:continuationSeparator/></w:r></w:p>
</w:footnote>
<!-- 用户注释从id="1"开始 -->

Lists

列表

Requires
NumberingDefinitionsPart
with
AbstractNum
+
NumberingInstance
. Apply via
NumberingProperties
in paragraph.
Multi-level: create
AbstractNum
with multiple
Level
s. Formats:
Decimal
,
UpperLetter
,
LowerRoman
,
Bullet
,
ChineseCounting
.
需要
NumberingDefinitionsPart
包含
AbstractNum
+
NumberingInstance
。通过段落中的
NumberingProperties
应用。
多级列表:创建包含多个
Level
AbstractNum
。格式包括:
Decimal
UpperLetter
LowerRoman
Bullet
ChineseCounting

Hyperlinks

超链接

Must use
<w:hyperlink>
element, not plain text.
Requires relationship first:
csharp
var relId = mainPart.AddHyperlinkRelationship(new Uri("https://example.com"), true).Id;
paragraph.Append(new Hyperlink(new Run(
    new RunProperties(new Color { Val = "0563C1" }, new Underline { Val = UnderlineValues.Single }),
    new Text("Click here")
)) { Id = relId })
必须使用
<w:hyperlink>
元素,而非纯文本。
首先需要创建关系:
csharp
var relId = mainPart.AddHyperlinkRelationship(new Uri("https://example.com"), true).Id;
paragraph.Append(new Hyperlink(new Run(
    new RunProperties(new Color { Val = "0563C1" }, new Underline { Val = UnderlineValues.Single }),
    new Text("点击此处")
)) { Id = relId })

Charts and Visualization

图表与可视化

RequirementPreferredAlternative
Data chartsWord nativematplotlib PNG
FlowchartsDrawingML ShapesTable layout
IllustrationsImage generationImage search
Word Chart: Use
NumberLiteral
(no Excel),
DataPoint
for colors. See Example.
matplotlib:
dpi=300
,
axes.unicode_minus=False
. Font/labels must match document language.
需求首选方案备选方案
数据图表Word原生图表matplotlib PNG
流程图DrawingML形状表格布局
插图图片生成图片搜索
Word图表:使用
NumberLiteral
(无需Excel),
DataPoint
设置颜色。参考Example。
matplotlib:设置
dpi=300
axes.unicode_minus=False
。字体/标签必须与文档语言一致。

4.7 Editing Operations (Python API)

4.7 编辑操作(Python API)

Use
docx_lib.editing
for comments and track changes:
python
from scripts.docx_lib.editing import (
    DocxContext,
    add_comment, reply_comment, resolve_comment, delete_comment,
    insert_paragraph, insert_text, propose_deletion,
    reject_insertion, restore_deletion, enable_track_changes
)

with DocxContext("input.docx", "output.docx") as ctx:
    # add_comment(ctx, para_text, comment, highlight=None)
    # - para_text: text to locate paragraph
    # - comment: comment content
    # - highlight: text to highlight (omit to highlight entire paragraph)
    add_comment(ctx, "M-SVI index", "Please define", highlight="M-SVI")
    insert_text(ctx, "The method", after="method", new_text=" and materials")
Complete guide:
references/EditingGuide.md
使用
docx_lib.editing
处理批注和修订模式:
python
from scripts.docx_lib.editing import (
    DocxContext,
    add_comment, reply_comment, resolve_comment, delete_comment,
    insert_paragraph, insert_text, propose_deletion,
    reject_insertion, restore_deletion, enable_track_changes
)

with DocxContext("input.docx", "output.docx") as ctx:
    # add_comment(ctx, para_text, comment, highlight=None)
    # - para_text: 用于定位段落的文本
    # - comment: 批注内容
    # - highlight: 需要高亮的文本(省略则高亮整个段落)
    add_comment(ctx, "M-SVI index", "请定义该术语", highlight="M-SVI")
    insert_text(ctx, "The method", after="method", new_text=" and materials")
完整指南
references/EditingGuide.md

4.8 XML Quick Reference

4.8 XML快速参考

Text Formatting (rPr)

文本格式(rPr)

xml
<w:r>
  <w:rPr>
    <w:rFonts w:ascii="Times New Roman" w:eastAsia="SimSun"/>
    <w:sz w:val="24"/>  <!-- 12pt = 24 half-points -->
    <w:b/><w:i/><w:u w:val="single"/>
    <w:color w:val="FF0000"/>
  </w:rPr>
  <w:t>text</w:t>
</w:r>
Font sizes: 21=10.5pt, 24=12pt, 28=14pt, 32=16pt, 44=22pt
xml
<w:r>
  <w:rPr>
    <w:rFonts w:ascii="Times New Roman" w:eastAsia="SimSun"/>
    <w:sz w:val="24"/>  <!-- 12磅 = 24半磅 -->
    <w:b/><w:i/><w:u w:val="single"/>
    <w:color w:val="FF0000"/>
  </w:rPr>
  <w:t>文本</w:t>
</w:r>
字体大小:21=10.5磅,24=12磅,28=14磅,32=16磅,44=22磅

Track Changes Structure

修订模式结构

xml
<!-- Insertion: <w:ins> wraps <w:r> -->
<w:ins w:id="1" w:author="..." w:date="...">
  <w:r><w:rPr>...</w:rPr><w:t>text</w:t></w:r>
</w:ins>

<!-- Deletion: <w:del> wraps <w:r> (same pattern as ins!) -->
<w:del w:id="2" w:author="..." w:date="...">
  <w:r><w:rPr>...</w:rPr><w:delText>text</w:delText></w:r>
</w:del>
Key: Both
<w:ins>
and
<w:del>
wrap
<w:r>
, not inside it. Use
<w:delText>
instead of
<w:t>
for deletions.
xml
<!-- 插入:<w:ins>包裹<w:r> -->
<w:ins w:id="1" w:author="..." w:date="...">
  <w:r><w:rPr>...</w:rPr><w:t>文本</w:t></w:r>
</w:ins>

<!-- 删除:<w:del>包裹<w:r>(与ins模式相同!) -->
<w:del w:id="2" w:author="..." w:date="...">
  <w:r><w:rPr>...</w:rPr><w:delText>文本</w:delText></w:r>
</w:del>
关键
<w:ins>
<w:del>
包裹
<w:r>
,而非位于其内部。删除操作使用
<w:delText>
而非
<w:t>

Schema Constraints

Schema约束

RuleRequirement
RSID values8-digit uppercase hex:
00A1B2C3
Whitespace
xml:space="preserve"
for leading/trailing spaces
Revision structure
<w:ins>
/
<w:del>
wrap
<w:r>
, must have
w:id
attribute

Complete examples: See
references/EditingGuide.md
for full working code.
规则要求
RSID值8位大写十六进制:
00A1B2C3
空白字符首尾空格需添加
xml:space="preserve"
修订结构
<w:ins>
/
<w:del>
必须包裹
<w:r>
,且必须包含
w:id
属性

完整示例:请查看
references/EditingGuide.md
中的可运行代码。