Loading...
Loading...
Compare original and translation side by side
STOP — Do NOT begin implementation until the approach (scratch vs template) is decided and data sources are confirmed.
停止 — 在确定实现方案(从零开发vs使用模板)并确认数据源之前,请勿开始编码实现。
STOP — Do NOT skip to validation until all document sections are implemented.
停止 — 在所有文档章节实现完成前,请勿跳至验证环节。
| Scenario | Approach | Library | Why |
|---|---|---|---|
| One-off report generation | From scratch | python-docx | Full programmatic control |
| Recurring reports with fixed layout | Template | docxtpl | Design layout in Word, fill with data |
| Bulk letter generation (mail merge) | Template | docxtpl | One template, many outputs |
| Complex formatting, custom styles | From scratch | python-docx | Direct access to document model |
| Non-technical users design template | Template | docxtpl | Users edit in Word, developers bind data |
| PDF output required | Either + conversion | libreoffice / docx2pdf | Post-processing step |
| 场景 | 实现方案 | 依赖库 | 选择原因 |
|---|---|---|---|
| 一次性报表生成 | 从零开发 | python-docx | 完全的编程控制能力 |
| 固定布局的定期报表 | 模板填充 | docxtpl | 在Word中设计布局,直接用数据填充即可 |
| 批量信函生成(邮件合并) | 模板填充 | docxtpl | 一个模板可生成多份输出 |
| 复杂格式、自定义样式需求 | 从零开发 | python-docx | 可直接访问文档底层模型 |
| 非技术用户参与模板设计 | 模板填充 | docxtpl | 用户可直接在Word中编辑模板,开发者只需绑定数据 |
| 需要PDF输出 | 任意方案+转换 | libreoffice / docx2pdf | 作为后处理步骤执行 |
from docx import Document
from docx.shared import Inches, Pt, Cm, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.enum.table import WD_TABLE_ALIGNMENT
doc = Document()from docx import Document
from docx.shared import Inches, Pt, Cm, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.enum.table import WD_TABLE_ALIGNMENT
doc = Document()undefinedundefinedfrom docx.enum.section import WD_ORIENT
from docx.oxml.ns import qn
from docx.oxml import OxmlElement
section = doc.sections[0]from docx.enum.section import WD_ORIENT
from docx.oxml.ns import qn
from docx.oxml import OxmlElement
section = doc.sections[0]undefinedundefinedfrom docx.shared import Cm, Pt
from docx.oxml.ns import nsdecls
from docx.oxml import parse_xmlfrom docx.shared import Cm, Pt
from docx.oxml.ns import nsdecls
from docx.oxml import parse_xmlundefinedundefinedTemplate file (template.docx) contains:
{{ company_name }}
Date: {{ report_date }}
Dear {{ recipient_name }},
{% for item in items %}
- {{ item.name }}: ${{ item.price }}
{% endfor %}
Total: ${{ total }}
{%if urgent %}
URGENT: This requires immediate attention.
{%endif %}Template file (template.docx) contains:
{{ company_name }}
Date: {{ report_date }}
Dear {{ recipient_name }},
{% for item in items %}
- {{ item.name }}: ${{ item.price }}
{% endfor %}
Total: ${{ total }}
{%if urgent %}
URGENT: This requires immediate attention.
{%endif %}from docxtpl import DocxTemplate, InlineImage
from docx.shared import Mm
tpl = DocxTemplate('template.docx')
context = {
'company_name': 'Acme Corp',
'report_date': '2025-03-15',
'recipient_name': 'Alice Johnson',
'items': [
{'name': 'Widget A', 'price': '29.99'},
{'name': 'Widget B', 'price': '49.99'},
],
'total': '79.98',
'urgent': True,
'chart': InlineImage(tpl, 'chart.png', width=Mm(120)),
}
tpl.render(context)
tpl.save('output.docx')from docxtpl import DocxTemplate, InlineImage
from docx.shared import Mm
tpl = DocxTemplate('template.docx')
context = {
'company_name': 'Acme Corp',
'report_date': '2025-03-15',
'recipient_name': 'Alice Johnson',
'items': [
{'name': 'Widget A', 'price': '29.99'},
{'name': 'Widget B', 'price': '49.99'},
],
'total': '79.98',
'urgent': True,
'chart': InlineImage(tpl, 'chart.png', width=Mm(120)),
}
tpl.render(context)
tpl.save('output.docx')from docxtpl import RichText
rt = RichText()
rt.add('Normal text ')
rt.add('bold text', bold=True)
rt.add(' and ')
rt.add('red text', color='FF0000')
rt.add(' with ')
rt.add('a link', url_id=tpl.build_url_id('https://example.com'))
context = {'formatted_text': rt}from docxtpl import RichText
rt = RichText()
rt.add('Normal text ')
rt.add('bold text', bold=True)
rt.add(' and ')
rt.add('red text', color='FF0000')
rt.add(' with ')
rt.add('a link', url_id=tpl.build_url_id('https://example.com'))
context = {'formatted_text': rt}Template table row with loop:
{% tr for row in table_data %}
{{ row.name }} | {{ row.value }} | {{ row.status }}
{% endtr %}Template table row with loop:
{% tr for row in table_data %}
{{ row.name }} | {{ row.value }} | {{ row.status }}
{% endtr %}from docxtpl import DocxTemplate
import csv
template = DocxTemplate('letter_template.docx')
with open('recipients.csv') as f:
reader = csv.DictReader(f)
for i, row in enumerate(reader):
context = {
'name': row['name'],
'address': row['address'],
'amount': row['amount'],
'due_date': row['due_date'],
}
template.render(context)
template.save(f'letters/letter_{i:04d}_{row["name"]}.docx')
template = DocxTemplate('letter_template.docx') # Re-load for next iterationfrom docxtpl import DocxTemplate
import csv
template = DocxTemplate('letter_template.docx')
with open('recipients.csv') as f:
reader = csv.DictReader(f)
for i, row in enumerate(reader):
context = {
'name': row['name'],
'address': row['address'],
'amount': row['amount'],
'due_date': row['due_date'],
}
template.render(context)
template.save(f'letters/letter_{i:04d}_{row["name"]}.docx')
template = DocxTemplate('letter_template.docx') # Re-load for next iterationfrom docx.enum.style import WD_STYLE_TYPEfrom docx.enum.style import WD_STYLE_TYPEundefinedundefinedNormal → Heading 1 → Heading 2 → ...
Normal → Body Text → List Paragraph
Normal → Table Normal → Table GridNormal → Heading 1 → Heading 2 → ...
Normal → Body Text → List Paragraph
Normal → Table Normal → Table Gridundefinedundefinedundefinedundefinedimport jinja2
def safe_generate_document(template_path, context, output_path):
try:
tpl = DocxTemplate(template_path)
tpl.render(context)
tpl.save(output_path)
return True
except jinja2.UndefinedError as e:
print(f"Missing template variable: {e}")
return False
except FileNotFoundError as e:
print(f"Template not found: {e}")
return False
except Exception as e:
print(f"Document generation failed: {e}")
return Falseimport jinja2
def safe_generate_document(template_path, context, output_path):
try:
tpl = DocxTemplate(template_path)
tpl.render(context)
tpl.save(output_path)
return True
except jinja2.UndefinedError as e:
print(f"Missing template variable: {e}")
return False
except FileNotFoundError as e:
print(f"Template not found: {e}")
return False
except Exception as e:
print(f"Document generation failed: {e}")
return False| Anti-Pattern | Why It Fails | What To Do Instead |
|---|---|---|
| Hardcoding font sizes instead of styles | Inconsistent formatting, hard to maintain | Define styles once, apply everywhere |
| Not handling missing template variables | Runtime crashes on incomplete data | Use |
| Huge tables without pagination | Unreadable output, broken layouts | Break tables across pages or summarize |
| Absolute image paths | Breaks portability across environments | Use relative paths or embed images |
| Not testing with different Word versions | Formatting breaks silently | Test in Word, LibreOffice, and Google Docs |
| Modifying XML directly when API exists | Fragile, version-dependent code | Use python-docx API methods first |
| All direct formatting, no styles | Impossible to maintain consistency | Create and apply named styles |
| Ignoring Unicode characters | Mojibake in generated documents | Test with accented characters, CJK, symbols |
| Not re-loading template in mail merge | Corrupted output after first render | Re-instantiate DocxTemplate per iteration |
| 反模式 | 故障原因 | 替代方案 |
|---|---|---|
| 硬编码字体大小而非使用样式 | 格式不一致,难以维护 | 一次性定义样式,全局复用 |
| 不处理缺失的模板变量 | 数据不完整时运行时崩溃 | 使用 |
| 超大表格未做分页处理 | 输出不可读,布局损坏 | 将表格拆分到多页或者做数据汇总 |
| 使用绝对路径引用图片 | 跨环境运行时路径失效 | 使用相对路径或直接嵌入图片 |
| 未在不同Word版本下测试 | 格式悄无声息地损坏 | 在Word、LibreOffice和Google Docs中都做测试 |
| 有可用API时直接修改XML | 代码脆弱,依赖版本 | 优先使用python-docx提供的API方法 |
| 全部直接设置格式,不使用样式 | 无法维持格式一致性 | 创建并应用命名样式 |
| 忽略Unicode字符处理 | 生成的文档出现乱码 | 用重音字符、中日韩字符、特殊符号做测试 |
| 邮件合并时不重新加载模板 | 第一次渲染后输出损坏 | 每次迭代都重新实例化DocxTemplate |
| Skill | How It Connects |
|---|---|
| DOCX-to-PDF conversion, or choosing PDF generation directly |
| Data from Excel feeds into document generation contexts |
| Generated documents attach to professional emails |
| Research content formatted into whitepapers and reports |
| Output file naming and directory structure conventions |
| Document generation pipelines in CI/CD or server environments |
| 技能 | 关联方式 |
|---|---|
| DOCX转PDF,或直接选择PDF生成方案 |
| Excel中的数据可作为文档生成的上下文数据源 |
| 生成的文档可作为专业邮件的附件 |
| 调研内容可格式化为白皮书和报告 |
| 输出文件命名和目录结构规范 |
| CI/CD或服务器环境中的文档生成管线 |