doc
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDOCX Skill
DOCX 技能
When to use
使用场景
- Read or review DOCX content where layout matters (tables, diagrams, pagination).
- Create or edit DOCX files with professional formatting.
- Validate visual layout before delivery.
- 读取或审查对布局有要求的DOCX内容(表格、图表、分页)。
- 创建或编辑具有专业格式的DOCX文件。
- 交付前验证视觉布局。
Workflow
工作流程
- Prefer visual review (layout, tables, diagrams).
- If and
sofficeare available, convert DOCX -> PDF -> PNGs.pdftoppm - Or use (requires
scripts/render_docx.pyand Poppler).pdf2image - If these tools are missing, install them or ask the user to review rendered pages locally.
- If
- Use for edits and structured creation (headings, styles, tables, lists).
python-docx - After each meaningful change, re-render and inspect the pages.
- If visual review is not possible, extract text with as a fallback and call out layout risk.
python-docx - Keep intermediate outputs organized and clean up after final approval.
- 优先进行视觉审查(布局、表格、图表)。
- 如果可用和
soffice,将DOCX转换为PDF再转成PNG。pdftoppm - 或者使用(需要
scripts/render_docx.py和Poppler)。pdf2image - 如果缺少这些工具,请安装它们或让用户在本地审查渲染后的页面。
- 如果可用
- 使用进行编辑和结构化创建(标题、样式、表格、列表)。
python-docx - 每次进行有意义的修改后,重新渲染并检查页面。
- 如果无法进行视觉审查,退而求其次使用提取文本,并说明布局风险。
python-docx - 整理好中间输出文件,在最终确认后清理。
Temp and output conventions
临时文件与输出规范
- Use for intermediate files; delete when done.
tmp/docs/ - Write final artifacts under when working in this repo.
output/doc/ - Keep filenames stable and descriptive.
- 使用存储中间文件;完成后删除。
tmp/docs/ - 在此仓库中工作时,将最终产物写入目录。
output/doc/ - 保持文件名稳定且具有描述性。
Dependencies (install if missing)
依赖项(缺失时请安装)
Prefer for dependency management.
uvPython packages:
uv pip install python-docx pdf2imageIf is unavailable:
uvpython3 -m pip install python-docx pdf2imageSystem tools (for rendering):
undefined优先使用进行依赖管理。
uvPython包:
uv pip install python-docx pdf2image如果不可用:
uvpython3 -m pip install python-docx pdf2image系统工具(用于渲染):
undefinedmacOS (Homebrew)
macOS(Homebrew)
brew install libreoffice poppler
brew install libreoffice poppler
Ubuntu/Debian
Ubuntu/Debian
sudo apt-get install -y libreoffice poppler-utils
If installation isn't possible in this environment, tell the user which dependency is missing and how to install it locally.sudo apt-get install -y libreoffice poppler-utils
如果无法在此环境中安装,请告知用户缺少的依赖项以及本地安装方法。Environment
环境
No required environment variables.
无必需的环境变量。
Rendering commands
渲染命令
DOCX -> PDF:
soffice -env:UserInstallation=file:///tmp/lo_profile_$$ --headless --convert-to pdf --outdir $OUTDIR $INPUT_DOCXPDF -> PNGs:
pdftoppm -png $OUTDIR/$BASENAME.pdf $OUTDIR/$BASENAMEBundled helper:
python3 scripts/render_docx.py /path/to/file.docx --output_dir /tmp/docx_pagesDOCX 转 PDF:
soffice -env:UserInstallation=file:///tmp/lo_profile_$$ --headless --convert-to pdf --outdir $OUTDIR $INPUT_DOCXPDF 转 PNG:
pdftoppm -png $OUTDIR/$BASENAME.pdf $OUTDIR/$BASENAME附带的辅助脚本:
python3 scripts/render_docx.py /path/to/file.docx --output_dir /tmp/docx_pagesQuality expectations
质量要求
- Deliver a client-ready document: consistent typography, spacing, margins, and clear hierarchy.
- Avoid formatting defects: clipped/overlapping text, broken tables, unreadable characters, or default-template styling.
- Charts, tables, and visuals must be legible in rendered pages with correct alignment.
- Use ASCII hyphens only. Avoid U+2011 (non-breaking hyphen) and other Unicode dashes.
- Citations and references must be human-readable; never leave tool tokens or placeholder strings.
- 交付客户就绪的文档:排版一致、间距合理、边距规范、层级清晰。
- 避免格式缺陷:文本被裁剪/重叠、表格破损、字符不可读或使用默认模板样式。
- 图表、表格和视觉元素在渲染页面中必须清晰可读,对齐正确。
- 仅使用ASCII连字符。避免使用U+2011(非断字连字符)和其他Unicode破折号。
- 引用和参考文献必须易于人类阅读;绝对不能留下工具标记或占位符字符串。
Final checks
最终检查
- Re-render and inspect every page at 100% zoom before final delivery.
- Fix any spacing, alignment, or pagination issues and repeat the render loop.
- Confirm there are no leftovers (temp files, duplicate renders) unless the user asks to keep them.
- 最终交付前,以100%缩放比例重新渲染并检查每一页。
- 修复所有间距、对齐或分页问题,然后重复渲染循环。
- 确认没有残留文件(临时文件、重复渲染产物),除非用户要求保留。