pptx

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

PPTX creation, editing, and analysis

PPTX文件的创建、编辑与分析

Overview

概述

A user may ask you to create, edit, or analyze the contents of a .pptx file. A .pptx file is essentially a ZIP archive containing XML files and other resources that you can read or edit. You have different tools and workflows available for different tasks.
用户可能会要求你创建、编辑或分析.pptx文件的内容。.pptx文件本质上是一个包含XML文件和其他资源的ZIP压缩包,你可以读取或编辑这些内容。针对不同任务,你可以使用不同的工具和工作流。

Reading and analyzing content

内容读取与分析

Text extraction

文本提取

If you just need to read the text contents of a presentation, you should convert the document to markdown:
bash
undefined
如果你只需要读取演示文稿的文本内容,应将文档转换为markdown格式:
bash
undefined

Convert document to markdown

Convert document to markdown

python -m markitdown path-to-file.pptx
undefined
python -m markitdown path-to-file.pptx
undefined

Raw XML access

原始XML访问

You need raw XML access for: comments, speaker notes, slide layouts, animations, design elements, and complex formatting. For any of these features, you'll need to unpack a presentation and read its raw XML contents.
当你需要处理批注、演讲者备注、幻灯片版式、动画、设计元素和复杂格式时,需要访问原始XML内容。针对这些功能,你需要解压演示文稿并读取其原始XML内容。

Unpacking a file

解压文件

python ooxml/scripts/unpack.py <office_file> <output_dir>
Note: The unpack.py script is located at
skills/pptx/ooxml/scripts/unpack.py
relative to the project root. If the script doesn't exist at this path, use
find . -name "unpack.py"
to locate it.
python ooxml/scripts/unpack.py <office_file> <output_dir>
注意:unpack.py脚本位于项目根目录下的
skills/pptx/ooxml/scripts/unpack.py
路径。如果该路径下没有此脚本,使用
find . -name "unpack.py"
命令查找。

Key file structures

关键文件结构

  • ppt/presentation.xml
    - Main presentation metadata and slide references
  • ppt/slides/slide{N}.xml
    - Individual slide contents (slide1.xml, slide2.xml, etc.)
  • ppt/notesSlides/notesSlide{N}.xml
    - Speaker notes for each slide
  • ppt/comments/modernComment_*.xml
    - Comments for specific slides
  • ppt/slideLayouts/
    - Layout templates for slides
  • ppt/slideMasters/
    - Master slide templates
  • ppt/theme/
    - Theme and styling information
  • ppt/media/
    - Images and other media files
  • ppt/presentation.xml
    - 主演示文稿元数据和幻灯片引用
  • ppt/slides/slide{N}.xml
    - 单个幻灯片内容(slide1.xml、slide2.xml等)
  • ppt/notesSlides/notesSlide{N}.xml
    - 每张幻灯片的演讲者备注
  • ppt/comments/modernComment_*.xml
    - 特定幻灯片的批注
  • ppt/slideLayouts/
    - 幻灯片版式模板
  • ppt/slideMasters/
    - 幻灯片母版模板
  • ppt/theme/
    - 主题和样式信息
  • ppt/media/
    - 图片和其他媒体文件

Typography and color extraction

排版与颜色提取

When given an example design to emulate: Always analyze the presentation's typography and colors first using the methods below:
  1. Read theme file: Check
    ppt/theme/theme1.xml
    for colors (
    <a:clrScheme>
    ) and fonts (
    <a:fontScheme>
    )
  2. Sample slide content: Examine
    ppt/slides/slide1.xml
    for actual font usage (
    <a:rPr>
    ) and colors
  3. Search for patterns: Use grep to find color (
    <a:solidFill>
    ,
    <a:srgbClr>
    ) and font references across all XML files
当需要模仿示例设计时:务必先通过以下方法分析演示文稿的排版和颜色:
  1. 读取主题文件:查看
    ppt/theme/theme1.xml
    中的颜色(
    <a:clrScheme>
    )和字体(
    <a:fontScheme>
  2. 采样幻灯片内容:检查
    ppt/slides/slide1.xml
    中的实际字体使用情况(
    <a:rPr>
    )和颜色
  3. 查找模式:使用grep命令在所有XML文件中查找颜色(
    <a:solidFill>
    <a:srgbClr>
    )和字体引用

Creating a new PowerPoint presentation without a template

无模板创建新PowerPoint演示文稿

When creating a new PowerPoint presentation from scratch, use the html2pptx workflow to convert HTML slides to PowerPoint with accurate positioning.
从头创建新PowerPoint演示文稿时,使用html2pptx工作流将HTML幻灯片转换为PowerPoint,确保定位准确。

Design Principles

设计原则

CRITICAL: Before creating any presentation, analyze the content and choose appropriate design elements:
  1. Consider the subject matter: What is this presentation about? What tone, industry, or mood does it suggest?
  2. Check for branding: If the user mentions a company/organization, consider their brand colors and identity
  3. Match palette to content: Select colors that reflect the subject
  4. State your approach: Explain your design choices before writing code
Requirements:
  • ✅ State your content-informed design approach BEFORE writing code
  • ✅ Use web-safe fonts only: Arial, Helvetica, Times New Roman, Georgia, Courier New, Verdana, Tahoma, Trebuchet MS, Impact
  • ✅ Create clear visual hierarchy through size, weight, and color
  • ✅ Ensure readability: strong contrast, appropriately sized text, clean alignment
  • ✅ Be consistent: repeat patterns, spacing, and visual language across slides
关键要求:创建任何演示文稿之前,先分析内容并选择合适的设计元素:
  1. 考虑主题内容:该演示文稿的主题是什么?它需要呈现什么样的基调、行业或氛围?
  2. 检查品牌规范:如果用户提到某公司/组织,考虑其品牌颜色和标识
  3. 配色与内容匹配:选择能体现主题的颜色
  4. 说明设计思路:编写代码前先解释你的设计选择
要求
  • ✅ 编写代码前,务必说明基于内容的设计思路
  • ✅ 仅使用网络安全字体:Arial、Helvetica、Times New Roman、Georgia、Courier New、Verdana、Tahoma、Trebuchet MS、Impact
  • ✅ 通过字号、字重和颜色建立清晰的视觉层次
  • ✅ 确保可读性:高对比度、合适的字号、整洁的对齐方式
  • ✅ 保持一致性:所有幻灯片重复使用相同的布局、间距和视觉语言

Color Palette Selection

配色方案选择

Choosing colors creatively:
  • Think beyond defaults: What colors genuinely match this specific topic? Avoid autopilot choices.
  • Consider multiple angles: Topic, industry, mood, energy level, target audience, brand identity (if mentioned)
  • Be adventurous: Try unexpected combinations - a healthcare presentation doesn't have to be green, finance doesn't have to be navy
  • Build your palette: Pick 3-5 colors that work together (dominant colors + supporting tones + accent)
  • Ensure contrast: Text must be clearly readable on backgrounds
Example color palettes (use these to spark creativity - choose one, adapt it, or create your own):
  1. Classic Blue: Deep navy (#1C2833), slate gray (#2E4053), silver (#AAB7B8), off-white (#F4F6F6)
  2. Teal & Coral: Teal (#5EA8A7), deep teal (#277884), coral (#FE4447), white (#FFFFFF)
  3. Bold Red: Red (#C0392B), bright red (#E74C3C), orange (#F39C12), yellow (#F1C40F), green (#2ECC71)
  4. Warm Blush: Mauve (#A49393), blush (#EED6D3), rose (#E8B4B8), cream (#FAF7F2)
  5. Burgundy Luxury: Burgundy (#5D1D2E), crimson (#951233), rust (#C15937), gold (#997929)
  6. Deep Purple & Emerald: Purple (#B165FB), dark blue (#181B24), emerald (#40695B), white (#FFFFFF)
  7. Cream & Forest Green: Cream (#FFE1C7), forest green (#40695B), white (#FCFCFC)
  8. Pink & Purple: Pink (#F8275B), coral (#FF574A), rose (#FF737D), purple (#3D2F68)
  9. Lime & Plum: Lime (#C5DE82), plum (#7C3A5F), coral (#FD8C6E), blue-gray (#98ACB5)
  10. Black & Gold: Gold (#BF9A4A), black (#000000), cream (#F4F6F6)
  11. Sage & Terracotta: Sage (#87A96B), terracotta (#E07A5F), cream (#F4F1DE), charcoal (#2C2C2C)
  12. Charcoal & Red: Charcoal (#292929), red (#E33737), light gray (#CCCBCB)
  13. Vibrant Orange: Orange (#F96D00), light gray (#F2F2F2), charcoal (#222831)
  14. Forest Green: Black (#191A19), green (#4E9F3D), dark green (#1E5128), white (#FFFFFF)
  15. Retro Rainbow: Purple (#722880), pink (#D72D51), orange (#EB5C18), amber (#F08800), gold (#DEB600)
  16. Vintage Earthy: Mustard (#E3B448), sage (#CBD18F), forest green (#3A6B35), cream (#F4F1DE)
  17. Coastal Rose: Old rose (#AD7670), beaver (#B49886), eggshell (#F3ECDC), ash gray (#BFD5BE)
  18. Orange & Turquoise: Light orange (#FC993E), grayish turquoise (#667C6F), white (#FCFCFC)
创意配色技巧
  • 超越默认选项:选择真正匹配特定主题的颜色,避免机械选择
  • 多角度考量:主题、行业、氛围、活力水平、目标受众、品牌标识(如有提及)
  • 大胆尝试:尝试意想不到的组合——医疗行业演示文稿不一定只用绿色,金融行业也不一定只用深蓝色
  • 构建配色方案:选择3-5种协调的颜色(主色+辅助色+强调色)
  • 确保对比度:文本在背景上必须清晰可读
示例配色方案(用于激发创意——可选择、改编或自行创建):
  1. 经典蓝色系:深藏青色(#1C2833)、石板灰(#2E4053)、银色(#AAB7B8)、米白色(#F4F6F6)
  2. 蓝绿色与珊瑚色系:蓝绿色(#5EA8A7)、深绿蓝(#277884)、珊瑚红(#FE4447)、白色(#FFFFFF)
  3. 醒目红色系:红色(#C0392B)、亮红色(#E74C3C)、橙色(#F39C12)、黄色(#F1C40F)、绿色(#2ECC71)
  4. 暖粉色系:淡紫色(#A49393)、浅粉色(#EED6D3)玫瑰色(#E8B4B8)、奶油色(#FAF7F2)
  5. 勃艮第奢华系:勃艮第红(#5D1D2E)、深红色(#951233)、铁锈红(#C15937)、金色(#997929)
  6. 深紫与祖母绿系:紫色(#B165FB)、深蓝色(#181B24)、祖母绿(#40695B)、白色(#FFFFFF)
  7. 奶油与森林绿系:奶油色(#FFE1C7)、森林绿(#40695B)、白色(#FCFCFC)
  8. 粉色与紫色系:粉色(#F8275B)、珊瑚红(#FF574A)、玫瑰红(#FF737D)、紫色(#3D2F68)
  9. 酸橙绿与李子色系:酸橙绿(#C5DE82)、李子紫(#7C3A5F)、珊瑚红(#FD8C6E)、蓝灰色(#98ACB5)
  10. 黑金系:金色(#BF9A4A)、黑色(#000000)、奶油色(#F4F6F6)
  11. 鼠尾草绿与赤陶色系:鼠尾草绿(#87A96B)、赤陶色(#E07A5F)、奶油色(#F4F1DE)、炭灰色(#2C2C2C)
  12. 炭灰与红色系:炭灰色(#292929)、红色(#E33737)、浅灰色(#CCCBCB)
  13. 活力橙色系:橙色(#F96D00)、浅灰色(#F2F2F2)、炭灰色(#222831)
  14. 森林绿系:黑色(#191A19)、绿色(#4E9F3D)、深绿色(#1E5128)、白色(#FFFFFF)
  15. 复古彩虹系:紫色(#722880)、粉色(#D72D51)、橙色(#EB5C18)、琥珀色(#F08800)、金色(#DEB600)
  16. 复古大地色系:芥末黄(#E3B448)、鼠尾草绿(#CBD18F)、森林绿(#3A6B35)、奶油色(#F4F1DE)
  17. 海岸玫瑰色系:旧玫瑰色(#AD7670)、海狸棕(#B49886)、蛋壳色(#F3ECDC)、灰绿色(#BFD5BE)
  18. 橙色与绿松石色系:浅橙色(#FC993E)、灰绿松石色(#667C6F)、白色(#FCFCFC)

Visual Details Options

视觉细节选项

Geometric Patterns:
  • Diagonal section dividers instead of horizontal
  • Asymmetric column widths (30/70, 40/60, 25/75)
  • Rotated text headers at 90° or 270°
  • Circular/hexagonal frames for images
  • Triangular accent shapes in corners
  • Overlapping shapes for depth
Border & Frame Treatments:
  • Thick single-color borders (10-20pt) on one side only
  • Double-line borders with contrasting colors
  • Corner brackets instead of full frames
  • L-shaped borders (top+left or bottom+right)
  • Underline accents beneath headers (3-5pt thick)
Typography Treatments:
  • Extreme size contrast (72pt headlines vs 11pt body)
  • All-caps headers with wide letter spacing
  • Numbered sections in oversized display type
  • Monospace (Courier New) for data/stats/technical content
  • Condensed fonts (Arial Narrow) for dense information
  • Outlined text for emphasis
Chart & Data Styling:
  • Monochrome charts with single accent color for key data
  • Horizontal bar charts instead of vertical
  • Dot plots instead of bar charts
  • Minimal gridlines or none at all
  • Data labels directly on elements (no legends)
  • Oversized numbers for key metrics
Layout Innovations:
  • Full-bleed images with text overlays
  • Sidebar column (20-30% width) for navigation/context
  • Modular grid systems (3×3, 4×4 blocks)
  • Z-pattern or F-pattern content flow
  • Floating text boxes over colored shapes
  • Magazine-style multi-column layouts
Background Treatments:
  • Solid color blocks occupying 40-60% of slide
  • Gradient fills (vertical or diagonal only)
  • Split backgrounds (two colors, diagonal or vertical)
  • Edge-to-edge color bands
  • Negative space as a design element
几何图案
  • 用对角分割线替代水平分割线
  • 不对称列宽(30/70、40/60、25/75)
  • 标题文本旋转90°或270°
  • 图片使用圆形/六边形边框
  • 角落添加三角形装饰形状
  • 重叠形状增加层次感
边框与框架处理
  • 仅在一侧添加粗单色边框(10-20pt)
  • 对比色双线条边框
  • 用角落括号替代完整框架
  • L形边框(顶部+左侧或底部+右侧)
  • 标题下方添加下划线装饰(3-5pt粗)
排版处理
  • 极端字号对比(72pt标题 vs 11pt正文)
  • 标题全部大写并加宽字符间距
  • 章节编号使用超大显示字体
  • 数据/统计/技术内容使用等宽字体(Courier New)
  • 密集信息使用压缩字体(Arial Narrow)
  • 文本添加轮廓以突出显示
图表与数据样式
  • 单色图表,关键数据使用单一强调色
  • 使用水平条形图替代垂直条形图
  • 使用点图替代条形图
  • 最小化网格线或完全移除
  • 数据标签直接标注在元素上(无图例)
  • 关键指标使用超大字号
版式创新
  • 全屏图片叠加文本
  • 侧边栏(宽度20-30%)用于导航/上下文说明
  • 模块化网格系统(3×3、4×4区块)
  • Z型或F型内容流
  • 彩色形状上方的浮动文本框
  • 杂志风格多栏布局
背景处理
  • 纯色块占据幻灯片40-60%区域
  • 渐变填充(仅垂直或对角方向)
  • 分割背景(两种颜色,对角或垂直分割)
  • 边缘到边缘的色带
  • 将留白作为设计元素

Layout Tips

版式技巧

When creating slides with charts or tables:
  • Two-column layout (PREFERRED): Use a header spanning the full width, then two columns below - text/bullets in one column and the featured content in the other. This provides better balance and makes charts/tables more readable. Use flexbox with unequal column widths (e.g., 40%/60% split) to optimize space for each content type.
  • Full-slide layout: Let the featured content (chart/table) take up the entire slide for maximum impact and readability
  • NEVER vertically stack: Do not place charts/tables below text in a single column - this causes poor readability and layout issues
创建包含图表或表格的幻灯片时
  • 首选双栏版式:使用跨全宽的标题,下方分为两栏——一栏放置文本/项目符号,另一栏放置重点内容。使用不等宽的flexbox布局(如40%/60%分割),为不同类型的内容优化空间。
  • 全屏版式:让重点内容(图表/表格)占据整个幻灯片,以最大化视觉冲击力和可读性
  • 绝对不要垂直堆叠:不要在单栏中将图表/表格放在文本下方——这会导致可读性差和版式问题

Workflow

工作流

  1. MANDATORY - READ ENTIRE FILE: Read
    html2pptx.md
    completely from start to finish. NEVER set any range limits when reading this file. Read the full file content for detailed syntax, critical formatting rules, and best practices before proceeding with presentation creation.
  2. Create an HTML file for each slide with proper dimensions (e.g., 720pt × 405pt for 16:9)
    • Use
      <p>
      ,
      <h1>
      -
      <h6>
      ,
      <ul>
      ,
      <ol>
      for all text content
    • Use
      class="placeholder"
      for areas where charts/tables will be added (render with gray background for visibility)
    • CRITICAL: Rasterize gradients and icons as PNG images FIRST using Sharp, then reference in HTML
    • LAYOUT: For slides with charts/tables/images, use either full-slide layout or two-column layout for better readability
  3. Create and run a JavaScript file using the
    html2pptx.js
    library to convert HTML slides to PowerPoint and save the presentation
    • Use the
      html2pptx()
      function to process each HTML file
    • Add charts and tables to placeholder areas using PptxGenJS API
    • Save the presentation using
      pptx.writeFile()
  4. Visual validation: Generate thumbnails and inspect for layout issues
    • Create thumbnail grid:
      python scripts/thumbnail.py output.pptx workspace/thumbnails --cols 4
    • Read and carefully examine the thumbnail image for:
      • Text cutoff: Text being cut off by header bars, shapes, or slide edges
      • Text overlap: Text overlapping with other text or shapes
      • Positioning issues: Content too close to slide boundaries or other elements
      • Contrast issues: Text colors not readable against background colors
    • If issues are found, fix the HTML and regenerate the PPTX and thumbnails
  5. Save the presentation using
    pptx.writeFile('presentation.pptx')
  1. 必须完成:通读全部文件:完整阅读
    html2pptx.md
    文件。阅读此文件时绝对不要设置范围限制。在开始创建演示文稿前,先通读全文以了解详细语法、关键格式规则和最佳实践。
  2. 为每张幻灯片创建HTML文件,设置合适的尺寸(如16:9比例使用720pt × 405pt)
    • 使用
      <p>
      <h1>
      -
      <h6>
      <ul>
      <ol>
      标记所有文本内容
    • 为将添加图表/表格的区域使用
      class="placeholder"
      (显示灰色背景以便识别)
    • 关键要求:先使用Sharp将渐变和图标栅格化为PNG图片,再在HTML中引用
    • 版式:包含图表/表格/图片的幻灯片,使用全屏版式或双栏版式以提升可读性
  3. 创建并运行JavaScript文件,使用
    html2pptx.js
    库将HTML幻灯片转换为PowerPoint并保存演示文稿
    • 使用
      html2pptx()
      函数处理每个HTML文件
    • 使用PptxGenJS API将图表和表格添加到占位区域
    • 使用
      pptx.writeFile()
      保存演示文稿
  4. 视觉验证:生成缩略图并检查版式问题
    • 创建缩略图网格:
      python scripts/thumbnail.py output.pptx workspace/thumbnails --cols 4
    • 读取并仔细检查缩略图,查看以下问题:
      • 文本截断:文本被标题栏、形状或幻灯片边缘截断
      • 文本重叠:文本与其他文本或形状重叠
      • 定位问题:内容过于靠近幻灯片边界或其他元素
      • 对比度问题:文本颜色在背景上不可读
    • 如果发现问题,修改HTML并重新生成PPTX和缩略图
  5. 使用
    pptx.writeFile('presentation.pptx')
    保存演示文稿

Creating a new PowerPoint presentation with a template

基于模板创建新PowerPoint演示文稿

This workflow is for creating presentations from templates that contain placeholder text to be replaced with real content.
此工作流适用于基于包含占位符文本的模板创建演示文稿,需将占位符替换为真实内容。

Workflow

工作流

  1. MANDATORY - READ ENTIRE FILE: Read
    ooxml.md
    completely first for XML reference.
  2. Copy the template to a working file Create a working copy of the template to modify:
    cp template.pptx working.pptx
  3. Create a thumbnail grid of the template to see all available slides
    bash
    python scripts/thumbnail.py template.pptx workspace/thumbnails --cols 4
  4. Rearrange slides based on content requirements
    bash
    python scripts/rearrange.py template.pptx working.pptx 0,5,12,12,23
    • Specify 0-indexed slide numbers in desired order
    • Slides can be duplicated (e.g.,
      12,12
      uses slide 12 twice)
    • Missing numbers skip those slides entirely
  5. Extract text inventory from the working presentation
    bash
    python scripts/inventory.py working.pptx text-inventory.json
    • Read text-inventory.json: Read the entire text-inventory.json file to understand all shapes and their properties. NEVER set any range limits when reading this file.
    • The inventory JSON structure:
      json
        {
          "slide-0": {
            "shape-0": {
              "placeholder_type": "TITLE",  // or null for non-placeholders
              "left": 1.5,                  // position in inches
              "top": 2.0,
              "width": 7.5,
              "height": 1.2,
              "paragraphs": [
                {
                  "text": "Paragraph text",
                  // Optional properties (only included when non-default):
                  "bullet": true,           // explicit bullet detected
                  "level": 0,               // only included when bullet is true
                  "alignment": "CENTER",    // CENTER, RIGHT (not LEFT)
                  "space_before": 10.0,     // space before paragraph in points
                  "space_after": 6.0,       // space after paragraph in points
                  "line_spacing": 22.4,     // line spacing in points
                  "font_name": "Arial",     // from first run
                  "font_size": 14.0,        // in points
                  "bold": true,
                  "italic": false,
                  "underline": false,
                  "color": "FF0000"         // RGB color
                }
              ]
            }
          }
        }
    • Key features:
      • Slides: Named as "slide-0", "slide-1", etc.
      • Shapes: Ordered by visual position (top-to-bottom, left-to-right) as "shape-0", "shape-1", etc.
      • Placeholder types: TITLE, CENTER_TITLE, SUBTITLE, BODY, OBJECT, or null
      • Default font size:
        default_font_size
        in points extracted from layout placeholders (when available)
      • Slide numbers are filtered: Shapes with SLIDE_NUMBER placeholder type are automatically excluded from inventory
      • Bullets: When
        bullet: true
        ,
        level
        is always included (even if 0)
      • Spacing:
        space_before
        ,
        space_after
        , and
        line_spacing
        in points (only included when set)
      • Colors:
        color
        for RGB (e.g., "FF0000"),
        theme_color
        for theme colors (e.g., "DARK_1")
      • Properties: Only non-default values are included in the output
  6. Generate replacement text and save the data to a JSON file Based on the text inventory from the previous step:
    • CRITICAL: First verify which shapes exist in the inventory - only reference shapes that are actually present
    • VALIDATION: The replace.py script will validate that all shapes in your replacement JSON exist in the inventory
      • If you reference a non-existent shape, you'll get an error showing available shapes
      • If you reference a non-existent slide, you'll get an error indicating the slide doesn't exist
      • All validation errors are shown at once before the script exits
    • IMPORTANT: The replace.py script uses inventory.py internally to identify ALL text shapes
    • AUTOMATIC CLEARING: ALL text shapes from the inventory will be cleared unless you provide "paragraphs" for them
    • Add a "paragraphs" field to shapes that need content (not "replacement_paragraphs")
    • Shapes without "paragraphs" in the replacement JSON will have their text cleared automatically
    • Paragraphs with bullets will be automatically left aligned. Don't set the
      alignment
      property on when
      "bullet": true
    • Generate appropriate replacement content for placeholder text
    • Use shape size to determine appropriate content length
    • CRITICAL: Include paragraph properties from the original inventory - don't just provide text
    • IMPORTANT: When bullet: true, do NOT include bullet symbols (•, -, *) in text - they're added automatically
    • ESSENTIAL FORMATTING RULES:
      • Headers/titles should typically have
        "bold": true
      • List items should have
        "bullet": true, "level": 0
        (level is required when bullet is true)
      • Preserve any alignment properties (e.g.,
        "alignment": "CENTER"
        for centered text)
      • Include font properties when different from default (e.g.,
        "font_size": 14.0
        ,
        "font_name": "Lora"
        )
      • Colors: Use
        "color": "FF0000"
        for RGB or
        "theme_color": "DARK_1"
        for theme colors
      • The replacement script expects properly formatted paragraphs, not just text strings
      • Overlapping shapes: Prefer shapes with larger default_font_size or more appropriate placeholder_type
    • Save the updated inventory with replacements to
      replacement-text.json
    • WARNING: Different template layouts have different shape counts - always check the actual inventory before creating replacements
    Example paragraphs field showing proper formatting:
    json
    "paragraphs": [
      {
        "text": "New presentation title text",
        "alignment": "CENTER",
        "bold": true
      },
      {
        "text": "Section Header",
        "bold": true
      },
      {
        "text": "First bullet point without bullet symbol",
        "bullet": true,
        "level": 0
      },
      {
        "text": "Red colored text",
        "color": "FF0000"
      },
      {
        "text": "Theme colored text",
        "theme_color": "DARK_1"
      },
      {
        "text": "Regular paragraph text without special formatting"
      }
    ]
    Shapes not listed in the replacement JSON are automatically cleared:
    json
    {
      "slide-0": {
        "shape-0": {
          "paragraphs": [...] // This shape gets new text
        }
        // shape-1 and shape-2 from inventory will be cleared automatically
      }
    }
    Common formatting patterns for presentations:
    • Title slides: Bold text, sometimes centered
    • Section headers within slides: Bold text
    • Bullet lists: Each item needs
      "bullet": true, "level": 0
    • Body text: Usually no special properties needed
    • Quotes: May have special alignment or font properties
  7. Apply replacements using the
    replace.py
    script
    bash
    python scripts/replace.py working.pptx replacement-text.json output.pptx
    The script will:
    • First extract the inventory of ALL text shapes using functions from inventory.py
    • Validate that all shapes in the replacement JSON exist in the inventory
    • Clear text from ALL shapes identified in the inventory
    • Apply new text only to shapes with "paragraphs" defined in the replacement JSON
    • Preserve formatting by applying paragraph properties from the JSON
    • Handle bullets, alignment, font properties, and colors automatically
    • Save the updated presentation
    Example validation errors:
    ERROR: Invalid shapes in replacement JSON:
      - Shape 'shape-99' not found on 'slide-0'. Available shapes: shape-0, shape-1, shape-4
      - Slide 'slide-999' not found in inventory
    ERROR: Replacement text made overflow worse in these shapes:
      - slide-0/shape-2: overflow worsened by 1.25" (was 0.00", now 1.25")
  1. 必须完成:通读全部文件:先完整阅读
    ooxml.md
    文件,了解XML参考内容。
  2. 复制模板到工作文件 创建模板的工作副本以便修改:
    cp template.pptx working.pptx
  3. 创建模板的缩略图网格,查看所有可用幻灯片
    bash
    python scripts/thumbnail.py template.pptx workspace/thumbnails --cols 4
  4. 根据内容需求重新排列幻灯片
    bash
    python scripts/rearrange.py template.pptx working.pptx 0,5,12,12,23
    • 使用0索引的幻灯片编号指定所需顺序
    • 幻灯片可重复使用(如
      12,12
      表示两次使用幻灯片12)
    • 未指定的编号对应的幻灯片将被跳过
  5. 从工作演示文稿中提取文本清单
    bash
    python scripts/inventory.py working.pptx text-inventory.json
    • 读取text-inventory.json:完整读取该文件,了解所有形状及其属性。阅读此文件时绝对不要设置范围限制
    • 清单JSON结构:
      json
        {
          "slide-0": {
            "shape-0": {
              "placeholder_type": "TITLE",  // 非占位符则为null
              "left": 1.5,                  // 位置(英寸)
              "top": 2.0,
              "width": 7.5,
              "height": 1.2,
              "paragraphs": [
                {
                  "text": "Paragraph text",
                  // 可选属性(仅在非默认值时包含):
                  "bullet": true,           // 检测到明确的项目符号
                  "level": 0,               // 仅当bullet为true时包含
                  "alignment": "CENTER",    // CENTER、RIGHT(LEFT不包含)
                  "space_before": 10.0,     // 段落前间距(磅)
                  "space_after": 6.0,       // 段落后间距(磅)
                  "line_spacing": 22.4,     // 行间距(磅)
                  "font_name": "Arial",     // 取自第一个文本块
                  "font_size": 14.0,        // 磅
                  "bold": true,
                  "italic": false,
                  "underline": false,
                  "color": "FF0000"         // RGB颜色
                }
              ]
            }
          }
        }
    • 关键特性:
      • 幻灯片:命名为"slide-0"、"slide-1"等
      • 形状:按视觉位置(从上到下、从左到右)排序,命名为"shape-0"、"shape-1"等
      • 占位符类型:TITLE、CENTER_TITLE、SUBTITLE、BODY、OBJECT或null
      • 默认字号
        default_font_size
        (从版式占位符提取,如有)
      • 幻灯片编号已过滤:SLIDE_NUMBER类型的占位符形状自动从清单中排除
      • 项目符号:当
        bullet: true
        时,始终包含
        level
        (即使为0)
      • 间距
        space_before
        space_after
        line_spacing
        以磅为单位(仅在设置时包含)
      • 颜色:RGB颜色用
        color
        表示(如"FF0000"),主题颜色用
        theme_color
        表示(如"DARK_1")
      • 属性:仅包含非默认值
  6. 生成替换文本并保存到JSON文件 基于上一步的文本清单:
    • 关键要求:首先验证清单中存在哪些形状——仅引用实际存在的形状
    • 验证机制:replace.py脚本会验证替换JSON中的所有形状是否存在于清单中
      • 如果引用了不存在的形状,会显示错误并列出可用形状
      • 如果引用了不存在的幻灯片,会显示错误提示该幻灯片不存在
      • 所有验证错误会在脚本退出前一次性显示
    • 重要说明:replace.py脚本内部使用inventory.py来识别所有文本形状
    • 自动清空:清单中的所有文本形状都会被清空,除非你为其提供"paragraphs"内容
    • 为需要添加内容的形状添加"paragraphs"字段(不是"replacement_paragraphs")
    • 替换JSON中未包含"paragraphs"的形状会被自动清空文本
    • 带项目符号的段落会自动左对齐。当
      "bullet": true
      时,不要设置
      alignment
      属性
    • 为占位符文本生成合适的替换内容
    • 根据形状大小确定合适的内容长度
    • 关键要求:保留原始清单中的段落属性——不要只提供文本
    • 重要说明:当
      bullet: true
      时,不要在文本中包含项目符号符号(•、-、*)——符号会自动添加
    • 核心格式规则
      • 标题/副标题通常设置
        "bold": true
      • 列表项需要设置
        "bullet": true, "level": 0
        bullet
        为true时必须包含
        level
      • 保留对齐属性(如标题居中设置
        "alignment": "CENTER"
      • 与默认值不同的字体属性需包含(如
        "font_size": 14.0
        "font_name": "Lora"
      • 颜色:RGB颜色使用
        "color": "FF0000"
        ,主题颜色使用
        "theme_color": "DARK_1"
      • 替换脚本需要格式正确的段落,而不仅仅是文本字符串
      • 重叠形状:优先选择默认字号更大或占位符类型更合适的形状
    • 将包含替换内容的更新清单保存到
      replacement-text.json
    • 警告:不同模板版式的形状数量不同——创建替换内容前务必检查实际清单
    示例段落字段(格式正确):
    json
    "paragraphs": [
      {
        "text": "New presentation title text",
        "alignment": "CENTER",
        "bold": true
      },
      {
        "text": "Section Header",
        "bold": true
      },
      {
        "text": "First bullet point without bullet symbol",
        "bullet": true,
        "level": 0
      },
      {
        "text": "Red colored text",
        "color": "FF0000"
      },
      {
        "text": "Theme colored text",
        "theme_color": "DARK_1"
      },
      {
        "text": "Regular paragraph text without special formatting"
      }
    ]
    替换JSON中未列出的形状会被自动清空
    json
    {
      "slide-0": {
        "shape-0": {
          "paragraphs": [...] // 此形状会被设置新文本
        }
        // 清单中的shape-1和shape-2会被自动清空
      }
    }
    演示文稿常见格式模式
    • 标题幻灯片:文本加粗,有时居中对齐
    • 幻灯片内的章节标题:文本加粗
    • 项目符号列表:每个项需设置
      "bullet": true, "level": 0
    • 正文文本:通常无需特殊属性
    • 引用内容:可能有特殊对齐或字体属性
  7. 使用
    replace.py
    脚本应用替换内容
    bash
    python scripts/replace.py working.pptx replacement-text.json output.pptx
    该脚本会:
    • 首先使用inventory.py中的函数提取所有文本形状的清单
    • 验证替换JSON中的所有形状是否存在于清单中
    • 清空清单中识别的所有形状的文本
    • 仅为替换JSON中定义了"paragraphs"的形状应用新文本
    • 通过应用JSON中的段落属性保留格式
    • 自动处理项目符号、对齐方式、字体属性和颜色
    • 保存更新后的演示文稿
    示例验证错误:
    ERROR: Invalid shapes in replacement JSON:
      - Shape 'shape-99' not found on 'slide-0'. Available shapes: shape-0, shape-1, shape-4
      - Slide 'slide-999' not found in inventory
    ERROR: Replacement text made overflow worse in these shapes:
      - slide-0/shape-2: overflow worsened by 1.25" (was 0.00", now 1.25")

Creating Thumbnail Grids

创建缩略图网格

To create visual thumbnail grids of PowerPoint slides for quick analysis and reference:
bash
python scripts/thumbnail.py template.pptx [output_prefix]
Features:
  • Creates:
    thumbnails.jpg
    (or
    thumbnails-1.jpg
    ,
    thumbnails-2.jpg
    , etc. for large decks)
  • Default: 5 columns, max 30 slides per grid (5×6)
  • Custom prefix:
    python scripts/thumbnail.py template.pptx my-grid
    • Note: The output prefix should include the path if you want output in a specific directory (e.g.,
      workspace/my-grid
      )
  • Adjust columns:
    --cols 4
    (range: 3-6, affects slides per grid)
  • Grid limits: 3 cols = 12 slides/grid, 4 cols = 20, 5 cols = 30, 6 cols = 42
  • Slides are zero-indexed (Slide 0, Slide 1, etc.)
Use cases:
  • Template analysis: Quickly understand slide layouts and design patterns
  • Content review: Visual overview of entire presentation
  • Navigation reference: Find specific slides by their visual appearance
  • Quality check: Verify all slides are properly formatted
Examples:
bash
undefined
为PowerPoint幻灯片创建可视化缩略图网格,以便快速分析和参考:
bash
python scripts/thumbnail.py template.pptx [output_prefix]
特性
  • 生成文件:
    thumbnails.jpg
    (对于大型演示文稿,会生成
    thumbnails-1.jpg
    thumbnails-2.jpg
    等)
  • 默认设置:5列,每个网格最多30张幻灯片(5×6)
  • 自定义前缀:
    python scripts/thumbnail.py template.pptx my-grid
    • 注意:如果希望输出到特定目录,输出前缀应包含路径(如
      workspace/my-grid
  • 调整列数:
    --cols 4
    (范围:3-6,影响每个网格的幻灯片数量)
  • 网格限制:3列=12张/网格,4列=20张,5列=30张,6列=42张
  • 幻灯片采用0索引(Slide 0、Slide 1等)
使用场景
  • 模板分析:快速了解幻灯片版式和设计模式
  • 内容审核:演示文稿的视觉概览
  • 导航参考:通过外观查找特定幻灯片
  • 质量检查:验证所有幻灯片格式是否正确
示例
bash
undefined

Basic usage

基础用法

python scripts/thumbnail.py presentation.pptx
python scripts/thumbnail.py presentation.pptx

Combine options: custom name, columns

组合选项:自定义名称、列数

python scripts/thumbnail.py template.pptx analysis --cols 4
undefined
python scripts/thumbnail.py template.pptx analysis --cols 4
undefined

Converting Slides to Images

幻灯片转图片

To visually analyze PowerPoint slides, convert them to images using a two-step process:
  1. Convert PPTX to PDF:
    bash
    soffice --headless --convert-to pdf template.pptx
  2. Convert PDF pages to JPEG images:
    bash
    pdftoppm -jpeg -r 150 template.pdf slide
    This creates files like
    slide-1.jpg
    ,
    slide-2.jpg
    , etc.
Options:
  • -r 150
    : Sets resolution to 150 DPI (adjust for quality/size balance)
  • -jpeg
    : Output JPEG format (use
    -png
    for PNG if preferred)
  • -f N
    : First page to convert (e.g.,
    -f 2
    starts from page 2)
  • -l N
    : Last page to convert (e.g.,
    -l 5
    stops at page 5)
  • slide
    : Prefix for output files
Example for specific range:
bash
pdftoppm -jpeg -r 150 -f 2 -l 5 template.pdf slide  # Converts only pages 2-5
要可视化分析PowerPoint幻灯片,可通过两步流程将其转换为图片:
  1. 将PPTX转换为PDF
    bash
    soffice --headless --convert-to pdf template.pptx
  2. 将PDF页面转换为JPEG图片
    bash
    pdftoppm -jpeg -r 150 template.pdf slide
    此命令会生成
    slide-1.jpg
    slide-2.jpg
    等文件。
选项说明:
  • -r 150
    :设置分辨率为150 DPI(可调整以平衡质量和文件大小)
  • -jpeg
    :输出JPEG格式(如需PNG格式,使用
    -png
  • -f N
    :起始转换页码(如
    -f 2
    从第2页开始)
  • -l N
    :结束转换页码(如
    -l 5
    到第5页结束)
  • slide
    :输出文件的前缀
特定范围转换示例:
bash
pdftoppm -jpeg -r 150 -f 2 -l 5 template.pdf slide  # 仅转换第2-5页

Code Style Guidelines

代码风格指南

IMPORTANT: When generating code for PPTX operations:
  • Write concise code
  • Avoid verbose variable names and redundant operations
  • Avoid unnecessary print statements
重要说明:生成PPTX操作相关代码时:
  • 编写简洁的代码
  • 避免冗长的变量名和冗余操作
  • 避免不必要的打印语句

Dependencies

依赖项

Required dependencies (should already be installed):
  • markitdown:
    pip install "markitdown[pptx]"
    (for text extraction from presentations)
  • pptxgenjs:
    npm install -g pptxgenjs
    (for creating presentations via html2pptx)
  • playwright:
    npm install -g playwright
    (for HTML rendering in html2pptx)
  • react-icons:
    npm install -g react-icons react react-dom
    (for icons)
  • sharp:
    npm install -g sharp
    (for SVG rasterization and image processing)
  • LibreOffice:
    sudo apt-get install libreoffice
    (for PDF conversion)
  • Poppler:
    sudo apt-get install poppler-utils
    (for pdftoppm to convert PDF to images)
  • defusedxml:
    pip install defusedxml
    (for secure XML parsing)
所需依赖(应已预先安装):
  • markitdown
    pip install "markitdown[pptx]"
    (用于提取演示文稿文本)
  • pptxgenjs
    npm install -g pptxgenjs
    (通过html2pptx创建演示文稿)
  • playwright
    npm install -g playwright
    (用于html2pptx中的HTML渲染)
  • react-icons
    npm install -g react-icons react react-dom
    (用于图标)
  • sharp
    npm install -g sharp
    (用于SVG栅格化和图片处理)
  • LibreOffice
    sudo apt-get install libreoffice
    (用于PDF转换)
  • Poppler
    sudo apt-get install poppler-utils
    (用于pdftoppm将PDF转换为图片)
  • defusedxml
    pip install defusedxml
    (用于安全XML解析)