pptx
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePPTX 创建、编辑和分析
PPTX Creation, Editing, and Analysis
概述
Overview
用户可能会要求您创建、编辑或分析 .pptx 文件的内容。.pptx 文件本质上是一个包含 XML 文件和其他资源的 ZIP 压缩包,您可以读取或编辑。不同的任务有不同的工具和工作流程可用。
Users may ask you to create, edit, or analyze the content of .pptx files. A .pptx file is essentially a ZIP archive containing XML files and other resources that you can read or edit. Different tasks have different available tools and workflows.
读取和分析内容
Reading and Analyzing Content
文本提取
Text Extraction
如果您只需要读取演示文稿的文本内容,应将文档转换为 markdown:
bash
undefinedIf you only need to read the text content of a presentation, convert the document to markdown:
bash
undefined将文档转换为 markdown
Convert document to markdown
python -m markitdown path-to-file.pptx
undefinedpython -m markitdown path-to-file.pptx
undefined原始 XML 访问
Raw XML Access
您需要原始 XML 访问来处理:批注、演讲者备注、幻灯片版式、动画、设计元素和复杂格式。对于这些功能,您需要解压演示文稿并读取其原始 XML 内容。
You need raw XML access to handle: comments, speaker notes, slide layouts, animations, design elements, and complex formatting. For these features, you need to unpack the presentation and read its raw XML content.
解压文件
Unpacking Files
python ooxml/scripts/unpack.py <office_file> <output_dir>注意:unpack.py 脚本位于相对于项目根目录的 。如果脚本不在此路径,请使用 来定位它。
skills/pptx/ooxml/scripts/unpack.pyfind . -name "unpack.py"python ooxml/scripts/unpack.py <office_file> <output_dir>Note: The unpack.py script is located at relative to the project root. If the script is not in this path, use to locate it.
skills/pptx/ooxml/scripts/unpack.pyfind . -name "unpack.py"关键文件结构
Key File Structure
- - 主演示文稿元数据和幻灯片引用
ppt/presentation.xml - - 单个幻灯片内容(slide1.xml、slide2.xml 等)
ppt/slides/slide{N}.xml - - 每个幻灯片的演讲者备注
ppt/notesSlides/notesSlide{N}.xml - - 特定幻灯片的批注
ppt/comments/modernComment_*.xml - - 幻灯片版式模板
ppt/slideLayouts/ - - 母版幻灯片模板
ppt/slideMasters/ - - 主题和样式信息
ppt/theme/ - - 图片和其他媒体文件
ppt/media/
- - Main presentation metadata and slide references
ppt/presentation.xml - - Individual slide content (slide1.xml, slide2.xml, etc.)
ppt/slides/slide{N}.xml - - Speaker notes for each slide
ppt/notesSlides/notesSlide{N}.xml - - Comments for specific slides
ppt/comments/modernComment_*.xml - - Slide layout templates
ppt/slideLayouts/ - - Master slide templates
ppt/slideMasters/ - - Theme and style information
ppt/theme/ - - Images and other media files
ppt/media/
排版和颜色提取
Typography and Color Extraction
当提供了要模仿的示例设计时:始终先使用以下方法分析演示文稿的排版和颜色:
- 读取主题文件:检查 中的颜色(
ppt/theme/theme1.xml)和字体(<a:clrScheme>)<a:fontScheme> - 采样幻灯片内容:检查 中的实际字体使用(
ppt/slides/slide1.xml)和颜色<a:rPr> - 搜索模式:使用 grep 在所有 XML 文件中查找颜色(、
<a:solidFill>)和字体引用<a:srgbClr>
When provided with a sample design to mimic: Always first analyze the presentation's typography and colors using the following methods:
- Read theme files: Check colors () and fonts (
<a:clrScheme>) in<a:fontScheme>ppt/theme/theme1.xml - Sample slide content: Check actual font usage () and colors in
<a:rPr>ppt/slides/slide1.xml - Search for patterns: Use grep to find color (,
<a:solidFill>) and font references in all XML files<a:srgbClr>
创建新 PowerPoint 演示文稿 不使用模板
Creating New PowerPoint Presentations Without Templates
从头开始创建新 PowerPoint 演示文稿时,使用 html2pptx 工作流程将 HTML 幻灯片转换为具有精确定位的 PowerPoint。
When creating a new PowerPoint presentation from scratch, use the html2pptx workflow to convert HTML slides to PowerPoint with precise positioning.
设计原则
Design Principles
关键:在创建任何演示文稿之前,分析内容并选择适当的设计元素:
- 考虑主题:这个演示文稿是关于什么的?它暗示什么语气、行业或情绪?
- 检查品牌:如果用户提到公司/组织,考虑其品牌颜色和形象
- 将调色板与内容匹配:选择反映主题的颜色
- 说明您的方法:在编写代码之前解释您的设计选择
要求:
- ✅ 在编写代码之前说明您基于内容的设计方法
- ✅ 仅使用网页安全字体:Arial、Helvetica、Times New Roman、Georgia、Courier New、Verdana、Tahoma、Trebuchet MS、Impact
- ✅ 通过大小、粗细和颜色创建清晰的视觉层次
- ✅ 确保可读性:强烈对比、适当大小的文本、整洁的对齐
- ✅ 保持一致性:在幻灯片之间重复模式、间距和视觉语言
Key: Before creating any presentation, analyze the content and select appropriate design elements:
- Consider the theme: What is this presentation about? What tone, industry, or mood does it imply?
- Check branding: If the user mentions a company/organization, consider its brand colors and image
- Match color palette to content: Choose colors that reflect the theme
- Explain your approach: Explain your design choices before writing code
Requirements:
- ✅ Explain your content-based design approach before writing code
- ✅ Only use web-safe fonts: Arial, Helvetica, Times New Roman, Georgia, Courier New, Verdana, Tahoma, Trebuchet MS, Impact
- ✅ Create clear visual hierarchy through size, weight, and color
- ✅ Ensure readability: strong contrast, appropriately sized text, neat alignment
- ✅ Maintain consistency: repeat patterns, spacing, and visual language across slides
调色板选择
Color Palette Selection
创造性地选择颜色:
- 跳出默认思维:什么颜色真正匹配这个特定主题?避免自动选择。
- 考虑多个角度:主题、行业、情绪、能量水平、目标受众、品牌形象(如有提及)
- 大胆尝试:尝试意想不到的组合 - 医疗保健演示文稿不必是绿色的,金融不必是海军蓝
- 构建调色板:选择 3-5 种配合良好的颜色(主色 + 辅助色 + 强调色)
- 确保对比:文本必须在背景上清晰可读
示例调色板(用于激发创意 - 选择一个、调整它或创建自己的):
- 经典蓝:深海军蓝(#1C2833)、石板灰(#2E4053)、银色(#AAB7B8)、米白(#F4F6F6)
- 青绿与珊瑚:青绿(#5EA8A7)、深青绿(#277884)、珊瑚(#FE4447)、白色(#FFFFFF)
- 大胆红色:红色(#C0392B)、亮红(#E74C3C)、橙色(#F39C12)、黄色(#F1C40F)、绿色(#2ECC71)
- 暖腮红:灰褐色(#A49393)、腮红(#EED6D3)、玫瑰(#E8B4B8)、奶油(#FAF7F2)
- 酒红奢华:酒红(#5D1D2E)、深红(#951233)、铁锈(#C15937)、金色(#997929)
- 深紫与翡翠:紫色(#B165FB)、深蓝(#181B24)、翡翠(#40695B)、白色(#FFFFFF)
- 奶油与森林绿:奶油(#FFE1C7)、森林绿(#40695B)、白色(#FCFCFC)
- 粉色与紫色:粉色(#F8275B)、珊瑚(#FF574A)、玫瑰(#FF737D)、紫色(#3D2F68)
- 青柠与梅色:青柠(#C5DE82)、梅色(#7C3A5F)、珊瑚(#FD8C6E)、蓝灰(#98ACB5)
- 黑与金:金色(#BF9A4A)、黑色(#000000)、奶油(#F4F6F6)
- 鼠尾草与赤陶:鼠尾草(#87A96B)、赤陶(#E07A5F)、奶油(#F4F1DE)、炭灰(#2C2C2C)
- 炭灰与红:炭灰(#292929)、红色(#E33737)、浅灰(#CCCBCB)
- 活力橙:橙色(#F96D00)、浅灰(#F2F2F2)、炭灰(#222831)
- 森林绿:黑色(#191A19)、绿色(#4E9F3D)、深绿(#1E5128)、白色(#FFFFFF)
- 复古彩虹:紫色(#722880)、粉色(#D72D51)、橙色(#EB5C18)、琥珀(#F08800)、金色(#DEB600)
- 复古大地:芥末(#E3B448)、鼠尾草(#CBD18F)、森林绿(#3A6B35)、奶油(#F4F1DE)
- 海岸玫瑰:老玫瑰(#AD7670)、海狸(#B49886)、蛋壳(#F3ECDC)、灰绿(#BFD5BE)
- 橙与青绿:浅橙(#FC993E)、灰青绿(#667C6F)、白色(#FCFCFC)
Choose colors creatively:
- Think beyond defaults: What colors truly match this specific theme? Avoid automatic selection.
- Consider multiple angles: theme, industry, mood, energy level, target audience, brand image (if mentioned)
- Be bold: Try unexpected combinations - healthcare presentations don't have to be green, finance doesn't have to be navy blue
- Build a palette: Choose 3-5 colors that work well together (primary + secondary + accent colors)
- Ensure contrast: Text must be clearly readable against the background
Sample Palettes (for inspiration - choose one, adjust it, or create your own):
- Classic Blue: Deep Navy (#1C2833), Slate Gray (#2E4053), Silver (#AAB7B8), Off-White (#F4F6F6)
- Teal & Coral: Teal (#5EA8A7), Deep Teal (#277884), Coral (#FE4447), White (#FFFFFF)
- Bold Red: Red (#C0392B), Bright Red (#E74C3C), Orange (#F39C12), Yellow (#F1C40F), Green (#2ECC71)
- Warm Blush: Taupe (#A49393), Blush (#EED6D3), Rose (#E8B4B8), Cream (#FAF7F2)
- Burgundy Luxury: Burgundy (#5D1D2E), Deep Red (#951233), Rust (#C15937), Gold (#997929)
- Deep Purple & Emerald: Purple (#B165FB), Dark Blue (#181B24), Emerald (#40695B), White (#FFFFFF)
- Cream & Forest Green: Cream (#FFE1C7), Forest Green (#40695B), White (#FCFCFC)
- Pink & Purple: Pink (#F8275B), Coral (#FF574A), Rose (#FF737D), Purple (#3D2F68)
- Lime & Plum: Lime (#C5DE82), Plum (#7C3A5F), Coral (#FD8C6E), Slate Blue (#98ACB5)
- Black & Gold: Gold (#BF9A4A), Black (#000000), Cream (#F4F6F6)
- Sage & Terracotta: Sage (#87A96B), Terracotta (#E07A5F), Cream (#F4F1DE), Charcoal (#2C2C2C)
- Charcoal & Red: Charcoal (#292929), Red (#E33737), Light Gray (#CCCBCB)
- Vibrant Orange: Orange (#F96D00), Light Gray (#F2F2F2), Charcoal (#222831)
- Forest Green: Black (#191A19), Green (#4E9F3D), Deep Green (#1E5128), White (#FFFFFF)
- Vintage Rainbow: Purple (#722880), Pink (#D72D51), Orange (#EB5C18), Amber (#F08800), Gold (#DEB600)
- Vintage Earth: Mustard (#E3B448), Sage (#CBD18F), Forest Green (#3A6B35), Cream (#F4F1DE)
- Coastal Rose: Old Rose (#AD7670), Beaver (#B49886), Eggshell (#F3ECDC), Gray Green (#BFD5BE)
- Orange & Teal: Light Orange (#FC993E), Gray Teal (#667C6F), White (#FCFCFC)
视觉细节选项
Visual Detail Options
几何图案:
- 对角线分隔代替水平分隔
- 不对称列宽(30/70、40/60、25/75)
- 90° 或 270° 旋转的文本标题
- 图片的圆形/六边形框架
- 角落的三角形装饰形状
- 重叠形状增加深度
边框和框架处理:
- 仅一侧的粗单色边框(10-20pt)
- 对比色的双线边框
- 角括号代替完整框架
- L 形边框(上+左或下+右)
- 标题下方的下划线强调(3-5pt 粗)
排版处理:
- 极端大小对比(72pt 标题 vs 11pt 正文)
- 大写标题配宽字母间距
- 超大显示字体的编号章节
- 等宽字体(Courier New)用于数据/统计/技术内容
- 窄体字体(Arial Narrow)用于密集信息
- 轮廓文本强调
图表和数据样式:
- 单色图表配单一强调色用于关键数据
- 水平条形图代替垂直条形图
- 点图代替条形图
- 最少或无网格线
- 数据标签直接在元素上(无图例)
- 关键指标的超大数字
布局创新:
- 带文本叠加的满版图片
- 侧边栏列(20-30% 宽度)用于导航/上下文
- 模块化网格系统(3×3、4×4 块)
- Z 形或 F 形内容流
- 浮动文本框覆盖在彩色形状上
- 杂志风格多栏布局
背景处理:
- 占幻灯片 40-60% 的纯色块
- 渐变填充(仅垂直或对角线)
- 分割背景(两种颜色,对角线或垂直)
- 边到边的色带
- 负空间作为设计元素
Geometric Patterns:
- Diagonal separators instead of horizontal separators
- Asymmetric column widths (30/70, 40/60, 25/75)
- Text titles rotated 90° or 270°
- Circular/hexagonal frames for images
- Triangle decorative shapes in corners
- Overlapping shapes to add depth
Border and Frame Treatments:
- Thick solid border on only one side (10-20pt)
- Double-line border with contrasting colors
- Angle brackets instead of full frames
- L-shaped borders (top+left or bottom+right)
- Underline emphasis below titles (3-5pt thick)
Typography Treatments:
- Extreme size contrast (72pt title vs 11pt body text)
- Uppercase titles with wide letter spacing
- Oversized display font for numbered sections
- Monospace font (Courier New) for data/statistics/technical content
- Narrow font (Arial Narrow) for dense information
- Outline text for emphasis
Chart and Data Styles:
- Monochrome charts with single accent color for key data
- Horizontal bar charts instead of vertical bar charts
- Dot plots instead of bar charts
- Minimal or no grid lines
- Data labels directly on elements (no legend)
- Oversized numbers for key metrics
Layout Innovations:
- Full-page images with text overlays
- Sidebar columns (20-30% width) for navigation/context
- Modular grid systems (3×3, 4×4 blocks)
- Z-shaped or F-shaped content flow
- Floating text boxes over colored shapes
- Magazine-style multi-column layouts
Background Treatments:
- Solid color blocks covering 40-60% of the slide
- Gradient fills (only vertical or diagonal)
- Split backgrounds (two colors, diagonal or vertical)
- Edge-to-edge color bands
- Negative space as a design element
布局提示
Layout Tips
创建带图表或表格的幻灯片时:
- 两栏布局(推荐):使用跨越全宽的标题,然后下方两栏 - 一栏放文本/要点,另一栏放主要内容。这提供更好的平衡,使图表/表格更易读。使用不等列宽的 flexbox(例如 40%/60% 分割)来优化每种内容类型的空间。
- 全幻灯片布局:让主要内容(图表/表格)占据整个幻灯片以获得最大影响力和可读性
- 切勿垂直堆叠:不要将图表/表格放在单列中的文本下方 - 这会导致可读性差和布局问题
When creating slides with charts or tables:
- Two-column layout (recommended): Use a full-width title, then two columns below - one for text/bullets, the other for main content. This provides better balance and makes charts/tables more readable. Use flexbox with unequal column widths (e.g., 40%/60% split) to optimize space for each content type.
- Full-slide layout: Let main content (charts/tables) occupy the entire slide for maximum impact and readability
- Never stack vertically: Do not place charts/tables below text in a single column - this causes poor readability and layout issues
工作流程
Workflow
- 必须 - 完整阅读整个文件:完整阅读 ,从头到尾。阅读此文件时切勿设置任何范围限制。 在开始创建演示文稿之前,阅读完整文件内容以了解详细语法、关键格式规则和最佳实践。
html2pptx.md - 为每个幻灯片创建具有正确尺寸的 HTML 文件(例如 16:9 为 720pt × 405pt)
- 使用 、
<p>-<h1>、<h6>、<ul>处理所有文本内容<ol> - 对将添加图表/表格的区域使用 (渲染灰色背景以便可见)
class="placeholder" - 关键:首先使用 Sharp 将渐变和图标光栅化为 PNG 图片,然后在 HTML 中引用
- 布局:对于带图表/表格/图片的幻灯片,使用全幻灯片布局或两栏布局以获得更好的可读性
- 使用
- 创建并运行使用 库的 JavaScript 文件,将 HTML 幻灯片转换为 PowerPoint 并保存演示文稿
html2pptx.js- 使用 函数处理每个 HTML 文件
html2pptx() - 使用 PptxGenJS API 将图表和表格添加到占位符区域
- 使用 保存演示文稿
pptx.writeFile()
- 使用
- 视觉验证:生成缩略图并检查布局问题
- 创建缩略图网格:
python scripts/thumbnail.py output.pptx workspace/thumbnails --cols 4 - 阅读并仔细检查缩略图图片:
- 文本截断:文本被标题栏、形状或幻灯片边缘截断
- 文本重叠:文本与其他文本或形状重叠
- 定位问题:内容太靠近幻灯片边界或其他元素
- 对比问题:文本和背景之间对比不足
- 如果发现问题,调整 HTML 边距/间距/颜色并重新生成演示文稿
- 重复直到所有幻灯片视觉正确
- 创建缩略图网格:
- Mandatory - Read the entire file completely: Read from start to finish. Do not set any scope limits when reading this file. Before starting to create the presentation, read the complete file content to understand detailed syntax, key formatting rules, and best practices.
html2pptx.md - Create HTML files with correct dimensions for each slide (e.g., 720pt × 405pt for 16:9)
- Use ,
<p>-<h1>,<h6>,<ul>for all text content<ol> - Use for areas where charts/tables will be added (renders gray background for visibility)
class="placeholder" - Key: First rasterize gradients and icons to PNG images using Sharp, then reference them in HTML
- Layout: For slides with charts/tables/images, use full-slide layout or two-column layout for better readability
- Use
- Create and run a JavaScript file using the library to convert HTML slides to PowerPoint and save the presentation
html2pptx.js- Use the function to process each HTML file
html2pptx() - Use the PptxGenJS API to add charts and tables to placeholder areas
- Use to save the presentation
pptx.writeFile()
- Use the
- Visual Validation: Generate thumbnails and check for layout issues
- Create thumbnail grid:
python scripts/thumbnail.py output.pptx workspace/thumbnails --cols 4 - Read and carefully inspect the thumbnail images:
- Text Truncation: Text is cut off by title bars, shapes, or slide edges
- Text Overlap: Text overlaps with other text or shapes
- Positioning Issues: Content is too close to slide boundaries or other elements
- Contrast Issues: Insufficient contrast between text and background
- If issues are found, adjust HTML margins/spacing/colors and regenerate the presentation
- Repeat until all slides are visually correct
- Create thumbnail grid:
编辑现有 PowerPoint 演示文稿
Editing Existing PowerPoint Presentations
编辑现有 PowerPoint 演示文稿中的幻灯片时,您需要使用原始 Office Open XML(OOXML)格式。这涉及解压 .pptx 文件、编辑 XML 内容,然后重新打包。
When editing slides in an existing PowerPoint presentation, you need to use the raw Office Open XML (OOXML) format. This involves unpacking the .pptx file, editing the XML content, then repacking it.
工作流程
Workflow
- 必须 - 完整阅读整个文件:完整阅读 (约 500 行),从头到尾。阅读此文件时切勿设置任何范围限制。 在编辑任何演示文稿之前,阅读完整文件内容以获取有关 OOXML 结构和编辑工作流程的详细指导。
ooxml.md - 解压演示文稿:
python ooxml/scripts/unpack.py <office_file> <output_dir> - 编辑 XML 文件(主要是 和相关文件)
ppt/slides/slide{N}.xml - 关键:每次编辑后立即验证,并在继续之前修复任何验证错误:
python ooxml/scripts/validate.py <dir> --original <file> - 打包最终演示文稿:
python ooxml/scripts/pack.py <input_directory> <office_file>
- Mandatory - Read the entire file completely: Read (approximately 500 lines) from start to finish. Do not set any scope limits when reading this file. Before editing any presentation, read the complete file content to get detailed guidance on OOXML structure and editing workflows.
ooxml.md - Unpack the presentation:
python ooxml/scripts/unpack.py <office_file> <output_dir> - Edit XML files (mainly and related files)
ppt/slides/slide{N}.xml - Key: Validate immediately after each edit and fix any validation errors before proceeding:
python ooxml/scripts/validate.py <dir> --original <file> - Pack the final presentation:
python ooxml/scripts/pack.py <input_directory> <office_file>
使用模板创建新 PowerPoint 演示文稿
Creating New PowerPoint Presentations Using Templates
当您需要创建遵循现有模板设计的演示文稿时,您需要复制和重新排列模板幻灯片,然后替换占位符内容。
When you need to create a presentation that follows an existing template design, you need to copy and rearrange template slides, then replace placeholder content.
工作流程
Workflow
-
提取模板文本并创建可视化缩略图网格:
- 提取文本:
python -m markitdown template.pptx > template-content.md - 阅读 :阅读整个文件以了解模板演示文稿的内容。阅读此文件时切勿设置任何范围限制。
template-content.md - 创建缩略图网格:
python scripts/thumbnail.py template.pptx - 有关更多详细信息,请参阅创建缩略图网格部分
- 提取文本:
-
分析模板并将清单保存到文件:
- 视觉分析:查看缩略图网格以了解幻灯片版式、设计模式和视觉结构
- 创建并保存模板清单文件到 ,包含:
template-inventory.mdmarkdown# 模板清单分析 **总幻灯片数:[count]** **重要:幻灯片从 0 开始索引(第一张幻灯片 = 0,最后一张幻灯片 = count-1)** ## [类别名称] - 幻灯片 0:[版式代码(如有)] - 描述/用途 - 幻灯片 1:[版式代码] - 描述/用途 - 幻灯片 2:[版式代码] - 描述/用途 [... 每个幻灯片都必须单独列出其索引 ...] - 使用缩略图网格:参考可视化缩略图来识别:
- 版式模式(标题幻灯片、内容版式、章节分隔)
- 图片占位符位置和数量
- 幻灯片组之间的设计一致性
- 视觉层次和结构
- 此清单文件是下一步选择适当模板所必需的
-
根据模板清单创建演示文稿大纲:
- 查看步骤 2 中的可用模板。
- 为第一张幻灯片选择介绍或标题模板。这应该是最前面的模板之一。
- 为其他幻灯片选择安全的、基于文本的版式。
- 关键:将版式结构与实际内容匹配:
- 单列版式:用于统一叙述或单一主题
- 两列版式:仅当您有正好 2 个不同的项目/概念时使用
- 三列版式:仅当您有正好 3 个不同的项目/概念时使用
- 图片+文本版式:仅当您有实际图片要插入时使用
- 引用版式:仅用于实际的人物引用(带归属),切勿用于强调
- 切勿使用占位符比内容多的版式
- 如果您有 2 个项目,不要强行放入 3 列版式
- 如果您有 4+ 个项目,考虑分成多张幻灯片或使用列表格式
- 在选择版式之前计算您的实际内容数量
- 验证所选版式中的每个占位符都将填充有意义的内容
- 为每个内容部分选择一个代表最佳版式的选项。
- 保存 ,包含内容和利用可用设计的模板映射
outline.md - 示例模板映射:
# 要使用的模板幻灯片(0 基索引) # 警告:验证索引在范围内!有 73 张幻灯片的模板索引为 0-72 # 映射:大纲中的幻灯片编号 -> 模板幻灯片索引 template_mapping = [ 0, # 使用幻灯片 0(标题/封面) 34, # 使用幻灯片 34(B1:标题和正文) 34, # 再次使用幻灯片 34(复制用于第二个 B1) 50, # 使用幻灯片 50(E1:引用) 54, # 使用幻灯片 54(F2:结束语+文本) ]
-
使用复制、重新排序和删除幻灯片:
rearrange.py- 使用 脚本创建具有所需顺序幻灯片的新演示文稿:
scripts/rearrange.pybashpython scripts/rearrange.py template.pptx working.pptx 0,34,34,50,52 - 该脚本自动处理重复幻灯片的复制、删除未使用的幻灯片和重新排序
- 幻灯片索引从 0 开始(第一张幻灯片是 0,第二张是 1,等等)
- 同一幻灯片索引可以出现多次以复制该幻灯片
- 使用
-
使用脚本提取所有文本:
inventory.py-
运行清单提取:bash
python scripts/inventory.py working.pptx text-inventory.json -
阅读 text-inventory.json:阅读整个 text-inventory.json 文件以了解所有形状及其属性。阅读此文件时切勿设置任何范围限制。
-
清单 JSON 结构:json
{ "slide-0": { "shape-0": { "placeholder_type": "TITLE", // 或非占位符为 null "left": 1.5, // 位置(英寸) "top": 2.0, "width": 7.5, "height": 1.2, "paragraphs": [ { "text": "段落文本", // 可选属性(仅在非默认时包含): "bullet": true, // 检测到显式项目符号 "level": 0, // 仅在 bullet 为 true 时包含 "alignment": "CENTER", // CENTER、RIGHT(非 LEFT) "space_before": 10.0, // 段前间距(磅) "space_after": 6.0, // 段后间距(磅) "line_spacing": 22.4, // 行间距(磅) "font_name": "Arial", // 来自第一个运行 "font_size": 14.0, // 磅 "bold": true, "italic": false, "underline": false, "color": "FF0000" // RGB 颜色 } ] } } } -
关键特性:
- 幻灯片:命名为"slide-0"、"slide-1"等。
- 形状:按视觉位置(从上到下、从左到右)排序为"shape-0"、"shape-1"等。
- 占位符类型:TITLE、CENTER_TITLE、SUBTITLE、BODY、OBJECT 或 null
- 默认字体大小:从版式占位符提取的 (磅,如可用)
default_font_size - 幻灯片编号被过滤:具有 SLIDE_NUMBER 占位符类型的形状自动从清单中排除
- 项目符号:当 时,
bullet: true始终包含(即使是 0)level - 间距:、
space_before和space_after(磅,仅在设置时包含)line_spacing - 颜色:RGB 的 (例如"FF0000"),主题颜色的
color(例如"DARK_1")theme_color - 属性:输出中仅包含非默认值
-
-
生成替换文本并将数据保存到 JSON 文件 根据上一步的文本清单:
- 关键:首先验证清单中存在哪些形状 - 只引用实际存在的形状
- 验证:replace.py 脚本将验证替换 JSON 中的所有形状都存在于清单中
- 如果引用不存在的形状,您将收到显示可用形状的错误
- 如果引用不存在的幻灯片,您将收到指示该幻灯片不存在的错误
- 所有验证错误在脚本退出前一次性显示
- 重要:replace.py 脚本内部使用 inventory.py 来识别所有文本形状
- 自动清除:清单中的所有文本形状都将被清除,除非您为它们提供"paragraphs"
- 为需要内容的形状添加"paragraphs"字段(不是"replacement_paragraphs")
- 替换 JSON 中没有"paragraphs"的形状将自动清除其文本
- 带项目符号的段落将自动左对齐。当 时不要设置
"bullet": true属性alignment - 为占位符文本生成适当的替换内容
- 使用形状大小来确定适当的内容长度
- 关键:包含原始清单中的段落属性 - 不要只提供文本
- 重要:当 bullet: true 时,不要在文本中包含项目符号符号(•、-、*)- 它们会自动添加
- 基本格式规则:
- 标题通常应有
"bold": true - 列表项应有 (bullet 为 true 时需要 level)
"bullet": true, "level": 0 - 保留任何对齐属性(例如居中文本的 )
"alignment": "CENTER" - 与默认不同时包含字体属性(例如 、
"font_size": 14.0)"font_name": "Lora" - 颜色:RGB 使用 ,主题颜色使用
"color": "FF0000""theme_color": "DARK_1" - 替换脚本期望正确格式的段落,而不仅仅是文本字符串
- 重叠形状:优先选择 default_font_size 更大或 placeholder_type 更合适的形状
- 标题通常应有
- 将带有替换的更新清单保存到
replacement-text.json - 警告:不同的模板版式有不同的形状数量 - 在创建替换之前始终检查实际清单
显示正确格式的示例 paragraphs 字段:json"paragraphs": [ { "text": "新演示文稿标题文本", "alignment": "CENTER", "bold": true }, { "text": "章节标题", "bold": true }, { "text": "第一个要点,不带项目符号符号", "bullet": true, "level": 0 }, { "text": "红色文本", "color": "FF0000" }, { "text": "主题颜色文本", "theme_color": "DARK_1" }, { "text": "没有特殊格式的普通段落文本" } ]替换 JSON 中未列出的形状将自动清除:json{ "slide-0": { "shape-0": { "paragraphs": [...] // 此形状获得新文本 } // 清单中的 shape-1 和 shape-2 将自动清除 } }演示文稿的常见格式模式:- 标题幻灯片:粗体文本,有时居中
- 幻灯片内的章节标题:粗体文本
- 项目符号列表:每个项目需要
"bullet": true, "level": 0 - 正文文本:通常不需要特殊属性
- 引用:可能有特殊的对齐或字体属性
-
使用脚本应用替换
replace.pybashpython scripts/replace.py working.pptx replacement-text.json output.pptx该脚本将:- 首先使用 inventory.py 的函数提取所有文本形状的清单
- 验证替换 JSON 中的所有形状都存在于清单中
- 清除清单中识别的所有形状的文本
- 仅对替换 JSON 中定义了"paragraphs"的形状应用新文本
- 通过应用 JSON 中的段落属性来保留格式
- 自动处理项目符号、对齐、字体属性和颜色
- 保存更新后的演示文稿
示例验证错误:错误:替换 JSON 中的无效形状: - 'slide-0' 上未找到形状 'shape-99'。可用形状:shape-0、shape-1、shape-4 - 清单中未找到幻灯片 'slide-999'错误:替换文本使以下形状的溢出更严重: - slide-0/shape-2:溢出加剧 1.25"(原为 0.00",现为 1.25")
-
Extract template text and create visual thumbnail grid:
- Extract text:
python -m markitdown template.pptx > template-content.md - Read : Read the entire file to understand the content of the template presentation. Do not set any scope limits when reading this file.
template-content.md - Create thumbnail grid:
python scripts/thumbnail.py template.pptx - For more details, see the Creating Thumbnail Grids section
- Extract text:
-
Analyze the template and save inventory to file:
- Visual Analysis: View the thumbnail grid to understand slide layouts, design patterns, and visual structure
- Create and save a template inventory file to , containing:
template-inventory.mdmarkdown# Template Inventory Analysis **Total Slides: [count]** **Important: Slides are indexed from 0 (first slide = 0, last slide = count-1)** ## [Category Name] - Slide 0: [Layout code (if any)] - Description/Purpose - Slide 1: [Layout code] - Description/Purpose - Slide 2: [Layout code] - Description/Purpose [... List each slide individually with its index ...] - Use the Thumbnail Grid: Reference the visual thumbnails to identify:
- Layout patterns (title slides, content layouts, section dividers)
- Image placeholder positions and quantities
- Design consistency between slide groups
- Visual hierarchy and structure
- This inventory file is required for selecting appropriate templates in the next step
-
Create presentation outline based on template inventory:
- Review the available templates from step 2.
- Select an introduction or title template for the first slide. This should be one of the frontmost templates.
- Select safe, text-based layouts for other slides.
- Key: Match layout structure to actual content:
- Single-column layout: For unified narratives or single topics
- Two-column layout: Only use when you have exactly 2 distinct items/concepts
- Three-column layout: Only use when you have exactly 3 distinct items/concepts
- Image+text layout: Only use when you have actual images to insert
- Quote layout: Only use for actual person quotes (with attribution), never for emphasis
- Never use layouts with more placeholders than content
- If you have 2 items, don't force them into a 3-column layout
- If you have 4+ items, consider splitting into multiple slides or using list format
- Calculate your actual content quantity before selecting layouts
- Verify that each placeholder in the selected layout will be filled with meaningful content
- Select an option that represents the best layout for each content section.
- Save , containing content and template mapping that utilizes available designs
outline.md - Example template mapping:
# Template Slides to Use (0-based index) # Warning: Verify indexes are within range! A template with 73 slides has indexes 0-72 # Mapping: Slide number in outline -> Template slide index template_mapping = [ 0, # Use slide 0 (Title/Cover) 34, # Use slide 34 (B1: Title and Body) 34, # Use slide 34 again (copy for second B1) 50, # Use slide 50 (E1: Quote) 54, # Use slide 54 (F2: Closing + Text) ]
-
Copy, reorder, and delete slides using:
rearrange.py- Use the script to create a new presentation with slides in the desired order:
scripts/rearrange.pybashpython scripts/rearrange.py template.pptx working.pptx 0,34,34,50,52 - The script automatically handles copying of duplicate slides, deletion of unused slides, and reordering
- Slide indexes start at 0 (first slide is 0, second is 1, etc.)
- The same slide index can appear multiple times to copy that slide
- Use the
-
Extract all text using thescript:
inventory.py-
Run Inventory Extraction:bash
python scripts/inventory.py working.pptx text-inventory.json -
Read text-inventory.json: Read the entire text-inventory.json file to understand all shapes and their properties. Do not set any scope limits when reading this file.
-
Inventory JSON Structure:json
{ "slide-0": { "shape-0": { "placeholder_type": "TITLE", // or null for non-placeholders "left": 1.5, // Position (inches) "top": 2.0, "width": 7.5, "height": 1.2, "paragraphs": [ { "text": "Paragraph text", // Optional properties (only included if non-default): "bullet": true, // Explicit bullet detected "level": 0, // Only included if bullet is true "alignment": "CENTER", // CENTER, RIGHT (non-LEFT) "space_before": 10.0, // Space before paragraph (points) "space_after": 6.0, // Space after paragraph (points) "line_spacing": 22.4, // Line spacing (points) "font_name": "Arial", // From first run "font_size": 14.0, // Points "bold": true, "italic": false, "underline": false, "color": "FF0000" // RGB color } ] } } } -
Key Features:
- Slides: Named "slide-0", "slide-1", etc.
- Shapes: Ordered by visual position (top to bottom, left to right) as "shape-0", "shape-1", etc.
- Placeholder Types: TITLE, CENTER_TITLE, SUBTITLE, BODY, OBJECT, or null
- Default Font Size: extracted from layout placeholders (points, if available)
default_font_size - Slide Numbers Filtered: Shapes with SLIDE_NUMBER placeholder type are automatically excluded from inventory
- Bullets: When ,
bullet: trueis always included (even if 0)level - Spacing: ,
space_before, andspace_after(points, only included if set)line_spacing - Colors: for RGB (e.g., "FF0000"),
colorfor theme colors (e.g., "DARK_1")theme_color - Properties: Only non-default values are included in output
-
-
Generate replacement text and save data to JSON file Based on the text inventory from the previous step:
- Key: First verify which shapes exist in the inventory - only reference actual existing shapes
- Validation: The replace.py script will validate that all shapes in the replacement JSON exist in the inventory
- If you reference a non-existent shape, you will receive an error showing available shapes
- If you reference a non-existent slide, you will receive an error indicating the slide does not exist
- All validation errors are shown at once before the script exits
- Important: The replace.py script internally uses inventory.py to identify all text shapes
- Automatic Clearing: All text shapes in the inventory will be cleared unless you provide "paragraphs" for them
- Add the "paragraphs" field (not "replacement_paragraphs") for shapes that need content
- Shapes without "paragraphs" in the replacement JSON will be automatically cleared of text
- Bulleted paragraphs will be automatically left-aligned. Do not set the property when
alignment"bullet": true - Generate appropriate replacement content for placeholder text
- Use shape size to determine appropriate content length
- Key: Include paragraph properties from the original inventory - don't just provide text
- Important: When bullet: true, do not include bullet symbols (•, -, *) in the text - they will be added automatically
- Basic Formatting Rules:
- Titles should typically have
"bold": true - List items should have (level is required when bullet is true)
"bullet": true, "level": 0 - Preserve any alignment properties (e.g., for centered text)
"alignment": "CENTER" - Include font properties when different from defaults (e.g., ,
"font_size": 14.0)"font_name": "Lora" - Colors: Use for RGB,
"color": "FF0000"for theme colors"theme_color": "DARK_1" - The replacement script expects properly formatted paragraphs, not just text strings
- Overlapping Shapes: Prioritize shapes with larger default_font_size or more appropriate placeholder_type
- Titles should typically have
- Save the updated inventory with replacements to
replacement-text.json - Warning: Different template layouts have different numbers of shapes - always check the actual inventory before creating replacements
Example properly formatted paragraphs field:json"paragraphs": [ { "text": "New presentation title text", "alignment": "CENTER", "bold": true }, { "text": "Section title", "bold": true }, { "text": "First bullet point, no bullet symbol", "bullet": true, "level": 0 }, { "text": "Red text", "color": "FF0000" }, { "text": "Theme color text", "theme_color": "DARK_1" }, { "text": "Normal paragraph text without special formatting" } ]Shapes not listed in replacement JSON will be automatically cleared:json{ "slide-0": { "shape-0": { "paragraphs": [...] // This shape gets new text } // shape-1 and shape-2 from inventory will be automatically cleared } }Common Formatting Patterns for Presentations:- Title slides: bold text, sometimes centered
- Section titles within slides: bold text
- Bullet lists: each item needs
"bullet": true, "level": 0 - Body text: usually no special properties needed
- Quotes: may have special alignment or font properties
-
Apply replacements using thescript
replace.pybashpython scripts/replace.py working.pptx replacement-text.json output.pptxThe script will:- First extract an inventory of all text shapes using functions from inventory.py
- Validate that all shapes in the replacement JSON exist in the inventory
- Clear text from all shapes identified in the inventory
- Apply new text only to shapes where "paragraphs" are defined in the replacement JSON
- Preserve formatting by applying paragraph properties from the JSON
- Automatically handle bullets, alignment, font properties, and colors
- Save the updated presentation
Example validation errors:Error: Invalid shapes in replacement JSON: - Shape 'shape-99' not found on 'slide-0'. Available shapes: shape-0, shape-1, shape-4 - Slide 'slide-999' not found in inventoryError: Replacement text worsens overflow for the following shapes: - slide-0/shape-2: overflow increased by 1.25" (was 0.00", now 1.25")
创建缩略图网格
Creating Thumbnail Grids
要创建 PowerPoint 幻灯片的可视化缩略图网格以便快速分析和参考:
bash
python scripts/thumbnail.py template.pptx [output_prefix]功能:
- 创建:(或大型演示文稿为
thumbnails.jpg、thumbnails-1.jpg等)thumbnails-2.jpg - 默认:5 列,每个网格最多 30 张幻灯片(5×6)
- 自定义前缀:
python scripts/thumbnail.py template.pptx my-grid- 注意:如果您想要输出到特定目录,输出前缀应包含路径(例如 )
workspace/my-grid
- 注意:如果您想要输出到特定目录,输出前缀应包含路径(例如
- 调整列数:(范围:3-6,影响每个网格的幻灯片数)
--cols 4 - 网格限制:3 列 = 12 张幻灯片/网格,4 列 = 20,5 列 = 30,6 列 = 42
- 幻灯片从零开始索引(幻灯片 0、幻灯片 1 等)
用例:
- 模板分析:快速了解幻灯片版式和设计模式
- 内容审查:整个演示文稿的视觉概览
- 导航参考:通过外观找到特定幻灯片
- 质量检查:验证所有幻灯片格式正确
示例:
bash
undefinedTo create a visual thumbnail grid of PowerPoint slides for quick analysis and reference:
bash
python scripts/thumbnail.py template.pptx [output_prefix]Features:
- Creates: (or
thumbnails.jpg,thumbnails-1.jpg, etc. for large presentations)thumbnails-2.jpg - Default: 5 columns, max 30 slides per grid (5×6)
- Custom prefix:
python scripts/thumbnail.py template.pptx my-grid- Note: If you want to output to a specific directory, the output prefix should include the path (e.g., )
workspace/my-grid
- Note: If you want to output to a specific directory, the output prefix should include the path (e.g.,
- Adjust column count: (range: 3-6, affects number of slides per grid)
--cols 4 - Grid limits: 3 columns = 12 slides/grid, 4 columns = 20, 5 columns = 30, 6 columns = 42
- Slides are zero-indexed (slide 0, slide 1, etc.)
Use Cases:
- Template analysis: Quickly understand slide layouts and design patterns
- Content review: Visual overview of the entire presentation
- Navigation reference: Find specific slides by appearance
- Quality check: Verify all slides are formatted correctly
Examples:
bash
undefined基本用法
Basic usage
python scripts/thumbnail.py presentation.pptx
python scripts/thumbnail.py presentation.pptx
组合选项:自定义名称、列数
Combined options: custom name, column count
python scripts/thumbnail.py template.pptx analysis --cols 4
undefinedpython scripts/thumbnail.py template.pptx analysis --cols 4
undefined将幻灯片转换为图片
Converting Slides to Images
要可视化分析 PowerPoint 幻灯片,使用两步过程将其转换为图片:
-
将 PPTX 转换为 PDF:bash
soffice --headless --convert-to pdf template.pptx -
将 PDF 页面转换为 JPEG 图片:bash
pdftoppm -jpeg -r 150 template.pdf slide这会创建、slide-1.jpg等文件。slide-2.jpg
选项:
- :设置分辨率为 150 DPI(调整以平衡质量/大小)
-r 150 - :输出 JPEG 格式(如需要可使用
-jpeg输出 PNG)-png - :要转换的第一页(例如
-f N从第 2 页开始)-f 2 - :要转换的最后一页(例如
-l N在第 5 页停止)-l 5 - :输出文件的前缀
slide
特定范围示例:
bash
pdftoppm -jpeg -r 150 -f 2 -l 5 template.pdf slide # 仅转换第 2-5 页To visually analyze PowerPoint slides, use a two-step process to convert them to images:
-
Convert PPTX to PDF:bash
soffice --headless --convert-to pdf template.pptx -
Convert PDF pages to JPEG images:bash
pdftoppm -jpeg -r 150 template.pdf slideThis creates files like,slide-1.jpg, etc.slide-2.jpg
Options:
- : Set resolution to 150 DPI (adjust to balance quality/size)
-r 150 - : Output in JPEG format (use
-jpegfor PNG output if needed)-png - : First page to convert (e.g.,
-f Nstarts from page 2)-f 2 - : Last page to convert (e.g.,
-l Nstops at page 5)-l 5 - : Prefix for output files
slide
Example for specific range:
bash
pdftoppm -jpeg -r 150 -f 2 -l 5 template.pdf slide # Convert only pages 2-5代码风格指南
Code Style Guide
重要:生成 PPTX 操作代码时:
- 编写简洁的代码
- 避免冗长的变量名和冗余操作
- 避免不必要的 print 语句
Important: When generating code for PPTX operations:
- Write concise code
- Avoid verbose variable names and redundant operations
- Avoid unnecessary print statements
依赖项
Dependencies
必需的依赖项(应该已经安装):
- markitdown:(用于从演示文稿提取文本)
pip install "markitdown[pptx]" - pptxgenjs:(用于通过 html2pptx 创建演示文稿)
npm install -g pptxgenjs - playwright:(用于 html2pptx 中的 HTML 渲染)
npm install -g playwright - react-icons:(用于图标)
npm install -g react-icons react react-dom - sharp:(用于 SVG 光栅化和图片处理)
npm install -g sharp - LibreOffice:(用于 PDF 转换)
sudo apt-get install libreoffice - Poppler:(用于 pdftoppm 将 PDF 转换为图片)
sudo apt-get install poppler-utils - defusedxml:(用于安全的 XML 解析)
pip install defusedxml
Required dependencies (should already be installed):
- markitdown: (for extracting text from presentations)
pip install "markitdown[pptx]" - pptxgenjs: (for creating presentations via html2pptx)
npm install -g pptxgenjs - playwright: (for HTML rendering in html2pptx)
npm install -g playwright - react-icons: (for icons)
npm install -g react-icons react react-dom - sharp: (for SVG rasterization and image processing)
npm install -g sharp - LibreOffice: (for PDF conversion)
sudo apt-get install libreoffice - Poppler: (for pdftoppm to convert PDF to images)
sudo apt-get install poppler-utils - defusedxml: (for safe XML parsing)
pip install defusedxml