paper-slide-deck
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePaper Slide Deck Generator
论文幻灯片组生成工具
Transform academic papers and content into professional slide deck images with automatic figure extraction.
将学术论文和内容转换为带有自动图表提取功能的专业幻灯片组图片。
Usage
使用方法
bash
/paper-slide-deck path/to/paper.pdf
/paper-slide-deck path/to/paper.pdf --style academic-paper
/paper-slide-deck path/to/content.md --style sketch-notes
/paper-slide-deck path/to/content.md --audience executives
/paper-slide-deck path/to/content.md --lang zh
/paper-slide-deck path/to/content.md --slides 10
/paper-slide-deck path/to/content.md --outline-only
/paper-slide-deck # Then paste contentbash
/paper-slide-deck path/to/paper.pdf
/paper-slide-deck path/to/paper.pdf --style academic-paper
/paper-slide-deck path/to/content.md --style sketch-notes
/paper-slide-deck path/to/content.md --audience executives
/paper-slide-deck path/to/content.md --lang zh
/paper-slide-deck path/to/content.md --slides 10
/paper-slide-deck path/to/content.md --outline-only
/paper-slide-deck # 然后粘贴内容Script Directory
脚本目录
Important: All scripts are located in the subdirectory of this skill.
scripts/Agent Execution Instructions:
- Determine this SKILL.md file's directory path as
SKILL_DIR - Script path =
${SKILL_DIR}/scripts/<script-name>.ts - Replace all in this document with the actual path
${SKILL_DIR}
Script Reference:
| Script | Purpose |
|---|---|
| Generate AI slides via Gemini API (Python) |
| Merge slides into PowerPoint |
| Merge slides into PDF |
| Auto-detect figures/tables in PDF |
| Extract figure from PDF page (uses PyMuPDF fallback) |
| Apply figure container template |
重要提示:所有脚本都位于此skill的子目录中。
scripts/Agent执行说明:
- 确定此SKILL.md文件的目录路径为
SKILL_DIR - 脚本路径 =
${SKILL_DIR}/scripts/<script-name>.ts - 将本文档中所有替换为实际路径
${SKILL_DIR}
脚本参考:
| 脚本 | 用途 |
|---|---|
| 通过Gemini API生成AI幻灯片(Python) |
| 将幻灯片合并为PowerPoint文件 |
| 将幻灯片合并为PDF文件 |
| 自动检测PDF中的图表/表格 |
| 从PDF页面提取图表(使用PyMuPDF作为备选方案) |
| 应用图表容器模板 |
Options
选项参数
| Option | Description |
|---|---|
| Visual style (see Style Gallery) |
| Target audience: beginners, intermediate, experts, executives, general |
| Output language (en, zh, ja, etc.) |
| Target slide count |
| Generate outline only, skip image generation |
| 选项 | 说明 |
|---|---|
| 视觉样式(查看样式库) |
| 目标受众:beginners、intermediate、experts、executives、general |
| 输出语言(en、zh、ja等) |
| 目标幻灯片数量 |
| 仅生成大纲,跳过图片生成 |
Style Gallery
样式库
| Style | Description | Best For |
|---|---|---|
| Clean professional, precise charts | Conference talks, thesis defense |
| Technical schematics, grid texture | Architecture, system design |
| Black chalkboard, colorful chalk | Education, tutorials, classroom |
| SaaS dashboard, card-based layouts | Product demos, SaaS, B2B |
| Magazine cover, bold typography, dark | Product launches, keynotes |
| Navy/gold, structured layouts | Investor decks, proposals |
| Cinematic dark mode, glowing accents | Entertainment, gaming |
| Magazine explainers, flat illustrations | Tech explainers, research |
| Ghibli/Disney style, hand-drawn | Educational, storytelling |
| Technical briefing, bilingual labels | Technical docs, academic |
| Ultra-clean, maximum whitespace | Executive briefings, premium |
| Retro 8-bit, chunky pixels | Gaming, developer talks |
| Academic diagrams, precise labeling | Biology, chemistry, medical |
| Hand-drawn, warm & friendly | Educational, tutorials |
| Flat vector, retro & cute | Creative, children's content |
| Aged-paper, historical styling | Historical, heritage, biography |
| Hand-painted textures, natural warmth | Lifestyle, wellness, travel |
| 样式 | 说明 | 适用场景 |
|---|---|---|
| 简洁专业风格,精确图表 | 会议报告、论文答辩 |
| 技术原理图,网格纹理 | 架构设计、系统设计 |
| 黑色黑板,彩色粉笔风格 | 教育教学、教程、课堂 |
| SaaS仪表板,卡片式布局 | 产品演示、SaaS、B2B场景 |
| 杂志封面风格,粗体排版,深色主题 | 产品发布、主题演讲 |
| 藏青/金色,结构化布局 | 投资者演示、提案 |
| 电影感深色模式,发光元素 | 娱乐、游戏领域 |
| 杂志解说风格,扁平化插图 | 技术讲解、研究内容 |
| 吉卜力/迪士尼风格,手绘质感 | 教育内容、故事讲述 |
| 技术简报,双语标签 | 技术文档、学术内容 |
| 极简风格,最大化留白 | 高管简报、高端演示 |
| 复古8位像素风格 | 游戏领域、开发者演讲 |
| 学术图表,精准标注 | 生物学、化学、医学领域 |
| 手绘风格,温暖友好 | 教育内容、教程 |
| 扁平化矢量图,复古可爱风格 | 创意内容、儿童内容 |
| 旧纸张质感,历史风格 | 历史内容、遗产介绍、传记 |
| 手绘水彩质感,自然温暖 | 生活方式、健康养生、旅行内容 |
Auto Style Selection
自动样式选择
| Content Signals | Selected Style |
|---|---|
| paper, thesis, defense, conference, ieee, acm, icml, neurips, cvpr, acl, aaai, iclr | |
| tutorial, learn, education, guide, intro, beginner | |
| classroom, teaching, school, chalkboard, blackboard | |
| architecture, system, data, analysis, technical | |
| creative, children, kids, cute, illustration | |
| briefing, bilingual, infographic, concept | |
| executive, minimal, clean, simple, elegant | |
| saas, product, dashboard, metrics, productivity | |
| investor, quarterly, business, corporate, proposal | |
| launch, marketing, keynote, bold, impact, magazine | |
| entertainment, music, gaming, creative, atmospheric | |
| explainer, journalism, science communication | |
| story, fantasy, animation, magical, whimsical | |
| gaming, retro, pixel, developer, nostalgia | |
| biology, chemistry, medical, pathway, scientific | |
| history, heritage, vintage, expedition, historical | |
| lifestyle, wellness, travel, artistic, natural | |
| Default | |
| 内容特征 | 选中样式 |
|---|---|
| paper、thesis、defense、conference、ieee、acm、icml、neurips、cvpr、acl、aaai、iclr | |
| tutorial、learn、education、guide、intro、beginner | |
| classroom、teaching、school、chalkboard、blackboard | |
| architecture、system、data、analysis、technical | |
| creative、children、kids、cute、illustration | |
| briefing、bilingual、infographic、concept | |
| executive、minimal、clean、simple、elegant | |
| saas、product、dashboard、metrics、productivity | |
| investor、quarterly、business、corporate、proposal | |
| launch、marketing、keynote、bold、impact、magazine | |
| entertainment、music、gaming、creative、atmospheric | |
| explainer、journalism、science communication | |
| story、fantasy、animation、magical、whimsical | |
| gaming、retro、pixel、developer、nostalgia | |
| biology、chemistry、medical、pathway、scientific | |
| history、heritage、vintage、expedition、historical | |
| lifestyle、wellness、travel、artistic、natural | |
| 默认 | |
Layout Gallery
布局库
Optional layout hints for individual slides. Specify in outline's section.
// LAYOUT为单个幻灯片提供可选的布局提示,在大纲的部分指定。
// LAYOUTSlide-Specific Layouts
幻灯片特定布局
| Layout | Description | Best For |
|---|---|---|
| Large centered title + subtitle | Cover slides, section breaks |
| Featured quote with attribution | Testimonials, key insights |
| Single large number as focal point | Impact statistics, metrics |
| Half image, half text | Feature highlights, comparisons |
| Grid of icons with labels | Features, capabilities, benefits |
| Content in balanced columns | Paired information, dual points |
| Content in three columns | Triple comparisons, categories |
| Full-bleed image + text overlay | Visual storytelling, emotional |
| Numbered list with highlights | Session overview, roadmap |
| Structured bullet points | Simple content, lists |
| 布局 | 说明 | 适用场景 |
|---|---|---|
| 居中大标题 + 副标题 | 封面幻灯片、章节分隔页 |
| 带署名的特色引语 | 客户证言、核心见解 |
| 单个大数字作为焦点 | 关键统计数据、指标 |
| 半图半文 | 功能亮点、对比内容 |
| 带标签的图标网格 | 功能、能力、优势展示 |
| 均衡的两栏内容 | 配对信息、双观点对比 |
| 三栏内容 | 三方对比、分类内容 |
| 全屏图片 + 文字叠加 | 视觉叙事、情感化内容 |
| 带重点的编号列表 | 会议概述、路线图 |
| 结构化项目符号列表 | 简单内容、清单 |
Infographic-Derived Layouts
信息图衍生布局
| Layout | Description | Best For |
|---|---|---|
| Sequential flow left-to-right | Timelines, step-by-step |
| Side-by-side A vs B | Before/after, pros-cons |
| Multi-factor grid | Feature comparisons |
| Pyramid or stacked levels | Priority, importance |
| Central node with radiating items | Concept maps, ecosystems |
| Varied-size tiles | Overview, summary |
| Narrowing stages | Conversion, filtering |
| Metrics with charts/numbers | KPIs, data display |
| Overlapping circles | Relationships, intersections |
| Continuous cycle | Recurring processes |
| Curved path with milestones | Journey, timeline |
| Parent-child hierarchy | Org charts, taxonomies |
| Visible vs hidden layers | Surface vs depth |
| Gap with connection | Problem-solution |
| 布局 | 说明 | 适用场景 |
|---|---|---|
| 从左到右的顺序流程 | 时间线、分步指南 |
| 并排对比A与B | 前后对比、优缺点分析 |
| 多因素网格 | 功能对比 |
| 金字塔或堆叠层级 | 优先级、重要性排序 |
| 中心节点 + 辐射项 | 概念图、生态系统 |
| 不同尺寸的瓦片布局 | 概述、摘要 |
| 逐步收窄的阶段 | 转化流程、筛选过程 |
| 带图表/数字的指标展示 | KPI、数据展示 |
| 重叠圆形 | 关系展示、交集内容 |
| 循环流程 | 重复过程 |
| 带里程碑的弯曲路径 | 发展历程、时间线 |
| 父子层级结构 | 组织结构图、分类体系 |
| 可见层与隐藏层 | 表面与深度内容 |
| 带连接的缺口 | 问题-解决方案 |
Academic-Specific Layouts
学术特定布局
| Layout | Description | Best For |
|---|---|---|
| Title, authors, affiliations, venue | Conference paper cover |
| Numbered section list with highlights | Talk structure overview |
| Central architecture/pipeline diagram | Methods, system design |
| Chart area + data annotations | Quantitative results |
| Centered equation + variable definitions | Mathematical derivations |
| 2x2 or 3x2 image comparison grid | Visual results, ablations |
| Numbered citation list | Key references slide |
| Numbered contribution points | Contributions summary |
Usage: Add in slide's section to guide visual composition.
Layout: <name>// LAYOUT| 布局 | 说明 | 适用场景 |
|---|---|---|
| 标题、作者、机构、会议 | 会议论文封面 |
| 带重点的编号章节列表 | 演讲结构概述 |
| 中心架构/流程图 | 方法介绍、系统设计 |
| 图表区域 + 数据注释 | 量化结果展示 |
| 居中公式 + 变量定义 | 数学推导 |
| 2x2或3x2图片对比网格 | 可视化结果、消融实验 |
| 编号引用列表 | 关键参考文献幻灯片 |
| 编号贡献点 | 贡献总结 |
使用方法:在幻灯片的部分添加,以指导视觉构图。
// LAYOUTLayout: <name>Design Philosophy
设计理念
This deck is designed for reading and sharing, not live presentation:
- Each slide must be self-explanatory without verbal commentary
- Structure content for logical flow when scrolling
- Include all necessary context within each slide
- Optimize for social media sharing and offline reading
本工具生成的幻灯片组专为阅读和分享设计,而非现场演示:
- 每张幻灯片必须无需口头讲解即可独立理解
- 内容结构需符合滚动浏览的逻辑流程
- 每张幻灯片需包含所有必要上下文
- 优化以适配社交媒体分享和离线阅读
File Management
文件管理
Output Directory
输出目录
Each session creates an independent directory named by content slug:
slide-deck/{topic-slug}/
├── source-{slug}.{ext} # Source files (text, images, etc.)
├── outline.md
├── outline-{style}.md # Style variant outlines
├── prompts/
│ └── 01-slide-cover.md, 02-slide-{slug}.md, ...
├── 01-slide-cover.png, 02-slide-{slug}.png, ...
├── {topic-slug}.pptx
└── {topic-slug}.pdfSlug Generation:
- Extract main topic from content (2-4 words, kebab-case)
- Example: "Introduction to Machine Learning" →
intro-machine-learning
每个会话会创建一个独立目录,名称由内容slug生成:
slide-deck/{topic-slug}/
├── source-{slug}.{ext} # 源文件(文本、图片等)
├── outline.md
├── outline-{style}.md # 样式变体大纲
├── prompts/
│ └── 01-slide-cover.md, 02-slide-{slug}.md, ...
├── 01-slide-cover.png, 02-slide-{slug}.png, ...
├── {topic-slug}.pptx
└── {topic-slug}.pdfSlug生成规则:
- 从内容中提取主题(2-4个单词,短横线分隔)
- 示例:"Introduction to Machine Learning" →
intro-machine-learning
Conflict Resolution
冲突解决
If already exists:
slide-deck/{topic-slug}/- Append timestamp:
{topic-slug}-YYYYMMDD-HHMMSS - Example: exists →
intro-mlintro-ml-20260118-143052
如果已存在:
slide-deck/{topic-slug}/- 添加时间戳后缀:
{topic-slug}-YYYYMMDD-HHMMSS - 示例:已存在 →
intro-mlintro-ml-20260118-143052
Source Files
源文件
Copy all sources with naming :
source-{slug}.{ext}- (main text content)
source-article.md - (image from conversation)
source-diagram.png - (additional file)
source-data.xlsx
Multiple sources supported: text, images, files from conversation.
将所有源文件复制为命名格式:
source-{slug}.{ext}- (主文本内容)
source-article.md - (对话中的图片)
source-diagram.png - (附加文件)
source-data.xlsx
支持多种源文件:文本、图片、对话中的文件。
Workflow
工作流程
Step 1: Analyze Content
步骤1:内容分析
- Save source content (if pasted, save as )
source.md - Follow for deep content analysis
references/analysis-framework.md - Determine style (use or auto-select from signals)
--style - Detect languages (source vs. user preference)
- Plan slide count (or dynamic)
--slides - For academic papers (PDF with figures): Run automatic figure detection:
This outputs a JSON file with all detected figures/tables, their page numbers, and captions.bash
npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json
- 保存源内容(如果是粘贴的内容,保存为)
source.md - 遵循进行深度内容分析
references/analysis-framework.md - 确定样式(使用参数或根据特征自动选择)
--style - 检测语言(源语言与用户偏好语言)
- 规划幻灯片数量(使用参数或动态生成)
--slides - 对于学术论文(带图表的PDF):运行自动图表检测:
此命令会输出一个JSON文件,包含所有检测到的图表/表格、页码和标题。bash
npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json
Step 2: Generate Outline Variants
步骤2:生成大纲变体
- Generate 3 style variant outlines based on content analysis
- Follow for structure
references/outline-template.md - Auto-populate IMAGE_SOURCE for academic papers:
- Read from Step 1
figures.json - Map figures to slides using rules in Section 8
references/analysis-framework.md - Automatically add blocks to appropriate slides:
// IMAGE_SOURCE- Architecture/pipeline figures → Methods slides ()
Source: extract - Results tables → Quantitative results slides ()
Source: extract - Comparison images → Qualitative results slides ()
Source: extract - Conceptual/simple diagrams → Leave for AI generation (or omit)
Source: generate
- Architecture/pipeline figures → Methods slides (
- Read
- Save as for each variant
outline-{style}.md
- 根据内容分析生成3种样式变体大纲
- 遵循的结构
references/outline-template.md - 为学术论文自动填充IMAGE_SOURCE:
- 读取步骤1生成的
figures.json - 根据第8节的规则将图表映射到幻灯片
references/analysis-framework.md - 自动为合适的幻灯片添加块:
// IMAGE_SOURCE- 架构/流程图 → 方法幻灯片()
Source: extract - 结果表格 → 量化结果幻灯片()
Source: extract - 对比图片 → 定性结果幻灯片()
Source: extract - 概念/简单图表 → 留待AI生成(或省略)
Source: generate
- 架构/流程图 → 方法幻灯片(
- 读取步骤1生成的
- 每个变体保存为
outline-{style}.md
Step 3: User Confirmation
步骤3:用户确认
Single AskUserQuestion with all applicable options:
| Question | When to Ask |
|---|---|
| Style variant | Always (3 options + custom) |
| Language | Only if source ≠ user language |
After selection:
- Copy selected to
outline-{style}.mdoutline.md - Regenerate in different language if requested
- User may edit for fine-tuning
outline.md
If , stop here.
--outline-only通过单个AskUserQuestion提供所有适用选项:
| 问题 | 询问时机 |
|---|---|
| 样式变体 | 始终询问(3个选项+自定义) |
| 语言 | 仅当源语言≠用户偏好语言时询问 |
用户选择后:
- 将选中的复制为
outline-{style}.mdoutline.md - 如果有需求,重新生成对应语言的版本
- 用户可编辑进行微调
outline.md
如果使用参数,流程在此处终止。
--outline-onlyStep 4: Generate Prompts
步骤4:生成提示词
- Read
references/base-prompt.md - Combine with style instructions from outline
- Add slide-specific content
- If specified in outline, include layout guidance in prompt:
Layout:- Reference layout characteristics for image composition
- Example: → "Central concept in middle with related items radiating outward"
Layout: hub-spoke
- Save to directory
prompts/
- 读取
references/base-prompt.md - 与大纲中的样式说明结合
- 添加幻灯片特定内容
- 如果大纲中指定了,在提示词中包含布局指导:
Layout:- 参考布局特征进行图片构图
- 示例:→ "中心概念位于中间,相关项向外辐射"
Layout: hub-spoke
- 保存到目录
prompts/
Step 5: Image Generation Method Selection
步骤5:选择图片生成方式
Before generating images, ask user to choose generation method:
Use AskUserQuestion with options:
| Option | Label | Description |
|---|---|---|
| 1 | Gemini API (Recommended) | Official Google API via Python. Requires GOOGLE_API_KEY env var. |
| 2 | Gemini Web (Browser-based) | ⚠️ Uses reverse-engineered web API. No API key needed but may break. |
Based on selection:
生成图片前,询问用户选择生成方式:
使用AskUserQuestion提供选项:
| 选项 | 标签 | 说明 |
|---|---|---|
| 1 | Gemini API(推荐) | 官方Google API,基于Python。需要GOOGLE_API_KEY环境变量。 |
| 2 | Gemini Web(基于浏览器) | ⚠️ 使用逆向工程的Web API。无需API密钥,但可能随时失效。 |
根据选择执行:
Option 1: Gemini API (Python)
选项1:Gemini API(Python)
- Verify API key: Check or
GOOGLE_API_KEYenvironment variableGEMINI_API_KEY - Run generation script:
bash
python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview
Script Features:
- Auto-installs package if missing
google-genai - Retry logic with exponential backoff (3 retries)
- Skips already-generated slides (> 10KB)
- Supports custom model via flag
--model - Outputs to subdirectory
slides/
Troubleshooting:
- If server disconnection errors occur, script auto-retries
- For persistent failures, re-run the script (it skips completed slides)
- Check API quota if many failures occur
- 验证API密钥:检查或
GOOGLE_API_KEY环境变量GEMINI_API_KEY - 运行生成脚本:
bash
python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview
脚本特性:
- 自动安装缺失的包
google-genai - 带指数退避的重试逻辑(3次重试)
- 跳过已生成的幻灯片(大于10KB)
- 通过标志支持自定义模型
--model - 输出到子目录
slides/
故障排除:
- 如果出现服务器断开连接错误,脚本会自动重试
- 若持续失败,重新运行脚本(会跳过已完成的幻灯片)
- 若多次失败,检查API配额
Option 2: Gemini Web Skill
选项2:Gemini Web Skill
-
Consent Check: Read consent file at:
- Windows:
$APPDATA/baoyu-skills/gemini-web/consent.json - macOS:
~/Library/Application Support/baoyu-skills/gemini-web/consent.json - Linux:
~/.local/share/baoyu-skills/gemini-web/consent.json
- Windows:
-
If no consent or version mismatch, display disclaimer and ask:
⚠️ DISCLAIMER: This uses a reverse-engineered Gemini Web API (NOT official). Risks: May break anytime, no support, possible account risk. -
For each slide, run:bash
npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \ --promptfiles prompts/01-slide-cover.md \ --image 01-slide-cover.png \ --sessionId slides-{topic-slug}-{timestamp}Where= path toGEMINI_WEB_SKILL_DIRskill directory.baoyu-danger-gemini-web -
Proxy support: If user is in restricted network, prepend:bash
HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890
-
同意检查:读取同意文件:
- Windows:
$APPDATA/baoyu-skills/gemini-web/consent.json - macOS:
~/Library/Application Support/baoyu-skills/gemini-web/consent.json - Linux:
~/.local/share/baoyu-skills/gemini-web/consent.json
- Windows:
-
如果没有同意文件或版本不匹配,显示免责声明并询问:
⚠️ 免责声明:本功能使用逆向工程的Gemini Web API(非官方)。 风险:可能随时失效,无技术支持,存在账号风险。 -
为每张幻灯片运行:bash
npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \ --promptfiles prompts/01-slide-cover.md \ --image 01-slide-cover.png \ --sessionId slides-{topic-slug}-{timestamp}其中=GEMINI_WEB_SKILL_DIRskill的目录路径。baoyu-danger-gemini-web -
代理支持:如果用户处于受限网络环境,添加前缀:bash
HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890
Step 5.5: Process IMAGE_SOURCE (Automatic Figure Extraction)
步骤5.5:处理IMAGE_SOURCE(自动图表提取)
For academic presentations, IMAGE_SOURCE metadata was auto-populated in Step 2 based on figure detection from Step 1.
Automatic Execution:
-
Parse outline to identify slides with
Source: extract -
Create figures directory:
mkdir -p figures -
For each extract slide, automatically:
- Read the Figure number, Page, and Caption from metadata
- Run figure extraction script:
bash
npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \ --pdf source-paper.pdf \ --page <page-number> \ --output figures/figure-<N>.png - Run template application script:
bash
npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \ --figure figures/figure-<N>.png \ --title "<slide-headline>" \ --caption "Figure <N>: <caption-text>" \ --output <NN>-slide-<slug>.png - Report: "Extracted: Figure N → slide NN"
-
For slides with(or no IMAGE_SOURCE):
Source: generate- Proceed to Step 6 for AI generation
Note: Source PDF must be saved as in output directory.
source-paper.pdfTroubleshooting:
- If figure detection missed a figure: manually add block to outline
// IMAGE_SOURCE - If wrong figure mapped: edit the and
Figure:values in outlinePage: - If extraction fails: check PDF page number (1-indexed)
PyMuPDF Fallback for Page Extraction:
If fails with "Image or Canvas expected" error (common with complex PDFs), use PyMuPDF:
extract-figure.tspython
import fitz
doc = fitz.open("source-paper.pdf")
page = doc[page_num - 1] # 0-indexed
mat = fitz.Matrix(3, 3) # 3x scale for high resolution
pix = page.get_pixmap(matrix=mat)
pix.save(f"extracted/page-{page_num}.png")Then apply template using .
apply-template.ts对于学术演示文稿,IMAGE_SOURCE元数据已在步骤2中基于步骤1的图表检测结果自动填充。
自动执行:
-
解析大纲以识别带有的幻灯片
Source: extract -
创建图表目录:
mkdir -p figures -
对于每个需要提取的幻灯片,自动执行:
- 从元数据中读取图表编号、页码和标题
- 运行图表提取脚本:
bash
npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \ --pdf source-paper.pdf \ --page <page-number> \ --output figures/figure-<N>.png - 运行模板应用脚本:
bash
npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \ --figure figures/figure-<N>.png \ --title "<slide-headline>" \ --caption "Figure <N>: <caption-text>" \ --output <NN>-slide-<slug>.png - 报告:"已提取:图表N → 幻灯片NN"
-
对于带有(或无IMAGE_SOURCE)的幻灯片:
Source: generate- 继续执行步骤6进行AI生成
注意:源PDF必须保存为输出目录中的。
source-paper.pdf故障排除:
- 如果图表检测遗漏了图表:手动在大纲中添加块
// IMAGE_SOURCE - 如果图表映射错误:编辑大纲中的和
Figure:值Page: - 如果提取失败:检查PDF页码(从1开始计数)
PyMuPDF备选页面提取方案:
如果出现"Image or Canvas expected"错误(复杂PDF常见),使用PyMuPDF:
extract-figure.tspython
import fitz
doc = fitz.open("source-paper.pdf")
page = doc[page_num - 1] # 从0开始计数
mat = fitz.Matrix(3, 3) # 3倍缩放以获取高分辨率
pix = page.get_pixmap(matrix=mat)
pix.save(f"extracted/page-{page_num}.png")然后使用应用模板。
apply-template.tsStep 6: Generate Images
步骤6:生成图片
- Use selected method from Step 5
- Skip slides already processed in Step 5.5 (those with )
Source: extract - Generate session ID:
slides-{topic-slug}-{timestamp} - Generate each remaining slide with same session ID
- Report progress: "Generated X/N"
- Auto-retry once on generation failure
- 使用步骤5中选择的生成方式
- 跳过步骤5.5中已处理的幻灯片(带有的幻灯片)
Source: extract - 生成会话ID:
slides-{topic-slug}-{timestamp} - 使用相同的会话ID生成剩余的每张幻灯片
- 报告进度:"已生成X/N"
- 生成失败时自动重试一次
Step 7: Merge to PPTX and PDF
步骤7:合并为PPTX和PDF
bash
npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir>
npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>bash
npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir>
npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>Step 8: Output Summary
步骤8:输出摘要
Slide Deck Complete!
Topic: [topic]
Style: [style name]
Location: [directory path]
Slides: N total
- 01-slide-cover.png ✓ Cover
- 02-slide-intro.png ✓ Content
- ...
- {NN}-slide-back-cover.png ✓ Back Cover
Outline: outline.md
PPTX: {topic-slug}.pptx
PDF: {topic-slug}.pdf幻灯片组生成完成!
主题: [topic]
样式: [style name]
位置: [目录路径]
幻灯片数量: 共N张
- 01-slide-cover.png ✓ 封面
- 02-slide-intro.png ✓ 内容页
- ...
- {NN}-slide-back-cover.png ✓ 封底
大纲: outline.md
PPTX文件: {topic-slug}.pptx
PDF文件: {topic-slug}.pdfSlide Modification
幻灯片修改
See for:
references/modification-guide.md- Edit single slide workflow
- Add new slide (with renumbering)
- Delete slide (with renumbering)
- File naming conventions
查看获取:
references/modification-guide.md- 单张幻灯片编辑流程
- 添加新幻灯片(自动重新编号)
- 删除幻灯片(自动重新编号)
- 文件命名规范
Image Generation Dependencies
图片生成依赖
Gemini API (Option 1 - Recommended)
Gemini API(选项1 - 推荐)
Requires:
- or
GOOGLE_API_KEYenvironment variableGEMINI_API_KEY - Python 3.8+ with pip
- package (auto-installed by script)
google-genai
Model: (default)
gemini-3-pro-image-preview需要:
- 或
GOOGLE_API_KEY环境变量GEMINI_API_KEY - Python 3.8+及pip
- 包(脚本会自动安装)
google-genai
模型: (默认)
gemini-3-pro-image-previewGemini Web Skill (Option 2)
Gemini Web Skill(选项2)
Requires:
- skill installed at
baoyu-danger-gemini-web.claude/skills/baoyu-danger-gemini-web - Google Chrome browser with logged-in Google account
- User consent for reverse-engineered API disclaimer
需要:
- skill安装在
baoyu-danger-gemini-web.claude/skills/baoyu-danger-gemini-web - 已登录Google账号的Google Chrome浏览器
- 用户同意逆向工程API的免责声明
PDF Figure Extraction
PDF图表提取
Requires:
- Primary: npm package (use legacy build for Node.js)
pdfjs-dist - Fallback: Python package (more reliable for complex PDFs)
pymupdf - npm package for apply-template.ts
canvas
需要:
- 主要依赖: npm包(为Node.js使用旧版构建)
pdfjs-dist - 备选依赖: Python包(对复杂PDF更可靠)
pymupdf - npm包(用于apply-template.ts)
canvas
References
参考文档
| File | Content |
|---|---|
| Deep content analysis for presentations |
| Outline structure and STYLE_INSTRUCTIONS format |
| Edit, add, delete slide workflows |
| Content and style guidelines |
| Base prompt for image generation |
| Visual specs for extracted figure containers |
| Full style specifications |
| 文件 | 内容 |
|---|---|
| 演示文稿深度内容分析框架 |
| 大纲结构和STYLE_INSTRUCTIONS格式 |
| 编辑、添加、删除幻灯片的工作流程 |
| 内容和样式指南 |
| 图片生成基础提示词 |
| 提取图表容器的视觉规范 |
| 完整样式规范 |
Notes
注意事项
Image Generation
图片生成
- Nano Banana Pro API: Recommended. Stable, reliable, requires API key
- Gemini Web: No API key needed, but uses reverse-engineered API with account risk
- Generation time: 10-30 seconds per slide
- Auto-retry once on generation failure
- Maintain style consistency via session ID
- Nano Banana Pro API: 推荐使用。稳定可靠,需要API密钥
- Gemini Web: 无需API密钥,但使用逆向工程API,存在账号风险
- 生成时间:每张幻灯片10-30秒
- 生成失败时自动重试一次
- 通过会话ID保持样式一致性
Content Guidelines
内容指南
- Use stylized alternatives for sensitive public figures
- Both methods use the same underlying Gemini model for image generation
- 对敏感公众人物使用风格化替代形象
- 两种生成方式均使用底层的Gemini模型生成图片
Extension Support
扩展支持
Custom styles and configurations via EXTEND.md.
Check paths (priority order):
- (project)
.paper-skills/paper-slide-deck/EXTEND.md - (user)
~/.paper-skills/paper-slide-deck/EXTEND.md
If found, load before Step 1. Extension content overrides defaults.
通过EXTEND.md实现自定义样式和配置。
路径检查优先级:
- (项目级)
.paper-skills/paper-slide-deck/EXTEND.md - (用户级)
~/.paper-skills/paper-slide-deck/EXTEND.md
如果找到,在步骤1之前加载。扩展内容会覆盖默认设置。