paper-slide-deck

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Paper Slide Deck Generator

论文幻灯片组生成工具

Transform academic papers and content into professional slide deck images with automatic figure extraction.
将学术论文和内容转换为带有自动图表提取功能的专业幻灯片组图片。

Usage

使用方法

bash
/paper-slide-deck path/to/paper.pdf
/paper-slide-deck path/to/paper.pdf --style academic-paper
/paper-slide-deck path/to/content.md --style sketch-notes
/paper-slide-deck path/to/content.md --audience executives
/paper-slide-deck path/to/content.md --lang zh
/paper-slide-deck path/to/content.md --slides 10
/paper-slide-deck path/to/content.md --outline-only
/paper-slide-deck  # Then paste content
bash
/paper-slide-deck path/to/paper.pdf
/paper-slide-deck path/to/paper.pdf --style academic-paper
/paper-slide-deck path/to/content.md --style sketch-notes
/paper-slide-deck path/to/content.md --audience executives
/paper-slide-deck path/to/content.md --lang zh
/paper-slide-deck path/to/content.md --slides 10
/paper-slide-deck path/to/content.md --outline-only
/paper-slide-deck  # 然后粘贴内容

Script Directory

脚本目录

Important: All scripts are located in the
scripts/
subdirectory of this skill.
Agent Execution Instructions:
  1. Determine this SKILL.md file's directory path as
    SKILL_DIR
  2. Script path =
    ${SKILL_DIR}/scripts/<script-name>.ts
  3. Replace all
    ${SKILL_DIR}
    in this document with the actual path
Script Reference:
ScriptPurpose
scripts/generate-slides.py
Generate AI slides via Gemini API (Python)
scripts/merge-to-pptx.ts
Merge slides into PowerPoint
scripts/merge-to-pdf.ts
Merge slides into PDF
scripts/detect-figures.ts
Auto-detect figures/tables in PDF
scripts/extract-figure.ts
Extract figure from PDF page (uses PyMuPDF fallback)
scripts/apply-template.ts
Apply figure container template
重要提示:所有脚本都位于此skill的
scripts/
子目录中。
Agent执行说明:
  1. 确定此SKILL.md文件的目录路径为
    SKILL_DIR
  2. 脚本路径 =
    ${SKILL_DIR}/scripts/<script-name>.ts
  3. 将本文档中所有
    ${SKILL_DIR}
    替换为实际路径
脚本参考:
脚本用途
scripts/generate-slides.py
通过Gemini API生成AI幻灯片(Python)
scripts/merge-to-pptx.ts
将幻灯片合并为PowerPoint文件
scripts/merge-to-pdf.ts
将幻灯片合并为PDF文件
scripts/detect-figures.ts
自动检测PDF中的图表/表格
scripts/extract-figure.ts
从PDF页面提取图表(使用PyMuPDF作为备选方案)
scripts/apply-template.ts
应用图表容器模板

Options

选项参数

OptionDescription
--style <name>
Visual style (see Style Gallery)
--audience <type>
Target audience: beginners, intermediate, experts, executives, general
--lang <code>
Output language (en, zh, ja, etc.)
--slides <number>
Target slide count
--outline-only
Generate outline only, skip image generation
选项说明
--style <name>
视觉样式(查看样式库)
--audience <type>
目标受众:beginners、intermediate、experts、executives、general
--lang <code>
输出语言(en、zh、ja等)
--slides <number>
目标幻灯片数量
--outline-only
仅生成大纲,跳过图片生成

Style Gallery

样式库

StyleDescriptionBest For
academic-paper
Clean professional, precise chartsConference talks, thesis defense
blueprint
(Default)
Technical schematics, grid textureArchitecture, system design
chalkboard
Black chalkboard, colorful chalkEducation, tutorials, classroom
notion
SaaS dashboard, card-based layoutsProduct demos, SaaS, B2B
bold-editorial
Magazine cover, bold typography, darkProduct launches, keynotes
corporate
Navy/gold, structured layoutsInvestor decks, proposals
dark-atmospheric
Cinematic dark mode, glowing accentsEntertainment, gaming
editorial-infographic
Magazine explainers, flat illustrationsTech explainers, research
fantasy-animation
Ghibli/Disney style, hand-drawnEducational, storytelling
intuition-machine
Technical briefing, bilingual labelsTechnical docs, academic
minimal
Ultra-clean, maximum whitespaceExecutive briefings, premium
pixel-art
Retro 8-bit, chunky pixelsGaming, developer talks
scientific
Academic diagrams, precise labelingBiology, chemistry, medical
sketch-notes
Hand-drawn, warm & friendlyEducational, tutorials
vector-illustration
Flat vector, retro & cuteCreative, children's content
vintage
Aged-paper, historical stylingHistorical, heritage, biography
watercolor
Hand-painted textures, natural warmthLifestyle, wellness, travel
样式说明适用场景
academic-paper
简洁专业风格,精确图表会议报告、论文答辩
blueprint
(默认)
技术原理图,网格纹理架构设计、系统设计
chalkboard
黑色黑板,彩色粉笔风格教育教学、教程、课堂
notion
SaaS仪表板,卡片式布局产品演示、SaaS、B2B场景
bold-editorial
杂志封面风格,粗体排版,深色主题产品发布、主题演讲
corporate
藏青/金色,结构化布局投资者演示、提案
dark-atmospheric
电影感深色模式,发光元素娱乐、游戏领域
editorial-infographic
杂志解说风格,扁平化插图技术讲解、研究内容
fantasy-animation
吉卜力/迪士尼风格,手绘质感教育内容、故事讲述
intuition-machine
技术简报,双语标签技术文档、学术内容
minimal
极简风格,最大化留白高管简报、高端演示
pixel-art
复古8位像素风格游戏领域、开发者演讲
scientific
学术图表,精准标注生物学、化学、医学领域
sketch-notes
手绘风格,温暖友好教育内容、教程
vector-illustration
扁平化矢量图,复古可爱风格创意内容、儿童内容
vintage
旧纸张质感,历史风格历史内容、遗产介绍、传记
watercolor
手绘水彩质感,自然温暖生活方式、健康养生、旅行内容

Auto Style Selection

自动样式选择

Content SignalsSelected Style
paper, thesis, defense, conference, ieee, acm, icml, neurips, cvpr, acl, aaai, iclr
academic-paper
tutorial, learn, education, guide, intro, beginner
sketch-notes
classroom, teaching, school, chalkboard, blackboard
chalkboard
architecture, system, data, analysis, technical
blueprint
creative, children, kids, cute, illustration
vector-illustration
briefing, bilingual, infographic, concept
intuition-machine
executive, minimal, clean, simple, elegant
minimal
saas, product, dashboard, metrics, productivity
notion
investor, quarterly, business, corporate, proposal
corporate
launch, marketing, keynote, bold, impact, magazine
bold-editorial
entertainment, music, gaming, creative, atmospheric
dark-atmospheric
explainer, journalism, science communication
editorial-infographic
story, fantasy, animation, magical, whimsical
fantasy-animation
gaming, retro, pixel, developer, nostalgia
pixel-art
biology, chemistry, medical, pathway, scientific
scientific
history, heritage, vintage, expedition, historical
vintage
lifestyle, wellness, travel, artistic, natural
watercolor
Default
blueprint
内容特征选中样式
paper、thesis、defense、conference、ieee、acm、icml、neurips、cvpr、acl、aaai、iclr
academic-paper
tutorial、learn、education、guide、intro、beginner
sketch-notes
classroom、teaching、school、chalkboard、blackboard
chalkboard
architecture、system、data、analysis、technical
blueprint
creative、children、kids、cute、illustration
vector-illustration
briefing、bilingual、infographic、concept
intuition-machine
executive、minimal、clean、simple、elegant
minimal
saas、product、dashboard、metrics、productivity
notion
investor、quarterly、business、corporate、proposal
corporate
launch、marketing、keynote、bold、impact、magazine
bold-editorial
entertainment、music、gaming、creative、atmospheric
dark-atmospheric
explainer、journalism、science communication
editorial-infographic
story、fantasy、animation、magical、whimsical
fantasy-animation
gaming、retro、pixel、developer、nostalgia
pixel-art
biology、chemistry、medical、pathway、scientific
scientific
history、heritage、vintage、expedition、historical
vintage
lifestyle、wellness、travel、artistic、natural
watercolor
默认
blueprint

Layout Gallery

布局库

Optional layout hints for individual slides. Specify in outline's
// LAYOUT
section.
为单个幻灯片提供可选的布局提示,在大纲的
// LAYOUT
部分指定。

Slide-Specific Layouts

幻灯片特定布局

LayoutDescriptionBest For
title-hero
Large centered title + subtitleCover slides, section breaks
quote-callout
Featured quote with attributionTestimonials, key insights
key-stat
Single large number as focal pointImpact statistics, metrics
split-screen
Half image, half textFeature highlights, comparisons
icon-grid
Grid of icons with labelsFeatures, capabilities, benefits
two-columns
Content in balanced columnsPaired information, dual points
three-columns
Content in three columnsTriple comparisons, categories
image-caption
Full-bleed image + text overlayVisual storytelling, emotional
agenda
Numbered list with highlightsSession overview, roadmap
bullet-list
Structured bullet pointsSimple content, lists
布局说明适用场景
title-hero
居中大标题 + 副标题封面幻灯片、章节分隔页
quote-callout
带署名的特色引语客户证言、核心见解
key-stat
单个大数字作为焦点关键统计数据、指标
split-screen
半图半文功能亮点、对比内容
icon-grid
带标签的图标网格功能、能力、优势展示
two-columns
均衡的两栏内容配对信息、双观点对比
three-columns
三栏内容三方对比、分类内容
image-caption
全屏图片 + 文字叠加视觉叙事、情感化内容
agenda
带重点的编号列表会议概述、路线图
bullet-list
结构化项目符号列表简单内容、清单

Infographic-Derived Layouts

信息图衍生布局

LayoutDescriptionBest For
linear-progression
Sequential flow left-to-rightTimelines, step-by-step
binary-comparison
Side-by-side A vs BBefore/after, pros-cons
comparison-matrix
Multi-factor gridFeature comparisons
hierarchical-layers
Pyramid or stacked levelsPriority, importance
hub-spoke
Central node with radiating itemsConcept maps, ecosystems
bento-grid
Varied-size tilesOverview, summary
funnel
Narrowing stagesConversion, filtering
dashboard
Metrics with charts/numbersKPIs, data display
venn-diagram
Overlapping circlesRelationships, intersections
circular-flow
Continuous cycleRecurring processes
winding-roadmap
Curved path with milestonesJourney, timeline
tree-branching
Parent-child hierarchyOrg charts, taxonomies
iceberg
Visible vs hidden layersSurface vs depth
bridge
Gap with connectionProblem-solution
布局说明适用场景
linear-progression
从左到右的顺序流程时间线、分步指南
binary-comparison
并排对比A与B前后对比、优缺点分析
comparison-matrix
多因素网格功能对比
hierarchical-layers
金字塔或堆叠层级优先级、重要性排序
hub-spoke
中心节点 + 辐射项概念图、生态系统
bento-grid
不同尺寸的瓦片布局概述、摘要
funnel
逐步收窄的阶段转化流程、筛选过程
dashboard
带图表/数字的指标展示KPI、数据展示
venn-diagram
重叠圆形关系展示、交集内容
circular-flow
循环流程重复过程
winding-roadmap
带里程碑的弯曲路径发展历程、时间线
tree-branching
父子层级结构组织结构图、分类体系
iceberg
可见层与隐藏层表面与深度内容
bridge
带连接的缺口问题-解决方案

Academic-Specific Layouts

学术特定布局

LayoutDescriptionBest For
paper-title
Title, authors, affiliations, venueConference paper cover
outline-agenda
Numbered section list with highlightsTalk structure overview
methods-diagram
Central architecture/pipeline diagramMethods, system design
results-chart
Chart area + data annotationsQuantitative results
equation-focus
Centered equation + variable definitionsMathematical derivations
qualitative-grid
2x2 or 3x2 image comparison gridVisual results, ablations
references-list
Numbered citation listKey references slide
contributions
Numbered contribution pointsContributions summary
Usage: Add
Layout: <name>
in slide's
// LAYOUT
section to guide visual composition.
布局说明适用场景
paper-title
标题、作者、机构、会议会议论文封面
outline-agenda
带重点的编号章节列表演讲结构概述
methods-diagram
中心架构/流程图方法介绍、系统设计
results-chart
图表区域 + 数据注释量化结果展示
equation-focus
居中公式 + 变量定义数学推导
qualitative-grid
2x2或3x2图片对比网格可视化结果、消融实验
references-list
编号引用列表关键参考文献幻灯片
contributions
编号贡献点贡献总结
使用方法:在幻灯片的
// LAYOUT
部分添加
Layout: <name>
,以指导视觉构图。

Design Philosophy

设计理念

This deck is designed for reading and sharing, not live presentation:
  • Each slide must be self-explanatory without verbal commentary
  • Structure content for logical flow when scrolling
  • Include all necessary context within each slide
  • Optimize for social media sharing and offline reading
本工具生成的幻灯片组专为阅读和分享设计,而非现场演示:
  • 每张幻灯片必须无需口头讲解即可独立理解
  • 内容结构需符合滚动浏览的逻辑流程
  • 每张幻灯片需包含所有必要上下文
  • 优化以适配社交媒体分享和离线阅读

File Management

文件管理

Output Directory

输出目录

Each session creates an independent directory named by content slug:
slide-deck/{topic-slug}/
├── source-{slug}.{ext}    # Source files (text, images, etc.)
├── outline.md
├── outline-{style}.md     # Style variant outlines
├── prompts/
│   └── 01-slide-cover.md, 02-slide-{slug}.md, ...
├── 01-slide-cover.png, 02-slide-{slug}.png, ...
├── {topic-slug}.pptx
└── {topic-slug}.pdf
Slug Generation:
  1. Extract main topic from content (2-4 words, kebab-case)
  2. Example: "Introduction to Machine Learning" →
    intro-machine-learning
每个会话会创建一个独立目录,名称由内容slug生成:
slide-deck/{topic-slug}/
├── source-{slug}.{ext}    # 源文件(文本、图片等)
├── outline.md
├── outline-{style}.md     # 样式变体大纲
├── prompts/
│   └── 01-slide-cover.md, 02-slide-{slug}.md, ...
├── 01-slide-cover.png, 02-slide-{slug}.png, ...
├── {topic-slug}.pptx
└── {topic-slug}.pdf
Slug生成规则:
  1. 从内容中提取主题(2-4个单词,短横线分隔)
  2. 示例:"Introduction to Machine Learning" →
    intro-machine-learning

Conflict Resolution

冲突解决

If
slide-deck/{topic-slug}/
already exists:
  • Append timestamp:
    {topic-slug}-YYYYMMDD-HHMMSS
  • Example:
    intro-ml
    exists →
    intro-ml-20260118-143052
如果
slide-deck/{topic-slug}/
已存在:
  • 添加时间戳后缀:
    {topic-slug}-YYYYMMDD-HHMMSS
  • 示例:
    intro-ml
    已存在 →
    intro-ml-20260118-143052

Source Files

源文件

Copy all sources with naming
source-{slug}.{ext}
:
  • source-article.md
    (main text content)
  • source-diagram.png
    (image from conversation)
  • source-data.xlsx
    (additional file)
Multiple sources supported: text, images, files from conversation.
将所有源文件复制为
source-{slug}.{ext}
命名格式:
  • source-article.md
    (主文本内容)
  • source-diagram.png
    (对话中的图片)
  • source-data.xlsx
    (附加文件)
支持多种源文件:文本、图片、对话中的文件。

Workflow

工作流程

Step 1: Analyze Content

步骤1:内容分析

  1. Save source content (if pasted, save as
    source.md
    )
  2. Follow
    references/analysis-framework.md
    for deep content analysis
  3. Determine style (use
    --style
    or auto-select from signals)
  4. Detect languages (source vs. user preference)
  5. Plan slide count (
    --slides
    or dynamic)
  6. For academic papers (PDF with figures): Run automatic figure detection:
    bash
    npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json
    This outputs a JSON file with all detected figures/tables, their page numbers, and captions.
  1. 保存源内容(如果是粘贴的内容,保存为
    source.md
  2. 遵循
    references/analysis-framework.md
    进行深度内容分析
  3. 确定样式(使用
    --style
    参数或根据特征自动选择)
  4. 检测语言(源语言与用户偏好语言)
  5. 规划幻灯片数量(使用
    --slides
    参数或动态生成)
  6. 对于学术论文(带图表的PDF):运行自动图表检测:
    bash
    npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json
    此命令会输出一个JSON文件,包含所有检测到的图表/表格、页码和标题。

Step 2: Generate Outline Variants

步骤2:生成大纲变体

  1. Generate 3 style variant outlines based on content analysis
  2. Follow
    references/outline-template.md
    for structure
  3. Auto-populate IMAGE_SOURCE for academic papers:
    • Read
      figures.json
      from Step 1
    • Map figures to slides using rules in
      references/analysis-framework.md
      Section 8
    • Automatically add
      // IMAGE_SOURCE
      blocks to appropriate slides:
      • Architecture/pipeline figures → Methods slides (
        Source: extract
        )
      • Results tables → Quantitative results slides (
        Source: extract
        )
      • Comparison images → Qualitative results slides (
        Source: extract
        )
      • Conceptual/simple diagrams → Leave for AI generation (
        Source: generate
        or omit)
  4. Save as
    outline-{style}.md
    for each variant
  1. 根据内容分析生成3种样式变体大纲
  2. 遵循
    references/outline-template.md
    的结构
  3. 为学术论文自动填充IMAGE_SOURCE:
    • 读取步骤1生成的
      figures.json
    • 根据
      references/analysis-framework.md
      第8节的规则将图表映射到幻灯片
    • 自动为合适的幻灯片添加
      // IMAGE_SOURCE
      块:
      • 架构/流程图 → 方法幻灯片(
        Source: extract
      • 结果表格 → 量化结果幻灯片(
        Source: extract
      • 对比图片 → 定性结果幻灯片(
        Source: extract
      • 概念/简单图表 → 留待AI生成(
        Source: generate
        或省略)
  4. 每个变体保存为
    outline-{style}.md

Step 3: User Confirmation

步骤3:用户确认

Single AskUserQuestion with all applicable options:
QuestionWhen to Ask
Style variantAlways (3 options + custom)
LanguageOnly if source ≠ user language
After selection:
  • Copy selected
    outline-{style}.md
    to
    outline.md
  • Regenerate in different language if requested
  • User may edit
    outline.md
    for fine-tuning
If
--outline-only
, stop here.
通过单个AskUserQuestion提供所有适用选项:
问题询问时机
样式变体始终询问(3个选项+自定义)
语言仅当源语言≠用户偏好语言时询问
用户选择后:
  • 将选中的
    outline-{style}.md
    复制为
    outline.md
  • 如果有需求,重新生成对应语言的版本
  • 用户可编辑
    outline.md
    进行微调
如果使用
--outline-only
参数,流程在此处终止。

Step 4: Generate Prompts

步骤4:生成提示词

  1. Read
    references/base-prompt.md
  2. Combine with style instructions from outline
  3. Add slide-specific content
  4. If
    Layout:
    specified in outline, include layout guidance in prompt:
    • Reference layout characteristics for image composition
    • Example:
      Layout: hub-spoke
      → "Central concept in middle with related items radiating outward"
  5. Save to
    prompts/
    directory
  1. 读取
    references/base-prompt.md
  2. 与大纲中的样式说明结合
  3. 添加幻灯片特定内容
  4. 如果大纲中指定了
    Layout:
    ,在提示词中包含布局指导:
    • 参考布局特征进行图片构图
    • 示例:
      Layout: hub-spoke
      → "中心概念位于中间,相关项向外辐射"
  5. 保存到
    prompts/
    目录

Step 5: Image Generation Method Selection

步骤5:选择图片生成方式

Before generating images, ask user to choose generation method:
Use AskUserQuestion with options:
OptionLabelDescription
1Gemini API (Recommended)Official Google API via Python. Requires GOOGLE_API_KEY env var.
2Gemini Web (Browser-based)⚠️ Uses reverse-engineered web API. No API key needed but may break.
Based on selection:
生成图片前,询问用户选择生成方式:
使用AskUserQuestion提供选项:
选项标签说明
1Gemini API(推荐)官方Google API,基于Python。需要GOOGLE_API_KEY环境变量。
2Gemini Web(基于浏览器)⚠️ 使用逆向工程的Web API。无需API密钥,但可能随时失效。
根据选择执行:

Option 1: Gemini API (Python)

选项1:Gemini API(Python)

  1. Verify API key: Check
    GOOGLE_API_KEY
    or
    GEMINI_API_KEY
    environment variable
  2. Run generation script:
    bash
    python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview
Script Features:
  • Auto-installs
    google-genai
    package if missing
  • Retry logic with exponential backoff (3 retries)
  • Skips already-generated slides (> 10KB)
  • Supports custom model via
    --model
    flag
  • Outputs to
    slides/
    subdirectory
Troubleshooting:
  • If server disconnection errors occur, script auto-retries
  • For persistent failures, re-run the script (it skips completed slides)
  • Check API quota if many failures occur
  1. 验证API密钥:检查
    GOOGLE_API_KEY
    GEMINI_API_KEY
    环境变量
  2. 运行生成脚本:
    bash
    python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview
脚本特性:
  • 自动安装缺失的
    google-genai
  • 带指数退避的重试逻辑(3次重试)
  • 跳过已生成的幻灯片(大于10KB)
  • 通过
    --model
    标志支持自定义模型
  • 输出到
    slides/
    子目录
故障排除:
  • 如果出现服务器断开连接错误,脚本会自动重试
  • 若持续失败,重新运行脚本(会跳过已完成的幻灯片)
  • 若多次失败,检查API配额

Option 2: Gemini Web Skill

选项2:Gemini Web Skill

  1. Consent Check: Read consent file at:
    • Windows:
      $APPDATA/baoyu-skills/gemini-web/consent.json
    • macOS:
      ~/Library/Application Support/baoyu-skills/gemini-web/consent.json
    • Linux:
      ~/.local/share/baoyu-skills/gemini-web/consent.json
  2. If no consent or version mismatch, display disclaimer and ask:
    ⚠️ DISCLAIMER: This uses a reverse-engineered Gemini Web API (NOT official).
    Risks: May break anytime, no support, possible account risk.
  3. For each slide, run:
    bash
    npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \
      --promptfiles prompts/01-slide-cover.md \
      --image 01-slide-cover.png \
      --sessionId slides-{topic-slug}-{timestamp}
    Where
    GEMINI_WEB_SKILL_DIR
    = path to
    baoyu-danger-gemini-web
    skill directory.
  4. Proxy support: If user is in restricted network, prepend:
    bash
    HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890
  1. 同意检查:读取同意文件:
    • Windows:
      $APPDATA/baoyu-skills/gemini-web/consent.json
    • macOS:
      ~/Library/Application Support/baoyu-skills/gemini-web/consent.json
    • Linux:
      ~/.local/share/baoyu-skills/gemini-web/consent.json
  2. 如果没有同意文件或版本不匹配,显示免责声明并询问:
    ⚠️ 免责声明:本功能使用逆向工程的Gemini Web API(非官方)。
    风险:可能随时失效,无技术支持,存在账号风险。
  3. 为每张幻灯片运行:
    bash
    npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \
      --promptfiles prompts/01-slide-cover.md \
      --image 01-slide-cover.png \
      --sessionId slides-{topic-slug}-{timestamp}
    其中
    GEMINI_WEB_SKILL_DIR
    =
    baoyu-danger-gemini-web
    skill的目录路径。
  4. 代理支持:如果用户处于受限网络环境,添加前缀:
    bash
    HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890

Step 5.5: Process IMAGE_SOURCE (Automatic Figure Extraction)

步骤5.5:处理IMAGE_SOURCE(自动图表提取)

For academic presentations, IMAGE_SOURCE metadata was auto-populated in Step 2 based on figure detection from Step 1.
Automatic Execution:
  1. Parse outline to identify slides with
    Source: extract
  2. Create figures directory:
    mkdir -p figures
  3. For each extract slide, automatically:
    • Read the Figure number, Page, and Caption from metadata
    • Run figure extraction script:
      bash
      npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \
        --pdf source-paper.pdf \
        --page <page-number> \
        --output figures/figure-<N>.png
    • Run template application script:
      bash
      npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \
        --figure figures/figure-<N>.png \
        --title "<slide-headline>" \
        --caption "Figure <N>: <caption-text>" \
        --output <NN>-slide-<slug>.png
    • Report: "Extracted: Figure N → slide NN"
  4. For slides with
    Source: generate
    (or no IMAGE_SOURCE):
    • Proceed to Step 6 for AI generation
Note: Source PDF must be saved as
source-paper.pdf
in output directory.
Troubleshooting:
  • If figure detection missed a figure: manually add
    // IMAGE_SOURCE
    block to outline
  • If wrong figure mapped: edit the
    Figure:
    and
    Page:
    values in outline
  • If extraction fails: check PDF page number (1-indexed)
PyMuPDF Fallback for Page Extraction: If
extract-figure.ts
fails with "Image or Canvas expected" error (common with complex PDFs), use PyMuPDF:
python
import fitz
doc = fitz.open("source-paper.pdf")
page = doc[page_num - 1]  # 0-indexed
mat = fitz.Matrix(3, 3)  # 3x scale for high resolution
pix = page.get_pixmap(matrix=mat)
pix.save(f"extracted/page-{page_num}.png")
Then apply template using
apply-template.ts
.
对于学术演示文稿,IMAGE_SOURCE元数据已在步骤2中基于步骤1的图表检测结果自动填充。
自动执行:
  1. 解析大纲以识别带有
    Source: extract
    的幻灯片
  2. 创建图表目录:
    mkdir -p figures
  3. 对于每个需要提取的幻灯片,自动执行:
    • 从元数据中读取图表编号、页码和标题
    • 运行图表提取脚本:
      bash
      npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \
        --pdf source-paper.pdf \
        --page <page-number> \
        --output figures/figure-<N>.png
    • 运行模板应用脚本:
      bash
      npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \
        --figure figures/figure-<N>.png \
        --title "<slide-headline>" \
        --caption "Figure <N>: <caption-text>" \
        --output <NN>-slide-<slug>.png
    • 报告:"已提取:图表N → 幻灯片NN"
  4. 对于带有
    Source: generate
    (或无IMAGE_SOURCE)的幻灯片
    :
    • 继续执行步骤6进行AI生成
注意:源PDF必须保存为输出目录中的
source-paper.pdf
故障排除:
  • 如果图表检测遗漏了图表:手动在大纲中添加
    // IMAGE_SOURCE
  • 如果图表映射错误:编辑大纲中的
    Figure:
    Page:
  • 如果提取失败:检查PDF页码(从1开始计数)
PyMuPDF备选页面提取方案: 如果
extract-figure.ts
出现"Image or Canvas expected"错误(复杂PDF常见),使用PyMuPDF:
python
import fitz
doc = fitz.open("source-paper.pdf")
page = doc[page_num - 1]  # 从0开始计数
mat = fitz.Matrix(3, 3)  # 3倍缩放以获取高分辨率
pix = page.get_pixmap(matrix=mat)
pix.save(f"extracted/page-{page_num}.png")
然后使用
apply-template.ts
应用模板。

Step 6: Generate Images

步骤6:生成图片

  1. Use selected method from Step 5
  2. Skip slides already processed in Step 5.5 (those with
    Source: extract
    )
  3. Generate session ID:
    slides-{topic-slug}-{timestamp}
  4. Generate each remaining slide with same session ID
  5. Report progress: "Generated X/N"
  6. Auto-retry once on generation failure
  1. 使用步骤5中选择的生成方式
  2. 跳过步骤5.5中已处理的幻灯片(带有
    Source: extract
    的幻灯片)
  3. 生成会话ID:
    slides-{topic-slug}-{timestamp}
  4. 使用相同的会话ID生成剩余的每张幻灯片
  5. 报告进度:"已生成X/N"
  6. 生成失败时自动重试一次

Step 7: Merge to PPTX and PDF

步骤7:合并为PPTX和PDF

bash
npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir>
npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>
bash
npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir>
npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>

Step 8: Output Summary

步骤8:输出摘要

Slide Deck Complete!

Topic: [topic]
Style: [style name]
Location: [directory path]
Slides: N total

- 01-slide-cover.png ✓ Cover
- 02-slide-intro.png ✓ Content
- ...
- {NN}-slide-back-cover.png ✓ Back Cover

Outline: outline.md
PPTX: {topic-slug}.pptx
PDF: {topic-slug}.pdf
幻灯片组生成完成!

主题: [topic]
样式: [style name]
位置: [目录路径]
幻灯片数量: 共N张

- 01-slide-cover.png ✓ 封面
- 02-slide-intro.png ✓ 内容页
- ...
- {NN}-slide-back-cover.png ✓ 封底

大纲: outline.md
PPTX文件: {topic-slug}.pptx
PDF文件: {topic-slug}.pdf

Slide Modification

幻灯片修改

See
references/modification-guide.md
for:
  • Edit single slide workflow
  • Add new slide (with renumbering)
  • Delete slide (with renumbering)
  • File naming conventions
查看
references/modification-guide.md
获取:
  • 单张幻灯片编辑流程
  • 添加新幻灯片(自动重新编号)
  • 删除幻灯片(自动重新编号)
  • 文件命名规范

Image Generation Dependencies

图片生成依赖

Gemini API (Option 1 - Recommended)

Gemini API(选项1 - 推荐)

Requires:
  • GOOGLE_API_KEY
    or
    GEMINI_API_KEY
    environment variable
  • Python 3.8+ with pip
  • google-genai
    package (auto-installed by script)
Model:
gemini-3-pro-image-preview
(default)
需要:
  • GOOGLE_API_KEY
    GEMINI_API_KEY
    环境变量
  • Python 3.8+及pip
  • google-genai
    包(脚本会自动安装)
模型:
gemini-3-pro-image-preview
(默认)

Gemini Web Skill (Option 2)

Gemini Web Skill(选项2)

Requires:
  • baoyu-danger-gemini-web
    skill installed at
    .claude/skills/baoyu-danger-gemini-web
  • Google Chrome browser with logged-in Google account
  • User consent for reverse-engineered API disclaimer
需要:
  • baoyu-danger-gemini-web
    skill安装在
    .claude/skills/baoyu-danger-gemini-web
  • 已登录Google账号的Google Chrome浏览器
  • 用户同意逆向工程API的免责声明

PDF Figure Extraction

PDF图表提取

Requires:
  • Primary:
    pdfjs-dist
    npm package (use legacy build for Node.js)
  • Fallback:
    pymupdf
    Python package (more reliable for complex PDFs)
  • canvas
    npm package for apply-template.ts
需要:
  • 主要依赖:
    pdfjs-dist
    npm包(为Node.js使用旧版构建)
  • 备选依赖:
    pymupdf
    Python包(对复杂PDF更可靠)
  • canvas
    npm包(用于apply-template.ts)

References

参考文档

FileContent
references/analysis-framework.md
Deep content analysis for presentations
references/outline-template.md
Outline structure and STYLE_INSTRUCTIONS format
references/modification-guide.md
Edit, add, delete slide workflows
references/content-rules.md
Content and style guidelines
references/base-prompt.md
Base prompt for image generation
references/figure-container-template.md
Visual specs for extracted figure containers
references/styles/<style>.md
Full style specifications
文件内容
references/analysis-framework.md
演示文稿深度内容分析框架
references/outline-template.md
大纲结构和STYLE_INSTRUCTIONS格式
references/modification-guide.md
编辑、添加、删除幻灯片的工作流程
references/content-rules.md
内容和样式指南
references/base-prompt.md
图片生成基础提示词
references/figure-container-template.md
提取图表容器的视觉规范
references/styles/<style>.md
完整样式规范

Notes

注意事项

Image Generation

图片生成

  • Nano Banana Pro API: Recommended. Stable, reliable, requires API key
  • Gemini Web: No API key needed, but uses reverse-engineered API with account risk
  • Generation time: 10-30 seconds per slide
  • Auto-retry once on generation failure
  • Maintain style consistency via session ID
  • Nano Banana Pro API: 推荐使用。稳定可靠,需要API密钥
  • Gemini Web: 无需API密钥,但使用逆向工程API,存在账号风险
  • 生成时间:每张幻灯片10-30秒
  • 生成失败时自动重试一次
  • 通过会话ID保持样式一致性

Content Guidelines

内容指南

  • Use stylized alternatives for sensitive public figures
  • Both methods use the same underlying Gemini model for image generation
  • 对敏感公众人物使用风格化替代形象
  • 两种生成方式均使用底层的Gemini模型生成图片

Extension Support

扩展支持

Custom styles and configurations via EXTEND.md.
Check paths (priority order):
  1. .paper-skills/paper-slide-deck/EXTEND.md
    (project)
  2. ~/.paper-skills/paper-slide-deck/EXTEND.md
    (user)
If found, load before Step 1. Extension content overrides defaults.
通过EXTEND.md实现自定义样式和配置。
路径检查优先级:
  1. .paper-skills/paper-slide-deck/EXTEND.md
    (项目级)
  2. ~/.paper-skills/paper-slide-deck/EXTEND.md
    (用户级)
如果找到,在步骤1之前加载。扩展内容会覆盖默认设置。