paper-slide-deck

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Paper Slide Deck Generator

论文幻灯片组生成工具

Transform academic papers and content into professional slide deck images with automatic figure extraction.

将学术论文和内容转换为带有自动图表提取功能的专业幻灯片组图片。

Usage

使用方法

bash

/paper-slide-deck path/to/paper.pdf
/paper-slide-deck path/to/paper.pdf --style academic-paper
/paper-slide-deck path/to/content.md --style sketch-notes
/paper-slide-deck path/to/content.md --audience executives
/paper-slide-deck path/to/content.md --lang zh
/paper-slide-deck path/to/content.md --slides 10
/paper-slide-deck path/to/content.md --outline-only
/paper-slide-deck  # Then paste content

bash

/paper-slide-deck path/to/paper.pdf
/paper-slide-deck path/to/paper.pdf --style academic-paper
/paper-slide-deck path/to/content.md --style sketch-notes
/paper-slide-deck path/to/content.md --audience executives
/paper-slide-deck path/to/content.md --lang zh
/paper-slide-deck path/to/content.md --slides 10
/paper-slide-deck path/to/content.md --outline-only
/paper-slide-deck  # 然后粘贴内容

Script Directory

脚本目录

Important: All scripts are located in the

scripts/

subdirectory of this skill.

Agent Execution Instructions:

Determine this SKILL.md file's directory path as
```
SKILL_DIR
```
Script path =
```
${SKILL_DIR}/scripts/<script-name>.ts
```
Replace all
```
${SKILL_DIR}
```
in this document with the actual path

Script Reference:

Script	Purpose
`scripts/generate-slides.py`	Generate AI slides via Gemini API (Python)
`scripts/merge-to-pptx.ts`	Merge slides into PowerPoint
`scripts/merge-to-pdf.ts`	Merge slides into PDF
`scripts/detect-figures.ts`	Auto-detect figures/tables in PDF
`scripts/extract-figure.ts`	Extract figure from PDF page (uses PyMuPDF fallback)
`scripts/apply-template.ts`	Apply figure container template

重要提示：所有脚本都位于此skill的

scripts/

子目录中。

Agent执行说明:

确定此SKILL.md文件的目录路径为
```
SKILL_DIR
```
脚本路径 =
```
${SKILL_DIR}/scripts/<script-name>.ts
```
将本文档中所有
```
${SKILL_DIR}
```
替换为实际路径

脚本参考:

脚本	用途
`scripts/generate-slides.py`	通过Gemini API生成AI幻灯片（Python）
`scripts/merge-to-pptx.ts`	将幻灯片合并为PowerPoint文件
`scripts/merge-to-pdf.ts`	将幻灯片合并为PDF文件
`scripts/detect-figures.ts`	自动检测PDF中的图表/表格
`scripts/extract-figure.ts`	从PDF页面提取图表（使用PyMuPDF作为备选方案）
`scripts/apply-template.ts`	应用图表容器模板

Options

选项参数

Option	Description
`--style <name>`	Visual style (see Style Gallery)
`--audience <type>`	Target audience: beginners, intermediate, experts, executives, general
`--lang <code>`	Output language (en, zh, ja, etc.)
`--slides <number>`	Target slide count
`--outline-only`	Generate outline only, skip image generation

选项	说明
`--style <name>`	视觉样式（查看样式库）
`--audience <type>`	目标受众：beginners、intermediate、experts、executives、general
`--lang <code>`	输出语言（en、zh、ja等）
`--slides <number>`	目标幻灯片数量
`--outline-only`	仅生成大纲，跳过图片生成

Style Gallery

样式库

Style	Description	Best For
`academic-paper`	Clean professional, precise charts	Conference talks, thesis defense
`blueprint` (Default)	Technical schematics, grid texture	Architecture, system design
`chalkboard`	Black chalkboard, colorful chalk	Education, tutorials, classroom
`notion`	SaaS dashboard, card-based layouts	Product demos, SaaS, B2B
`bold-editorial`	Magazine cover, bold typography, dark	Product launches, keynotes
`corporate`	Navy/gold, structured layouts	Investor decks, proposals
`dark-atmospheric`	Cinematic dark mode, glowing accents	Entertainment, gaming
`editorial-infographic`	Magazine explainers, flat illustrations	Tech explainers, research
`fantasy-animation`	Ghibli/Disney style, hand-drawn	Educational, storytelling
`intuition-machine`	Technical briefing, bilingual labels	Technical docs, academic
`minimal`	Ultra-clean, maximum whitespace	Executive briefings, premium
`pixel-art`	Retro 8-bit, chunky pixels	Gaming, developer talks
`scientific`	Academic diagrams, precise labeling	Biology, chemistry, medical
`sketch-notes`	Hand-drawn, warm & friendly	Educational, tutorials
`vector-illustration`	Flat vector, retro & cute	Creative, children's content
`vintage`	Aged-paper, historical styling	Historical, heritage, biography
`watercolor`	Hand-painted textures, natural warmth	Lifestyle, wellness, travel

样式	说明	适用场景
`academic-paper`	简洁专业风格，精确图表	会议报告、论文答辩
`blueprint` （默认）	技术原理图，网格纹理	架构设计、系统设计
`chalkboard`	黑色黑板，彩色粉笔风格	教育教学、教程、课堂
`notion`	SaaS仪表板，卡片式布局	产品演示、SaaS、B2B场景
`bold-editorial`	杂志封面风格，粗体排版，深色主题	产品发布、主题演讲
`corporate`	藏青/金色，结构化布局	投资者演示、提案
`dark-atmospheric`	电影感深色模式，发光元素	娱乐、游戏领域
`editorial-infographic`	杂志解说风格，扁平化插图	技术讲解、研究内容
`fantasy-animation`	吉卜力/迪士尼风格，手绘质感	教育内容、故事讲述
`intuition-machine`	技术简报，双语标签	技术文档、学术内容
`minimal`	极简风格，最大化留白	高管简报、高端演示
`pixel-art`	复古8位像素风格	游戏领域、开发者演讲
`scientific`	学术图表，精准标注	生物学、化学、医学领域
`sketch-notes`	手绘风格，温暖友好	教育内容、教程
`vector-illustration`	扁平化矢量图，复古可爱风格	创意内容、儿童内容
`vintage`	旧纸张质感，历史风格	历史内容、遗产介绍、传记
`watercolor`	手绘水彩质感，自然温暖	生活方式、健康养生、旅行内容

Auto Style Selection

自动样式选择

Content Signals	Selected Style
paper, thesis, defense, conference, ieee, acm, icml, neurips, cvpr, acl, aaai, iclr	`academic-paper`
tutorial, learn, education, guide, intro, beginner	`sketch-notes`
classroom, teaching, school, chalkboard, blackboard	`chalkboard`
architecture, system, data, analysis, technical	`blueprint`
creative, children, kids, cute, illustration	`vector-illustration`
briefing, bilingual, infographic, concept	`intuition-machine`
executive, minimal, clean, simple, elegant	`minimal`
saas, product, dashboard, metrics, productivity	`notion`
investor, quarterly, business, corporate, proposal	`corporate`
launch, marketing, keynote, bold, impact, magazine	`bold-editorial`
entertainment, music, gaming, creative, atmospheric	`dark-atmospheric`
explainer, journalism, science communication	`editorial-infographic`
story, fantasy, animation, magical, whimsical	`fantasy-animation`
gaming, retro, pixel, developer, nostalgia	`pixel-art`
biology, chemistry, medical, pathway, scientific	`scientific`
history, heritage, vintage, expedition, historical	`vintage`
lifestyle, wellness, travel, artistic, natural	`watercolor`
Default	`blueprint`

内容特征	选中样式
paper、thesis、defense、conference、ieee、acm、icml、neurips、cvpr、acl、aaai、iclr	`academic-paper`
tutorial、learn、education、guide、intro、beginner	`sketch-notes`
classroom、teaching、school、chalkboard、blackboard	`chalkboard`
architecture、system、data、analysis、technical	`blueprint`
creative、children、kids、cute、illustration	`vector-illustration`
briefing、bilingual、infographic、concept	`intuition-machine`
executive、minimal、clean、simple、elegant	`minimal`
saas、product、dashboard、metrics、productivity	`notion`
investor、quarterly、business、corporate、proposal	`corporate`
launch、marketing、keynote、bold、impact、magazine	`bold-editorial`
entertainment、music、gaming、creative、atmospheric	`dark-atmospheric`
explainer、journalism、science communication	`editorial-infographic`
story、fantasy、animation、magical、whimsical	`fantasy-animation`
gaming、retro、pixel、developer、nostalgia	`pixel-art`
biology、chemistry、medical、pathway、scientific	`scientific`
history、heritage、vintage、expedition、historical	`vintage`
lifestyle、wellness、travel、artistic、natural	`watercolor`
默认	`blueprint`

Layout Gallery

布局库

Optional layout hints for individual slides. Specify in outline's

// LAYOUT

section.

为单个幻灯片提供可选的布局提示，在大纲的

// LAYOUT

部分指定。

Slide-Specific Layouts

幻灯片特定布局

Layout	Description	Best For
`title-hero`	Large centered title + subtitle	Cover slides, section breaks
`quote-callout`	Featured quote with attribution	Testimonials, key insights
`key-stat`	Single large number as focal point	Impact statistics, metrics
`split-screen`	Half image, half text	Feature highlights, comparisons
`icon-grid`	Grid of icons with labels	Features, capabilities, benefits
`two-columns`	Content in balanced columns	Paired information, dual points
`three-columns`	Content in three columns	Triple comparisons, categories
`image-caption`	Full-bleed image + text overlay	Visual storytelling, emotional
`agenda`	Numbered list with highlights	Session overview, roadmap
`bullet-list`	Structured bullet points	Simple content, lists

布局	说明	适用场景
`title-hero`	居中大标题 + 副标题	封面幻灯片、章节分隔页
`quote-callout`	带署名的特色引语	客户证言、核心见解
`key-stat`	单个大数字作为焦点	关键统计数据、指标
`split-screen`	半图半文	功能亮点、对比内容
`icon-grid`	带标签的图标网格	功能、能力、优势展示
`two-columns`	均衡的两栏内容	配对信息、双观点对比
`three-columns`	三栏内容	三方对比、分类内容
`image-caption`	全屏图片 + 文字叠加	视觉叙事、情感化内容
`agenda`	带重点的编号列表	会议概述、路线图
`bullet-list`	结构化项目符号列表	简单内容、清单

Infographic-Derived Layouts

信息图衍生布局

Layout	Description	Best For
`linear-progression`	Sequential flow left-to-right	Timelines, step-by-step
`binary-comparison`	Side-by-side A vs B	Before/after, pros-cons
`comparison-matrix`	Multi-factor grid	Feature comparisons
`hierarchical-layers`	Pyramid or stacked levels	Priority, importance
`hub-spoke`	Central node with radiating items	Concept maps, ecosystems
`bento-grid`	Varied-size tiles	Overview, summary
`funnel`	Narrowing stages	Conversion, filtering
`dashboard`	Metrics with charts/numbers	KPIs, data display
`venn-diagram`	Overlapping circles	Relationships, intersections
`circular-flow`	Continuous cycle	Recurring processes
`winding-roadmap`	Curved path with milestones	Journey, timeline
`tree-branching`	Parent-child hierarchy	Org charts, taxonomies
`iceberg`	Visible vs hidden layers	Surface vs depth
`bridge`	Gap with connection	Problem-solution

布局	说明	适用场景
`linear-progression`	从左到右的顺序流程	时间线、分步指南
`binary-comparison`	并排对比A与B	前后对比、优缺点分析
`comparison-matrix`	多因素网格	功能对比
`hierarchical-layers`	金字塔或堆叠层级	优先级、重要性排序
`hub-spoke`	中心节点 + 辐射项	概念图、生态系统
`bento-grid`	不同尺寸的瓦片布局	概述、摘要
`funnel`	逐步收窄的阶段	转化流程、筛选过程
`dashboard`	带图表/数字的指标展示	KPI、数据展示
`venn-diagram`	重叠圆形	关系展示、交集内容
`circular-flow`	循环流程	重复过程
`winding-roadmap`	带里程碑的弯曲路径	发展历程、时间线
`tree-branching`	父子层级结构	组织结构图、分类体系
`iceberg`	可见层与隐藏层	表面与深度内容
`bridge`	带连接的缺口	问题-解决方案

Academic-Specific Layouts

学术特定布局

Layout	Description	Best For
`paper-title`	Title, authors, affiliations, venue	Conference paper cover
`outline-agenda`	Numbered section list with highlights	Talk structure overview
`methods-diagram`	Central architecture/pipeline diagram	Methods, system design
`results-chart`	Chart area + data annotations	Quantitative results
`equation-focus`	Centered equation + variable definitions	Mathematical derivations
`qualitative-grid`	2x2 or 3x2 image comparison grid	Visual results, ablations
`references-list`	Numbered citation list	Key references slide
`contributions`	Numbered contribution points	Contributions summary

Usage: Add

Layout: <name>

in slide's

// LAYOUT

section to guide visual composition.

布局	说明	适用场景
`paper-title`	标题、作者、机构、会议	会议论文封面
`outline-agenda`	带重点的编号章节列表	演讲结构概述
`methods-diagram`	中心架构/流程图	方法介绍、系统设计
`results-chart`	图表区域 + 数据注释	量化结果展示
`equation-focus`	居中公式 + 变量定义	数学推导
`qualitative-grid`	2x2或3x2图片对比网格	可视化结果、消融实验
`references-list`	编号引用列表	关键参考文献幻灯片
`contributions`	编号贡献点	贡献总结

使用方法：在幻灯片的

// LAYOUT

部分添加

Layout: <name>

，以指导视觉构图。

Design Philosophy

设计理念

This deck is designed for reading and sharing, not live presentation:

Each slide must be self-explanatory without verbal commentary
Structure content for logical flow when scrolling
Include all necessary context within each slide
Optimize for social media sharing and offline reading

本工具生成的幻灯片组专为阅读和分享设计，而非现场演示：

每张幻灯片必须无需口头讲解即可独立理解
内容结构需符合滚动浏览的逻辑流程
每张幻灯片需包含所有必要上下文
优化以适配社交媒体分享和离线阅读

File Management

文件管理

Output Directory

输出目录

Each session creates an independent directory named by content slug:

slide-deck/{topic-slug}/
├── source-{slug}.{ext}    # Source files (text, images, etc.)
├── outline.md
├── outline-{style}.md     # Style variant outlines
├── prompts/
│   └── 01-slide-cover.md, 02-slide-{slug}.md, ...
├── 01-slide-cover.png, 02-slide-{slug}.png, ...
├── {topic-slug}.pptx
└── {topic-slug}.pdf

Slug Generation:

Extract main topic from content (2-4 words, kebab-case)
Example: "Introduction to Machine Learning" →
```
intro-machine-learning
```

每个会话会创建一个独立目录，名称由内容slug生成：

slide-deck/{topic-slug}/
├── source-{slug}.{ext}    # 源文件（文本、图片等）
├── outline.md
├── outline-{style}.md     # 样式变体大纲
├── prompts/
│   └── 01-slide-cover.md, 02-slide-{slug}.md, ...
├── 01-slide-cover.png, 02-slide-{slug}.png, ...
├── {topic-slug}.pptx
└── {topic-slug}.pdf

Slug生成规则:

从内容中提取主题（2-4个单词，短横线分隔）
示例："Introduction to Machine Learning" →
```
intro-machine-learning
```

Conflict Resolution

冲突解决

slide-deck/{topic-slug}/

already exists:

Append timestamp:
```
{topic-slug}-YYYYMMDD-HHMMSS
```
Example:
```
intro-ml
```
exists →
```
intro-ml-20260118-143052
```

如果

slide-deck/{topic-slug}/

已存在：

添加时间戳后缀：
```
{topic-slug}-YYYYMMDD-HHMMSS
```
示例：
```
intro-ml
```
已存在 →
```
intro-ml-20260118-143052
```

Source Files

源文件

Copy all sources with naming

source-{slug}.{ext}

```
source-article.md
```
(main text content)
```
source-diagram.png
```
(image from conversation)
```
source-data.xlsx
```
(additional file)

Multiple sources supported: text, images, files from conversation.

将所有源文件复制为

source-{slug}.{ext}

命名格式：

```
source-article.md
```
（主文本内容）
```
source-diagram.png
```
（对话中的图片）
```
source-data.xlsx
```
（附加文件）

支持多种源文件：文本、图片、对话中的文件。

Workflow

工作流程

Step 1: Analyze Content

步骤1：内容分析

Save source content (if pasted, save as
```
source.md
```
)
Follow
```
references/analysis-framework.md
```
for deep content analysis
Determine style (use
```
--style
```
or auto-select from signals)
Detect languages (source vs. user preference)
Plan slide count (
```
--slides
```
or dynamic)
For academic papers (PDF with figures): Run automatic figure detection:
bash
```
npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json
```
This outputs a JSON file with all detected figures/tables, their page numbers, and captions.

保存源内容（如果是粘贴的内容，保存为
```
source.md
```
）
遵循
```
references/analysis-framework.md
```
进行深度内容分析
确定样式（使用
```
--style
```
参数或根据特征自动选择）
检测语言（源语言与用户偏好语言）
规划幻灯片数量（使用
```
--slides
```
参数或动态生成）
对于学术论文（带图表的PDF）：运行自动图表检测：
bash
```
npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json
```
此命令会输出一个JSON文件，包含所有检测到的图表/表格、页码和标题。

Step 2: Generate Outline Variants

步骤2：生成大纲变体

Generate 3 style variant outlines based on content analysis
Follow
```
references/outline-template.md
```
for structure
Auto-populate IMAGE_SOURCE for academic papers:
- Read
```
figures.json
```
  from Step 1
- Map figures to slides using rules in
```
references/analysis-framework.md
```
  Section 8
- Automatically add
```
// IMAGE_SOURCE
```
  blocks to appropriate slides:
  - Architecture/pipeline figures → Methods slides (
```
Source: extract
```
    )
  - Results tables → Quantitative results slides (
```
Source: extract
```
    )
  - Comparison images → Qualitative results slides (
```
Source: extract
```
    )
  - Conceptual/simple diagrams → Leave for AI generation (
```
Source: generate
```
    or omit)
Save as
```
outline-{style}.md
```
for each variant

根据内容分析生成3种样式变体大纲
遵循
```
references/outline-template.md
```
的结构
为学术论文自动填充IMAGE_SOURCE:
- 读取步骤1生成的
```
figures.json
```
- 根据
```
references/analysis-framework.md
```
  第8节的规则将图表映射到幻灯片
- 自动为合适的幻灯片添加
```
// IMAGE_SOURCE
```
  块：
  - 架构/流程图 → 方法幻灯片（
```
Source: extract
```
    ）
  - 结果表格 → 量化结果幻灯片（
```
Source: extract
```
    ）
  - 对比图片 → 定性结果幻灯片（
```
Source: extract
```
    ）
  - 概念/简单图表 → 留待AI生成（
```
Source: generate
```
    或省略）
每个变体保存为
```
outline-{style}.md
```

Step 3: User Confirmation

步骤3：用户确认

Single AskUserQuestion with all applicable options:

Question	When to Ask
Style variant	Always (3 options + custom)
Language	Only if source ≠ user language

After selection:

Copy selected
```
outline-{style}.md
```
to
```
outline.md
```
Regenerate in different language if requested
User may edit
```
outline.md
```
for fine-tuning

--outline-only

, stop here.

通过单个AskUserQuestion提供所有适用选项:

问题	询问时机
样式变体	始终询问（3个选项+自定义）
语言	仅当源语言≠用户偏好语言时询问

用户选择后:

将选中的
```
outline-{style}.md
```
复制为
```
outline.md
```
如果有需求，重新生成对应语言的版本
用户可编辑
```
outline.md
```
进行微调

如果使用

--outline-only

参数，流程在此处终止。

Step 4: Generate Prompts

步骤4：生成提示词

Read
```
references/base-prompt.md
```
Combine with style instructions from outline
Add slide-specific content
If
```
Layout:
```
specified in outline, include layout guidance in prompt:
- Reference layout characteristics for image composition
- Example:
```
Layout: hub-spoke
```
  → "Central concept in middle with related items radiating outward"
Save to
```
prompts/
```
directory

读取
```
references/base-prompt.md
```
与大纲中的样式说明结合
添加幻灯片特定内容
如果大纲中指定了
```
Layout:
```
，在提示词中包含布局指导：
- 参考布局特征进行图片构图
- 示例：
```
Layout: hub-spoke
```
  → "中心概念位于中间，相关项向外辐射"
保存到
```
prompts/
```
目录

Step 5: Image Generation Method Selection

步骤5：选择图片生成方式

Before generating images, ask user to choose generation method:

Use AskUserQuestion with options:

Option	Label	Description
1	Gemini API (Recommended)	Official Google API via Python. Requires GOOGLE_API_KEY env var.
2	Gemini Web (Browser-based)	⚠️ Uses reverse-engineered web API. No API key needed but may break.

Based on selection:

生成图片前，询问用户选择生成方式:

使用AskUserQuestion提供选项:

选项	标签	说明
1	Gemini API（推荐）	官方Google API，基于Python。需要GOOGLE_API_KEY环境变量。
2	Gemini Web（基于浏览器）	⚠️ 使用逆向工程的Web API。无需API密钥，但可能随时失效。

根据选择执行:

Option 1: Gemini API (Python)

选项1：Gemini API（Python）

Verify API key: Check
```
GOOGLE_API_KEY
```
or
```
GEMINI_API_KEY
```
environment variable

Run generation script:

bash

python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview

Script Features:

Auto-installs
```
google-genai
```
package if missing
Retry logic with exponential backoff (3 retries)
Skips already-generated slides (> 10KB)
Supports custom model via
```
--model
```
flag
Outputs to
```
slides/
```
subdirectory

Troubleshooting:

If server disconnection errors occur, script auto-retries
For persistent failures, re-run the script (it skips completed slides)
Check API quota if many failures occur

验证API密钥：检查
```
GOOGLE_API_KEY
```
或
```
GEMINI_API_KEY
```
环境变量

运行生成脚本:

bash

python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview

脚本特性:

自动安装缺失的
```
google-genai
```
包
带指数退避的重试逻辑（3次重试）
跳过已生成的幻灯片（大于10KB）
通过
```
--model
```
标志支持自定义模型
输出到
```
slides/
```
子目录

故障排除:

如果出现服务器断开连接错误，脚本会自动重试
若持续失败，重新运行脚本（会跳过已完成的幻灯片）
若多次失败，检查API配额

Option 2: Gemini Web Skill

选项2：Gemini Web Skill

Consent Check: Read consent file at:

Windows:

$APPDATA/baoyu-skills/gemini-web/consent.json

macOS:

~/Library/Application Support/baoyu-skills/gemini-web/consent.json

Linux:

~/.local/share/baoyu-skills/gemini-web/consent.json

If no consent or version mismatch, display disclaimer and ask:

⚠️ DISCLAIMER: This uses a reverse-engineered Gemini Web API (NOT official).
Risks: May break anytime, no support, possible account risk.

For each slide, run:

bash

npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \
  --promptfiles prompts/01-slide-cover.md \
  --image 01-slide-cover.png \
  --sessionId slides-{topic-slug}-{timestamp}

Where

GEMINI_WEB_SKILL_DIR

= path to

baoyu-danger-gemini-web

skill directory.

Proxy support: If user is in restricted network, prepend:

bash

HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890

同意检查：读取同意文件：

Windows:

$APPDATA/baoyu-skills/gemini-web/consent.json

macOS:

~/Library/Application Support/baoyu-skills/gemini-web/consent.json

Linux:

~/.local/share/baoyu-skills/gemini-web/consent.json

如果没有同意文件或版本不匹配，显示免责声明并询问:

⚠️ 免责声明：本功能使用逆向工程的Gemini Web API（非官方）。
风险：可能随时失效，无技术支持，存在账号风险。

为每张幻灯片运行:

bash

npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \
  --promptfiles prompts/01-slide-cover.md \
  --image 01-slide-cover.png \
  --sessionId slides-{topic-slug}-{timestamp}

其中

GEMINI_WEB_SKILL_DIR

baoyu-danger-gemini-web

skill的目录路径。

代理支持：如果用户处于受限网络环境，添加前缀:
bash
```
HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890
```

Step 5.5: Process IMAGE_SOURCE (Automatic Figure Extraction)

步骤5.5：处理IMAGE_SOURCE（自动图表提取）

For academic presentations, IMAGE_SOURCE metadata was auto-populated in Step 2 based on figure detection from Step 1.

Automatic Execution:

Parse outline to identify slides with
```
Source: extract
```
Create figures directory:
```
mkdir -p figures
```

For each extract slide, automatically:

Read the Figure number, Page, and Caption from metadata

Run figure extraction script:

bash

npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \
  --pdf source-paper.pdf \
  --page <page-number> \
  --output figures/figure-<N>.png

Run template application script:

bash

npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \
  --figure figures/figure-<N>.png \
  --title "<slide-headline>" \
  --caption "Figure <N>: <caption-text>" \
  --output <NN>-slide-<slug>.png

Report: "Extracted: Figure N → slide NN"

For slides with
Source: generate
(or no IMAGE_SOURCE):
- Proceed to Step 6 for AI generation

Note: Source PDF must be saved as

source-paper.pdf

in output directory.

Troubleshooting:

If figure detection missed a figure: manually add
```
// IMAGE_SOURCE
```
block to outline
If wrong figure mapped: edit the
```
Figure:
```
and
```
Page:
```
values in outline
If extraction fails: check PDF page number (1-indexed)

PyMuPDF Fallback for Page Extraction: If

extract-figure.ts

fails with "Image or Canvas expected" error (common with complex PDFs), use PyMuPDF:

python

import fitz
doc = fitz.open("source-paper.pdf")
page = doc[page_num - 1]  # 0-indexed
mat = fitz.Matrix(3, 3)  # 3x scale for high resolution
pix = page.get_pixmap(matrix=mat)
pix.save(f"extracted/page-{page_num}.png")

Then apply template using

apply-template.ts

对于学术演示文稿，IMAGE_SOURCE元数据已在步骤2中基于步骤1的图表检测结果自动填充。

自动执行:

解析大纲以识别带有
```
Source: extract
```
的幻灯片
创建图表目录:
```
mkdir -p figures
```

对于每个需要提取的幻灯片，自动执行:

从元数据中读取图表编号、页码和标题

运行图表提取脚本:

bash

npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \
  --pdf source-paper.pdf \
  --page <page-number> \
  --output figures/figure-<N>.png

运行模板应用脚本:

bash

npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \
  --figure figures/figure-<N>.png \
  --title "<slide-headline>" \
  --caption "Figure <N>: <caption-text>" \
  --output <NN>-slide-<slug>.png

报告："已提取：图表N → 幻灯片NN"

对于带有
Source: generate
（或无IMAGE_SOURCE）的幻灯片:
- 继续执行步骤6进行AI生成

注意：源PDF必须保存为输出目录中的

source-paper.pdf

。

故障排除:

如果图表检测遗漏了图表：手动在大纲中添加
```
// IMAGE_SOURCE
```
块
如果图表映射错误：编辑大纲中的
```
Figure:
```
和
```
Page:
```
值
如果提取失败：检查PDF页码（从1开始计数）

PyMuPDF备选页面提取方案: 如果

extract-figure.ts

出现"Image or Canvas expected"错误（复杂PDF常见），使用PyMuPDF:

python

import fitz
doc = fitz.open("source-paper.pdf")
page = doc[page_num - 1]  # 从0开始计数
mat = fitz.Matrix(3, 3)  # 3倍缩放以获取高分辨率
pix = page.get_pixmap(matrix=mat)
pix.save(f"extracted/page-{page_num}.png")

然后使用

apply-template.ts

应用模板。

Step 6: Generate Images

步骤6：生成图片

Use selected method from Step 5
Skip slides already processed in Step 5.5 (those with
```
Source: extract
```
)
Generate session ID:
```
slides-{topic-slug}-{timestamp}
```
Generate each remaining slide with same session ID
Report progress: "Generated X/N"
Auto-retry once on generation failure

使用步骤5中选择的生成方式
跳过步骤5.5中已处理的幻灯片（带有
```
Source: extract
```
的幻灯片）
生成会话ID:
```
slides-{topic-slug}-{timestamp}
```
使用相同的会话ID生成剩余的每张幻灯片
报告进度："已生成X/N"
生成失败时自动重试一次

Step 7: Merge to PPTX and PDF

步骤7：合并为PPTX和PDF

bash

npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir>
npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>

bash

npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir>
npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>

Step 8: Output Summary

步骤8：输出摘要

Slide Deck Complete!

Topic: [topic]
Style: [style name]
Location: [directory path]
Slides: N total

- 01-slide-cover.png ✓ Cover
- 02-slide-intro.png ✓ Content
- ...
- {NN}-slide-back-cover.png ✓ Back Cover

Outline: outline.md
PPTX: {topic-slug}.pptx
PDF: {topic-slug}.pdf

幻灯片组生成完成！

主题: [topic]
样式: [style name]
位置: [目录路径]
幻灯片数量: 共N张

- 01-slide-cover.png ✓ 封面
- 02-slide-intro.png ✓ 内容页
- ...
- {NN}-slide-back-cover.png ✓ 封底

大纲: outline.md
PPTX文件: {topic-slug}.pptx
PDF文件: {topic-slug}.pdf

Slide Modification

幻灯片修改

See

references/modification-guide.md

for:

Edit single slide workflow
Add new slide (with renumbering)
Delete slide (with renumbering)
File naming conventions

查看

references/modification-guide.md

获取：

单张幻灯片编辑流程
添加新幻灯片（自动重新编号）
删除幻灯片（自动重新编号）
文件命名规范

Image Generation Dependencies

图片生成依赖

Gemini API (Option 1 - Recommended)

Gemini API（选项1 - 推荐）

Requires:

```
GOOGLE_API_KEY
```
or
```
GEMINI_API_KEY
```
environment variable
Python 3.8+ with pip
```
google-genai
```
package (auto-installed by script)

Model:

gemini-3-pro-image-preview

(default)

需要:

```
GOOGLE_API_KEY
```
或
```
GEMINI_API_KEY
```
环境变量
Python 3.8+及pip
```
google-genai
```
包（脚本会自动安装）

模型:

gemini-3-pro-image-preview

（默认）

Gemini Web Skill (Option 2)

Gemini Web Skill（选项2）

Requires:

baoyu-danger-gemini-web

skill installed at

.claude/skills/baoyu-danger-gemini-web

Google Chrome browser with logged-in Google account
User consent for reverse-engineered API disclaimer

需要:

baoyu-danger-gemini-web

skill安装在

.claude/skills/baoyu-danger-gemini-web

已登录Google账号的Google Chrome浏览器
用户同意逆向工程API的免责声明

PDF Figure Extraction

PDF图表提取

Requires:

Primary:
```
pdfjs-dist
```
npm package (use legacy build for Node.js)
Fallback:
```
pymupdf
```
Python package (more reliable for complex PDFs)
```
canvas
```
npm package for apply-template.ts

需要:

主要依赖:
```
pdfjs-dist
```
npm包（为Node.js使用旧版构建）
备选依赖:
```
pymupdf
```
Python包（对复杂PDF更可靠）
```
canvas
```
npm包（用于apply-template.ts）

References

参考文档

File	Content
`references/analysis-framework.md`	Deep content analysis for presentations
`references/outline-template.md`	Outline structure and STYLE_INSTRUCTIONS format
`references/modification-guide.md`	Edit, add, delete slide workflows
`references/content-rules.md`	Content and style guidelines
`references/base-prompt.md`	Base prompt for image generation
`references/figure-container-template.md`	Visual specs for extracted figure containers
`references/styles/<style>.md`	Full style specifications

文件	内容
`references/analysis-framework.md`	演示文稿深度内容分析框架
`references/outline-template.md`	大纲结构和STYLE_INSTRUCTIONS格式
`references/modification-guide.md`	编辑、添加、删除幻灯片的工作流程
`references/content-rules.md`	内容和样式指南
`references/base-prompt.md`	图片生成基础提示词
`references/figure-container-template.md`	提取图表容器的视觉规范
`references/styles/<style>.md`	完整样式规范

Notes

注意事项

Image Generation

图片生成

Nano Banana Pro API: Recommended. Stable, reliable, requires API key
Gemini Web: No API key needed, but uses reverse-engineered API with account risk
Generation time: 10-30 seconds per slide
Auto-retry once on generation failure
Maintain style consistency via session ID

Nano Banana Pro API: 推荐使用。稳定可靠，需要API密钥
Gemini Web: 无需API密钥，但使用逆向工程API，存在账号风险
生成时间：每张幻灯片10-30秒
生成失败时自动重试一次
通过会话ID保持样式一致性

Content Guidelines

内容指南

Use stylized alternatives for sensitive public figures
Both methods use the same underlying Gemini model for image generation

对敏感公众人物使用风格化替代形象
两种生成方式均使用底层的Gemini模型生成图片

Extension Support

扩展支持

Custom styles and configurations via EXTEND.md.

Check paths (priority order):

.paper-skills/paper-slide-deck/EXTEND.md

(project)

~/.paper-skills/paper-slide-deck/EXTEND.md

(user)

If found, load before Step 1. Extension content overrides defaults.

通过EXTEND.md实现自定义样式和配置。

路径检查优先级:

.paper-skills/paper-slide-deck/EXTEND.md

（项目级）

~/.paper-skills/paper-slide-deck/EXTEND.md

（用户级）

如果找到，在步骤1之前加载。扩展内容会覆盖默认设置。