nano-banana

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Nano Banana

Generate high-quality presentation slides as images using Gemini's image generation API, review them interactively in a browser, and iteratively edit based on feedback.

利用Gemini的图像生成API生成高质量演示幻灯片图像，在浏览器中交互式审阅，并基于反馈迭代编辑。

When to Use This Skill

何时使用该技能

User asks to create a presentation, slide deck, or PPT
User wants to generate visual slides for a talk or lecture
User has a document or outline and wants slides based on it
User says "make me a PPT", "generate slides", "create a presentation"
User wants to edit or refine existing generated slides
User needs high-quality figures, diagrams, or illustrations for papers or documents
User asks to generate research figures, architecture diagrams, or concept illustrations

Do NOT use for:

Writing academic papers → use
```
paper-writing
```
Planning academic conference talk narrative structure → use
```
academic-slides
```

用户要求创建演示文稿、幻灯片Deck或PPT
用户需要为演讲或讲座生成可视化幻灯片
用户已有文档或大纲，需要基于其生成幻灯片
用户说出“给我做个PPT”、“生成幻灯片”、“创建演示文稿”等需求
用户需要编辑或优化已生成的幻灯片
用户需要为论文或文档提供高质量图表、示意图或插图
用户要求生成研究配图、架构图或概念插图

禁止用于：

撰写学术论文 → 请使用
```
paper-writing
```
工具
规划学术会议演讲的叙事结构 → 请使用
```
academic-slides
```
工具

Before You Start: Prerequisites

开始前：前置要求

Before proceeding with any slide generation, verify these prerequisites:

API Key: Check that a Google API key is available. Run:
bash
```
echo $GOOGLE_API_KEY
```
If empty, ask the user to provide one. They can either:
- Set it via config:
```
EvoSci config set google_api_key <key>
```
- Provide it directly (pass via
```
--api-key
```
  argument)
- If the user provides the key in conversation, pass it to scripts with
```
--api-key
```
Language: Ask the user what language the slide content should be in. This affects the content you write in
```
slides_plan.json
```
, not the style template.

在进行任何幻灯片生成操作前，请确认以下前置条件已满足：

API Key：检查是否已配置Google API密钥，运行命令：
bash
```
echo $GOOGLE_API_KEY
```
若返回为空，请让用户提供密钥，用户可选择以下任意方式配置：
- 通过配置命令设置：
```
EvoSci config set google_api_key <key>
```
- 直接提供密钥（通过
```
--api-key
```
  参数传递）
- 若用户在对话中提供了密钥，请通过
```
--api-key
```
  参数传递给脚本
语言：询问用户幻灯片内容需要使用的语言，该设置仅影响
```
slides_plan.json
```
中的内容，不影响样式模板。

Core Workflow

核心工作流

Phase 1: Content Planning Conversation     ← most important phase
Phase 2: Generate slides_plan.json
Phase 3: Select Style & Generate Slides
Phase 4: Launch Review Server
Phase 5: Apply Feedback Edits              ← repeat Phase 4-5 until satisfied
Phase 6: Package as PPTX
Phase 7: Cleanup

Follow these phases in order. Do NOT skip Phase 1 — the quality of generated slides depends directly on planning depth.

阶段1：内容规划对话     ← 最重要的阶段
阶段2：生成slides_plan.json
阶段3：选择样式并生成幻灯片
阶段4：启动审阅服务
阶段5：应用反馈修改              ← 重复阶段4-5直到用户满意
阶段6：打包为PPTX
阶段7：清理资源

请按顺序执行以上阶段，禁止跳过阶段1——生成幻灯片的质量直接取决于规划的深度。

Phase 1: Content Planning Conversation

阶段1：内容规划对话

This is the most critical phase. Rushing to generation without proper planning produces mediocre slides. Engage the user in a structured conversation:

Step 1 — Understand the context:

What is the topic of the presentation?
Who is the audience? (technical peers, executives, students, general public)
How long is the talk? (this determines page count)
What is the occasion? (conference, internal talk, lecture, pitch)

Step 2 — Define the storyline:

What is the opening hook? (a surprising fact, a question, a trend)
What are the 3-5 main sections or arguments?
What is the key takeaway the audience should remember?
What is the closing message?

Step 3 — Outline per-page content:

For each slide, agree on: title + 2-4 key points + visual description
Identify which slides are cover, content, or data type
Ensure logical flow between pages

Duration-to-page-count guidance:

Duration	Pages	Structure
5 min	5	Cover + 3 content + closing
10-15 min	8-12	Cover + intro + 3-4 sections + summary + closing
20-30 min	15-20	Cover + intro + 5-6 sections + summary + closing
45-60 min	25-30	Cover + intro + 7-9 sections (2-3 pages each) + summary + closing

If the user provides a document or outline, read it thoroughly, then propose a slide breakdown for approval before proceeding.

这是最关键的阶段，没有充分规划就匆忙生成只会得到质量平庸的幻灯片。请和用户进行结构化沟通：

步骤1 — 理解上下文：

演示的主题是什么？
受众是谁？（技术同行、高管、学生、普通大众）
演讲时长是多少？（该信息决定幻灯片页数）
演示场合是什么？（会议、内部演讲、讲座、项目推介）

步骤2 — 定义故事线：

开场钩子是什么？（惊人的事实、问题、趋势）
3-5个核心章节或论点是什么？
受众应该记住的核心结论是什么？
收尾信息是什么？

步骤3 — 规划单页内容：

为每一页幻灯片确认：标题 + 2-4个核心要点 + 视觉描述
明确哪些幻灯片是封面、内容页、数据页
确保页面之间逻辑流畅

时长对应页数参考：

时长	页数	结构
5分钟	5	封面 + 3页内容 + 收尾
10-15分钟	8-12	封面 + 介绍 + 3-4个章节 + 总结 + 收尾
20-30分钟	15-20	封面 + 介绍 + 5-6个章节 + 总结 + 收尾
45-60分钟	25-30	封面 + 介绍 + 7-9个章节（每个章节2-3页） + 总结 + 收尾

如果用户提供了文档或大纲，请先通读全文，然后提出幻灯片拆分方案，获得用户批准后再进入下一阶段。

Phase 2: Generate slides_plan.json

阶段2：生成slides_plan.json

Create a

slides_plan.json

file in the workspace root with this schema:

json

{
  "title": "Presentation Title",
  "total_slides": 10,
  "slides": [
    {
      "slide_number": 1,
      "page_type": "cover",
      "content": "Title: My Presentation\nSubtitle: A subtitle here\nLabel: 2026 Edition"
    },
    {
      "slide_number": 2,
      "page_type": "content",
      "content": "Title: First Topic\nKey points:\n- Point one\n- Point two\n- Point three"
    },
    {
      "slide_number": 3,
      "page_type": "data",
      "content": "Title: Key Metrics\nMetric 1: 95% accuracy\nMetric 2: 3x faster\nMetric 3: 10k users"
    }
  ]
}

page_type values:

cover

content

data

在工作区根目录创建

slides_plan.json

文件，遵循以下结构：

json

{
  "title": "Presentation Title",
  "total_slides": 10,
  "slides": [
    {
      "slide_number": 1,
      "page_type": "cover",
      "content": "Title: My Presentation\nSubtitle: A subtitle here\nLabel: 2026 Edition"
    },
    {
      "slide_number": 2,
      "page_type": "content",
      "content": "Title: First Topic\nKey points:\n- Point one\n- Point two\n- Point three"
    },
    {
      "slide_number": 3,
      "page_type": "data",
      "content": "Title: Key Metrics\nMetric 1: 95% accuracy\nMetric 2: 3x faster\nMetric 3: 10k users"
    }
  ]
}

page_type可选值：

cover

（封面）、

content

（内容页）、

data

（数据页）

Critical Content Field Rules

内容字段核心规则

The

content

field is what gets passed to the image generation model. Follow these rules strictly:

DO write descriptive titles and bullet points
DO describe the visual layout you want (e.g., "left-right comparison", "4 icon cards")
DO NOT prefix lines with "Slogan:", "Visual:", "Points:", or any meta-labels — the model will render these as visible text on the slide
DO NOT put the same sentence in both the title area and the bottom of the content — it causes duplication
DO NOT include footer text, page numbers, or watermark instructions

Bad example (meta-labels leak as visible text):

Title: Why AI Matters
Visual: left-right comparison chart
Points:
- Point one
- Point two
Slogan: AI changes everything

Good example (clean, no meta-labels):

Title: Why AI Matters
Visual layout: left-right comparison chart showing traditional vs AI approach
Key points:
- Point one with brief explanation
- Point two with brief explanation
Bottom tagline: AI changes everything

content

字段的内容会直接传递给图像生成模型，请严格遵守以下规则：

必须编写清晰的标题和要点列表
可以描述你想要的视觉布局（例如“左右对比布局”、“4个图标卡片”）
禁止在每行前加“Slogan:”、“Visual:”、“Points:”等元标签——模型会将这些内容作为可见文本渲染到幻灯片上
禁止在标题区域和内容底部重复相同的句子，会导致内容重复
禁止包含页脚文本、页码、水印相关的说明

错误示例（元标签会泄露为可见文本）：

Title: Why AI Matters
Visual: left-right comparison chart
Points:
- Point one
- Point two
Slogan: AI changes everything

正确示例（干净无多余元标签）：

Title: Why AI Matters
Visual layout: left-right comparison chart showing traditional vs AI approach
Key points:
- Point one with brief explanation
- Point two with brief explanation
Bottom tagline: AI changes everything

Phase 3: Select Style & Generate Slides

阶段3：选择样式并生成幻灯片

Available Styles

可用样式

Style	File	Visual Characteristics	Best For
Lineal Color	`styles/lineal-color.md`	White background, teal accents, flat 2D icons, info cards	Technical talks, lectures, educational
Gradient Glass	`styles/gradient-glass.md`	Light pastel background, frosted glass cards, Apple Keynote feel	Product launches, pitches, SaaS
Vector Illustration	`styles/vector-illustration.md`	Cream background, black outlines, retro colors, toy-model charm	Educational, children's content, brand stories

Present the styles to the user and let them choose. If unsure, recommend Lineal Color as the default.

样式名称	文件路径	视觉特点	适用场景
Lineal Color	`styles/lineal-color.md`	白色背景、青色点缀、扁平2D图标、信息卡片	技术演讲、讲座、教育场景
Gradient Glass	`styles/gradient-glass.md`	浅马卡龙背景、毛玻璃卡片、苹果Keynote风格	产品发布、项目推介、SaaS相关演示
Vector Illustration	`styles/vector-illustration.md`	米黄色背景、黑色描边、复古配色、卡通模型质感	教育内容、儿童内容、品牌故事

请将以上样式提供给用户选择，若用户不确定，推荐默认使用Lineal Color。

Available Models

可用模型

Model	Speed	Quality	When to Use
`gemini-3-pro-image-preview`	Moderate	Best	Final version, important presentations
`gemini-3.1-flash-image-preview`	Fast	Good	Drafts, rapid iteration, large decks
`gemini-2.5-flash-image`	Fastest	Basic	Quick prototypes, bulk generation

For first-time generation, recommend

gemini-3.1-flash-image-preview

(fast iteration). Switch to

gemini-3-pro-image-preview

for the final version.

模型	速度	质量	适用场景
`gemini-3-pro-image-preview`	中等	最优	最终版本、重要演示
`gemini-3.1-flash-image-preview`	快	良好	草稿、快速迭代、大型幻灯片Deck
`gemini-2.5-flash-image`	最快	基础	快速原型、批量生成

首次生成时推荐使用

gemini-3.1-flash-image-preview

（迭代速度快），最终版本生成时切换为

gemini-3-pro-image-preview

。

Generate Command

生成命令

bash

python /skills/nano-banana/scripts/generate_ppt.py \
  --plan slides_plan.json \
  --style /skills/nano-banana/styles/lineal-color.md \
  --model gemini-3.1-flash-image-preview \
  --output ppt_output

Arguments:

```
--plan
```
(required): Path to slides_plan.json
```
--style
```
(required): Path to style template
```
--model
```
: Image generation model (default:
```
gemini-3-pro-image-preview
```
)
```
--resolution
```
:
```
2K
```
(default) or
```
4K
```
```
--output
```
: Output directory (default:
```
ppt_output/TIMESTAMP
```
)
```
--api-key
```
: Google API key (if not in environment)
```
--workers
```
: Number of parallel workers (default: 1, recommended: 3-5 for large decks)

Output structure:

ppt_output/
├── images/
│   ├── slide-01.png
│   ├── slide-02.png
│   └── ...
├── prompts.json    # All prompts used (for debugging)
└── index.html      # Browser viewer

bash

python /skills/nano-banana/scripts/generate_ppt.py \
  --plan slides_plan.json \
  --style /skills/nano-banana/styles/lineal-color.md \
  --model gemini-3.1-flash-image-preview \
  --output ppt_output

参数说明：

```
--plan
```
（必填）：slides_plan.json的文件路径
```
--style
```
（必填）：样式模板的文件路径
```
--model
```
：图像生成模型（默认值：
```
gemini-3-pro-image-preview
```
）
```
--resolution
```
：分辨率，可选
```
2K
```
（默认）或
```
4K
```
```
--output
```
：输出目录（默认值：
```
ppt_output/TIMESTAMP
```
）
```
--api-key
```
：Google API密钥（若未配置在环境变量中）
```
--workers
```
：并行生成的worker数量（默认值：1，大型Deck推荐3-5）

输出目录结构：

ppt_output/
├── images/
│   ├── slide-01.png
│   ├── slide-02.png
│   └── ...
├── prompts.json    # 所有使用的prompt（用于调试）
└── index.html      # 浏览器预览页面

Phase 4: Launch Review Server

阶段4：启动审阅服务

Start the interactive review server so the user can review slides and write feedback:

bash

python /skills/nano-banana/scripts/serve_viewer.py \
  --dir ppt_output \
  --plan slides_plan.json \
  --port 8080 \
  --pid-file .viewer.pid

Tell the user:

Review server is running at http://localhost:8080. Open it in your browser to review each slide. Write feedback in the text box below any slide that needs changes, then click "Save Feedback". Tell me when you're done.

The server saves feedback directly into

slides_plan.json

as a

feedback

field on each slide.

Wait for the user to confirm they have saved their feedback before proceeding.

启动交互式审阅服务，方便用户预览幻灯片并提交反馈：

bash

python /skills/nano-banana/scripts/serve_viewer.py \
  --dir ppt_output \
  --plan slides_plan.json \
  --port 8080 \
  --pid-file .viewer.pid

告知用户：

审阅服务已运行在 http://localhost:8080，请在浏览器中打开预览每一页幻灯片。需要修改的幻灯片请在下方文本框填写反馈，然后点击“保存反馈”，完成后告知我即可。

服务会将反馈直接保存到

slides_plan.json

中对应幻灯片的

feedback

字段里。

请等待用户确认已保存所有反馈后再进入下一阶段。

Phase 5: Apply Feedback Edits

阶段5：应用反馈修改

Read

slides_plan.json

and find all slides with a non-empty

feedback

field. For each one, run the edit script:

bash

python /skills/nano-banana/scripts/edit_slide.py \
  --input ppt_output/images/slide-{NUMBER}.png \
  --instruction "{FEEDBACK_TEXT}" \
  --output ppt_output/images/slide-{NUMBER}.png \
  --model gemini-3.1-flash-image-preview

Arguments:

```
--input
```
(required): Path to the original slide image
```
--instruction
```
(required): The edit instruction (from feedback field)
```
--output
```
: Output path (default: overwrite input)
```
--model
```
: Image generation model
```
--api-key
```
: Google API key (if not in environment)

After editing all slides with feedback, clear the

feedback

fields from

slides_plan.json

and tell the user to refresh the browser to see updated slides.

If the user has more feedback, repeat Phase 4-5. This review-edit cycle continues until the user is satisfied.

读取

slides_plan.json

，找到所有

feedback

字段不为空的幻灯片，对每一页运行编辑脚本：

bash

python /skills/nano-banana/scripts/edit_slide.py \
  --input ppt_output/images/slide-{NUMBER}.png \
  --instruction "{FEEDBACK_TEXT}" \
  --output ppt_output/images/slide-{NUMBER}.png \
  --model gemini-3.1-flash-image-preview

参数说明：

```
--input
```
（必填）：原始幻灯片图像的路径
```
--instruction
```
（必填）：修改指令（来自feedback字段）
```
--output
```
：输出路径（默认覆盖原文件）
```
--model
```
：图像生成模型
```
--api-key
```
：Google API密钥（若未配置在环境变量中）

完成所有带反馈幻灯片的编辑后，清空

slides_plan.json

中的

feedback

字段，告知用户刷新浏览器即可看到更新后的幻灯片。

如果用户还有新的反馈，重复阶段4-5，该审阅-编辑循环持续到用户满意为止。

Phase 6: Package as PPTX

阶段6：打包为PPTX

Once the user approves all slides, ask for the desired filename and package them:

bash

python /skills/nano-banana/scripts/package_pptx.py \
  --dir ppt_output/images \
  --output presentation.pptx \
  --kill-server .viewer.pid

Arguments:

```
--dir
```
(required): Directory containing slide-XX.png images
```
--output
```
(required): Output .pptx file path
```
--kill-server
```
: PID file from serve_viewer.py — automatically stops the review server after packaging

用户确认所有幻灯片无误后，询问期望的文件名，然后执行打包命令：

bash

python /skills/nano-banana/scripts/package_pptx.py \
  --dir ppt_output/images \
  --output presentation.pptx \
  --kill-server .viewer.pid

参数说明：

```
--dir
```
（必填）：存储slide-XX.png图像的目录
```
--output
```
（必填）：输出.pptx文件的路径
```
--kill-server
```
：serve_viewer.py生成的PID文件，打包完成后会自动停止审阅服务

Phase 7: Cleanup

阶段7：清理资源

The review server is automatically stopped by
```
package_pptx.py --kill-server
```
Ask the user if they want to keep
```
ppt_output/
```
directory or clean it up
The
```
slides_plan.json
```
can be kept for future re-generation

审阅服务会通过
```
package_pptx.py --kill-server
```
自动停止
询问用户是否需要保留
```
ppt_output/
```
目录，还是清理掉
```
slides_plan.json
```
可以保留，用于后续重新生成

Counterintuitive Rules

反直觉规则

Never include meta-labels in content — Words like "Slogan:", "Visual:", "Points:" will be rendered as visible text on the slide. Describe what you want without prefixes.
Content describes WHAT, not HOW — The style template handles visual layout. The content field should focus on text and logical structure, not colors or positioning.
More planning = better slides — Spending 10 minutes on Phase 1 conversation saves hours of re-generation. Do not rush to Phase 3.
Edit, don't regenerate — When a slide needs minor changes (text fix, color change, remove footer), use
```
edit_slide.py
```
instead of regenerating from scratch. Editing preserves visual consistency.
Use flash model for drafts —
```
gemini-3.1-flash-image-preview
```
is fast enough for iteration. Only switch to
```
gemini-3-pro-image-preview
```
for the final version after all feedback is addressed.
Never read generated images yourself — Not all models support multimodal input. Do NOT use
```
read_file
```
on generated PNG images to check quality. Always launch the review server and let the user inspect slides visually in the browser. The user's feedback is your only quality signal.
One idea per slide — Do not pack multiple concepts into a single slide. If a slide has more than 4 bullet points, split it into two slides.
Bottom taglines should not repeat the title — If the title says "Why AI Matters", the bottom tagline should add new insight, not restate the title.

内容中绝对不能包含元标签——“Slogan:”、“Visual:”、“Points:”这类词汇会被渲染为幻灯片上的可见文本，直接描述需求不要加前缀。
内容描述是什么，而不是怎么做——样式模板负责视觉布局，内容字段只需要聚焦文本和逻辑结构，不需要说明颜色、位置等样式信息。
规划越充分，幻灯片质量越高——在阶段1花10分钟沟通，可以节省数小时的重新生成时间，不要匆忙进入阶段3。
优先编辑，不要重新生成——如果幻灯片只需要小修改（文本修正、颜色调整、删除页脚），使用
```
edit_slide.py
```
而不是从头重新生成，编辑可以保留视觉一致性。
草稿使用flash模型——
```
gemini-3.1-flash-image-preview
```
的速度足够支撑迭代，所有反馈处理完成后的最终版本再切换到
```
gemini-3-pro-image-preview
```
。
不要自行读取生成的图像——不是所有模型都支持多模态输入，不要使用
```
read_file
```
读取生成的PNG图像检查质量，始终启动审阅服务让用户在浏览器中目视检查，用户的反馈是唯一的质量信号。
每页一个核心观点——不要在单页幻灯片中塞多个概念，如果一页的要点超过4个，拆分为两页。
底部标语不要重复标题——如果标题是“为什么AI很重要”，底部标语应该提供新的洞察，而不是重复标题内容。

Scripts Reference

脚本参考

Script	Purpose	Key Arguments
`scripts/generate_ppt.py`	Batch generate all slides from plan	`--plan` , `--style` , `--model` , `--output` , `--resolution` , `--api-key` , `--workers`
`scripts/edit_slide.py`	Edit a single slide based on instruction	`--input` , `--instruction` , `--output` , `--model` , `--api-key`
`scripts/serve_viewer.py`	Local review server with feedback	`--dir` , `--plan` , `--port` , `--no-open` , `--pid-file`
`scripts/package_pptx.py`	Package slide images into .pptx	`--dir` , `--output` , `--kill-server`

脚本	用途	核心参数
`scripts/generate_ppt.py`	根据规划批量生成所有幻灯片	`--plan` , `--style` , `--model` , `--output` , `--resolution` , `--api-key` , `--workers`
`scripts/edit_slide.py`	根据指令编辑单页幻灯片	`--input` , `--instruction` , `--output` , `--model` , `--api-key`
`scripts/serve_viewer.py`	带反馈功能的本地审阅服务	`--dir` , `--plan` , `--port` , `--no-open` , `--pid-file`
`scripts/package_pptx.py`	将幻灯片图像打包为.pptx文件	`--dir` , `--output` , `--kill-server`

Style Template Format

样式模板格式

Style templates are markdown files in

styles/

with a fixed structure that

generate_ppt.py

parses:

Section	Purpose	Parsed by Code
`## Base Prompt`	Visual specifications shared by all slides	Yes — injected into every prompt
`## Page Templates`	Layout descriptions per page type	Fallback only
`## Examples`	Actual prompt templates with `{Base Prompt}` and `[Content]` placeholders	Yes — primary templates
Other sections	Documentation only	No

To create a new style: copy an existing

.md

file, modify the

## Base Prompt

and

## Examples

sections. The code extracts

### Cover

### Content

, and

### Data

code blocks from

## Examples

样式模板是

styles/

目录下的markdown文件，采用固定结构供

generate_ppt.py

解析：

模块	用途	是否被代码解析
`## Base Prompt`	所有幻灯片通用的视觉规范	是 —— 会注入到每个prompt中
`## Page Templates`	每种页面类型的布局描述	仅作为 fallback
`## Examples`	包含 `{Base Prompt}` 和 `[Content]` 占位符的实际prompt模板	是 —— 作为主模板使用
其他模块	仅作为文档	否

要创建新样式：复制现有的

.md

文件，修改

## Base Prompt

和

## Examples

模块即可，代码会从

## Examples

中提取

### Cover

、

### Content

和

### Data

代码块。