nano-banana

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Nano Banana

Nano Banana

Generate high-quality presentation slides as images using Gemini's image generation API, review them interactively in a browser, and iteratively edit based on feedback.
利用Gemini的图像生成API生成高质量演示幻灯片图像,在浏览器中交互式审阅,并基于反馈迭代编辑。

When to Use This Skill

何时使用该技能

  • User asks to create a presentation, slide deck, or PPT
  • User wants to generate visual slides for a talk or lecture
  • User has a document or outline and wants slides based on it
  • User says "make me a PPT", "generate slides", "create a presentation"
  • User wants to edit or refine existing generated slides
  • User needs high-quality figures, diagrams, or illustrations for papers or documents
  • User asks to generate research figures, architecture diagrams, or concept illustrations
Do NOT use for:
  • Writing academic papers → use
    paper-writing
  • Planning academic conference talk narrative structure → use
    academic-slides

  • 用户要求创建演示文稿、幻灯片Deck或PPT
  • 用户需要为演讲或讲座生成可视化幻灯片
  • 用户已有文档或大纲,需要基于其生成幻灯片
  • 用户说出“给我做个PPT”、“生成幻灯片”、“创建演示文稿”等需求
  • 用户需要编辑或优化已生成的幻灯片
  • 用户需要为论文或文档提供高质量图表、示意图或插图
  • 用户要求生成研究配图、架构图或概念插图
禁止用于:
  • 撰写学术论文 → 请使用
    paper-writing
    工具
  • 规划学术会议演讲的叙事结构 → 请使用
    academic-slides
    工具

Before You Start: Prerequisites

开始前:前置要求

Before proceeding with any slide generation, verify these prerequisites:
  1. API Key: Check that a Google API key is available. Run:
    bash
    echo $GOOGLE_API_KEY
    If empty, ask the user to provide one. They can either:
    • Set it via config:
      EvoSci config set google_api_key <key>
    • Provide it directly (pass via
      --api-key
      argument)
    • If the user provides the key in conversation, pass it to scripts with
      --api-key
  2. Language: Ask the user what language the slide content should be in. This affects the content you write in
    slides_plan.json
    , not the style template.

在进行任何幻灯片生成操作前,请确认以下前置条件已满足:
  1. API Key:检查是否已配置Google API密钥,运行命令:
    bash
    echo $GOOGLE_API_KEY
    若返回为空,请让用户提供密钥,用户可选择以下任意方式配置:
    • 通过配置命令设置:
      EvoSci config set google_api_key <key>
    • 直接提供密钥(通过
      --api-key
      参数传递)
    • 若用户在对话中提供了密钥,请通过
      --api-key
      参数传递给脚本
  2. 语言:询问用户幻灯片内容需要使用的语言,该设置仅影响
    slides_plan.json
    中的内容,不影响样式模板。

Core Workflow

核心工作流

Phase 1: Content Planning Conversation     ← most important phase
Phase 2: Generate slides_plan.json
Phase 3: Select Style & Generate Slides
Phase 4: Launch Review Server
Phase 5: Apply Feedback Edits              ← repeat Phase 4-5 until satisfied
Phase 6: Package as PPTX
Phase 7: Cleanup
Follow these phases in order. Do NOT skip Phase 1 — the quality of generated slides depends directly on planning depth.

阶段1:内容规划对话     ← 最重要的阶段
阶段2:生成slides_plan.json
阶段3:选择样式并生成幻灯片
阶段4:启动审阅服务
阶段5:应用反馈修改              ← 重复阶段4-5直到用户满意
阶段6:打包为PPTX
阶段7:清理资源
请按顺序执行以上阶段,禁止跳过阶段1——生成幻灯片的质量直接取决于规划的深度。

Phase 1: Content Planning Conversation

阶段1:内容规划对话

This is the most critical phase. Rushing to generation without proper planning produces mediocre slides. Engage the user in a structured conversation:
Step 1 — Understand the context:
  • What is the topic of the presentation?
  • Who is the audience? (technical peers, executives, students, general public)
  • How long is the talk? (this determines page count)
  • What is the occasion? (conference, internal talk, lecture, pitch)
Step 2 — Define the storyline:
  • What is the opening hook? (a surprising fact, a question, a trend)
  • What are the 3-5 main sections or arguments?
  • What is the key takeaway the audience should remember?
  • What is the closing message?
Step 3 — Outline per-page content:
  • For each slide, agree on: title + 2-4 key points + visual description
  • Identify which slides are cover, content, or data type
  • Ensure logical flow between pages
Duration-to-page-count guidance:
DurationPagesStructure
5 min5Cover + 3 content + closing
10-15 min8-12Cover + intro + 3-4 sections + summary + closing
20-30 min15-20Cover + intro + 5-6 sections + summary + closing
45-60 min25-30Cover + intro + 7-9 sections (2-3 pages each) + summary + closing
If the user provides a document or outline, read it thoroughly, then propose a slide breakdown for approval before proceeding.

这是最关键的阶段,没有充分规划就匆忙生成只会得到质量平庸的幻灯片。请和用户进行结构化沟通:
步骤1 — 理解上下文:
  • 演示的主题是什么?
  • 受众是谁?(技术同行、高管、学生、普通大众)
  • 演讲时长是多少?(该信息决定幻灯片页数)
  • 演示场合是什么?(会议、内部演讲、讲座、项目推介)
步骤2 — 定义故事线:
  • 开场钩子是什么?(惊人的事实、问题、趋势)
  • 3-5个核心章节或论点是什么?
  • 受众应该记住的核心结论是什么?
  • 收尾信息是什么?
步骤3 — 规划单页内容:
  • 为每一页幻灯片确认:标题 + 2-4个核心要点 + 视觉描述
  • 明确哪些幻灯片是封面、内容页、数据页
  • 确保页面之间逻辑流畅
时长对应页数参考:
时长页数结构
5分钟5封面 + 3页内容 + 收尾
10-15分钟8-12封面 + 介绍 + 3-4个章节 + 总结 + 收尾
20-30分钟15-20封面 + 介绍 + 5-6个章节 + 总结 + 收尾
45-60分钟25-30封面 + 介绍 + 7-9个章节(每个章节2-3页) + 总结 + 收尾
如果用户提供了文档或大纲,请先通读全文,然后提出幻灯片拆分方案,获得用户批准后再进入下一阶段。

Phase 2: Generate slides_plan.json

阶段2:生成slides_plan.json

Create a
slides_plan.json
file in the workspace root with this schema:
json
{
  "title": "Presentation Title",
  "total_slides": 10,
  "slides": [
    {
      "slide_number": 1,
      "page_type": "cover",
      "content": "Title: My Presentation\nSubtitle: A subtitle here\nLabel: 2026 Edition"
    },
    {
      "slide_number": 2,
      "page_type": "content",
      "content": "Title: First Topic\nKey points:\n- Point one\n- Point two\n- Point three"
    },
    {
      "slide_number": 3,
      "page_type": "data",
      "content": "Title: Key Metrics\nMetric 1: 95% accuracy\nMetric 2: 3x faster\nMetric 3: 10k users"
    }
  ]
}
page_type values:
cover
,
content
,
data
在工作区根目录创建
slides_plan.json
文件,遵循以下结构:
json
{
  "title": "Presentation Title",
  "total_slides": 10,
  "slides": [
    {
      "slide_number": 1,
      "page_type": "cover",
      "content": "Title: My Presentation\nSubtitle: A subtitle here\nLabel: 2026 Edition"
    },
    {
      "slide_number": 2,
      "page_type": "content",
      "content": "Title: First Topic\nKey points:\n- Point one\n- Point two\n- Point three"
    },
    {
      "slide_number": 3,
      "page_type": "data",
      "content": "Title: Key Metrics\nMetric 1: 95% accuracy\nMetric 2: 3x faster\nMetric 3: 10k users"
    }
  ]
}
page_type可选值:
cover
(封面)、
content
(内容页)、
data
(数据页)

Critical Content Field Rules

内容字段核心规则

The
content
field is what gets passed to the image generation model. Follow these rules strictly:
  1. DO write descriptive titles and bullet points
  2. DO describe the visual layout you want (e.g., "left-right comparison", "4 icon cards")
  3. DO NOT prefix lines with "Slogan:", "Visual:", "Points:", or any meta-labels — the model will render these as visible text on the slide
  4. DO NOT put the same sentence in both the title area and the bottom of the content — it causes duplication
  5. DO NOT include footer text, page numbers, or watermark instructions
Bad example (meta-labels leak as visible text):
Title: Why AI Matters
Visual: left-right comparison chart
Points:
- Point one
- Point two
Slogan: AI changes everything
Good example (clean, no meta-labels):
Title: Why AI Matters
Visual layout: left-right comparison chart showing traditional vs AI approach
Key points:
- Point one with brief explanation
- Point two with brief explanation
Bottom tagline: AI changes everything

content
字段的内容会直接传递给图像生成模型,请严格遵守以下规则:
  1. 必须编写清晰的标题和要点列表
  2. 可以描述你想要的视觉布局(例如“左右对比布局”、“4个图标卡片”)
  3. 禁止在每行前加“Slogan:”、“Visual:”、“Points:”等元标签——模型会将这些内容作为可见文本渲染到幻灯片上
  4. 禁止在标题区域和内容底部重复相同的句子,会导致内容重复
  5. 禁止包含页脚文本、页码、水印相关的说明
错误示例(元标签会泄露为可见文本):
Title: Why AI Matters
Visual: left-right comparison chart
Points:
- Point one
- Point two
Slogan: AI changes everything
正确示例(干净无多余元标签):
Title: Why AI Matters
Visual layout: left-right comparison chart showing traditional vs AI approach
Key points:
- Point one with brief explanation
- Point two with brief explanation
Bottom tagline: AI changes everything

Phase 3: Select Style & Generate Slides

阶段3:选择样式并生成幻灯片

Available Styles

可用样式

StyleFileVisual CharacteristicsBest For
Lineal Color
styles/lineal-color.md
White background, teal accents, flat 2D icons, info cardsTechnical talks, lectures, educational
Gradient Glass
styles/gradient-glass.md
Light pastel background, frosted glass cards, Apple Keynote feelProduct launches, pitches, SaaS
Vector Illustration
styles/vector-illustration.md
Cream background, black outlines, retro colors, toy-model charmEducational, children's content, brand stories
Present the styles to the user and let them choose. If unsure, recommend Lineal Color as the default.
样式名称文件路径视觉特点适用场景
Lineal Color
styles/lineal-color.md
白色背景、青色点缀、扁平2D图标、信息卡片技术演讲、讲座、教育场景
Gradient Glass
styles/gradient-glass.md
浅马卡龙背景、毛玻璃卡片、苹果Keynote风格产品发布、项目推介、SaaS相关演示
Vector Illustration
styles/vector-illustration.md
米黄色背景、黑色描边、复古配色、卡通模型质感教育内容、儿童内容、品牌故事
请将以上样式提供给用户选择,若用户不确定,推荐默认使用Lineal Color。

Available Models

可用模型

ModelSpeedQualityWhen to Use
gemini-3-pro-image-preview
ModerateBestFinal version, important presentations
gemini-3.1-flash-image-preview
FastGoodDrafts, rapid iteration, large decks
gemini-2.5-flash-image
FastestBasicQuick prototypes, bulk generation
For first-time generation, recommend
gemini-3.1-flash-image-preview
(fast iteration). Switch to
gemini-3-pro-image-preview
for the final version.
模型速度质量适用场景
gemini-3-pro-image-preview
中等最优最终版本、重要演示
gemini-3.1-flash-image-preview
良好草稿、快速迭代、大型幻灯片Deck
gemini-2.5-flash-image
最快基础快速原型、批量生成
首次生成时推荐使用
gemini-3.1-flash-image-preview
(迭代速度快),最终版本生成时切换为
gemini-3-pro-image-preview

Generate Command

生成命令

bash
python /skills/nano-banana/scripts/generate_ppt.py \
  --plan slides_plan.json \
  --style /skills/nano-banana/styles/lineal-color.md \
  --model gemini-3.1-flash-image-preview \
  --output ppt_output
Arguments:
  • --plan
    (required): Path to slides_plan.json
  • --style
    (required): Path to style template
  • --model
    : Image generation model (default:
    gemini-3-pro-image-preview
    )
  • --resolution
    :
    2K
    (default) or
    4K
  • --output
    : Output directory (default:
    ppt_output/TIMESTAMP
    )
  • --api-key
    : Google API key (if not in environment)
  • --workers
    : Number of parallel workers (default: 1, recommended: 3-5 for large decks)
Output structure:
ppt_output/
├── images/
│   ├── slide-01.png
│   ├── slide-02.png
│   └── ...
├── prompts.json    # All prompts used (for debugging)
└── index.html      # Browser viewer

bash
python /skills/nano-banana/scripts/generate_ppt.py \
  --plan slides_plan.json \
  --style /skills/nano-banana/styles/lineal-color.md \
  --model gemini-3.1-flash-image-preview \
  --output ppt_output
参数说明:
  • --plan
    (必填):slides_plan.json的文件路径
  • --style
    (必填):样式模板的文件路径
  • --model
    :图像生成模型(默认值:
    gemini-3-pro-image-preview
  • --resolution
    :分辨率,可选
    2K
    (默认)或
    4K
  • --output
    :输出目录(默认值:
    ppt_output/TIMESTAMP
  • --api-key
    :Google API密钥(若未配置在环境变量中)
  • --workers
    :并行生成的worker数量(默认值:1,大型Deck推荐3-5)
输出目录结构:
ppt_output/
├── images/
│   ├── slide-01.png
│   ├── slide-02.png
│   └── ...
├── prompts.json    # 所有使用的prompt(用于调试)
└── index.html      # 浏览器预览页面

Phase 4: Launch Review Server

阶段4:启动审阅服务

Start the interactive review server so the user can review slides and write feedback:
bash
python /skills/nano-banana/scripts/serve_viewer.py \
  --dir ppt_output \
  --plan slides_plan.json \
  --port 8080 \
  --pid-file .viewer.pid
Tell the user:
Review server is running at http://localhost:8080. Open it in your browser to review each slide. Write feedback in the text box below any slide that needs changes, then click "Save Feedback". Tell me when you're done.
The server saves feedback directly into
slides_plan.json
as a
feedback
field on each slide.
Wait for the user to confirm they have saved their feedback before proceeding.

启动交互式审阅服务,方便用户预览幻灯片并提交反馈:
bash
python /skills/nano-banana/scripts/serve_viewer.py \
  --dir ppt_output \
  --plan slides_plan.json \
  --port 8080 \
  --pid-file .viewer.pid
告知用户:
服务会将反馈直接保存到
slides_plan.json
中对应幻灯片的
feedback
字段里。
请等待用户确认已保存所有反馈后再进入下一阶段。

Phase 5: Apply Feedback Edits

阶段5:应用反馈修改

Read
slides_plan.json
and find all slides with a non-empty
feedback
field. For each one, run the edit script:
bash
python /skills/nano-banana/scripts/edit_slide.py \
  --input ppt_output/images/slide-{NUMBER}.png \
  --instruction "{FEEDBACK_TEXT}" \
  --output ppt_output/images/slide-{NUMBER}.png \
  --model gemini-3.1-flash-image-preview
Arguments:
  • --input
    (required): Path to the original slide image
  • --instruction
    (required): The edit instruction (from feedback field)
  • --output
    : Output path (default: overwrite input)
  • --model
    : Image generation model
  • --api-key
    : Google API key (if not in environment)
After editing all slides with feedback, clear the
feedback
fields from
slides_plan.json
and tell the user to refresh the browser to see updated slides.
If the user has more feedback, repeat Phase 4-5. This review-edit cycle continues until the user is satisfied.

读取
slides_plan.json
,找到所有
feedback
字段不为空的幻灯片,对每一页运行编辑脚本:
bash
python /skills/nano-banana/scripts/edit_slide.py \
  --input ppt_output/images/slide-{NUMBER}.png \
  --instruction "{FEEDBACK_TEXT}" \
  --output ppt_output/images/slide-{NUMBER}.png \
  --model gemini-3.1-flash-image-preview
参数说明:
  • --input
    (必填):原始幻灯片图像的路径
  • --instruction
    (必填):修改指令(来自feedback字段)
  • --output
    :输出路径(默认覆盖原文件)
  • --model
    :图像生成模型
  • --api-key
    :Google API密钥(若未配置在环境变量中)
完成所有带反馈幻灯片的编辑后,清空
slides_plan.json
中的
feedback
字段,告知用户刷新浏览器即可看到更新后的幻灯片。
如果用户还有新的反馈,重复阶段4-5,该审阅-编辑循环持续到用户满意为止。

Phase 6: Package as PPTX

阶段6:打包为PPTX

Once the user approves all slides, ask for the desired filename and package them:
bash
python /skills/nano-banana/scripts/package_pptx.py \
  --dir ppt_output/images \
  --output presentation.pptx \
  --kill-server .viewer.pid
Arguments:
  • --dir
    (required): Directory containing slide-XX.png images
  • --output
    (required): Output .pptx file path
  • --kill-server
    : PID file from serve_viewer.py — automatically stops the review server after packaging

用户确认所有幻灯片无误后,询问期望的文件名,然后执行打包命令:
bash
python /skills/nano-banana/scripts/package_pptx.py \
  --dir ppt_output/images \
  --output presentation.pptx \
  --kill-server .viewer.pid
参数说明:
  • --dir
    (必填):存储slide-XX.png图像的目录
  • --output
    (必填):输出.pptx文件的路径
  • --kill-server
    :serve_viewer.py生成的PID文件,打包完成后会自动停止审阅服务

Phase 7: Cleanup

阶段7:清理资源

  • The review server is automatically stopped by
    package_pptx.py --kill-server
  • Ask the user if they want to keep
    ppt_output/
    directory or clean it up
  • The
    slides_plan.json
    can be kept for future re-generation

  • 审阅服务会通过
    package_pptx.py --kill-server
    自动停止
  • 询问用户是否需要保留
    ppt_output/
    目录,还是清理掉
  • slides_plan.json
    可以保留,用于后续重新生成

Counterintuitive Rules

反直觉规则

  1. Never include meta-labels in content — Words like "Slogan:", "Visual:", "Points:" will be rendered as visible text on the slide. Describe what you want without prefixes.
  2. Content describes WHAT, not HOW — The style template handles visual layout. The content field should focus on text and logical structure, not colors or positioning.
  3. More planning = better slides — Spending 10 minutes on Phase 1 conversation saves hours of re-generation. Do not rush to Phase 3.
  4. Edit, don't regenerate — When a slide needs minor changes (text fix, color change, remove footer), use
    edit_slide.py
    instead of regenerating from scratch. Editing preserves visual consistency.
  5. Use flash model for drafts
    gemini-3.1-flash-image-preview
    is fast enough for iteration. Only switch to
    gemini-3-pro-image-preview
    for the final version after all feedback is addressed.
  6. Never read generated images yourself — Not all models support multimodal input. Do NOT use
    read_file
    on generated PNG images to check quality. Always launch the review server and let the user inspect slides visually in the browser. The user's feedback is your only quality signal.
  7. One idea per slide — Do not pack multiple concepts into a single slide. If a slide has more than 4 bullet points, split it into two slides.
  8. Bottom taglines should not repeat the title — If the title says "Why AI Matters", the bottom tagline should add new insight, not restate the title.

  1. 内容中绝对不能包含元标签——“Slogan:”、“Visual:”、“Points:”这类词汇会被渲染为幻灯片上的可见文本,直接描述需求不要加前缀。
  2. 内容描述是什么,而不是怎么做——样式模板负责视觉布局,内容字段只需要聚焦文本和逻辑结构,不需要说明颜色、位置等样式信息。
  3. 规划越充分,幻灯片质量越高——在阶段1花10分钟沟通,可以节省数小时的重新生成时间,不要匆忙进入阶段3。
  4. 优先编辑,不要重新生成——如果幻灯片只需要小修改(文本修正、颜色调整、删除页脚),使用
    edit_slide.py
    而不是从头重新生成,编辑可以保留视觉一致性。
  5. 草稿使用flash模型——
    gemini-3.1-flash-image-preview
    的速度足够支撑迭代,所有反馈处理完成后的最终版本再切换到
    gemini-3-pro-image-preview
  6. 不要自行读取生成的图像——不是所有模型都支持多模态输入,不要使用
    read_file
    读取生成的PNG图像检查质量,始终启动审阅服务让用户在浏览器中目视检查,用户的反馈是唯一的质量信号。
  7. 每页一个核心观点——不要在单页幻灯片中塞多个概念,如果一页的要点超过4个,拆分为两页。
  8. 底部标语不要重复标题——如果标题是“为什么AI很重要”,底部标语应该提供新的洞察,而不是重复标题内容。

Scripts Reference

脚本参考

ScriptPurposeKey Arguments
scripts/generate_ppt.py
Batch generate all slides from plan
--plan
,
--style
,
--model
,
--output
,
--resolution
,
--api-key
,
--workers
scripts/edit_slide.py
Edit a single slide based on instruction
--input
,
--instruction
,
--output
,
--model
,
--api-key
scripts/serve_viewer.py
Local review server with feedback
--dir
,
--plan
,
--port
,
--no-open
,
--pid-file
scripts/package_pptx.py
Package slide images into .pptx
--dir
,
--output
,
--kill-server

脚本用途核心参数
scripts/generate_ppt.py
根据规划批量生成所有幻灯片
--plan
,
--style
,
--model
,
--output
,
--resolution
,
--api-key
,
--workers
scripts/edit_slide.py
根据指令编辑单页幻灯片
--input
,
--instruction
,
--output
,
--model
,
--api-key
scripts/serve_viewer.py
带反馈功能的本地审阅服务
--dir
,
--plan
,
--port
,
--no-open
,
--pid-file
scripts/package_pptx.py
将幻灯片图像打包为.pptx文件
--dir
,
--output
,
--kill-server

Style Template Format

样式模板格式

Style templates are markdown files in
styles/
with a fixed structure that
generate_ppt.py
parses:
SectionPurposeParsed by Code
## Base Prompt
Visual specifications shared by all slidesYes — injected into every prompt
## Page Templates
Layout descriptions per page typeFallback only
## Examples
Actual prompt templates with
{Base Prompt}
and
[Content]
placeholders
Yes — primary templates
Other sectionsDocumentation onlyNo
To create a new style: copy an existing
.md
file, modify the
## Base Prompt
and
## Examples
sections. The code extracts
### Cover
,
### Content
, and
### Data
code blocks from
## Examples
.
样式模板是
styles/
目录下的markdown文件,采用固定结构供
generate_ppt.py
解析:
模块用途是否被代码解析
## Base Prompt
所有幻灯片通用的视觉规范是 —— 会注入到每个prompt中
## Page Templates
每种页面类型的布局描述仅作为 fallback
## Examples
包含
{Base Prompt}
[Content]
占位符的实际prompt模板
是 —— 作为主模板使用
其他模块仅作为文档
要创建新样式:复制现有的
.md
文件,修改
## Base Prompt
## Examples
模块即可,代码会从
## Examples
中提取
### Cover
### Content
### Data
代码块。