nano-banana-builder
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseNano Banana Builder
Nano Banana 构建指南
Build production-ready web applications powered by Google's Nano Banana image generation APIs—creating everything from simple text-to-image generators to sophisticated iterative editors with multi-turn conversation.
构建可投入生产的Web应用,这些应用由Google的Nano Banana图像生成API驱动——可创建从简单的文本转图像生成器到支持多轮对话的复杂迭代编辑器等各类应用。
CRITICAL: Exact Model Names
⚠️ 关键注意事项:使用精确的模型名称
Use ONLY these exact model strings. Do not invent, guess, or add date suffixes.
| Model String (use exactly) | Alias | Use Case |
|---|---|---|
| Nano Banana | Fast iterations, drafts, high volume |
| Nano Banana Pro | Quality output, text rendering, 2K |
Common mistakes to avoid:
- ❌ — wrong, date suffixes are for text models
gemini-2.5-flash-preview-05-20 - ❌ — wrong, 2.5 Pro doesn't do image generation
gemini-2.5-pro-image - ❌ — wrong, doesn't exist
gemini-3-flash-image - ❌ — wrong, that's for image input, not generation
gemini-pro-vision
The only valid image generation models are and .
gemini-2.5-flash-imagegemini-3-pro-image-preview请仅使用以下精确的模型字符串,切勿自行编造、猜测或添加日期后缀。
| 模型字符串(请严格使用) | 别名 | 适用场景 |
|---|---|---|
| Nano Banana | 快速迭代、草稿生成、高吞吐量场景 |
| Nano Banana Pro | 高质量输出、文本渲染、2K分辨率场景 |
需避免的常见错误:
- ❌ —— 错误,日期后缀仅适用于文本模型
gemini-2.5-flash-preview-05-20 - ❌ —— 错误,2.5 Pro不支持图像生成
gemini-2.5-pro-image - ❌ —— 错误,该模型不存在
gemini-3-flash-image - ❌ —— 错误,该模型仅用于图像输入,不支持图像生成
gemini-pro-vision
唯一有效的图像生成模型为 和 。
gemini-2.5-flash-imagegemini-3-pro-image-previewPhilosophy: Conversational Image Generation
设计理念:对话式图像生成
Nano Banana isn't just another image API—it's conversational by design. The core insight is that image generation works best as a dialogue, not a one-shot prompt.
Think of it as working with an AI art director:
- Iterative refinement → Build up images through conversation, not perfection in one prompt
- Context awareness → The model "remembers" previous generations and edits
- Natural language editing → Describe changes conversationally, not with parameters
Nano Banana 并非普通的图像API——它从设计之初就具备对话式特性。核心思路是,图像生成作为一种对话过程时效果最佳,而非单次提示即可完成。
可以将其视为与AI艺术总监协作:
- 迭代优化 → 通过对话逐步完善图像,而非追求单次提示就能生成完美结果
- 上下文感知 → 模型会“记住”之前的生成结果和编辑操作
- 自然语言编辑 → 用自然语言描述修改需求,而非依赖参数配置
Before Building, Ask
开发前需明确的问题
- What's the primary use case? Text-to-image generation? Image editing? Multi-image composition? Style transfer?
- Which model fits the need? Nano Banana (speed/iterations) or Nano Banana Pro (quality/complex prompts)?
- What's the user journey? Single generation? Iterative refinement? Gallery browsing?
- What are production constraints? Rate limits? Storage? Cost per image? User volume?
- 核心使用场景是什么? 文本转图像生成?图像编辑?多图像合成?风格迁移?
- 哪种模型更适配需求? Nano Banana(速度/迭代)还是Nano Banana Pro(质量/复杂提示)?
- 用户流程是怎样的? 单次生成?迭代优化?图库浏览?
- 生产环境有哪些约束? 速率限制?存储?单图像成本?用户量?
Core Principles
核心原则
- Conversation over configuration: Leverage Nano Banana's iterative editing rather than complex parameter UIs
- Model selection matters: Use for speed/iterations,
gemini-2.5-flash-imagefor quality/complexitygemini-3-pro-image-preview - State as conversation history: Track generations as chat messages to enable multi-turn editing
- Rate limit awareness: Image generation has strict quotas—implement queuing and caching
- Storage strategy: Store generated images (Vercel Blob/S3), not just inline base64
- 优先对话而非配置:利用Nano Banana的迭代编辑能力,而非复杂的参数界面
- 合理选择模型:使用满足速度/迭代需求,使用
gemini-2.5-flash-image满足质量/复杂需求gemini-3-pro-image-preview - 以对话历史为状态:将生成记录作为聊天消息跟踪,以支持多轮编辑
- 重视速率限制:图像生成有严格的配额限制——需实现排队和缓存机制
- 合理存储策略:将生成的图像存储到对象存储(如Vercel Blob/S3),而非仅使用内联base64格式
Model Selection Framework
模型选择框架
Choose based on use case:
| Use Case | Model | Why |
|---|---|---|
| Rapid iterations, drafts | | Fast (2-5s), lower cost per image |
| Final output, quality | | Superior quality, thinking, text rendering |
| Text-heavy images | | Best typography, 2K resolution |
| Multi-turn editing | Either | Both support conversational editing |
| High volume | | Lower cost, faster throughput |
根据使用场景选择:
| 使用场景 | 模型 | 原因 |
|---|---|---|
| 快速迭代、草稿生成 | | 速度快(2-5秒),单图像成本更低 |
| 最终输出、高质量需求 | | 画质更优,支持思考过程、文本渲染 |
| 含大量文本的图像 | | 排版效果最佳,支持2K分辨率 |
| 多轮编辑 | 两者均可 | 均支持对话式编辑 |
| 高吞吐量场景 | | 成本更低,处理速度更快 |
Quick Start
快速开始
Basic Server Action
基础Server Action实现
typescript
// app/actions/generate.ts
'use server'
import { google } from '@ai-sdk/google'
import { generateText } from 'ai'
export async function generateImage(prompt: string) {
const result = await generateText({
model: google('gemini-2.5-flash-image'),
prompt,
providerOptions: {
google: {
responseModalities: ['IMAGE'],
imageConfig: { aspectRatio: '16:9' }
}
}
})
return result.files[0] // { base64, uint8Array, mediaType }
}typescript
// app/actions/generate.ts
'use server'
import { google } from '@ai-sdk/google'
import { generateText } from 'ai'
export async function generateImage(prompt: string) {
const result = await generateText({
model: google('gemini-2.5-flash-image'),
prompt,
providerOptions: {
google: {
responseModalities: ['IMAGE'],
imageConfig: { aspectRatio: '16:9' }
}
}
})
return result.files[0] // { base64, uint8Array, mediaType }
}Client Component with useChat
基于useChat的客户端组件
typescript
// app/components/ImageGenerator.tsx
'use client'
import { useChat } from '@ai-sdk/react'
export function ImageGenerator() {
const { append, messages, isLoading } = useChat({
api: '/api/generate'
})
return (
<div>
{messages.map(m => (
<div key={m.id}>
{m.parts?.map((part, i) =>
part.type === 'image' && (
<img key={i} src={part.url} alt="Generated" />
)
)}
</div>
))}
<button
disabled={isLoading}
onClick={() => append({
role: 'user',
content: 'A futuristic cityscape at dusk'
})}
>
Generate
</button>
</div>
)
}typescript
// app/components/ImageGenerator.tsx
'use client'
import { useChat } from '@ai-sdk/react'
export function ImageGenerator() {
const { append, messages, isLoading } = useChat({
api: '/api/generate'
})
return (
<div>
{messages.map(m => (
<div key={m.id}>
{m.parts?.map((part, i) =>
part.type === 'image' && (
<img key={i} src={part.url} alt="Generated" />
)
)}
</div>
))}
<button
disabled={isLoading}
onClick={() => append({
role: 'user',
content: 'A futuristic cityscape at dusk'
})}
>
Generate
</button>
</div>
)
}Advanced Implementation
进阶实现
For complete implementations including:
- Server Actions with model selection, storage, and error handling
- API Routes with streaming responses
- Client Components with iterative editing and galleries
- Advanced Patterns like multi-image composition and batch generation
See references/advanced-patterns.md
如需完整的实现方案,包括:
- 支持模型选择、存储及错误处理的Server Actions
- 支持流式响应的API路由
- 支持迭代编辑及图库功能的客户端组件
- 多图像合成、批量生成等进阶模式
请查看 references/advanced-patterns.md
Configuration & Operations
配置与运维
For detailed configuration and operational concerns:
- Provider Options (responseModalities, imageConfig, thinkingConfig)
- Storage Strategy (Vercel Blob, S3/R2 implementations)
- Rate Limiting (Upstash Redis patterns, quota management)
- Cost Optimization strategies
See references/configuration.md
如需详细的配置及运维指南,包括:
- 提供商选项(responseModalities、imageConfig、thinkingConfig)
- 存储策略(Vercel Blob、S3/R2实现方案)
- 速率限制(Upstash Redis模式、配额管理)
- 成本优化策略
请查看 references/configuration.md
Anti-Patterns to Avoid
需避免的反模式
❌ Inventing model names or adding date suffixes:
Why wrong: Image generation models have specific names; date suffixes like are for text models only
Better: Use exactly or — no variations
-preview-05-20gemini-2.5-flash-imagegemini-3-pro-image-preview❌ Using Gemini 2.5 Pro for images:
Why wrong: Gemini 2.5 Pro doesn't generate images directly
Better: Use or
gemini-2.5-flash-imagegemini-3-pro-image-preview❌ Storing only base64 in database:
Why wrong: Blobs database, expensive storage, slow retrieval
Better: Store in object storage (Vercel Blob/S3), save URL only
❌ No rate limit handling:
Why wrong: Will hit 429 errors in production, poor UX
Better: Implement rate limiting with user-friendly error messages
❌ Ignoring multi-turn context:
Why wrong: Wastes Nano Banana's conversational editing strength
Better: Track chat history for iterative refinement
❌ Hardcoding API keys client-side:
Why wrong: Exposes credentials, security risk
Better: Use server actions / API routes with environment variables
❌ Using wrong aspect ratio:
Why wrong: 21:9 on 1:1 request wastes tokens, unexpected crop
Better: Match aspect ratio to intended use case
❌ No loading states:
Why wrong: Image generation takes 5-30s, users think it's broken
Better: Show progress indicators and estimated wait time
❌ Generating on every keystroke:
Why wrong: Wastes quota, slow response
Better: Debounce prompts, require explicit action
❌ 自行编造模型名称或添加日期后缀:
错误原因:图像生成模型有固定名称;类似的日期后缀仅适用于文本模型
正确做法:严格使用或——切勿修改
-preview-05-20gemini-2.5-flash-imagegemini-3-pro-image-preview❌ 使用Gemini 2.5 Pro进行图像生成:
错误原因:Gemini 2.5 Pro不直接支持图像生成
正确做法:使用或
gemini-2.5-flash-imagegemini-3-pro-image-preview❌ 仅在数据库中存储base64格式图像:
错误原因:会导致数据库体积过大、存储成本高、检索速度慢
正确做法:存储到对象存储(如Vercel Blob/S3),仅保存图像URL
❌ 未处理速率限制:
错误原因:生产环境中会触发429错误,影响用户体验
正确做法:实现速率限制机制,并提供友好的错误提示
❌ 忽略多轮上下文:
错误原因:浪费了Nano Banana的对话式编辑能力
正确做法:跟踪聊天历史以支持迭代优化
❌ 在客户端硬编码API密钥:
错误原因:会泄露凭证,存在安全风险
正确做法:使用Server Actions/API路由,并通过环境变量存储密钥
❌ 使用错误的宽高比:
错误原因:在1:1的请求中使用21:9会浪费令牌,导致意外裁剪
正确做法:根据使用场景匹配合适的宽高比
❌ 未添加加载状态:
错误原因:图像生成需要5-30秒,用户会误以为应用故障
正确做法:显示进度指示器及预计等待时间
❌ 每次按键都触发生成:
错误原因:浪费配额,响应速度慢
正确做法:对输入提示进行防抖处理,要求用户明确触发生成操作
Variation Guidance
差异化设计指南
IMPORTANT: Every app should feel uniquely designed for its specific purpose.
Vary across dimensions:
- UI Style: Minimal, brutalist, playful, professional, dark, light
- Color Scheme: Warm, cool, monochrome, vibrant, muted
- Layout: Single page, multi-step wizard, sidebar, grid, list
- Interaction: Click-to-generate, drag-and-drop, real-time typing, batch
Avoid overused patterns:
- ❌ Default Tailwind purple gradients
- ❌ Generic "AI startup" aesthetic
- ❌ Same component libraries for every project
- ❌ Inter/Roboto fonts without thought
Context should drive design:
- Meme generator → Bold, fun, casual
- Product mockup tool → Clean, professional, grid-based
- Art exploration → Gallery-first, visual-heavy
- Brand asset creator → Polished, template-guided
重要提示:每个应用都应根据其特定用途进行独特设计。
可从以下维度进行差异化:
- UI风格:极简、粗犷、趣味、专业、深色、浅色
- 配色方案:暖色调、冷色调、单色、鲜艳、柔和
- 布局:单页、多步骤向导、侧边栏、网格、列表
- 交互方式:点击生成、拖拽、实时输入、批量生成
需避免的过度使用模式:
- ❌ 默认的Tailwind紫色渐变
- ❌ 通用的"AI初创公司"风格
- ❌ 每个项目都使用相同的组件库
- ❌ 未经思考就使用Inter/Roboto字体
设计应贴合场景:
- 表情包生成器 → 大胆、有趣、休闲
- 产品原型工具 → 简洁、专业、基于网格
- 艺术探索平台 → 以图库为核心、视觉优先
- 品牌资产创建工具 → 精致、基于模板
Environment Setup
环境搭建
bash
undefinedbash
undefined.env.local
.env.local
GEMINI_API_KEY=your_api_key_here
GEMINI_API_KEY=your_api_key_here
For Vercel Blob storage
For Vercel Blob storage
BLOB_READ_WRITE_TOKEN=your_vercel_token
BLOB_READ_WRITE_TOKEN=your_vercel_token
For S3 (optional)
For S3 (optional)
S3_BUCKET=your-bucket
S3_ENDPOINT=https://your-endpoint.r2.cloudflarestorage.com
S3_ACCESS_KEY_ID=your_key
S3_SECRET_ACCESS_KEY=your_secret
S3_BUCKET=your-bucket
S3_ENDPOINT=https://your-endpoint.r2.cloudflarestorage.com
S3_ACCESS_KEY_ID=your_key
S3_SECRET_ACCESS_KEY=your_secret
For Upstash rate limiting (optional)
For Upstash rate limiting (optional)
UPSTASH_REDIS_REST_URL=your_url
UPSTASH_REDIS_REST_TOKEN=your_token
```bashUPSTASH_REDIS_REST_URL=your_url
UPSTASH_REDIS_REST_TOKEN=your_token
```bashInstall dependencies
Install dependencies
npm install @ai-sdk/google ai @ai-sdk/react @vercel/blob
npm install @ai-sdk/google ai @ai-sdk/react @vercel/blob
Or if using separate packages
Or if using separate packages
npm install google-genai
---npm install google-genai
---Remember
要点总结
Nano Banana enables conversational image generation that feels like working with a creative partner, not a tool.
The best apps:
- Leverage multi-turn editing for refinement
- Choose models intentionally (speed vs quality)
- Handle rate limits gracefully
- Store images efficiently
- Provide great loading states
- Feel uniquely designed for their purpose
You're building more than an image generator—you're creating a creative experience. Design it thoughtfully.
Nano Banana 实现了对话式图像生成,让用户感觉像是与创意伙伴协作,而非使用工具。
优秀的应用应具备以下特点:
- 利用多轮编辑功能进行优化
- 有针对性地选择模型(速度 vs 质量)
- 优雅处理速率限制
- 高效存储图像
- 提供良好的加载状态
- 根据特定用途进行独特设计
你正在构建的不仅仅是一个图像生成器——而是一种创意体验,请用心设计。