nano-banana-builder

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Nano Banana Builder

Nano Banana 构建指南

Build production-ready web applications powered by Google's Nano Banana image generation APIs—creating everything from simple text-to-image generators to sophisticated iterative editors with multi-turn conversation.

构建可投入生产的Web应用，这些应用由Google的Nano Banana图像生成API驱动——可创建从简单的文本转图像生成器到支持多轮对话的复杂迭代编辑器等各类应用。

CRITICAL: Exact Model Names

⚠️ 关键注意事项：使用精确的模型名称

Use ONLY these exact model strings. Do not invent, guess, or add date suffixes.

Model String (use exactly)	Alias	Use Case
`gemini-2.5-flash-image`	Nano Banana	Fast iterations, drafts, high volume
`gemini-3-pro-image-preview`	Nano Banana Pro	Quality output, text rendering, 2K

Common mistakes to avoid:

❌
```
gemini-2.5-flash-preview-05-20
```
— wrong, date suffixes are for text models
❌
```
gemini-2.5-pro-image
```
— wrong, 2.5 Pro doesn't do image generation
❌
```
gemini-3-flash-image
```
— wrong, doesn't exist
❌
```
gemini-pro-vision
```
— wrong, that's for image input, not generation

The only valid image generation models are
gemini-2.5-flash-image
and
gemini-3-pro-image-preview
.

请仅使用以下精确的模型字符串，切勿自行编造、猜测或添加日期后缀。

模型字符串（请严格使用）	别名	适用场景
`gemini-2.5-flash-image`	Nano Banana	快速迭代、草稿生成、高吞吐量场景
`gemini-3-pro-image-preview`	Nano Banana Pro	高质量输出、文本渲染、2K分辨率场景

需避免的常见错误：

❌
```
gemini-2.5-flash-preview-05-20
```
—— 错误，日期后缀仅适用于文本模型
❌
```
gemini-2.5-pro-image
```
—— 错误，2.5 Pro不支持图像生成
❌
```
gemini-3-flash-image
```
—— 错误，该模型不存在
❌
```
gemini-pro-vision
```
—— 错误，该模型仅用于图像输入，不支持图像生成

唯一有效的图像生成模型为
gemini-2.5-flash-image
和
gemini-3-pro-image-preview
。

Philosophy: Conversational Image Generation

设计理念：对话式图像生成

Nano Banana isn't just another image API—it's conversational by design. The core insight is that image generation works best as a dialogue, not a one-shot prompt.

Think of it as working with an AI art director:

Iterative refinement → Build up images through conversation, not perfection in one prompt
Context awareness → The model "remembers" previous generations and edits
Natural language editing → Describe changes conversationally, not with parameters

Nano Banana 并非普通的图像API——它从设计之初就具备对话式特性。核心思路是，图像生成作为一种对话过程时效果最佳，而非单次提示即可完成。

可以将其视为与AI艺术总监协作：

迭代优化 → 通过对话逐步完善图像，而非追求单次提示就能生成完美结果
上下文感知 → 模型会“记住”之前的生成结果和编辑操作
自然语言编辑 → 用自然语言描述修改需求，而非依赖参数配置

Before Building, Ask

开发前需明确的问题

What's the primary use case? Text-to-image generation? Image editing? Multi-image composition? Style transfer?
Which model fits the need? Nano Banana (speed/iterations) or Nano Banana Pro (quality/complex prompts)?
What's the user journey? Single generation? Iterative refinement? Gallery browsing?
What are production constraints? Rate limits? Storage? Cost per image? User volume?

核心使用场景是什么？ 文本转图像生成？图像编辑？多图像合成？风格迁移？
哪种模型更适配需求？ Nano Banana（速度/迭代）还是Nano Banana Pro（质量/复杂提示）？
用户流程是怎样的？ 单次生成？迭代优化？图库浏览？
生产环境有哪些约束？ 速率限制？存储？单图像成本？用户量？

Core Principles

核心原则

Conversation over configuration: Leverage Nano Banana's iterative editing rather than complex parameter UIs
Model selection matters: Use
```
gemini-2.5-flash-image
```
for speed/iterations,
```
gemini-3-pro-image-preview
```
for quality/complexity
State as conversation history: Track generations as chat messages to enable multi-turn editing
Rate limit awareness: Image generation has strict quotas—implement queuing and caching
Storage strategy: Store generated images (Vercel Blob/S3), not just inline base64

优先对话而非配置：利用Nano Banana的迭代编辑能力，而非复杂的参数界面
合理选择模型：使用
```
gemini-2.5-flash-image
```
满足速度/迭代需求，使用
```
gemini-3-pro-image-preview
```
满足质量/复杂需求
以对话历史为状态：将生成记录作为聊天消息跟踪，以支持多轮编辑
重视速率限制：图像生成有严格的配额限制——需实现排队和缓存机制
合理存储策略：将生成的图像存储到对象存储（如Vercel Blob/S3），而非仅使用内联base64格式

Model Selection Framework

模型选择框架

Choose based on use case:

Use Case	Model	Why
Rapid iterations, drafts	`gemini-2.5-flash-image`	Fast (2-5s), lower cost per image
Final output, quality	`gemini-3-pro-image-preview`	Superior quality, thinking, text rendering
Text-heavy images	`gemini-3-pro-image-preview`	Best typography, 2K resolution
Multi-turn editing	Either	Both support conversational editing
High volume	`gemini-2.5-flash-image`	Lower cost, faster throughput

根据使用场景选择：

使用场景	模型	原因
快速迭代、草稿生成	`gemini-2.5-flash-image`	速度快（2-5秒），单图像成本更低
最终输出、高质量需求	`gemini-3-pro-image-preview`	画质更优，支持思考过程、文本渲染
含大量文本的图像	`gemini-3-pro-image-preview`	排版效果最佳，支持2K分辨率
多轮编辑	两者均可	均支持对话式编辑
高吞吐量场景	`gemini-2.5-flash-image`	成本更低，处理速度更快

Quick Start

快速开始

Basic Server Action

基础Server Action实现

typescript

// app/actions/generate.ts
'use server'

import { google } from '@ai-sdk/google'
import { generateText } from 'ai'

export async function generateImage(prompt: string) {
  const result = await generateText({
    model: google('gemini-2.5-flash-image'),
    prompt,
    providerOptions: {
      google: {
        responseModalities: ['IMAGE'],
        imageConfig: { aspectRatio: '16:9' }
      }
    }
  })

  return result.files[0] // { base64, uint8Array, mediaType }
}

typescript

// app/actions/generate.ts
'use server'

import { google } from '@ai-sdk/google'
import { generateText } from 'ai'

export async function generateImage(prompt: string) {
  const result = await generateText({
    model: google('gemini-2.5-flash-image'),
    prompt,
    providerOptions: {
      google: {
        responseModalities: ['IMAGE'],
        imageConfig: { aspectRatio: '16:9' }
      }
    }
  })

  return result.files[0] // { base64, uint8Array, mediaType }
}

Client Component with useChat

基于useChat的客户端组件

typescript

// app/components/ImageGenerator.tsx
'use client'

import { useChat } from '@ai-sdk/react'

export function ImageGenerator() {
  const { append, messages, isLoading } = useChat({
    api: '/api/generate'
  })

  return (
    <div>
      {messages.map(m => (
        <div key={m.id}>
          {m.parts?.map((part, i) =>
            part.type === 'image' && (
              <img key={i} src={part.url} alt="Generated" />
            )
          )}
        </div>
      ))}

      <button
        disabled={isLoading}
        onClick={() => append({
          role: 'user',
          content: 'A futuristic cityscape at dusk'
        })}
      >
        Generate
      </button>
    </div>
  )
}

typescript

// app/components/ImageGenerator.tsx
'use client'

import { useChat } from '@ai-sdk/react'

export function ImageGenerator() {
  const { append, messages, isLoading } = useChat({
    api: '/api/generate'
  })

  return (
    <div>
      {messages.map(m => (
        <div key={m.id}>
          {m.parts?.map((part, i) =>
            part.type === 'image' && (
              <img key={i} src={part.url} alt="Generated" />
            )
          )}
        </div>
      ))}

      <button
        disabled={isLoading}
        onClick={() => append({
          role: 'user',
          content: 'A futuristic cityscape at dusk'
        })}
      >
        Generate
      </button>
    </div>
  )
}

Advanced Implementation

进阶实现

For complete implementations including:

Server Actions with model selection, storage, and error handling
API Routes with streaming responses
Client Components with iterative editing and galleries
Advanced Patterns like multi-image composition and batch generation

See references/advanced-patterns.md

如需完整的实现方案，包括：

支持模型选择、存储及错误处理的Server Actions
支持流式响应的API路由
支持迭代编辑及图库功能的客户端组件
多图像合成、批量生成等进阶模式

请查看 references/advanced-patterns.md

Configuration & Operations

配置与运维

For detailed configuration and operational concerns:

Provider Options (responseModalities, imageConfig, thinkingConfig)
Storage Strategy (Vercel Blob, S3/R2 implementations)
Rate Limiting (Upstash Redis patterns, quota management)
Cost Optimization strategies

See references/configuration.md

如需详细的配置及运维指南，包括：

提供商选项（responseModalities、imageConfig、thinkingConfig）
存储策略（Vercel Blob、S3/R2实现方案）
速率限制（Upstash Redis模式、配额管理）
成本优化策略

请查看 references/configuration.md

Anti-Patterns to Avoid

需避免的反模式

❌ Inventing model names or adding date suffixes: Why wrong: Image generation models have specific names; date suffixes like

-preview-05-20

are for text models only Better: Use exactly

gemini-2.5-flash-image

gemini-3-pro-image-preview

— no variations

❌ Using Gemini 2.5 Pro for images: Why wrong: Gemini 2.5 Pro doesn't generate images directly Better: Use

gemini-2.5-flash-image

gemini-3-pro-image-preview

❌ Storing only base64 in database: Why wrong: Blobs database, expensive storage, slow retrieval Better: Store in object storage (Vercel Blob/S3), save URL only

❌ No rate limit handling: Why wrong: Will hit 429 errors in production, poor UX Better: Implement rate limiting with user-friendly error messages

❌ Ignoring multi-turn context: Why wrong: Wastes Nano Banana's conversational editing strength Better: Track chat history for iterative refinement

❌ Hardcoding API keys client-side: Why wrong: Exposes credentials, security risk Better: Use server actions / API routes with environment variables

❌ Using wrong aspect ratio: Why wrong: 21:9 on 1:1 request wastes tokens, unexpected crop Better: Match aspect ratio to intended use case

❌ No loading states: Why wrong: Image generation takes 5-30s, users think it's broken Better: Show progress indicators and estimated wait time

❌ Generating on every keystroke: Why wrong: Wastes quota, slow response Better: Debounce prompts, require explicit action

❌ 自行编造模型名称或添加日期后缀：错误原因：图像生成模型有固定名称；类似

-preview-05-20

的日期后缀仅适用于文本模型正确做法：严格使用

gemini-2.5-flash-image

或

gemini-3-pro-image-preview

——切勿修改

❌ 使用Gemini 2.5 Pro进行图像生成：错误原因：Gemini 2.5 Pro不直接支持图像生成正确做法：使用

gemini-2.5-flash-image

或

gemini-3-pro-image-preview

❌ 仅在数据库中存储base64格式图像：错误原因：会导致数据库体积过大、存储成本高、检索速度慢正确做法：存储到对象存储（如Vercel Blob/S3），仅保存图像URL

❌ 未处理速率限制：错误原因：生产环境中会触发429错误，影响用户体验正确做法：实现速率限制机制，并提供友好的错误提示

❌ 忽略多轮上下文：错误原因：浪费了Nano Banana的对话式编辑能力正确做法：跟踪聊天历史以支持迭代优化

❌ 在客户端硬编码API密钥：错误原因：会泄露凭证，存在安全风险正确做法：使用Server Actions/API路由，并通过环境变量存储密钥

❌ 使用错误的宽高比：错误原因：在1:1的请求中使用21:9会浪费令牌，导致意外裁剪正确做法：根据使用场景匹配合适的宽高比

❌ 未添加加载状态：错误原因：图像生成需要5-30秒，用户会误以为应用故障正确做法：显示进度指示器及预计等待时间

❌ 每次按键都触发生成：错误原因：浪费配额，响应速度慢正确做法：对输入提示进行防抖处理，要求用户明确触发生成操作

Variation Guidance

差异化设计指南

IMPORTANT: Every app should feel uniquely designed for its specific purpose.

Vary across dimensions:

UI Style: Minimal, brutalist, playful, professional, dark, light
Color Scheme: Warm, cool, monochrome, vibrant, muted
Layout: Single page, multi-step wizard, sidebar, grid, list
Interaction: Click-to-generate, drag-and-drop, real-time typing, batch

Avoid overused patterns:

❌ Default Tailwind purple gradients
❌ Generic "AI startup" aesthetic
❌ Same component libraries for every project
❌ Inter/Roboto fonts without thought

Context should drive design:

Meme generator → Bold, fun, casual
Product mockup tool → Clean, professional, grid-based
Art exploration → Gallery-first, visual-heavy
Brand asset creator → Polished, template-guided

重要提示：每个应用都应根据其特定用途进行独特设计。

可从以下维度进行差异化：

UI风格：极简、粗犷、趣味、专业、深色、浅色
配色方案：暖色调、冷色调、单色、鲜艳、柔和
布局：单页、多步骤向导、侧边栏、网格、列表
交互方式：点击生成、拖拽、实时输入、批量生成

需避免的过度使用模式：

❌ 默认的Tailwind紫色渐变
❌ 通用的"AI初创公司"风格
❌ 每个项目都使用相同的组件库
❌ 未经思考就使用Inter/Roboto字体

设计应贴合场景：

表情包生成器 → 大胆、有趣、休闲
产品原型工具 → 简洁、专业、基于网格
艺术探索平台 → 以图库为核心、视觉优先
品牌资产创建工具 → 精致、基于模板

Environment Setup

环境搭建

bash

undefined

bash

undefined

.env.local

GEMINI_API_KEY=your_api_key_here

For Vercel Blob storage

BLOB_READ_WRITE_TOKEN=your_vercel_token

For S3 (optional)

S3_BUCKET=your-bucket S3_ENDPOINT=https://your-endpoint.r2.cloudflarestorage.com S3_ACCESS_KEY_ID=your_key S3_SECRET_ACCESS_KEY=your_secret

For Upstash rate limiting (optional)

UPSTASH_REDIS_REST_URL=your_url UPSTASH_REDIS_REST_TOKEN=your_token


```bash

UPSTASH_REDIS_REST_URL=your_url UPSTASH_REDIS_REST_TOKEN=your_token


```bash

Install dependencies

npm install @ai-sdk/google ai @ai-sdk/react @vercel/blob

Or if using separate packages

npm install google-genai

---

npm install google-genai

---

Remember

要点总结

Nano Banana enables conversational image generation that feels like working with a creative partner, not a tool.

The best apps:

Leverage multi-turn editing for refinement
Choose models intentionally (speed vs quality)
Handle rate limits gracefully
Store images efficiently
Provide great loading states
Feel uniquely designed for their purpose

You're building more than an image generator—you're creating a creative experience. Design it thoughtfully.

Nano Banana 实现了对话式图像生成，让用户感觉像是与创意伙伴协作，而非使用工具。

优秀的应用应具备以下特点：

利用多轮编辑功能进行优化
有针对性地选择模型（速度 vs 质量）
优雅处理速率限制
高效存储图像
提供良好的加载状态
根据特定用途进行独特设计

你正在构建的不仅仅是一个图像生成器——而是一种创意体验，请用心设计。