image-gen

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Image Generation Skill

图像生成技能

Generate and edit website images using Gemini Native Image Generation.

使用Gemini原生图像生成功能生成和编辑网站图片。

⚠️ Critical: SDK Migration Required

⚠️ 重要提示：需要进行SDK迁移

IMPORTANT: The

@google/generative-ai

package is deprecated as of November 30, 2025. All new projects must use

@google/genai

Migration Required:

typescript

// ❌ OLD (deprecated, support ended Nov 30, 2025)
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(API_KEY);

// ✅ NEW (required)
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: API_KEY });

Source: GitHub Repository Migration Notice

重要提示：

@google/generative-ai

包已于2025年11月30日弃用。所有新项目必须使用

@google/genai

。

迁移要求:

typescript

// ❌ 旧版（已弃用，支持于2025年11月30日终止）
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(API_KEY);

// ✅ 新版（必须使用）
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: API_KEY });

来源：GitHub仓库迁移通知

Models

模型

Model	ID	Status	Best For
Gemini 3 Pro Image	`gemini-3-pro-image-preview`	Preview (Nov 20, 2025)	4K, complex prompts, text
Gemini 2.5 Flash Image	`gemini-2.5-flash-image`	GA (Oct 2, 2025)	Fast iteration, general use
Imagen 4.0	`imagen-4.0-generate-001`	GA (Aug 14, 2025)	Alternative platform

Deprecated Models (do not use):

```
gemini-2.0-flash-exp-image-generation
```
- Shut down Nov 11, 2025

gemini-2.0-flash-preview-image-generation

- Shut down Nov 11, 2025

```
gemini-2.5-flash-image-preview
```
- Scheduled shutdown Jan 15, 2026

Source: Google AI Changelog

模型	ID	状态	最佳适用场景
Gemini 3 Pro 图像模型	`gemini-3-pro-image-preview`	预览版（2025年11月20日）	4K分辨率、复杂提示词、文字生成
Gemini 2.5 Flash 图像模型	`gemini-2.5-flash-image`	正式发布（2025年10月2日）	快速迭代、通用场景
Imagen 4.0	`imagen-4.0-generate-001`	正式发布（2025年8月14日）	替代平台

已弃用模型（请勿使用）:

```
gemini-2.0-flash-exp-image-generation
```
- 已于2025年11月11日停用

gemini-2.0-flash-preview-image-generation

- 已于2025年11月11日停用

```
gemini-2.5-flash-image-preview
```
- 计划于2026年1月15日停用

来源：Google AI 更新日志

Capabilities

功能特性

Feature	Supported
Generate from text	✅
Edit existing images	✅
Change aspect ratio	✅
Widen/extend images	✅
Style transfer	✅
Change colours	✅
Add/remove elements	✅
Text in images	✅ (legible!)
Multiple reference images	✅ (up to 14: max 5 humans, 9 objects)
4K resolution	✅ (Pro only)

Note: Exceeding 5 human reference images causes unpredictable character consistency. Keep human images ≤ 5 for reliable results.

功能	是否支持
文本生成图像	✅
编辑现有图像	✅
修改宽高比	✅
扩展图像宽度	✅
风格迁移	✅
调整颜色	✅
添加/移除元素	✅
图像中添加文字	✅（清晰可读！）
多参考图像	✅（最多14张：最多5张人物图，9张物体图）
4K分辨率	✅（仅Pro版支持）

注意：参考人物图像超过5张会导致角色一致性不可预测。为获得可靠结果，请将人物图像数量控制在≤5张。

Aspect Ratios

宽高比

1:1   | 2:3  | 3:2  | 3:4  | 4:3
4:5   | 5:4  | 9:16 | 16:9 | 21:9

1:1   | 2:3  | 3:2  | 3:4  | 4:3
4:5   | 5:4  | 9:16 | 16:9 | 21:9

Resolutions (Pro only)

分辨率（仅Pro版支持）

Size	1:1	16:9	4:3
1K	1024x1024	1376x768	1184x880
2K	2048x2048	2752x1536	2368x1760
4K	4096x4096	5504x3072	4736x3520

尺寸	1:1	16:9	4:3
1K	1024x1024	1376x768	1184x880
2K	2048x2048	2752x1536	2368x1760
4K	4096x4096	5504x3072	4736x3520

Quick Start

快速开始

typescript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Generate new image
const response = await ai.models.generateContent({
  model: "gemini-2.5-flash-image",
  contents: "A professional plumber in hi-vis working in modern Australian home",
  config: {
    responseModalities: ["TEXT", "IMAGE"],  // BOTH required - cannot use ["IMAGE"] alone
    imageGenerationConfig: {
      aspectRatio: "16:9",
    },
  },
});

// Extract image
for (const part of response.candidates[0].content.parts) {
  if (part.inlineData) {
    const buffer = Buffer.from(part.inlineData.data, "base64");
    fs.writeFileSync("hero.png", buffer);
  }
}

Important:

responseModalities

must include both

["TEXT", "IMAGE"]

. Using

["IMAGE"]

alone may fail or produce unexpected results.

typescript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 生成新图像
const response = await ai.models.generateContent({
  model: "gemini-2.5-flash-image",
  contents: "A professional plumber in hi-vis working in modern Australian home",
  config: {
    responseModalities: ["TEXT", "IMAGE"],  // 必须同时包含两者 - 不能单独使用["IMAGE"]
    imageGenerationConfig: {
      aspectRatio: "16:9",
    },
  },
});

// 提取图像
for (const part of response.candidates[0].content.parts) {
  if (part.inlineData) {
    const buffer = Buffer.from(part.inlineData.data, "base64");
    fs.writeFileSync("hero.png", buffer);
  }
}

重要提示：

responseModalities

必须同时包含

["TEXT", "IMAGE"]

。单独使用

["IMAGE"]

可能会失败或产生意外结果。

Model Selection

模型选择

Requirement	Use
Fast iteration	Gemini 2.5 Flash Image
4K resolution	Gemini 3 Pro Image Preview
Text in images	Gemini 3 Pro (94% legibility at 4K)
Simple edits	Gemini 2.5 Flash Image
Complex compositions	Gemini 3 Pro Image Preview
Infographics/diagrams	Gemini 3 Pro Image Preview

Text Rendering Benchmarks (4K resolution):

Gemini 3 Pro Image: 94% legible text
DALL-E 3: 78% legible text
Midjourney: Decorative pseudo-text only

需求	推荐模型
快速迭代	Gemini 2.5 Flash Image
4K分辨率	Gemini 3 Pro Image Preview
图像中添加文字	Gemini 3 Pro（4K分辨率下文字可读性94%）
简单编辑	Gemini 2.5 Flash Image
复杂构图	Gemini 3 Pro Image Preview
信息图表/流程图	Gemini 3 Pro Image Preview

文字渲染基准测试（4K分辨率）:

Gemini 3 Pro Image：94%文字可读
DALL-E 3：78%文字可读
Midjourney：仅生成装饰性伪文字

When to Use

使用场景

Use Gemini Image Gen when:

Stock photos don't fit brand/context
Need Australian-specific imagery
Need text in images (infographics, diagrams)
Need consistent style across multiple images
Need to edit/modify existing images
Client has no photos of their work

Don't use when:

Client has good photos of actual work
Real team photos needed (discuss first)
Product shots (use real products)
Legal/compliance concerns

建议使用Gemini图像生成的场景:

库存照片不符合品牌/场景需求
需要澳大利亚特色的图像
需要在图像中添加文字（信息图表、流程图）
需要多张图像保持统一风格
需要编辑/修改现有图像
客户没有自己的工作照片

不建议使用的场景:

客户已有高质量的实际工作照片
需要真实的团队照片（需先沟通）
产品照片（使用真实产品）
存在法律/合规问题

Known Issues Prevention

已知问题预防

This skill prevents 5 documented issues:

本技能可预防5个已记录的问题：

Issue #1: Resolution Parameter Case Sensitivity

问题#1：分辨率参数大小写敏感

Error: Request fails with invalid parameter error Source: Google AI Image Generation Docs Why It Happens: Resolution values are case-sensitive and must use uppercase 'K'. Prevention: Always use

"4K"

"2K"

"1K"

- never lowercase

"4k"

typescript

// ❌ WRONG - causes request failure
config: { imageGenerationConfig: { resolution: "4k" } }

// ✅ CORRECT - uppercase required
config: { imageGenerationConfig: { resolution: "4K" } }

错误表现：请求因参数无效失败来源：Google AI 图像生成文档原因：分辨率值区分大小写，必须使用大写的'K'。 预防措施：始终使用

"4K"

、

"2K"

、

"1K"

- 切勿使用小写的

"4k"

。

typescript

// ❌ 错误写法 - 会导致请求失败
config: { imageGenerationConfig: { resolution: "4k" } }

// ✅ 正确写法 - 必须使用大写
config: { imageGenerationConfig: { resolution: "4K" } }

Issue #2: Aspect Ratio May Be Ignored (Sept 2025+)

问题#2：宽高比可能被忽略（2025年9月起）

Error: Returns 1:1 square image despite requesting 16:9 or other ratios Source: Google Support Thread Why It Happens: Backend update in September 2025 affected Gemini 2.5 Flash Image model's aspect ratio handling. Prevention: Use Gemini 3 Pro Image Preview for reliable aspect ratio control, or generate 1:1 and use multi-turn editing to extend.

typescript

// May ignore aspectRatio on Gemini 2.5 Flash Image
model: "gemini-2.5-flash-image",
config: { imageGenerationConfig: { aspectRatio: "16:9" } }

// More reliable for aspect ratio control
model: "gemini-3-pro-image-preview",
config: { imageGenerationConfig: { aspectRatio: "16:9" } }

Status: Google confirmed working on fix (Sept 2025).

错误表现：尽管请求16:9或其他比例，仍返回1:1的正方形图像来源：Google支持线程原因：2025年9月的后端更新影响了Gemini 2.5 Flash Image模型的宽高比处理。 预防措施：使用Gemini 3 Pro Image Preview以获得可靠的宽高比控制，或先生成1:1图像再通过多轮编辑进行扩展。

typescript

// 在Gemini 2.5 Flash Image上可能会忽略aspectRatio
model: "gemini-2.5-flash-image",
config: { imageGenerationConfig: { aspectRatio: "16:9" } }

// 宽高比控制更可靠
model: "gemini-3-pro-image-preview",
config: { imageGenerationConfig: { aspectRatio: "16:9" } }

状态：Google已确认正在修复（2025年9月）。

Issue #3: Exceeding 5 Human Reference Images

问题#3：参考人物图像超过5张

Error: Unpredictable character consistency in generated images Source: Google AI Image Generation Docs Why It Happens: Gemini 3 Pro Image supports up to 14 reference images total, but only 5 can be human images for character consistency. Prevention: Limit human images to 5 or fewer. Use remaining slots (up to 14 total) for objects/scenes.

typescript

// ❌ WRONG - 7 human images exceeds limit
const humanImages = [img1, img2, img3, img4, img5, img6, img7];
const prompt = [
  { text: "Generate consistent characters" },
  ...humanImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
];

// ✅ CORRECT - max 5 human images
const humanImages = images.slice(0, 5);  // Limit to 5
const objectImages = images.slice(5, 14);  // Up to 9 more for objects
const prompt = [
  { text: "Generate consistent characters" },
  ...humanImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
  ...objectImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
];

错误表现：生成图像中的角色一致性不可预测来源：Google AI 图像生成文档原因：Gemini 3 Pro Image总共支持最多14张参考图像，但为保证角色一致性，其中人物图像最多只能有5张。 预防措施：将人物图像数量限制在5张或更少。剩余的参考位（最多14张总数）可用于物体/场景图像。

typescript

// ❌ 错误写法 - 7张人物图像超过限制
const humanImages = [img1, img2, img3, img4, img5, img6, img7];
const prompt = [
  { text: "Generate consistent characters" },
  ...humanImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
];

// ✅ 正确写法 - 最多5张人物图像
const humanImages = images.slice(0, 5);  // 限制为5张
const objectImages = images.slice(5, 14);  // 最多可再添加9张物体图像
const prompt = [
  { text: "Generate consistent characters" },
  ...humanImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
  ...objectImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
];

Issue #4: SynthID Watermark Cannot Be Disabled

问题#4：SynthID水印无法禁用

Error: N/A (documented limitation) Source: Google AI Image Generation Docs Why It Happens: All generated images automatically include a SynthID watermark for content authenticity tracking. Prevention: Be aware of this limitation for commercial use cases. Watermark cannot be disabled by developers.

错误表现：无（已记录的限制）来源：Google AI 图像生成文档原因：所有生成的图像都会自动包含SynthID水印，用于内容真实性追踪。 预防措施：在商业使用场景中需注意此限制。开发者无法禁用水印。

Issue #5: Google Search Grounding Excludes Image Results

问题#5：Google搜索 grounding 不包含图像结果

Error: Generated images don't reflect visual search results, only text Source: Google AI Image Generation Docs Why It Happens: When using Google Search tool with image generation, "image-based search results are not passed to the generation model." Prevention: Only text-based search results inform the visual output. Don't expect the model to reference images from search results.

typescript

// Google Search tool enabled
const response = await ai.models.generateContent({
  model: "gemini-3-pro-image-preview",
  contents: "Generate image of latest iPhone design",
  tools: [{ googleSearch: {} }],
  config: { responseModalities: ["TEXT", "IMAGE"] },
});
// Result: Only text search results used, not image results from web search

错误表现：生成的图像不反映视觉搜索结果，仅基于文本来源：Google AI 图像生成文档原因：当将Google搜索工具与图像生成结合使用时，“基于图像的搜索结果不会传递给生成模型”。 预防措施：只有基于文本的搜索结果会影响视觉输出。不要期望模型参考搜索结果中的图像。

typescript

// 启用Google搜索工具
const response = await ai.models.generateContent({
  model: "gemini-3-pro-image-preview",
  contents: "Generate image of latest iPhone design",
  tools: [{ googleSearch: {} }],
  config: { responseModalities: ["TEXT", "IMAGE"] },
});
// 结果：仅使用文本搜索结果，不使用网页搜索中的图像结果

Pricing

定价

Current Pricing (as of November 2025):

Gemini 2.5 Flash Image: ~$0.008 per image
- Input: 258 tokens per image
- Output: 1290 tokens per image
- Rate: $30.00 per 1M output tokens

Note: The

generateImages

API (Imagen models) does not return

usageMetadata

in responses. Track costs manually based on pricing above.

Source: Google Developers Blog - Gemini 2.5 Flash Image

当前定价（截至2025年11月）:

Gemini 2.5 Flash Image：约每张图片0.008美元
- 输入：每张图片258个token
- 输出：每张图片1290个token
- 费率：每100万输出token30.00美元

注意：

generateImages

API（Imagen模型）不会在响应中返回

usageMetadata

。请根据上述定价手动跟踪成本。

来源：Google开发者博客 - Gemini 2.5 Flash Image

Reference Files

参考文件

```
references/prompting.md
```
- Effective prompt patterns
```
references/website-images.md
```
- Hero, service, background templates
```
references/editing.md
```
- Multi-turn editing patterns
```
references/local-imagery.md
```
- Australian-specific details
```
references/integration.md
```
- API code examples

Last verified: 2026-01-21 | Skill version: 2.0.0 | Changes: Added SDK migration notice (critical), updated to current model names (gemini-3-pro-image-preview, gemini-2.5-flash-image), added 5 Known Issues (resolution case sensitivity, aspect ratio bug, reference image limits, SynthID watermark, Google Search grounding), added pricing section, added text rendering benchmarks.

```
references/prompting.md
```
- 有效的提示词模板
```
references/website-images.md
```
- 首屏、服务、背景模板
```
references/editing.md
```
- 多轮编辑模板
```
references/local-imagery.md
```
- 澳大利亚特色细节
```
references/integration.md
```
- API代码示例

最后验证时间：2026-01-21 | 技能版本：2.0.0 | 更新内容：添加了SDK迁移通知（重要），更新为当前模型名称（gemini-3-pro-image-preview、gemini-2.5-flash-image），添加了5个已知问题（分辨率大小写敏感、宽高比bug、参考图像限制、SynthID水印、Google搜索grounding），添加了定价部分，添加了文字渲染基准测试。