gemini-image
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGemini Image Generation
Gemini图片生成
Generate high-quality images from text prompts using Google's Gemini and Imagen models through executable scripts.
通过可执行脚本,使用Google的Gemini和Imagen模型从文本提示词生成高质量图片。
When to Use This Skill
何时使用该技能
Use this skill when you need to:
- Create visual content from text descriptions
- Generate multiple image variations
- Create images at specific resolutions (1K, 2K, 4K)
- Produce images for different aspect ratios (social media, banners, etc.)
- Generate photorealistic images or artistic visuals
- Create images with person generation controls
- Batch generate multiple images at once
- Combine with text generation for complete content creation
当你需要以下功能时,使用该技能:
- 根据文本描述创建视觉内容
- 生成多张图片变体
- 创建特定分辨率的图片(1K、2K、4K)
- 生成不同宽高比的图片(适用于社交媒体、横幅等场景)
- 生成写实风格或艺术风格的图片
- 控制人物生成规则
- 批量生成多张图片
- 结合文本生成功能完成完整内容创作
Available Scripts
可用脚本
scripts/generate_image.js
scripts/generate_image.js
Purpose: Generate images using Gemini 3 Pro Image or Imagen 4 models
When to use:
- Any image generation task
- Multiple image generation (1-4 per request)
- Custom resolution and aspect ratio needs
- Professional asset creation
- Photorealistic or artistic image generation
Key parameters:
| Parameter | Description | Example |
|---|---|---|
| Text description (required) | |
| Model to use | |
| Output directory for images | |
| Base name for output files | |
| Disable auto timestamp | Flag |
| Aspect ratio | |
| Resolution | |
| Number of images (1-4) | |
| Person generation policy | |
Output: List of saved PNG file paths
用途:使用Gemini 3 Pro Image或Imagen 4模型生成图片
适用场景:
- 任何图片生成任务
- 生成多张图片(每次请求1-4张)
- 需要自定义分辨率和宽高比
- 专业素材创作
- 写实或艺术风格图片生成
关键参数:
| 参数 | 说明 | 示例 |
|---|---|---|
| 文本描述(必填) | |
| 使用的模型 | |
| 图片输出目录 | |
| 输出文件的基础名称 | |
| 禁用自动时间戳 | 标志参数 |
| 宽高比 | |
| 分辨率 | |
| 生成图片数量(1-4) | |
| 人物生成策略 | |
输出:已保存的PNG文件路径列表
Workflows
工作流程
Workflow 1: Basic Image Generation
工作流程1:基础图片生成
bash
node scripts/generate_image.js "A futuristic city at sunset with flying cars"- Best for: Quick image generation, prototypes
- Model: (default, Nano Banana 2)
gemini-3.1-flash-image-preview - Output:
images/generated_image_YYYYMMDD_HHMMSS.png
bash
node scripts/generate_image.js "A futuristic city at sunset with flying cars"- 最佳适用场景:快速图片生成、原型制作
- 默认模型:(Nano Banana 2)
gemini-3.1-flash-image-preview - 输出文件:
images/generated_image_YYYYMMDD_HHMMSS.png
Workflow 2: Social Media (Instagram, Facebook)
工作流程2:社交媒体适配(Instagram、Facebook)
bash
node scripts/generate_image.js "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop- Best for: Instagram posts, profile pictures
- Aspect: 1:1 (square format)
- Resolution: 2K (2048x2048)
- Output:
images/coffee-shop_YYYYMMDD_HHMMSS.png
bash
node scripts/generate_image.js "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop- 最佳适用场景:Instagram帖子、头像
- 宽高比:1:1(正方形格式)
- 分辨率:2K(2048x2048)
- 输出文件:
images/coffee-shop_YYYYMMDD_HHMMSS.png
Workflow 3: YouTube Thumbnails (16:9)
工作流程3:YouTube缩略图(16:9)
bash
node scripts/generate_image.js "Tech gadget review thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail- Best for: YouTube, video thumbnails
- Aspect: 16:9 (widescreen)
- Resolution: 2K (2752x1536)
- Output:
images/thumbnail_YYYYMMDD_HHMMSS.png
bash
node scripts/generate_image.js "Tech gadget review thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail- 最佳适用场景:YouTube、视频缩略图
- 宽高比:16:9(宽屏)
- 分辨率:2K(2752x1536)
- 输出文件:
images/thumbnail_YYYYMMDD_HHMMSS.png
Workflow 4: Multiple Variations
工作流程4:多版本变体生成
bash
node scripts/generate_image.js "Abstract geometric patterns in blue and gold" --num 4 --name abstract- Best for: A/B testing, design options
- Generates: 4 distinct variations
- Output: ,
images/abstract_YYYYMMDD_HHMMSS_0.png, etc.images/abstract_YYYYMMDD_HHMMSS_1.png
bash
node scripts/generate_image.js "Abstract geometric patterns in blue and gold" --num 4 --name abstract- 最佳适用场景:A/B测试、设计方案选型
- 生成数量:4张不同变体
- 输出文件:、
images/abstract_YYYYMMDD_HHMMSS_0.png等images/abstract_YYYYMMDD_HHMMSS_1.png
Workflow 5: Custom Output Directory
工作流程5:自定义输出目录
bash
node scripts/generate_image.js "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --output-dir ./professional/ --name museum- Best for: Print materials, high-end assets, organized projects
- Model: or
gemini-3.1-flash-image-preview(for 4K)gemini-3-pro-image-preview - Resolution: 4K (5504x3072 for 16:9)
- Directory created automatically if it doesn't exist
bash
node scripts/generate_image.js "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --output-dir ./professional/ --name museum- 最佳适用场景:印刷素材、高端资产、项目文件整理
- 推荐模型:或
gemini-3.1-flash-image-preview(支持4K)gemini-3-pro-image-preview - 分辨率:4K(16:9比例下为5504x3072)
- 若目录不存在,脚本会自动创建
Workflow 6: Photorealistic Images (Imagen 4)
工作流程6:写实风格图片(Imagen 4)
bash
node scripts/generate_image.js "Robot holding a red skateboard in urban setting" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --num 2 --name robot-skate- Best for: Realistic photos, product shots
- Model: (photorealistic)
imagen-4.0-generate-001 - Notes: English prompts only
- Max 4 images per request
bash
node scripts/generate_image.js "Robot holding a red skateboard in urban setting" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --num 2 --name robot-skate- 最佳适用场景:写实照片、产品拍摄
- 使用模型:(写实风格)
imagen-4.0-generate-001 - 注意事项:仅支持英文提示词
- 每次请求最多生成4张图片
Workflow 7: Blog Post Featured Image
工作流程7:博客文章特色图片
bash
node scripts/generate_image.js "Serene mountain lake at sunrise with reflections" --aspect 16:9 --size 2K --output-dir ./blog-images/ --name featured-image- Best for: Blog headers, article images
- Combines well with: gemini-text for blog content generation
bash
node scripts/generate_image.js "Serene mountain lake at sunrise with reflections" --aspect 16:9 --size 2K --output-dir ./blog-images/ --name featured-image- 最佳适用场景:博客标题图、文章配图
- 搭配使用:可与gemini-text技能结合生成博客内容
Workflow 8: Content Creation Pipeline (Text + Image)
工作流程8:内容创作流水线(文本+图片)
bash
undefinedbash
undefined1. Generate content (gemini-text skill)
1. 生成文本内容(gemini-text技能)
node skills/gemini-text/scripts/generate.js "Write a product description for smart home device"
node skills/gemini-text/scripts/generate.js "Write a product description for smart home device"
2. Generate product image (this skill)
2. 生成产品图片(本技能)
node scripts/generate_image.js "Sleek modern smart home device on white background" --aspect 4:3 --size 2K --name product
node scripts/generate_image.js "Sleek modern smart home device on white background" --aspect 4:3 --size 2K --name product
3. Create social media post
3. 创建社交媒体帖子
- Best for: E-commerce, marketing campaigns
- Combines with: gemini-text, gemini-batch for batch production- 最佳适用场景:电商、营销活动
- 搭配使用:gemini-text、gemini-batch技能实现批量生产Workflow 9: Disable Timestamp
工作流程9:禁用时间戳
bash
node scripts/generate_image.js "Fixed filename image" --name my-image --no-timestamp- Best for: When you want complete control over filename
- Output: (no timestamp)
images/my-image.png - Use when: Generating files for specific naming schemes or automated pipelines
bash
node scripts/generate_image.js "Fixed filename image" --name my-image --no-timestamp- 最佳适用场景:需要完全控制文件名时
- 输出文件:(无时间戳)
images/my-image.png - 适用场景:生成符合特定命名规则的文件或用于自动化流水线
Parameters Reference
参数参考
Model Selection
模型选择
| Model | Nickname | Quality | Max Size | Best For |
|---|---|---|---|---|
| Nano Banana 2 | Pro-level | 4K | New default, fast + strong quality |
| Nano Banana Pro | Highest | 4K | Maximum quality and complex text rendering |
| Nano Banana | Good | 2K | High-volume, low-latency |
| Imagen 4 | Photorealistic | 2K | Realistic photos, product shots |
| 模型 | 别名 | 画质 | 最大分辨率 | 最佳适用场景 |
|---|---|---|---|---|
| Nano Banana 2 | 专业级 | 4K | 新默认模型,兼顾速度与画质 |
| Nano Banana Pro | 最高画质 | 4K | 追求极致画质和复杂文本渲染 |
| Nano Banana | 良好 | 2K | 高吞吐量、低延迟场景 |
| Imagen 4 | 写实风格 | 2K | 写实照片、产品拍摄 |
Aspect Ratios
宽高比
| Ratio | Use Case | 1K Size | 2K Size |
|---|---|---|---|
| 1:1 | Instagram, avatars | 1024x1024 | 2048x2048 |
| 16:9 | YouTube, presentations | 1376x768 | 2752x1536 |
| 9:16 | Instagram Stories, TikTok | 768x1376 | 1536x2752 |
| 4:3 | Traditional displays | 1024x768 | 2048x1536 |
| 3:4 | Portrait orientation | 768x1024 | 1536x2048 |
| 21:9 | Ultrawide | - | 5504x2400 |
Note: 4K resolution is available with and
gemini-3.1-flash-image-previewgemini-3-pro-image-preview| 比例 | 适用场景 | 1K分辨率 | 2K分辨率 |
|---|---|---|---|
| 1:1 | Instagram、头像 | 1024x1024 | 2048x2048 |
| 16:9 | YouTube、演示文稿 | 1376x768 | 2752x1536 |
| 9:16 | Instagram Stories、TikTok | 768x1376 | 1536x2752 |
| 4:3 | 传统显示器 | 1024x768 | 2048x1536 |
| 3:4 | 竖屏方向 | 768x1024 | 1536x2048 |
| 21:9 | 超宽屏 | - | 5504x2400 |
注意:仅和支持4K分辨率
gemini-3.1-flash-image-previewgemini-3-pro-image-previewResolution Guide
分辨率指南
| Size | Use Case | Best Model |
|---|---|---|
| 1K (1024px) | Web thumbnails, previews | Any model |
| 2K (2048px) | Standard web, social media | Any model |
| 4K (4096px) | Print, high-end assets | gemini-3-pro only |
| 尺寸 | 适用场景 | 推荐模型 |
|---|---|---|
| 1K(1024px) | 网页缩略图、预览图 | 所有模型 |
| 2K(2048px) | 标准网页、社交媒体 | 所有模型 |
| 4K(4096px) | 印刷、高端资产 | 仅gemini-3-pro系列 |
Person Generation Policy
人物生成策略
| Policy | Description | Restrictions |
|---|---|---|
| No people in images | None |
| Adults only | Recommended default |
| All ages | Restricted in EU, UK, CH, MENA |
| 策略 | 说明 | 限制 |
|---|---|---|
| 图片中不出现人物 | 无 |
| 仅允许成年人出现 | 推荐默认选项 |
| 允许所有年龄段人物 | 在欧盟、英国、瑞士、中东和北非地区受限 |
Output Interpretation
输出结果说明
File Naming
文件命名规则
- Default format: (auto timestamp)
{name}_YYYYMMDD_HHMMSS.png - Single image example:
artwork_20260130_031643.png - Multiple images: ,
{name}_YYYYMMDD_HHMMSS_0.png, etc.{name}_YYYYMMDD_HHMMSS_1.png - Without timestamp ():
--no-timestamp{name}.png - Script prints: "Saved: /path/to/file.png"
- 默认格式:(自动添加时间戳)
{name}_YYYYMMDD_HHMMSS.png - 单张图片示例:
artwork_20260130_031643.png - 多张图片:、
{name}_YYYYMMDD_HHMMSS_0.png等{name}_YYYYMMDD_HHMMSS_1.png - 禁用时间戳():
--no-timestamp{name}.png - 脚本会输出:"Saved: /path/to/file.png"
Image Quality
图片质量
- All images include SynthID watermark for authenticity
- PNG format for lossless quality
- Can be converted to JPEG/WEBP if needed
- 4K images are significantly larger file sizes
- 所有图片均包含SynthID水印用于真实性验证
- 采用PNG格式保证无损画质
- 可根据需要转换为JPEG/WEBP格式
- 4K图片文件体积显著更大
Error Messages
错误信息
- "Model not available": Check model name spelling
- "Unsupported size": Verify size/model combination
- "Aspect ratio error": Use supported ratios for selected model
- "Model not available":检查模型名称拼写
- "Unsupported size":确认分辨率与模型的兼容性
- "Aspect ratio error":使用所选模型支持的宽高比
Common Issues
常见问题
"google-genai or pillow not installed"
"未安装google-genai或pillow依赖"
bash
cd scripts && npm installbash
cd scripts && npm install"Image generation failed"
"图片生成失败"
- Check prompt length (too verbose can fail)
- Try simpler, more focused prompts
- Verify model availability in your region
- Check API quota limits
- 检查提示词长度(过于冗长可能导致失败)
- 尝试更简洁、聚焦的提示词
- 验证模型是否在你的地区可用
- 检查API配额限制
"Unsupported aspect ratio"
"不支持的宽高比"
- Check if ratio is supported by selected model
- Imagen 4 has fewer ratio options than Gemini
- Use 16:9 or 1:1 for best compatibility
- 确认所选模型是否支持该宽高比
- Imagen 4支持的宽高比少于Gemini系列
- 优先使用16:9或1:1以获得最佳兼容性
"4K not supported"
"4K分辨率不支持"
- 4K works best with or
gemini-3.1-flash-image-previewgemini-3-pro-image-preview - Use for older models
--size 2K - Try
--model gemini-3.1-flash-image-preview --size 4K
- 4K分辨率仅在或
gemini-3.1-flash-image-preview模型下表现最佳gemini-3-pro-image-preview - 旧版模型请使用参数
--size 2K - 尝试命令:
--model gemini-3.1-flash-image-preview --size 4K
"Imagen prompt language error"
"Imagen提示词语言错误"
- Imagen models support English prompts only
- Use for other languages
gemini-3.1-flash-image-preview - Translate prompt to English for Imagen
- Imagen模型仅支持英文提示词
- 其他语言请使用模型
gemini-3.1-flash-image-preview - 若使用Imagen,请将提示词翻译为英文
File too large for storage
文件体积过大导致存储不足
- Use for smaller files
--size 1K - Compress images after generation
- Convert PNG to JPEG for web use
- 使用参数生成更小的文件
--size 1K - 生成后压缩图片
- 将PNG转换为JPEG格式用于网页场景
Best Practices
最佳实践
Prompt Engineering
提示词优化
- Be specific and descriptive
- Include style descriptors (e.g., "photorealistic", "digital art")
- Mention lighting, mood, and composition
- Use analogies for complex concepts
- Avoid negative prompts (describe what you want, not what to avoid)
- 描述要具体、清晰
- 包含风格描述(如“写实风格”“数字艺术”)
- 提及光线、氛围和构图
- 使用类比描述复杂概念
- 避免负面提示词(描述你想要的内容,而非要避免的内容)
Model Selection
模型选择
- Use for: Best default balance, quality, speed, 4K
gemini-3.1-flash-image-preview - Use for: Maximum quality, complex text rendering
gemini-3-pro-image-preview - Use for: Speed, high volume
gemini-2.5-flash-image - Use for: Photorealism, product shots
imagen-4.0-generate-001
- :默认最佳选择,兼顾速度、画质和4K支持
gemini-3.1-flash-image-preview - :追求极致画质和复杂文本渲染
gemini-3-pro-image-preview - :高吞吐量、低延迟场景
gemini-2.5-flash-image - :写实风格图片、产品拍摄
imagen-4.0-generate-001
Performance Optimization
性能优化
- Generate multiple images at once with
--num - Use lower resolution for previews
- Batch requests for high-volume needs (gemini-batch skill)
- Cache results for repeated requests
- 使用参数批量生成多张图片
--num - 预览时使用较低分辨率
- 高需求场景使用gemini-batch技能批量处理请求
- 对重复请求的结果进行缓存
Quality Tips
画质提升技巧
- Use 2K resolution for most web uses
- 4K only when maximum detail is needed
- Combine specific prompts with style guidance
- Test prompts with before generating batches
--num 1
- 大多数网页场景使用2K分辨率即可
- 仅在需要极致细节时使用4K
- 将具体提示词与风格引导结合
- 批量生成前先用测试提示词效果
--num 1
Cost Management
成本管理
- Use flash models for cost efficiency
- 4K generation costs significantly more
- Batch multiple requests when possible
- Generate at 1K for testing, 2K/4K for final
- 使用flash系列模型提高成本效益
- 4K生成的成本显著更高
- 尽可能批量处理请求
- 测试阶段使用1K,最终输出使用2K/4K
Related Skills
相关技能
- gemini-text: Generate text content alongside images
- gemini-tts: Create audio for image-based content
- gemini-batch: Process multiple image requests efficiently
- gemini-embeddings: Generate image embeddings for similarity search
- gemini-text:搭配生成文本内容
- gemini-tts:为图片内容创建音频
- gemini-batch:高效处理多张图片请求
- gemini-embeddings:生成图片嵌入向量用于相似性搜索
Quick Reference
快速参考
bash
undefinedbash
undefinedBasic
基础生成
node scripts/generate_image.js "Your prompt"
node scripts/generate_image.js "Your prompt"
Social media (1:1)
社交媒体适配(1:1)
node scripts/generate_image.js "Prompt" --aspect 1:1 --size 2K --name social-post
node scripts/generate_image.js "Prompt" --aspect 1:1 --size 2K --name social-post
YouTube thumbnail (16:9)
YouTube缩略图(16:9)
node scripts/generate_image.js "Prompt" --aspect 16:9 --size 2K --name thumbnail
node scripts/generate_image.js "Prompt" --aspect 16:9 --size 2K --name thumbnail
4K high quality
4K高质量生成
node scripts/generate_image.js "Prompt" --aspect 16:9 --size 4K --name high-res
node scripts/generate_image.js "Prompt" --aspect 16:9 --size 4K --name high-res
Multiple variations
多版本变体
node scripts/generate_image.js "Prompt" --num 4 --name variations
node scripts/generate_image.js "Prompt" --num 4 --name variations
Custom directory
自定义输出目录
node scripts/generate_image.js "Prompt" --output-dir ./my-images/ --name custom
node scripts/generate_image.js "Prompt" --output-dir ./my-images/ --name custom
Photorealistic
写实风格图片
node scripts/generate_image.js "Prompt" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --name photo
node scripts/generate_image.js "Prompt" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --name photo
No timestamp
禁用时间戳
node scripts/generate_image.js "Prompt" --name fixed-name --no-timestamp
undefinednode scripts/generate_image.js "Prompt" --name fixed-name --no-timestamp
undefinedReference
参考资源
- See for model documentation (if available)
references/ - Get API key: https://aistudio.google.com/apikey
- Documentation: https://ai.google.dev/gemini-api/docs/image-generation
- SynthID: https://deepmind.google/technologies/synthid/
- 模型文档:查看目录(若有提供)
references/ - 获取API密钥:https://aistudio.google.com/apikey
- 官方文档:https://ai.google.dev/gemini-api/docs/image-generation
- SynthID介绍:https://deepmind.google/technologies/synthid/