image-generation
Original:🇺🇸 English
Translated
Guide to image generation and editing in MassGen. Use when creating images, editing existing images, iterating on image designs, or choosing between image backends (OpenAI, Google Gemini/Imagen, Grok, OpenRouter).
2installs
Sourcemassgen/massgen
Added on
NPX Install
npx skill4agent add massgen/massgen image-generationTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Image Generation
Generate images using with . The system auto-selects the best backend based on available API keys.
generate_mediamode="image"Quick Start
python
# Simple text-to-image (auto-selects backend)
generate_media(prompt="A cat in space", mode="image")
# Specify backend and quality
generate_media(prompt="A logo for a coffee shop", mode="image",
backend_type="openai", quality="high")
# Batch generation (parallel)
generate_media(prompts=["sunset over ocean", "mountain landscape", "city at night"],
mode="image", max_concurrent=3)Backend Comparison
| Backend | Default Model | Strengths | API Key |
|---|---|---|---|
| Google (priority 1) | | Fast, flexible sizes, image editing, multi-turn | |
| OpenAI (priority 2) | | High quality, transparent backgrounds, continuation via response ID | |
| Grok (priority 3) | | 1k resolution, continuation via stored data URI | |
| OpenRouter (priority 4) | | Access to multiple models via single API | |
Key Parameters
| Parameter | Description | Example |
|---|---|---|
| Text description of the image | |
| Force a specific backend | |
| Override default model | |
| Image quality (OpenAI) | |
| Image dimensions | See backends reference |
| Aspect ratio | |
| Source images for image-to-image editing | |
| Continuation ID for multi-turn editing | |
Image-to-Image Editing
Transform existing images by providing :
input_imagespython
generate_media(
prompt="Make it look like a watercolor painting",
mode="image",
input_images=["photo.jpg"]
)Supported backends for image-to-image: Google (Gemini), OpenAI, Grok. The system auto-selects if your current backend doesn't support it.
Multi-Turn Editing (Continuation)
Iteratively refine images using :
continue_frompython
# First generation
result = generate_media(prompt="A logo for a coffee shop", mode="image")
# Refine using the continuation ID
result2 = generate_media(
prompt="Make the text larger and add a cup icon",
mode="image",
continue_from=result["continuation_id"]
)Each backend uses a different continuation mechanism:
- OpenAI: Passes (stateless)
previous_response_id - Google Gemini: In-memory chat store (LRU, 50 items)
- Grok: In-memory data URI store (LRU, 50 items)
Continuation only works for single image generation (not batch).
Google: Gemini vs Imagen
Google supports two API paths. Gemini (Nano Banana 2) is the default and recommended for most use cases. Imagen is only needed for advanced reference-image editing features.
- Gemini models ():
gemini-*— text-to-image, image editing viagenerate_content(), multi-turn continuationinput_images - Imagen models ():
imagen-*/generate_images()— text-to-image withedit_image()/negative_prompt/seed, plus style transfer, control editing, and subject consistency via reference imagesguidance_scale
For studio-quality precision and text rendering, use: (Pro-tier).
model="gemini-3-pro-image-preview"Need More Control?
- Per-backend sizes, quality options, and quirks: See references/backends.md
- Complete reference: See references/extra_params.md
extra_params - Advanced editing (inpainting, style transfer, control, subject): See references/editing.md