Loading...
Loading...
Use this skill for any image-related AI generation or editing task. Triggers include: GENERATE: "generate image", "create image", "make picture", "draw", "visualize", "image of", "create art", "generate art" EDIT: "edit image", "modify image", "change image", "update image", "fix image", "enhance image" ADD/REMOVE: "add to image", "put in image", "remove from image", "delete from image", "add element" STYLE: "style transfer", "make it look like", "convert style", "apply style", "in the style of" PRODUCT: "product photo", "product placement", "place product", "mockup", "put product on" COMPOSITE: "combine images", "merge images", "blend images", "create composite" Supports text-to-image generation, image editing with references, product placement, style transfer, and multi-image composition using Google Gemini (Nano Banana Pro) or OpenAI DALL-E.
npx skill4agent add michaelboeding/skills image-generation| User Says | Mode | What Happens |
|---|---|---|
| "Generate an image of a sunset" | Generate | Text-to-image, no reference needed |
| "Create a logo for my coffee shop" | Generate | Text-to-image with text rendering |
| "Edit this image: add a hat to the cat" | Edit | User provides image, AI modifies it |
| "Remove the background from this photo" | Edit | User provides image, AI edits it |
| "Put this product on a kitchen counter" | Product | User provides product + optional scene |
| "Make this photo look like Van Gogh painted it" | Style | User provides photo, AI applies style |
| "Combine these photos into a group shot" | Composite | User provides multiple images |
OPENAI_API_KEYGOOGLE_API_KEYgpt-image-1.5gpt-image-1gpt-image-1-miniautoautoauto⚠️ Note: DALL-E 2 and DALL-E 3 are deprecated and will stop being supported on 05/12/2026.
gemini-2.5-flash-imagegemini-3-pro-image-previewAskUserQuestion"Which image generation model would you like to use?
- Google Gemini (Nano Banana Pro) - Up to 4K, 14 reference images, style transfer, thinking mode (Recommended)
- OpenAI GPT Image 1.5 - State of the art, transparency, streaming, up to 16 input images
- OpenAI GPT Image 1 - Great quality, transparency, image editing
- OpenAI GPT Image 1 Mini - Fastest, most affordable"
"I'll generate that image for you! First — do you have any reference images?
- Product photos to include
- Style references
- Images to edit
- No, generate from scratch"
"What aspect ratio?
- 1:1 (square)
- 16:9 (landscape/widescreen)
- 9:16 (portrait/vertical)
- 4:3 / 3:4 (classic)
- Other (2:3, 3:2, 4:5, 5:4, 21:9)
- Or specify"
"What resolution?
- 1K (fast)
- 2K (balanced)
- 4K (highest quality)"
"Any style preferences?
- Photorealistic
- Artistic/painterly
- Cartoon/illustration
- 3D render
- Or describe your own"
| Question | Determines |
|---|---|
| Reference | Generation vs editing mode |
| Aspect Ratio | Image dimensions |
| Resolution | Quality level |
| Style | Prompt enhancement direction |
OPENAI_API_KEYGOOGLE_API_KEYgemini.pygemini-3-pro-image-previewopenai_image.pygpt-image-1.5openai_image.pygpt-image-1openai_image.pygpt-image-1-mini${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/python3 ${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/openai_image.py \
--prompt "your enhanced prompt" \
--model "gpt-image-1" \
--size "1024x1024" \
--quality "high" \
--output "/path/to/output.png"python3 ${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/openai_image.py \
--prompt "A product icon with no background" \
--model "gpt-image-1" \
--background "transparent" \
--quality "high" \
--output "/path/to/output.png"python3 ${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/openai_image.py \
--prompt "Add a wizard hat to this cat" \
--model "gpt-image-1" \
--image "/path/to/cat.jpg" \
--input-fidelity "high" \
--output "/path/to/output.png"python3 ${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/openai_image.py \
--prompt "Create a gift basket containing these items" \
--model "gpt-image-1" \
--image "/path/to/item1.png" \
--image "/path/to/item2.png" \
--image "/path/to/item3.png" \
--output "/path/to/output.png"python3 ${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/openai_image.py \
--prompt "Replace the pool with a garden" \
--model "gpt-image-1" \
--image "/path/to/scene.jpg" \
--mask "/path/to/mask.png" \
--output "/path/to/output.png"python3 ${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/openai_image.py \
--prompt "A beautiful sunset over mountains" \
--model "gpt-image-1" \
--stream \
--partial-images 2 \
--output "/path/to/output.png"python3 ${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/gemini.py \
--prompt "your enhanced prompt" \
--model "gemini-3-pro-image-preview" \
--aspect-ratio "1:1" \
--resolution "2K" \
--output "/path/to/output.png"python3 ${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/gemini.py \
--prompt "Add a wizard hat to this cat" \
--image "/path/to/cat.jpg" \
--aspect-ratio "1:1" \
--resolution "2K"python3 ${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/gemini.py \
--prompt "Place this product on the kitchen counter in this scene" \
--image "/path/to/product.png" \
--image "/path/to/kitchen.jpg" \
--aspect-ratio "16:9" \
--resolution "2K"python3 ${CLAUDE_PLUGIN_ROOT}/skills/image-generation/scripts/gemini.py \
--prompt "your enhanced prompt" \
--model "gemini-2.5-flash-image" \
--aspect-ratio "1:1"input_fidelity: high| Feature | GPT Image 1.5 | GPT Image 1 | GPT Image 1 Mini | Nano Banana | Nano Banana Pro |
|---|---|---|---|---|---|
| Provider | OpenAI | OpenAI | OpenAI | ||
| Model ID | gpt-image-1.5 | gpt-image-1 | gpt-image-1-mini | gemini-2.5-flash-image | gemini-3-pro-image-preview |
| Best for | State of the art | Quality + value | Speed + cost | Fast generation | Professional assets |
| Sizes | 1024², 1536x1024, 1024x1536, auto | Same | Same | 1K only | Up to 4K |
| Quality options | low, medium, high, auto | Same | Same | N/A | N/A |
| Aspect ratios | 3 + auto | Same | Same | 10 options | 10 options |
| Reference images | Up to 16 | Up to 16 | Up to 16 | Up to 3 | Up to 14 |
| Image editing | Yes | Yes | Yes | Yes | Yes |
| Inpainting (mask) | Yes | Yes | Yes | Yes | Yes |
| Transparent background | Yes | Yes | Yes | No | No |
| Streaming | Yes | Yes | Yes | No | No |
| Input fidelity | high/low | high/low | low only | N/A | N/A |
| Output formats | png, jpeg, webp | Same | Same | png | png |
| Compression | 0-100% | Same | Same | No | No |
| Text rendering | Excellent | Excellent | Good | Good | Excellent |
| Thinking mode | No | No | No | No | Yes |
| Max prompt length | 32,000 chars | 32,000 chars | 32,000 chars | N/A | N/A |
| Speed | ~30-60s | ~20-40s | ~10-20s | ~10-20s | ~30-60s |
⚠️ DALL-E 2 and DALL-E 3 are deprecated and will stop being supported on 05/12/2026. Use GPT Image models instead.