image-generation

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Image Generation

图像生成

Generate and edit images using AI models. The script automatically picks a backend based on which API keys are configured — you don't need to specify a model unless the user explicitly names one.

Supported models (passed via

model

only when the user asks for a specific one):

OpenAI —
```
gpt-image-2
```
,
```
gpt-image-1
```
Gemini Nano Banana —
```
nano-banana-2
```
,
```
nano-banana-pro
```
,
```
nano-banana
```
Seedream (Volcengine Ark) —
```
seedream-5.0-lite
```
,
```
seedream-4.5
```
Qwen (DashScope) —
```
qwen-image-2.0
```
,
```
qwen-image-2.0-pro
```
MiniMax —
```
image-01
```

使用AI模型生成和编辑图像。脚本会根据已配置的API密钥自动选择后端——除非用户明确指定模型，否则无需手动指定。

支持的模型（仅当用户要求特定模型时才通过

model

参数传递）：

OpenAI —
```
gpt-image-2
```
,
```
gpt-image-1
```
Gemini Nano Banana —
```
nano-banana-2
```
,
```
nano-banana-pro
```
,
```
nano-banana
```
Seedream (Volcengine Ark) —
```
seedream-5.0-lite
```
,
```
seedream-4.5
```
Qwen (DashScope) —
```
qwen-image-2.0
```
,
```
qwen-image-2.0-pro
```
MiniMax —
```
image-01
```

Usage

使用方法

Run

scripts/generate.py

with a JSON argument. The path is relative to this skill's

base_dir

bash

python <base_dir>/scripts/generate.py '<json_args>'

Set bash timeout to at least 600 seconds, as image generation can take 30–200s per provider, and the script may try multiple providers sequentially.

运行

scripts/generate.py

并传入JSON参数。路径相对于此技能的

base_dir

。

bash

python <base_dir>/scripts/generate.py '<json_args>'

将bash超时时间设置为至少600秒，因为每个服务商的图像生成可能需要30–200秒，且脚本可能会依次尝试多个服务商。

Parameters

参数

Parameter	Type	Required	Default	Description
`prompt`	string	yes	—	Image description
`image_url`	string / list	no	null	Input image(s) for editing: local file path or URL. Multi-image fusion is supported (pass a list)
`quality`	string	no	auto	`low` / `medium` / `high` (only some backends honour this)
`size`	string	no	auto	`512` / `1K` / `2K` / `3K` / `4K` , or pixel value ( `1024x1024` )
`aspect_ratio`	string	no	null	`1:1` / `3:2` / `2:3` / `16:9` / `9:16` / `21:9` (some backends also support extreme ratios like `1:4` / `8:1` )

Higher
quality
and larger
size
cost more and run slower. Default to omitting both (

auto

) so the model picks a balanced setting. Only raise them when the user explicitly asks for high quality / a poster / print-ready output. For quick previews or chat scenarios prefer

quality=low

size=1K

参数	类型	是否必填	默认值	说明
`prompt`	字符串	是	—	图像描述文本
`image_url`	字符串 / 列表	否	null	用于编辑的输入图像：本地文件路径或URL。支持多图像融合（传入列表）
`quality`	字符串	否	auto	`low` / `medium` / `high` （仅部分后端支持此参数）
`size`	字符串	否	auto	`512` / `1K` / `2K` / `3K` / `4K` ，或像素值（如 `1024x1024` ）
`aspect_ratio`	字符串	否	null	`1:1` / `3:2` / `2:3` / `16:9` / `9:16` / `21:9` （部分后端还支持 `1:4` / `8:1` 等极端比例）

更高的
quality
和更大的
size
会增加成本并减慢生成速度。默认情况下不指定这两个参数（使用

auto

），由模型选择平衡设置。仅当用户明确要求高质量、海报或印刷级输出时才调高参数。对于快速预览或聊天场景，优先选择

quality=low

size=1K

。

Example — generate

示例——生成图像

bash

python <base_dir>/scripts/generate.py '{"prompt": "A corgi astronaut floating in space"}'

With aspect ratio:

bash

python <base_dir>/scripts/generate.py '{"prompt": "Isometric miniature city of Shanghai at sunset", "size": "2K", "aspect_ratio": "16:9"}'

bash

python <base_dir>/scripts/generate.py '{"prompt": "A corgi astronaut floating in space"}'

指定宽高比的示例：

bash

python <base_dir>/scripts/generate.py '{"prompt": "Isometric miniature city of Shanghai at sunset", "size": "2K", "aspect_ratio": "16:9"}'

Important: Editing vs Generating

重要提示：编辑vs生成

When the user asks to edit, modify, or improve an existing image, pass the original image via

image_url

. Prefer local file paths directly — the script handles file reading internally. Without

image_url

, the script generates a brand-new image instead of editing.

当用户要求编辑、修改或优化现有图像时，通过

image_url

传入原始图像。优先使用本地文件路径——脚本会自动处理文件读取。如果未传入

image_url

，脚本将生成全新图像而非编辑现有图像。

Example — edit (image-to-image)

示例——编辑图像（图生图）

bash

python <base_dir>/scripts/generate.py '{"prompt": "Add a Santa hat to the dog", "image_url": "/path/to/dog.png"}'

Multi-image fusion — pass a list:

bash

python <base_dir>/scripts/generate.py '{"prompt": "Combine these characters into a group photo", "image_url": ["/path/a.png", "/path/b.png"]}'

bash

python <base_dir>/scripts/generate.py '{"prompt": "Add a Santa hat to the dog", "image_url": "/path/to/dog.png"}'

多图像融合——传入列表：

bash

python <base_dir>/scripts/generate.py '{"prompt": "Combine these characters into a group photo", "image_url": ["/path/a.png", "/path/b.png"]}'

Output

输出

Prints JSON to stdout:

json

{
  "model": "doubao-seedream-5-0-260128",
  "images": [
    {"url": "/path/to/output.png"}
  ]
}

After success, display the image to the user. You can either embed it in markdown (

![description](/path/to/output.png)

) or use the

send

tool.

On error:

json

{
  "error": "error message"
}

将JSON打印到标准输出：

json

{
  "model": "doubao-seedream-5-0-260128",
  "images": [
    {"url": "/path/to/output.png"}
  ]
}

成功后，将图像展示给用户。可以将其嵌入markdown（

![描述](/path/to/output.png)

）或使用

send

工具发送。

错误输出：

json

{
  "error": "error message"
}

Setup

配置

The script needs at least one of these API keys (set via

env_config

config.json

OPENAI_API_KEY

GEMINI_API_KEY

ARK_API_KEY

DASHSCOPE_API_KEY

MINIMAX_API_KEY

LINKAI_API_KEY

Each also has an optional

*_API_BASE

for custom endpoints. The script automatically picks the first configured backend and falls back to the next if it fails — no need to specify a model.

脚本需要至少以下API密钥之一（通过

env_config

或

config.json

设置）：

OPENAI_API_KEY

GEMINI_API_KEY

ARK_API_KEY

DASHSCOPE_API_KEY

MINIMAX_API_KEY

LINKAI_API_KEY

每个API还可选择配置

*_API_BASE

以使用自定义端点。脚本会自动选择第一个已配置的后端，如果失败则回退到下一个——无需指定模型。

Error Handling

错误处理

If the script returns an error after trying all configured backends, do NOT retry with the same parameters — the failure is almost always a configuration issue (wrong API key, unsupported API base). Tell the user to fix it via

env_config

, then retry.

如果脚本尝试所有已配置的后端后仍返回错误，请勿使用相同参数重试——失败几乎总是配置问题（API密钥错误、不支持的API端点）。告知用户通过

env_config

修复后再重试。

Notes

注意事项

HTTP timeout is 300s — high-resolution generation can take over 200s.
Omit
```
quality
```
/
```
size
```
to let the model pick automatically (
```
auto
```
).
Input images for editing are auto-compressed to ≤ 4MB / longest edge ≤ 4096px.

HTTP超时时间为300秒——高分辨率生成可能需要超过200秒。
省略
```
quality
```
/
```
size
```
参数，让模型自动选择（
```
auto
```
）。
用于编辑的输入图像会自动压缩至≤4MB / 最长边≤4096像素。