baoyu-imagine

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Image Generation (AI SDK)

图像生成（AI SDK）

Official API-based image generation. Supports OpenAI, Azure OpenAI, Google, OpenRouter, DashScope (阿里通义万象), MiniMax, Jimeng (即梦), Seedream (豆包) and Replicate providers.

基于官方API的图像生成工具，支持OpenAI、Azure OpenAI、Google、OpenRouter、DashScope（阿里通义万象）、MiniMax、即梦（Jimeng）、豆包（Seedream）和Replicate等服务商。

Script Directory

脚本目录

Agent Execution:

```
{baseDir}
```
= this SKILL.md file's directory
Script path =
```
{baseDir}/scripts/main.ts
```
Resolve
```
${BUN_X}
```
runtime: if
```
bun
```
installed →
```
bun
```
; if
```
npx
```
available →
```
npx -y bun
```
; else suggest installing bun

Agent 执行步骤:

```
{baseDir}
```
= 本SKILL.md文件所在目录
脚本路径 =
```
{baseDir}/scripts/main.ts
```
解析
```
${BUN_X}
```
运行时：若已安装
```
bun
```
则使用
```
bun
```
；若有
```
npx
```
则使用
```
npx -y bun
```
；否则建议安装bun

Step 0: Load Preferences ⛔ BLOCKING

步骤0：加载偏好设置 ⛔ 阻塞操作

CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.

Check EXTEND.md existence (priority: project → user):

bash

undefined

关键注意事项：此步骤必须在任何图像生成操作前完成，请勿跳过或延迟。

检查EXTEND.md文件是否存在（优先级：项目配置 → 用户配置）：

bash

undefined

macOS, Linux, WSL, Git Bash

macOS、Linux、WSL、Git Bash

test -f .baoyu-skills/baoyu-imagine/EXTEND.md && echo "project" test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-imagine/EXTEND.md" && echo "xdg" test -f "$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md" && echo "user"


```powershell


```powershell

PowerShell (Windows)

PowerShell（Windows）

if (Test-Path .baoyu-skills/baoyu-imagine/EXTEND.md) { "project" } $xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" } if (Test-Path "$xdg/baoyu-skills/baoyu-imagine/EXTEND.md") { "xdg" } if (Test-Path "$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md") { "user" }


| Result | Action |
|--------|--------|
| Found | Load, parse, apply settings. If `default_model.[provider]` is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup ([references/config/first-time-setup.md](references/config/first-time-setup.md)) → Save EXTEND.md → Then continue |

**CRITICAL**: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.

| Path | Location |
|------|----------|
| `.baoyu-skills/baoyu-imagine/EXTEND.md` | Project directory |
| `$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md` | User home |

Legacy compatibility: if `.baoyu-skills/baoyu-image-gen/EXTEND.md` exists and the new path does not, runtime renames it to `baoyu-imagine`. If both files exist, runtime leaves them unchanged and uses the new path.

**EXTEND.md Supports**: Default provider | Default quality | Default aspect ratio | Default image size | Default models | Batch worker cap | Provider-specific batch limits

Schema: `references/config/preferences-schema.md`


| 结果 | 操作 |
|--------|--------|
| 找到文件 | 加载、解析并应用设置。若`default_model.[provider]`为null → 仅询问模型（流程2） |
| 未找到文件 | ⛔ 运行首次设置（[references/config/first-time-setup.md](references/config/first-time-setup.md)）→ 保存EXTEND.md → 然后继续 |

**关键注意事项**：若未找到文件，必须先通过AskUserQuestion完成完整设置（服务商 + 模型 + 画质 + 保存位置），之后才能生成图像。在EXTEND.md创建完成前，生成操作会被阻塞。

| 路径 | 位置 |
|------|----------|
| `.baoyu-skills/baoyu-imagine/EXTEND.md` | 项目目录 |
| `$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md` | 用户主目录 |

旧版本兼容：若`.baoyu-skills/baoyu-image-gen/EXTEND.md`存在但新路径下无该文件，运行时会将其重命名为`baoyu-imagine`。若两个路径下的文件都存在，运行时不会修改它们，优先使用新路径下的文件。

**EXTEND.md支持配置项**：默认服务商 | 默认画质 | 默认宽高比 | 默认图像尺寸 | 默认模型 | 批量任务上限 | 服务商专属批量限制

配置 schema：`references/config/preferences-schema.md`

Usage

使用方法

bash

undefined

bash

undefined

Basic

基础用法

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image cat.png

With aspect ratio

指定宽高比

${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

${BUN_X} {baseDir}/scripts/main.ts --prompt "一幅风景画" --image out.png --ar 16:9

High quality

高质量生成

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --quality 2k

From prompt files

从提示词文件生成

${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png

With reference images (Google, OpenAI, Azure OpenAI, OpenRouter, Replicate, MiniMax, or Seedream 4.0/4.5/5.0)

使用参考图（支持Google、OpenAI、Azure OpenAI、OpenRouter、Replicate、MiniMax或Seedream 4.0/4.5/5.0）

${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

${BUN_X} {baseDir}/scripts/main.ts --prompt "改成蓝色" --image out.png --ref source.png

With reference images (explicit provider/model)

指定服务商/模型并使用参考图

${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png

${BUN_X} {baseDir}/scripts/main.ts --prompt "改成蓝色" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png

Azure OpenAI (model means deployment name)

Azure OpenAI（model指部署名称）

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider azure --model gpt-image-1.5

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider azure --model gpt-image-1.5

OpenRouter (recommended default model)

OpenRouter（推荐使用默认模型）

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider openrouter

OpenRouter with reference images

OpenRouter结合参考图

${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png

${BUN_X} {baseDir}/scripts/main.ts --prompt "改成蓝色" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png

Specific provider

指定服务商

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider openai

DashScope (阿里通义万象)

DashScope（阿里通义万象）

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope

DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)

DashScope Qwen-Image 2.0 Pro（推荐用于自定义尺寸和文字渲染）

${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报，包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872

${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张21:9的横幅海报，包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872

DashScope legacy Qwen fixed-size model

DashScope旧版Qwen固定尺寸模型

${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928

MiniMax

${BUN_X} {baseDir}/scripts/main.ts --prompt "A fashion editorial portrait by a bright studio window" --image out.jpg --provider minimax

${BUN_X} {baseDir}/scripts/main.ts --prompt "明亮工作室窗边的时尚人像" --image out.jpg --provider minimax

MiniMax with subject reference (best for character/portrait consistency)

MiniMax结合主体参考图（最适合角色/人像一致性生成）

${BUN_X} {baseDir}/scripts/main.ts --prompt "A girl stands by the library window, cinematic lighting" --image out.jpg --provider minimax --model image-01 --ref portrait.png --ar 16:9

${BUN_X} {baseDir}/scripts/main.ts --prompt "女孩站在图书馆窗边，电影级灯光" --image out.jpg --provider minimax --model image-01 --ref portrait.png --ar 16:9

MiniMax with custom size (documented for image-01)

MiniMax自定义尺寸（仅image-01支持）

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cinematic poster" --image out.jpg --provider minimax --model image-01 --size 1536x1024

${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影海报" --image out.jpg --provider minimax --model image-01 --size 1536x1024

Replicate (google/nano-banana-pro)

Replicate（google/nano-banana-pro）

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate

Replicate with specific model

Replicate指定模型

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate --model google/nano-banana

Batch mode with saved prompt files

批量模式（使用已保存的提示词文件）

${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json

Batch mode with explicit worker count

指定任务数的批量模式

${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json

undefined

${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json

undefined

Batch File Format

批量文件格式

json

{
  "jobs": 4,
  "tasks": [
    {
      "id": "hero",
      "promptFiles": ["prompts/hero.md"],
      "image": "out/hero.png",
      "provider": "replicate",
      "model": "google/nano-banana-pro",
      "ar": "16:9",
      "quality": "2k"
    },
    {
      "id": "diagram",
      "promptFiles": ["prompts/diagram.md"],
      "image": "out/diagram.png",
      "ref": ["references/original.png"]
    }
  ]
}

Paths in

promptFiles

image

, and

ref

are resolved relative to the batch file's directory.

jobs

is optional (overridden by CLI

--jobs

). Top-level array format (without

jobs

wrapper) is also accepted.

json

{
  "jobs": 4,
  "tasks": [
    {
      "id": "hero",
      "promptFiles": ["prompts/hero.md"],
      "image": "out/hero.png",
      "provider": "replicate",
      "model": "google/nano-banana-pro",
      "ar": "16:9",
      "quality": "2k"
    },
    {
      "id": "diagram",
      "promptFiles": ["prompts/diagram.md"],
      "image": "out/diagram.png",
      "ref": ["references/original.png"]
    }
  ]
}

promptFiles

、

image

和

ref

中的路径均相对于批量文件所在目录。

jobs

为可选参数（会被CLI的

--jobs

覆盖）。也支持不带

jobs

包装器的顶级数组格式。

Options

可选参数

Option	Description
`--prompt <text>` , `-p`	Prompt text
`--promptfiles <files...>`	Read prompt from files (concatenated)
`--image <path>`	Output image path (required in single-image mode)
`--batchfile <path>`	JSON batch file for multi-image generation
`--jobs <count>`	Worker count for batch mode (default: auto, max from config, built-in default 10)
`--provider google\|openai\|azure\|openrouter\|dashscope\|minimax\|jimeng\|seedream\|replicate`	Force provider (default: auto-detect)
`--model <id>` , `-m`	Model ID (Google: `gemini-3-pro-image-preview` ; OpenAI: `gpt-image-1.5` ; Azure: deployment name such as `gpt-image-1.5` or `image-prod` ; OpenRouter: `google/gemini-3.1-flash-image-preview` ; DashScope: `qwen-image-2.0-pro` ; MiniMax: `image-01` )
`--ar <ratio>`	Aspect ratio (e.g., `16:9` , `1:1` , `4:3` )
`--size <WxH>`	Size (e.g., `1024x1024` )
`--quality normal\|2k`	Quality preset (default: `2k` )
`--imageSize 1K\|2K\|4K`	Image size for Google/OpenRouter (default: from quality)
`--ref <files...>`	Reference images. Supported by Google multimodal, OpenAI GPT Image edits, Azure OpenAI edits (PNG/JPG only), OpenRouter multimodal models, Replicate, MiniMax subject-reference, and Seedream 5.0/4.5/4.0. Not supported by Jimeng, Seedream 3.0, or removed SeedEdit 3.0
`--n <count>`	Number of images
`--json`	JSON output

参数	描述
`--prompt <text>` , `-p`	提示词文本
`--promptfiles <files...>`	从文件读取提示词（多文件内容会拼接）
`--image <path>`	输出图像路径（单图模式必填）
`--batchfile <path>`	用于多图像生成的JSON批量文件
`--jobs <count>`	批量模式的任务数（默认：自动，上限由配置决定，内置默认值为10）
`--provider google\|openai\|azure\|openrouter\|dashscope\|minimax\|jimeng\|seedream\|replicate`	强制指定服务商（默认：自动检测）
`--model <id>` , `-m`	模型ID（Google: `gemini-3-pro-image-preview` ; OpenAI: `gpt-image-1.5` ; Azure: 部署名称如 `gpt-image-1.5` 或 `image-prod` ; OpenRouter: `google/gemini-3.1-flash-image-preview` ; DashScope: `qwen-image-2.0-pro` ; MiniMax: `image-01` ）
`--ar <ratio>`	宽高比（例如 `16:9` 、 `1:1` 、 `4:3` ）
`--size <WxH>`	图像尺寸（例如 `1024x1024` ）
`--quality normal\|2k`	画质预设（默认： `2k` ）
`--imageSize 1K\|2K\|4K`	Google/OpenRouter的图像尺寸（默认：由画质决定）
`--ref <files...>`	参考图。支持Google多模态、OpenAI GPT Image编辑、Azure OpenAI编辑（仅支持PNG/JPG）、OpenRouter多模态模型、Replicate、MiniMax主体参考、Seedream 5.0/4.5/4.0。不支持Jimeng、Seedream 3.0或已移除的SeedEdit 3.0
`--n <count>`	生成图像数量
`--json`	以JSON格式输出结果

Environment Variables

环境变量

Variable	Description
`OPENAI_API_KEY`	OpenAI API key
`AZURE_OPENAI_API_KEY`	Azure OpenAI API key
`OPENROUTER_API_KEY`	OpenRouter API key
`GOOGLE_API_KEY`	Google API key
`DASHSCOPE_API_KEY`	DashScope API key (阿里云)
`MINIMAX_API_KEY`	MiniMax API key
`REPLICATE_API_TOKEN`	Replicate API token
`JIMENG_ACCESS_KEY_ID`	Jimeng (即梦) Volcengine access key
`JIMENG_SECRET_ACCESS_KEY`	Jimeng (即梦) Volcengine secret key
`ARK_API_KEY`	Seedream (豆包) Volcengine ARK API key
`OPENAI_IMAGE_MODEL`	OpenAI model override
`AZURE_OPENAI_DEPLOYMENT`	Azure default deployment name
`AZURE_OPENAI_IMAGE_MODEL`	Backward-compatible alias for Azure default deployment/model name
`OPENROUTER_IMAGE_MODEL`	OpenRouter model override (default: `google/gemini-3.1-flash-image-preview` )
`GOOGLE_IMAGE_MODEL`	Google model override
`DASHSCOPE_IMAGE_MODEL`	DashScope model override (default: `qwen-image-2.0-pro` )
`MINIMAX_IMAGE_MODEL`	MiniMax model override (default: `image-01` )
`REPLICATE_IMAGE_MODEL`	Replicate model override (default: google/nano-banana-pro)
`JIMENG_IMAGE_MODEL`	Jimeng model override (default: jimeng_t2i_v40)
`SEEDREAM_IMAGE_MODEL`	Seedream model override (default: doubao-seedream-5-0-260128)
`OPENAI_BASE_URL`	Custom OpenAI endpoint
`AZURE_OPENAI_BASE_URL`	Azure resource endpoint or deployment endpoint
`AZURE_API_VERSION`	Azure image API version (default: `2025-04-01-preview` )
`OPENROUTER_BASE_URL`	Custom OpenRouter endpoint (default: `https://openrouter.ai/api/v1` )
`OPENROUTER_HTTP_REFERER`	Optional app/site URL for OpenRouter attribution
`OPENROUTER_TITLE`	Optional app name for OpenRouter attribution
`GOOGLE_BASE_URL`	Custom Google endpoint
`DASHSCOPE_BASE_URL`	Custom DashScope endpoint
`MINIMAX_BASE_URL`	Custom MiniMax endpoint (default: `https://api.minimax.io` )
`REPLICATE_BASE_URL`	Custom Replicate endpoint
`JIMENG_BASE_URL`	Custom Jimeng endpoint (default: `https://visual.volcengineapi.com` )
`JIMENG_REGION`	Jimeng region (default: `cn-north-1` )
`SEEDREAM_BASE_URL`	Custom Seedream endpoint (default: `https://ark.cn-beijing.volces.com/api/v3` )
`BAOYU_IMAGE_GEN_MAX_WORKERS`	Override batch worker cap
`BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY`	Override provider concurrency, e.g. `BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY`
`BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS`	Override provider start gap, e.g. `BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS`

Load Priority: CLI args > EXTEND.md > env vars >

<cwd>/.baoyu-skills/.env

~/.baoyu-skills/.env

变量	描述
`OPENAI_API_KEY`	OpenAI API密钥
`AZURE_OPENAI_API_KEY`	Azure OpenAI API密钥
`OPENROUTER_API_KEY`	OpenRouter API密钥
`GOOGLE_API_KEY`	Google API密钥
`DASHSCOPE_API_KEY`	DashScope API密钥（阿里云）
`MINIMAX_API_KEY`	MiniMax API密钥
`REPLICATE_API_TOKEN`	Replicate API令牌
`JIMENG_ACCESS_KEY_ID`	即梦（Jimeng）火山引擎访问密钥
`JIMENG_SECRET_ACCESS_KEY`	即梦（Jimeng）火山引擎秘密密钥
`ARK_API_KEY`	豆包（Seedream）火山引擎ARK API密钥
`OPENAI_IMAGE_MODEL`	覆盖OpenAI默认模型
`AZURE_OPENAI_DEPLOYMENT`	Azure默认部署名称
`AZURE_OPENAI_IMAGE_MODEL`	Azure默认部署/模型名称的向后兼容别名
`OPENROUTER_IMAGE_MODEL`	覆盖OpenRouter默认模型（默认： `google/gemini-3.1-flash-image-preview` ）
`GOOGLE_IMAGE_MODEL`	覆盖Google默认模型
`DASHSCOPE_IMAGE_MODEL`	覆盖DashScope默认模型（默认： `qwen-image-2.0-pro` ）
`MINIMAX_IMAGE_MODEL`	覆盖MiniMax默认模型（默认： `image-01` ）
`REPLICATE_IMAGE_MODEL`	覆盖Replicate默认模型（默认：google/nano-banana-pro）
`JIMENG_IMAGE_MODEL`	覆盖即梦默认模型（默认：jimeng_t2i_v40）
`SEEDREAM_IMAGE_MODEL`	覆盖豆包默认模型（默认：doubao-seedream-5-0-260128）
`OPENAI_BASE_URL`	自定义OpenAI端点
`AZURE_OPENAI_BASE_URL`	Azure资源端点或部署端点
`AZURE_API_VERSION`	Azure图像API版本（默认： `2025-04-01-preview` ）
`OPENROUTER_BASE_URL`	自定义OpenRouter端点（默认： `https://openrouter.ai/api/v1` ）
`OPENROUTER_HTTP_REFERER`	OpenRouter可选的应用/网站URL归因
`OPENROUTER_TITLE`	OpenRouter可选的应用名称归因
`GOOGLE_BASE_URL`	自定义Google端点
`DASHSCOPE_BASE_URL`	自定义DashScope端点
`MINIMAX_BASE_URL`	自定义MiniMax端点（默认： `https://api.minimax.io` ）
`REPLICATE_BASE_URL`	自定义Replicate端点
`JIMENG_BASE_URL`	自定义即梦端点（默认： `https://visual.volcengineapi.com` ）
`JIMENG_REGION`	即梦服务区域（默认： `cn-north-1` ）
`SEEDREAM_BASE_URL`	自定义豆包端点（默认： `https://ark.cn-beijing.volces.com/api/v3` ）
`BAOYU_IMAGE_GEN_MAX_WORKERS`	覆盖批量任务上限
`BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY`	覆盖服务商并发数，例如 `BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY`
`BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS`	覆盖服务商任务启动间隔，例如 `BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS`

加载优先级：CLI参数 > EXTEND.md > 环境变量 >

<cwd>/.baoyu-skills/.env

~/.baoyu-skills/.env

Model Resolution

模型解析优先级

Model priority (highest → lowest), applies to all providers:

CLI flag:
```
--model <id>
```
EXTEND.md:
```
default_model.[provider]
```

Env var:

<PROVIDER>_IMAGE_MODEL

(e.g.,

GOOGLE_IMAGE_MODEL

)

Built-in default

For Azure,

--model

default_model.azure

should be the Azure deployment name.

AZURE_OPENAI_DEPLOYMENT

is the preferred env var, and

AZURE_OPENAI_IMAGE_MODEL

remains as a backward-compatible alias.

EXTEND.md overrides env vars. If both EXTEND.md

default_model.google: "gemini-3-pro-image-preview"

and env var

GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview

exist, EXTEND.md wins.

Agent MUST display model info before each generation:

Show:
```
Using [provider] / [model]
```

Show switch hint:

Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL

模型优先级（从高到低），适用于所有服务商：

CLI参数：
```
--model <id>
```
EXTEND.md：
```
default_model.[provider]
```

环境变量：

<PROVIDER>_IMAGE_MODEL

（例如

GOOGLE_IMAGE_MODEL

）

内置默认值

对于Azure，

--model

default_model.azure

应为Azure部署名称。

AZURE_OPENAI_DEPLOYMENT

是推荐的环境变量，

AZURE_OPENAI_IMAGE_MODEL

作为向后兼容的别名保留。

EXTEND.md会覆盖环境变量。若EXTEND.md中设置了

default_model.google: "gemini-3-pro-image-preview"

，同时环境变量

GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview

存在，会优先使用EXTEND.md中的配置。

Agent必须在每次生成前显示模型信息：

显示内容：
```
正在使用 [服务商] / [模型]
```

显示切换提示：

切换模型方式：--model <id> | EXTEND.md default_model.[provider] | 环境变量 <PROVIDER>_IMAGE_MODEL

DashScope Models

DashScope模型

Use

--model qwen-image-2.0-pro

or set

default_model.dashscope

DASHSCOPE_IMAGE_MODEL

when the user wants official Qwen-Image behavior.

Official DashScope model families:

```
qwen-image-2.0-pro
```
,
```
qwen-image-2.0-pro-2026-03-03
```
,
```
qwen-image-2.0
```
,
```
qwen-image-2.0-2026-03-03
```
- Free-form
```
size
```
  in
```
宽*高
```
  format
- Total pixels must stay between
```
512*512
```
  and
```
2048*2048
```
- Default size is approximately
```
1024*1024
```
- Best choice for custom ratios such as
```
21:9
```
  and text-heavy Chinese/English layouts

qwen-image-max

qwen-image-max-2025-12-30

qwen-image-plus

qwen-image-plus-2026-01-09

qwen-image

Fixed sizes only:

1664*928

1472*1104

1328*1328

1104*1472

928*1664

Default size is
```
1664*928
```
```
qwen-image
```
currently has the same capability as
```
qwen-image-plus
```

Legacy DashScope models such as
```
z-image-turbo
```
,
```
z-image-ultra
```
,
```
wanx-v1
```
- Keep using them only when the user explicitly asks for legacy behavior or compatibility

When translating CLI args into DashScope behavior:

```
--size
```
wins over
```
--ar
```
For
```
qwen-image-2.0*
```
, prefer explicit
```
--size
```
; otherwise infer from
```
--ar
```
and use the official recommended resolutions below
For
```
qwen-image-max/plus/image
```
, only use the five official fixed sizes; if the requested ratio is not covered, switch to
```
qwen-image-2.0-pro
```
```
--quality
```
is a baoyu-imagine compatibility preset, not a native DashScope API field. Mapping
```
normal
```
/
```
2k
```
onto the
```
qwen-image-2.0*
```
table below is an implementation inference, not an official API guarantee

Recommended

qwen-image-2.0*

sizes for common aspect ratios:

Ratio	`normal`	`2k`
`1:1`	`1024*1024`	`1536*1536`
`2:3`	`768*1152`	`1024*1536`
`3:2`	`1152*768`	`1536*1024`
`3:4`	`960*1280`	`1080*1440`
`4:3`	`1280*960`	`1440*1080`
`9:16`	`720*1280`	`1080*1920`
`16:9`	`1280*720`	`1920*1080`
`21:9`	`1344*576`	`2048*872`

DashScope official APIs also expose

negative_prompt

prompt_extend

, and

watermark

, but

baoyu-imagine

does not expose them as dedicated CLI flags today.

Official references:

当用户需要官方Qwen-Image特性时，使用

--model qwen-image-2.0-pro

，或在EXTEND.md中设置

default_model.dashscope

/ 环境变量

DASHSCOPE_IMAGE_MODEL

。

官方DashScope模型系列：

```
qwen-image-2.0-pro
```
、
```
qwen-image-2.0-pro-2026-03-03
```
、
```
qwen-image-2.0
```
、
```
qwen-image-2.0-2026-03-03
```
- 支持
```
宽*高
```
  格式的自由尺寸设置
- 总像素数需在
```
512*512
```
  到
```
2048*2048
```
  之间
- 默认尺寸约为
```
1024*1024
```
- 最适合自定义比例（如21:9）和包含大量中英文文字的布局

qwen-image-max

、

qwen-image-max-2025-12-30

、

qwen-image-plus

、

qwen-image-plus-2026-01-09

、

qwen-image

仅支持固定尺寸：

1664*928

、

1472*1104

、

1328*1328

、

1104*1472

、

928*1664

默认尺寸为
```
1664*928
```
当前
```
qwen-image
```
与
```
qwen-image-plus
```
功能一致

旧版DashScope模型如
```
z-image-turbo
```
、
```
z-image-ultra
```
、
```
wanx-v1
```
- 仅当用户明确要求旧版特性或需要兼容时使用

将CLI参数转换为DashScope行为的规则：

```
--size
```
优先级高于
```
--ar
```
对于
```
qwen-image-2.0*
```
系列，优先使用明确的
```
--size
```
；否则根据
```
--ar
```
推断尺寸，并使用下方官方推荐的分辨率
对于
```
qwen-image-max/plus/image
```
系列，仅使用官方提供的5种固定尺寸；若请求的比例不匹配，自动切换到
```
qwen-image-2.0-pro
```
```
--quality
```
是baoyu-imagine的兼容预设，并非DashScope原生API字段。将
```
normal
```
/
```
2k
```
映射到
```
qwen-image-2.0*
```
的下表是实现层面的推断，不代表官方API保证

qwen-image-2.0*

系列常见宽高比推荐尺寸：

比例	`normal`	`2k`
`1:1`	`1024*1024`	`1536*1536`
`2:3`	`768*1152`	`1024*1536`
`3:2`	`1152*768`	`1536*1024`
`3:4`	`960*1280`	`1080*1440`
`4:3`	`1280*960`	`1440*1080`
`9:16`	`720*1280`	`1080*1920`
`16:9`	`1280*720`	`1920*1080`
`21:9`	`1344*576`	`2048*872`

DashScope官方API还支持

negative_prompt

、

prompt_extend

和

watermark

参数，但目前baoyu-imagine未将它们作为独立CLI参数暴露。

官方参考文档：

MiniMax Models

MiniMax模型

Use

--model image-01

or set

default_model.minimax

MINIMAX_IMAGE_MODEL

when the user wants MiniMax image generation.

Official MiniMax image model options currently documented in the API reference:

```
image-01
```
(recommended default)
- Supports text-to-image and subject-reference image generation
- Supports official
```
aspect_ratio
```
  values:
```
1:1
```
  ,
```
16:9
```
  ,
```
4:3
```
  ,
```
3:2
```
  ,
```
2:3
```
  ,
```
3:4
```
  ,
```
9:16
```
  ,
```
21:9
```
- Supports documented custom
```
width
```
  /
```
height
```
  output sizes when using
```
--size <WxH>
```
- ```
width
```
  and
```
height
```
  must both be between
```
512
```
  and
```
2048
```
  , and both must be divisible by
```
8
```
```
image-01-live
```
- Lower-latency variant
- Use
```
--ar
```
  for sizing; MiniMax documents custom
```
width
```
  /
```
height
```
  as only effective for
```
image-01
```

MiniMax subject reference notes:

```
--ref
```
files are sent as MiniMax
```
subject_reference
```
MiniMax docs currently describe
```
subject_reference[].type
```
as
```
character
```
Official docs say
```
image_file
```
supports public URLs or Base64 Data URLs;
```
baoyu-imagine
```
sends local refs as Data URLs
Official docs recommend front-facing portrait references in JPG/JPEG/PNG under 10MB

Official references:

当用户需要MiniMax图像生成功能时，使用

--model image-01

，或在EXTEND.md中设置

default_model.minimax

/ 环境变量

MINIMAX_IMAGE_MODEL

。

目前API参考文档中记录的官方MiniMax图像模型选项：

```
image-01
```
（推荐默认值）
- 支持文本生成图像和主体参考图像生成
- 支持官方
```
aspect_ratio
```
  值：
```
1:1
```
  、
```
16:9
```
  、
```
4:3
```
  、
```
3:2
```
  、
```
2:3
```
  、
```
3:4
```
  、
```
9:16
```
  、
```
21:9
```
- 使用
```
--size <WxH>
```
  时，支持文档中记录的自定义输出尺寸
- ```
width
```
  和
```
height
```
  必须在
```
512
```
  到
```
2048
```
  之间，且均需能被
```
8
```
  整除
```
image-01-live
```
- 低延迟版本
- 使用
```
--ar
```
  设置尺寸；MiniMax文档说明自定义
```
width
```
  /
```
height
```
  仅对
```
image-01
```
  有效

MiniMax主体参考注意事项：

```
--ref
```
文件会作为MiniMax的
```
subject_reference
```
发送
MiniMax文档中目前将
```
subject_reference[].type
```
描述为
```
character
```
官方文档说明
```
image_file
```
支持公共URL或Base64 Data URL；baoyu-imagine会将本地参考图转换为Data URL发送
官方文档建议使用正面人像参考图，格式为JPG/JPEG/PNG，文件大小不超过10MB

官方参考文档：

OpenRouter Models

OpenRouter模型

Use full OpenRouter model IDs, e.g.:

```
google/gemini-3.1-flash-image-preview
```
(recommended, supports image output and reference-image workflows)
```
google/gemini-2.5-flash-image-preview
```
```
black-forest-labs/flux.2-pro
```
Other OpenRouter image-capable model IDs

Notes:

OpenRouter image generation uses
```
/chat/completions
```
, not the OpenAI
```
/images
```
endpoints
If
```
--ref
```
is used, choose a multimodal model that supports image input and image output
```
--imageSize
```
maps to OpenRouter
```
imageGenerationOptions.size
```
;
```
--size <WxH>
```
is converted to the nearest OpenRouter size and inferred aspect ratio when possible

使用完整的OpenRouter模型ID，例如：

```
google/gemini-3.1-flash-image-preview
```
（推荐，支持图像输出和参考图工作流）
```
google/gemini-2.5-flash-image-preview
```
```
black-forest-labs/flux.2-pro
```
其他支持图像生成的OpenRouter模型ID

注意事项：

OpenRouter图像生成使用
```
/chat/completions
```
端点，而非OpenAI的
```
/images
```
端点
若使用
```
--ref
```
，需选择支持图像输入和输出的多模态模型
```
--imageSize
```
对应OpenRouter的
```
imageGenerationOptions.size
```
；若仅提供
```
--size <WxH>
```
，会自动转换为最接近的OpenRouter尺寸，并推断宽高比

Replicate Models

Replicate模型

Supported model formats:

```
owner/name
```
(recommended for official models), e.g.
```
google/nano-banana-pro
```

owner/name:version

(community models by version), e.g.

stability-ai/sdxl:<version>

Examples:

bash

undefined

支持的模型格式：

```
owner/name
```
（官方模型推荐格式），例如
```
google/nano-banana-pro
```

owner/name:version

（带版本的社区模型），例如

stability-ai/sdxl:<version>

示例：

bash

undefined

Use Replicate default model

使用Replicate默认模型

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate

Override model explicitly

显式覆盖模型

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

undefined

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate --model google/nano-banana

undefined

Provider Selection

服务商选择逻辑

```
--ref
```
provided + no
```
--provider
```
→ auto-select Google first, then OpenAI, then Azure, then OpenRouter, then Replicate, then Seedream, then MiniMax (MiniMax subject reference is more specialized toward character/portrait consistency)

--provider

specified → use it (if

--ref

, must be

google

openai

azure

openrouter

replicate

seedream

, or

minimax

)

Only one API key available → use that provider
Multiple available → default to Google

提供了
```
--ref
```
但未指定
```
--provider
```
→ 自动优先选择Google，其次是OpenAI、Azure、OpenRouter、Replicate、Seedream，最后是MiniMax（MiniMax主体参考更专注于角色/人像一致性）
指定了
```
--provider
```
→ 使用该服务商（若使用
```
--ref
```
，服务商必须是
```
google
```
、
```
openai
```
、
```
azure
```
、
```
openrouter
```
、
```
replicate
```
、
```
seedream
```
或
```
minimax
```
）
仅存在一个API密钥 → 使用该服务商
存在多个API密钥 → 默认使用Google

Quality Presets

画质预设

Preset	Google imageSize	OpenAI Size	OpenRouter size	Replicate resolution	Use Case
`normal`	1K	1024px	1K	1K	Quick previews
`2k` (default)	2K	2048px	2K	2K	Covers, illustrations, infographics

Google/OpenRouter imageSize: Can be overridden with

--imageSize 1K|2K|4K

预设	Google imageSize	OpenAI尺寸	OpenRouter尺寸	Replicate分辨率	使用场景
`normal`	1K	1024px	1K	1K	快速预览
`2k` （默认）	2K	2048px	2K	2K	封面图、插画、信息图

Google/OpenRouter imageSize：可通过

--imageSize 1K|2K|4K

覆盖

Aspect Ratios

宽高比

Supported:

1:1

16:9

9:16

4:3

3:4

2.35:1

Google multimodal: uses
```
imageConfig.aspectRatio
```
OpenAI: maps to closest supported size
OpenRouter: sends
```
imageGenerationOptions.aspect_ratio
```
; if only
```
--size <WxH>
```
is given, aspect ratio is inferred automatically
Replicate: passes
```
aspect_ratio
```
to model; when
```
--ref
```
is provided without
```
--ar
```
, defaults to
```
match_input_image
```
MiniMax: sends official
```
aspect_ratio
```
values directly; if
```
--size <WxH>
```
is given without
```
--ar
```
,
```
width
```
/
```
height
```
are sent for
```
image-01
```

支持的宽高比：

1:1

、

16:9

、

9:16

、

4:3

、

3:4

、

2.35:1

Google多模态：使用
```
imageConfig.aspectRatio
```
OpenAI：映射到最接近的支持尺寸
OpenRouter：发送
```
imageGenerationOptions.aspect_ratio
```
；若仅提供
```
--size <WxH>
```
，会自动推断宽高比
Replicate：将
```
aspect_ratio
```
传递给模型；若提供
```
--ref
```
但未指定
```
--ar
```
，默认使用
```
match_input_image
```
MiniMax：直接发送官方
```
aspect_ratio
```
值；若提供
```
--size <WxH>
```
但未指定
```
--ar
```
，会为
```
image-01
```
发送
```
width
```
/
```
height
```
参数

Generation Mode

生成模式

Default: Sequential generation.

Batch Parallel Generation: When

--batchfile

contains 2 or more pending tasks, the script automatically enables parallel generation.

Mode	When to Use
Sequential (default)	Normal usage, single images, small batches
Parallel batch	Batch mode with 2+ tasks

Execution choice:

Situation	Preferred approach	Why
One image, or 1-2 simple images	Sequential	Lower coordination overhead and easier debugging
Multiple images already have saved prompt files	Batch ( `--batchfile` )	Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput
Each image still needs separate reasoning, prompt writing, or style exploration	Subagents	The work is still exploratory, so each image may need independent analysis before generation
Output comes from `baoyu-article-illustrator` with `outline.md` + `prompts/`	Batch ( `build-batch.ts` -> `--batchfile` )	That workflow already produces prompt files, so direct batch execution is the intended path

Rule of thumb:

Prefer batch over subagents once prompt files are already saved and the task is "generate all of these"
Use subagents only when generation is coupled with per-image thinking, rewriting, or divergent creative exploration

Parallel behavior:

Default worker count is automatic, capped by config, built-in default 10
Provider-specific throttling is applied only in batch mode, and the built-in defaults are tuned for faster throughput while still avoiding obvious RPM bursts
You can override worker count with
```
--jobs <count>
```
Each image retries automatically up to 3 attempts
Final output includes success count, failure count, and per-image failure reasons

默认模式：顺序生成

批量并行生成：当

--batchfile

包含2个或更多待处理任务时，脚本会自动启用并行生成

模式	使用场景
顺序生成（默认）	常规使用、单图生成、小批量生成
批量并行生成	包含2个以上任务的批量模式

执行方式选择：

场景	推荐方式	原因
单图生成，或1-2张简单图像	顺序生成	协调开销更低，调试更简单
已有多个保存好的提示词文件，需生成多图	批量模式（ `--batchfile` ）	可复用已确定的提示词，应用统一的限流/重试机制，吞吐量可预测
每张图像仍需单独推理、编写提示词或探索风格	子Agent	工作仍处于探索阶段，每张图像在生成前可能需要独立分析
输出来自 `baoyu-article-illustrator` ，包含 `outline.md` + `prompts/` 目录	批量模式（ `build-batch.ts` → `--batchfile` ）	该工作流已生成提示词文件，直接批量执行是设计的预期路径

经验法则：

一旦提示词文件已保存，且任务是“生成所有这些图像”，优先使用批量模式而非子Agent
仅当生成过程需要结合单图思考、重写提示词或发散性创意探索时，才使用子Agent

并行行为：

默认任务数为自动分配，上限由配置决定，内置默认值为10
仅在批量模式下应用服务商专属限流，内置默认值经过调优，可在避免明显RPM突增的同时提升吞吐量
可通过
```
--jobs <count>
```
覆盖任务数
每张图像自动重试最多3次
最终输出包含成功数、失败数和单图失败原因

Error Handling

错误处理

Missing API key → error with setup instructions
Generation failure → auto-retry up to 3 attempts per image
Invalid aspect ratio → warning, proceed with default
Reference images with unsupported provider/model → error with fix hint

缺少API密钥 → 抛出错误并给出设置说明
生成失败 → 每张图像自动重试最多3次
无效宽高比 → 发出警告，使用默认值继续
参考图与服务商/模型不兼容 → 抛出错误并给出修复提示

Extension Support

扩展支持

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.

可通过EXTEND.md进行自定义配置，详见偏好设置部分的路径和支持选项。