baoyu-imagine

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Image Generation (AI SDK)

图像生成(AI SDK)

Official API-based image generation. Supports OpenAI, Azure OpenAI, Google, OpenRouter, DashScope (阿里通义万象), MiniMax, Jimeng (即梦), Seedream (豆包) and Replicate providers.
基于官方API的图像生成工具,支持OpenAI、Azure OpenAI、Google、OpenRouter、DashScope(阿里通义万象)、MiniMax、即梦(Jimeng)、豆包(Seedream)和Replicate等服务商。

Script Directory

脚本目录

Agent Execution:
  1. {baseDir}
    = this SKILL.md file's directory
  2. Script path =
    {baseDir}/scripts/main.ts
  3. Resolve
    ${BUN_X}
    runtime: if
    bun
    installed →
    bun
    ; if
    npx
    available →
    npx -y bun
    ; else suggest installing bun
Agent 执行步骤:
  1. {baseDir}
    = 本SKILL.md文件所在目录
  2. 脚本路径 =
    {baseDir}/scripts/main.ts
  3. 解析
    ${BUN_X}
    运行时:若已安装
    bun
    则使用
    bun
    ;若有
    npx
    则使用
    npx -y bun
    ;否则建议安装bun

Step 0: Load Preferences ⛔ BLOCKING

步骤0:加载偏好设置 ⛔ 阻塞操作

CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.
Check EXTEND.md existence (priority: project → user):
bash
undefined
关键注意事项:此步骤必须在任何图像生成操作前完成,请勿跳过或延迟。
检查EXTEND.md文件是否存在(优先级:项目配置 → 用户配置):
bash
undefined

macOS, Linux, WSL, Git Bash

macOS、Linux、WSL、Git Bash

test -f .baoyu-skills/baoyu-imagine/EXTEND.md && echo "project" test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-imagine/EXTEND.md" && echo "xdg" test -f "$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md" && echo "user"

```powershell
test -f .baoyu-skills/baoyu-imagine/EXTEND.md && echo "project" test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-imagine/EXTEND.md" && echo "xdg" test -f "$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md" && echo "user"

```powershell

PowerShell (Windows)

PowerShell(Windows)

if (Test-Path .baoyu-skills/baoyu-imagine/EXTEND.md) { "project" } $xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" } if (Test-Path "$xdg/baoyu-skills/baoyu-imagine/EXTEND.md") { "xdg" } if (Test-Path "$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md") { "user" }

| Result | Action |
|--------|--------|
| Found | Load, parse, apply settings. If `default_model.[provider]` is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup ([references/config/first-time-setup.md](references/config/first-time-setup.md)) → Save EXTEND.md → Then continue |

**CRITICAL**: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.

| Path | Location |
|------|----------|
| `.baoyu-skills/baoyu-imagine/EXTEND.md` | Project directory |
| `$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md` | User home |

Legacy compatibility: if `.baoyu-skills/baoyu-image-gen/EXTEND.md` exists and the new path does not, runtime renames it to `baoyu-imagine`. If both files exist, runtime leaves them unchanged and uses the new path.

**EXTEND.md Supports**: Default provider | Default quality | Default aspect ratio | Default image size | Default models | Batch worker cap | Provider-specific batch limits

Schema: `references/config/preferences-schema.md`
if (Test-Path .baoyu-skills/baoyu-imagine/EXTEND.md) { "project" } $xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" } if (Test-Path "$xdg/baoyu-skills/baoyu-imagine/EXTEND.md") { "xdg" } if (Test-Path "$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md") { "user" }

| 结果 | 操作 |
|--------|--------|
| 找到文件 | 加载、解析并应用设置。若`default_model.[provider]`为null → 仅询问模型(流程2) |
| 未找到文件 | ⛔ 运行首次设置([references/config/first-time-setup.md](references/config/first-time-setup.md))→ 保存EXTEND.md → 然后继续 |

**关键注意事项**:若未找到文件,必须先通过AskUserQuestion完成完整设置(服务商 + 模型 + 画质 + 保存位置),之后才能生成图像。在EXTEND.md创建完成前,生成操作会被阻塞。

| 路径 | 位置 |
|------|----------|
| `.baoyu-skills/baoyu-imagine/EXTEND.md` | 项目目录 |
| `$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md` | 用户主目录 |

旧版本兼容:若`.baoyu-skills/baoyu-image-gen/EXTEND.md`存在但新路径下无该文件,运行时会将其重命名为`baoyu-imagine`。若两个路径下的文件都存在,运行时不会修改它们,优先使用新路径下的文件。

**EXTEND.md支持配置项**:默认服务商 | 默认画质 | 默认宽高比 | 默认图像尺寸 | 默认模型 | 批量任务上限 | 服务商专属批量限制

配置 schema:`references/config/preferences-schema.md`

Usage

使用方法

bash
undefined
bash
undefined

Basic

基础用法

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image cat.png

With aspect ratio

指定宽高比

${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
${BUN_X} {baseDir}/scripts/main.ts --prompt "一幅风景画" --image out.png --ar 16:9

High quality

高质量生成

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --quality 2k

From prompt files

从提示词文件生成

${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png

With reference images (Google, OpenAI, Azure OpenAI, OpenRouter, Replicate, MiniMax, or Seedream 4.0/4.5/5.0)

使用参考图(支持Google、OpenAI、Azure OpenAI、OpenRouter、Replicate、MiniMax或Seedream 4.0/4.5/5.0)

${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "改成蓝色" --image out.png --ref source.png

With reference images (explicit provider/model)

指定服务商/模型并使用参考图

${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "改成蓝色" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png

Azure OpenAI (model means deployment name)

Azure OpenAI(model指部署名称)

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider azure --model gpt-image-1.5
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider azure --model gpt-image-1.5

OpenRouter (recommended default model)

OpenRouter(推荐使用默认模型)

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider openrouter

OpenRouter with reference images

OpenRouter结合参考图

${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "改成蓝色" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png

Specific provider

指定服务商

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider openai

DashScope (阿里通义万象)

DashScope(阿里通义万象)

${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope

DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)

DashScope Qwen-Image 2.0 Pro(推荐用于自定义尺寸和文字渲染)

${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张21:9的横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872

DashScope legacy Qwen fixed-size model

DashScope旧版Qwen固定尺寸模型

${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928

MiniMax

MiniMax

${BUN_X} {baseDir}/scripts/main.ts --prompt "A fashion editorial portrait by a bright studio window" --image out.jpg --provider minimax
${BUN_X} {baseDir}/scripts/main.ts --prompt "明亮工作室窗边的时尚人像" --image out.jpg --provider minimax

MiniMax with subject reference (best for character/portrait consistency)

MiniMax结合主体参考图(最适合角色/人像一致性生成)

${BUN_X} {baseDir}/scripts/main.ts --prompt "A girl stands by the library window, cinematic lighting" --image out.jpg --provider minimax --model image-01 --ref portrait.png --ar 16:9
${BUN_X} {baseDir}/scripts/main.ts --prompt "女孩站在图书馆窗边,电影级灯光" --image out.jpg --provider minimax --model image-01 --ref portrait.png --ar 16:9

MiniMax with custom size (documented for image-01)

MiniMax自定义尺寸(仅image-01支持)

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cinematic poster" --image out.jpg --provider minimax --model image-01 --size 1536x1024
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影海报" --image out.jpg --provider minimax --model image-01 --size 1536x1024

Replicate (google/nano-banana-pro)

Replicate(google/nano-banana-pro)

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate

Replicate with specific model

Replicate指定模型

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate --model google/nano-banana

Batch mode with saved prompt files

批量模式(使用已保存的提示词文件)

${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json

Batch mode with explicit worker count

指定任务数的批量模式

${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json
undefined
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json
undefined

Batch File Format

批量文件格式

json
{
  "jobs": 4,
  "tasks": [
    {
      "id": "hero",
      "promptFiles": ["prompts/hero.md"],
      "image": "out/hero.png",
      "provider": "replicate",
      "model": "google/nano-banana-pro",
      "ar": "16:9",
      "quality": "2k"
    },
    {
      "id": "diagram",
      "promptFiles": ["prompts/diagram.md"],
      "image": "out/diagram.png",
      "ref": ["references/original.png"]
    }
  ]
}
Paths in
promptFiles
,
image
, and
ref
are resolved relative to the batch file's directory.
jobs
is optional (overridden by CLI
--jobs
). Top-level array format (without
jobs
wrapper) is also accepted.
json
{
  "jobs": 4,
  "tasks": [
    {
      "id": "hero",
      "promptFiles": ["prompts/hero.md"],
      "image": "out/hero.png",
      "provider": "replicate",
      "model": "google/nano-banana-pro",
      "ar": "16:9",
      "quality": "2k"
    },
    {
      "id": "diagram",
      "promptFiles": ["prompts/diagram.md"],
      "image": "out/diagram.png",
      "ref": ["references/original.png"]
    }
  ]
}
promptFiles
image
ref
中的路径均相对于批量文件所在目录。
jobs
为可选参数(会被CLI的
--jobs
覆盖)。也支持不带
jobs
包装器的顶级数组格式。

Options

可选参数

OptionDescription
--prompt <text>
,
-p
Prompt text
--promptfiles <files...>
Read prompt from files (concatenated)
--image <path>
Output image path (required in single-image mode)
--batchfile <path>
JSON batch file for multi-image generation
--jobs <count>
Worker count for batch mode (default: auto, max from config, built-in default 10)
--provider google|openai|azure|openrouter|dashscope|minimax|jimeng|seedream|replicate
Force provider (default: auto-detect)
--model <id>
,
-m
Model ID (Google:
gemini-3-pro-image-preview
; OpenAI:
gpt-image-1.5
; Azure: deployment name such as
gpt-image-1.5
or
image-prod
; OpenRouter:
google/gemini-3.1-flash-image-preview
; DashScope:
qwen-image-2.0-pro
; MiniMax:
image-01
)
--ar <ratio>
Aspect ratio (e.g.,
16:9
,
1:1
,
4:3
)
--size <WxH>
Size (e.g.,
1024x1024
)
--quality normal|2k
Quality preset (default:
2k
)
--imageSize 1K|2K|4K
Image size for Google/OpenRouter (default: from quality)
--ref <files...>
Reference images. Supported by Google multimodal, OpenAI GPT Image edits, Azure OpenAI edits (PNG/JPG only), OpenRouter multimodal models, Replicate, MiniMax subject-reference, and Seedream 5.0/4.5/4.0. Not supported by Jimeng, Seedream 3.0, or removed SeedEdit 3.0
--n <count>
Number of images
--json
JSON output
参数描述
--prompt <text>
,
-p
提示词文本
--promptfiles <files...>
从文件读取提示词(多文件内容会拼接)
--image <path>
输出图像路径(单图模式必填)
--batchfile <path>
用于多图像生成的JSON批量文件
--jobs <count>
批量模式的任务数(默认:自动,上限由配置决定,内置默认值为10)
--provider google|openai|azure|openrouter|dashscope|minimax|jimeng|seedream|replicate
强制指定服务商(默认:自动检测)
--model <id>
,
-m
模型ID(Google:
gemini-3-pro-image-preview
; OpenAI:
gpt-image-1.5
; Azure: 部署名称如
gpt-image-1.5
image-prod
; OpenRouter:
google/gemini-3.1-flash-image-preview
; DashScope:
qwen-image-2.0-pro
; MiniMax:
image-01
--ar <ratio>
宽高比(例如
16:9
1:1
4:3
--size <WxH>
图像尺寸(例如
1024x1024
--quality normal|2k
画质预设(默认:
2k
--imageSize 1K|2K|4K
Google/OpenRouter的图像尺寸(默认:由画质决定)
--ref <files...>
参考图。支持Google多模态、OpenAI GPT Image编辑、Azure OpenAI编辑(仅支持PNG/JPG)、OpenRouter多模态模型、Replicate、MiniMax主体参考、Seedream 5.0/4.5/4.0。不支持Jimeng、Seedream 3.0或已移除的SeedEdit 3.0
--n <count>
生成图像数量
--json
以JSON格式输出结果

Environment Variables

环境变量

VariableDescription
OPENAI_API_KEY
OpenAI API key
AZURE_OPENAI_API_KEY
Azure OpenAI API key
OPENROUTER_API_KEY
OpenRouter API key
GOOGLE_API_KEY
Google API key
DASHSCOPE_API_KEY
DashScope API key (阿里云)
MINIMAX_API_KEY
MiniMax API key
REPLICATE_API_TOKEN
Replicate API token
JIMENG_ACCESS_KEY_ID
Jimeng (即梦) Volcengine access key
JIMENG_SECRET_ACCESS_KEY
Jimeng (即梦) Volcengine secret key
ARK_API_KEY
Seedream (豆包) Volcengine ARK API key
OPENAI_IMAGE_MODEL
OpenAI model override
AZURE_OPENAI_DEPLOYMENT
Azure default deployment name
AZURE_OPENAI_IMAGE_MODEL
Backward-compatible alias for Azure default deployment/model name
OPENROUTER_IMAGE_MODEL
OpenRouter model override (default:
google/gemini-3.1-flash-image-preview
)
GOOGLE_IMAGE_MODEL
Google model override
DASHSCOPE_IMAGE_MODEL
DashScope model override (default:
qwen-image-2.0-pro
)
MINIMAX_IMAGE_MODEL
MiniMax model override (default:
image-01
)
REPLICATE_IMAGE_MODEL
Replicate model override (default: google/nano-banana-pro)
JIMENG_IMAGE_MODEL
Jimeng model override (default: jimeng_t2i_v40)
SEEDREAM_IMAGE_MODEL
Seedream model override (default: doubao-seedream-5-0-260128)
OPENAI_BASE_URL
Custom OpenAI endpoint
AZURE_OPENAI_BASE_URL
Azure resource endpoint or deployment endpoint
AZURE_API_VERSION
Azure image API version (default:
2025-04-01-preview
)
OPENROUTER_BASE_URL
Custom OpenRouter endpoint (default:
https://openrouter.ai/api/v1
)
OPENROUTER_HTTP_REFERER
Optional app/site URL for OpenRouter attribution
OPENROUTER_TITLE
Optional app name for OpenRouter attribution
GOOGLE_BASE_URL
Custom Google endpoint
DASHSCOPE_BASE_URL
Custom DashScope endpoint
MINIMAX_BASE_URL
Custom MiniMax endpoint (default:
https://api.minimax.io
)
REPLICATE_BASE_URL
Custom Replicate endpoint
JIMENG_BASE_URL
Custom Jimeng endpoint (default:
https://visual.volcengineapi.com
)
JIMENG_REGION
Jimeng region (default:
cn-north-1
)
SEEDREAM_BASE_URL
Custom Seedream endpoint (default:
https://ark.cn-beijing.volces.com/api/v3
)
BAOYU_IMAGE_GEN_MAX_WORKERS
Override batch worker cap
BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY
Override provider concurrency, e.g.
BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY
BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS
Override provider start gap, e.g.
BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS
Load Priority: CLI args > EXTEND.md > env vars >
<cwd>/.baoyu-skills/.env
>
~/.baoyu-skills/.env
变量描述
OPENAI_API_KEY
OpenAI API密钥
AZURE_OPENAI_API_KEY
Azure OpenAI API密钥
OPENROUTER_API_KEY
OpenRouter API密钥
GOOGLE_API_KEY
Google API密钥
DASHSCOPE_API_KEY
DashScope API密钥(阿里云)
MINIMAX_API_KEY
MiniMax API密钥
REPLICATE_API_TOKEN
Replicate API令牌
JIMENG_ACCESS_KEY_ID
即梦(Jimeng)火山引擎访问密钥
JIMENG_SECRET_ACCESS_KEY
即梦(Jimeng)火山引擎秘密密钥
ARK_API_KEY
豆包(Seedream)火山引擎ARK API密钥
OPENAI_IMAGE_MODEL
覆盖OpenAI默认模型
AZURE_OPENAI_DEPLOYMENT
Azure默认部署名称
AZURE_OPENAI_IMAGE_MODEL
Azure默认部署/模型名称的向后兼容别名
OPENROUTER_IMAGE_MODEL
覆盖OpenRouter默认模型(默认:
google/gemini-3.1-flash-image-preview
GOOGLE_IMAGE_MODEL
覆盖Google默认模型
DASHSCOPE_IMAGE_MODEL
覆盖DashScope默认模型(默认:
qwen-image-2.0-pro
MINIMAX_IMAGE_MODEL
覆盖MiniMax默认模型(默认:
image-01
REPLICATE_IMAGE_MODEL
覆盖Replicate默认模型(默认:google/nano-banana-pro)
JIMENG_IMAGE_MODEL
覆盖即梦默认模型(默认:jimeng_t2i_v40)
SEEDREAM_IMAGE_MODEL
覆盖豆包默认模型(默认:doubao-seedream-5-0-260128)
OPENAI_BASE_URL
自定义OpenAI端点
AZURE_OPENAI_BASE_URL
Azure资源端点或部署端点
AZURE_API_VERSION
Azure图像API版本(默认:
2025-04-01-preview
OPENROUTER_BASE_URL
自定义OpenRouter端点(默认:
https://openrouter.ai/api/v1
OPENROUTER_HTTP_REFERER
OpenRouter可选的应用/网站URL归因
OPENROUTER_TITLE
OpenRouter可选的应用名称归因
GOOGLE_BASE_URL
自定义Google端点
DASHSCOPE_BASE_URL
自定义DashScope端点
MINIMAX_BASE_URL
自定义MiniMax端点(默认:
https://api.minimax.io
REPLICATE_BASE_URL
自定义Replicate端点
JIMENG_BASE_URL
自定义即梦端点(默认:
https://visual.volcengineapi.com
JIMENG_REGION
即梦服务区域(默认:
cn-north-1
SEEDREAM_BASE_URL
自定义豆包端点(默认:
https://ark.cn-beijing.volces.com/api/v3
BAOYU_IMAGE_GEN_MAX_WORKERS
覆盖批量任务上限
BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY
覆盖服务商并发数,例如
BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY
BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS
覆盖服务商任务启动间隔,例如
BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS
加载优先级:CLI参数 > EXTEND.md > 环境变量 >
<cwd>/.baoyu-skills/.env
>
~/.baoyu-skills/.env

Model Resolution

模型解析优先级

Model priority (highest → lowest), applies to all providers:
  1. CLI flag:
    --model <id>
  2. EXTEND.md:
    default_model.[provider]
  3. Env var:
    <PROVIDER>_IMAGE_MODEL
    (e.g.,
    GOOGLE_IMAGE_MODEL
    )
  4. Built-in default
For Azure,
--model
/
default_model.azure
should be the Azure deployment name.
AZURE_OPENAI_DEPLOYMENT
is the preferred env var, and
AZURE_OPENAI_IMAGE_MODEL
remains as a backward-compatible alias.
EXTEND.md overrides env vars. If both EXTEND.md
default_model.google: "gemini-3-pro-image-preview"
and env var
GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview
exist, EXTEND.md wins.
Agent MUST display model info before each generation:
  • Show:
    Using [provider] / [model]
  • Show switch hint:
    Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL
模型优先级(从高到低),适用于所有服务商:
  1. CLI参数:
    --model <id>
  2. EXTEND.md:
    default_model.[provider]
  3. 环境变量:
    <PROVIDER>_IMAGE_MODEL
    (例如
    GOOGLE_IMAGE_MODEL
  4. 内置默认值
对于Azure,
--model
/
default_model.azure
应为Azure部署名称。
AZURE_OPENAI_DEPLOYMENT
是推荐的环境变量,
AZURE_OPENAI_IMAGE_MODEL
作为向后兼容的别名保留。
EXTEND.md会覆盖环境变量。若EXTEND.md中设置了
default_model.google: "gemini-3-pro-image-preview"
,同时环境变量
GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview
存在,会优先使用EXTEND.md中的配置。
Agent必须在每次生成前显示模型信息
  • 显示内容:
    正在使用 [服务商] / [模型]
  • 显示切换提示:
    切换模型方式:--model <id> | EXTEND.md default_model.[provider] | 环境变量 <PROVIDER>_IMAGE_MODEL

DashScope Models

DashScope模型

Use
--model qwen-image-2.0-pro
or set
default_model.dashscope
/
DASHSCOPE_IMAGE_MODEL
when the user wants official Qwen-Image behavior.
Official DashScope model families:
  • qwen-image-2.0-pro
    ,
    qwen-image-2.0-pro-2026-03-03
    ,
    qwen-image-2.0
    ,
    qwen-image-2.0-2026-03-03
    • Free-form
      size
      in
      宽*高
      format
    • Total pixels must stay between
      512*512
      and
      2048*2048
    • Default size is approximately
      1024*1024
    • Best choice for custom ratios such as
      21:9
      and text-heavy Chinese/English layouts
  • qwen-image-max
    ,
    qwen-image-max-2025-12-30
    ,
    qwen-image-plus
    ,
    qwen-image-plus-2026-01-09
    ,
    qwen-image
    • Fixed sizes only:
      1664*928
      ,
      1472*1104
      ,
      1328*1328
      ,
      1104*1472
      ,
      928*1664
    • Default size is
      1664*928
    • qwen-image
      currently has the same capability as
      qwen-image-plus
  • Legacy DashScope models such as
    z-image-turbo
    ,
    z-image-ultra
    ,
    wanx-v1
    • Keep using them only when the user explicitly asks for legacy behavior or compatibility
When translating CLI args into DashScope behavior:
  • --size
    wins over
    --ar
  • For
    qwen-image-2.0*
    , prefer explicit
    --size
    ; otherwise infer from
    --ar
    and use the official recommended resolutions below
  • For
    qwen-image-max/plus/image
    , only use the five official fixed sizes; if the requested ratio is not covered, switch to
    qwen-image-2.0-pro
  • --quality
    is a baoyu-imagine compatibility preset, not a native DashScope API field. Mapping
    normal
    /
    2k
    onto the
    qwen-image-2.0*
    table below is an implementation inference, not an official API guarantee
Recommended
qwen-image-2.0*
sizes for common aspect ratios:
Ratio
normal
2k
1:1
1024*1024
1536*1536
2:3
768*1152
1024*1536
3:2
1152*768
1536*1024
3:4
960*1280
1080*1440
4:3
1280*960
1440*1080
9:16
720*1280
1080*1920
16:9
1280*720
1920*1080
21:9
1344*576
2048*872
DashScope official APIs also expose
negative_prompt
,
prompt_extend
, and
watermark
, but
baoyu-imagine
does not expose them as dedicated CLI flags today.
Official references:
当用户需要官方Qwen-Image特性时,使用
--model qwen-image-2.0-pro
,或在EXTEND.md中设置
default_model.dashscope
/ 环境变量
DASHSCOPE_IMAGE_MODEL
官方DashScope模型系列:
  • qwen-image-2.0-pro
    qwen-image-2.0-pro-2026-03-03
    qwen-image-2.0
    qwen-image-2.0-2026-03-03
    • 支持
      宽*高
      格式的自由尺寸设置
    • 总像素数需在
      512*512
      2048*2048
      之间
    • 默认尺寸约为
      1024*1024
    • 最适合自定义比例(如21:9)和包含大量中英文文字的布局
  • qwen-image-max
    qwen-image-max-2025-12-30
    qwen-image-plus
    qwen-image-plus-2026-01-09
    qwen-image
    • 仅支持固定尺寸:
      1664*928
      1472*1104
      1328*1328
      1104*1472
      928*1664
    • 默认尺寸为
      1664*928
    • 当前
      qwen-image
      qwen-image-plus
      功能一致
  • 旧版DashScope模型如
    z-image-turbo
    z-image-ultra
    wanx-v1
    • 仅当用户明确要求旧版特性或需要兼容时使用
将CLI参数转换为DashScope行为的规则:
  • --size
    优先级高于
    --ar
  • 对于
    qwen-image-2.0*
    系列,优先使用明确的
    --size
    ;否则根据
    --ar
    推断尺寸,并使用下方官方推荐的分辨率
  • 对于
    qwen-image-max/plus/image
    系列,仅使用官方提供的5种固定尺寸;若请求的比例不匹配,自动切换到
    qwen-image-2.0-pro
  • --quality
    是baoyu-imagine的兼容预设,并非DashScope原生API字段。将
    normal
    /
    2k
    映射到
    qwen-image-2.0*
    的下表是实现层面的推断,不代表官方API保证
qwen-image-2.0*
系列常见宽高比推荐尺寸:
比例
normal
2k
1:1
1024*1024
1536*1536
2:3
768*1152
1024*1536
3:2
1152*768
1536*1024
3:4
960*1280
1080*1440
4:3
1280*960
1440*1080
9:16
720*1280
1080*1920
16:9
1280*720
1920*1080
21:9
1344*576
2048*872
DashScope官方API还支持
negative_prompt
prompt_extend
watermark
参数,但目前baoyu-imagine未将它们作为独立CLI参数暴露。
官方参考文档:

MiniMax Models

MiniMax模型

Use
--model image-01
or set
default_model.minimax
/
MINIMAX_IMAGE_MODEL
when the user wants MiniMax image generation.
Official MiniMax image model options currently documented in the API reference:
  • image-01
    (recommended default)
    • Supports text-to-image and subject-reference image generation
    • Supports official
      aspect_ratio
      values:
      1:1
      ,
      16:9
      ,
      4:3
      ,
      3:2
      ,
      2:3
      ,
      3:4
      ,
      9:16
      ,
      21:9
    • Supports documented custom
      width
      /
      height
      output sizes when using
      --size <WxH>
    • width
      and
      height
      must both be between
      512
      and
      2048
      , and both must be divisible by
      8
  • image-01-live
    • Lower-latency variant
    • Use
      --ar
      for sizing; MiniMax documents custom
      width
      /
      height
      as only effective for
      image-01
MiniMax subject reference notes:
  • --ref
    files are sent as MiniMax
    subject_reference
  • MiniMax docs currently describe
    subject_reference[].type
    as
    character
  • Official docs say
    image_file
    supports public URLs or Base64 Data URLs;
    baoyu-imagine
    sends local refs as Data URLs
  • Official docs recommend front-facing portrait references in JPG/JPEG/PNG under 10MB
Official references:
当用户需要MiniMax图像生成功能时,使用
--model image-01
,或在EXTEND.md中设置
default_model.minimax
/ 环境变量
MINIMAX_IMAGE_MODEL
目前API参考文档中记录的官方MiniMax图像模型选项:
  • image-01
    (推荐默认值)
    • 支持文本生成图像和主体参考图像生成
    • 支持官方
      aspect_ratio
      值:
      1:1
      16:9
      4:3
      3:2
      2:3
      3:4
      9:16
      21:9
    • 使用
      --size <WxH>
      时,支持文档中记录的自定义输出尺寸
    • width
      height
      必须在
      512
      2048
      之间,且均需能被
      8
      整除
  • image-01-live
    • 低延迟版本
    • 使用
      --ar
      设置尺寸;MiniMax文档说明自定义
      width
      /
      height
      仅对
      image-01
      有效
MiniMax主体参考注意事项:
  • --ref
    文件会作为MiniMax的
    subject_reference
    发送
  • MiniMax文档中目前将
    subject_reference[].type
    描述为
    character
  • 官方文档说明
    image_file
    支持公共URL或Base64 Data URL;baoyu-imagine会将本地参考图转换为Data URL发送
  • 官方文档建议使用正面人像参考图,格式为JPG/JPEG/PNG,文件大小不超过10MB
官方参考文档:

OpenRouter Models

OpenRouter模型

Use full OpenRouter model IDs, e.g.:
  • google/gemini-3.1-flash-image-preview
    (recommended, supports image output and reference-image workflows)
  • google/gemini-2.5-flash-image-preview
  • black-forest-labs/flux.2-pro
  • Other OpenRouter image-capable model IDs
Notes:
  • OpenRouter image generation uses
    /chat/completions
    , not the OpenAI
    /images
    endpoints
  • If
    --ref
    is used, choose a multimodal model that supports image input and image output
  • --imageSize
    maps to OpenRouter
    imageGenerationOptions.size
    ;
    --size <WxH>
    is converted to the nearest OpenRouter size and inferred aspect ratio when possible
使用完整的OpenRouter模型ID,例如:
  • google/gemini-3.1-flash-image-preview
    (推荐,支持图像输出和参考图工作流)
  • google/gemini-2.5-flash-image-preview
  • black-forest-labs/flux.2-pro
  • 其他支持图像生成的OpenRouter模型ID
注意事项:
  • OpenRouter图像生成使用
    /chat/completions
    端点,而非OpenAI的
    /images
    端点
  • 若使用
    --ref
    ,需选择支持图像输入和输出的多模态模型
  • --imageSize
    对应OpenRouter的
    imageGenerationOptions.size
    ;若仅提供
    --size <WxH>
    ,会自动转换为最接近的OpenRouter尺寸,并推断宽高比

Replicate Models

Replicate模型

Supported model formats:
  • owner/name
    (recommended for official models), e.g.
    google/nano-banana-pro
  • owner/name:version
    (community models by version), e.g.
    stability-ai/sdxl:<version>
Examples:
bash
undefined
支持的模型格式:
  • owner/name
    (官方模型推荐格式),例如
    google/nano-banana-pro
  • owner/name:version
    (带版本的社区模型),例如
    stability-ai/sdxl:<version>
示例:
bash
undefined

Use Replicate default model

使用Replicate默认模型

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate

Override model explicitly

显式覆盖模型

${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
undefined
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate --model google/nano-banana
undefined

Provider Selection

服务商选择逻辑

  1. --ref
    provided + no
    --provider
    → auto-select Google first, then OpenAI, then Azure, then OpenRouter, then Replicate, then Seedream, then MiniMax (MiniMax subject reference is more specialized toward character/portrait consistency)
  2. --provider
    specified → use it (if
    --ref
    , must be
    google
    ,
    openai
    ,
    azure
    ,
    openrouter
    ,
    replicate
    ,
    seedream
    , or
    minimax
    )
  3. Only one API key available → use that provider
  4. Multiple available → default to Google
  1. 提供了
    --ref
    但未指定
    --provider
    → 自动优先选择Google,其次是OpenAI、Azure、OpenRouter、Replicate、Seedream,最后是MiniMax(MiniMax主体参考更专注于角色/人像一致性)
  2. 指定了
    --provider
    → 使用该服务商(若使用
    --ref
    ,服务商必须是
    google
    openai
    azure
    openrouter
    replicate
    seedream
    minimax
  3. 仅存在一个API密钥 → 使用该服务商
  4. 存在多个API密钥 → 默认使用Google

Quality Presets

画质预设

PresetGoogle imageSizeOpenAI SizeOpenRouter sizeReplicate resolutionUse Case
normal
1K1024px1K1KQuick previews
2k
(default)
2K2048px2K2KCovers, illustrations, infographics
Google/OpenRouter imageSize: Can be overridden with
--imageSize 1K|2K|4K
预设Google imageSizeOpenAI尺寸OpenRouter尺寸Replicate分辨率使用场景
normal
1K1024px1K1K快速预览
2k
(默认)
2K2048px2K2K封面图、插画、信息图
Google/OpenRouter imageSize:可通过
--imageSize 1K|2K|4K
覆盖

Aspect Ratios

宽高比

Supported:
1:1
,
16:9
,
9:16
,
4:3
,
3:4
,
2.35:1
  • Google multimodal: uses
    imageConfig.aspectRatio
  • OpenAI: maps to closest supported size
  • OpenRouter: sends
    imageGenerationOptions.aspect_ratio
    ; if only
    --size <WxH>
    is given, aspect ratio is inferred automatically
  • Replicate: passes
    aspect_ratio
    to model; when
    --ref
    is provided without
    --ar
    , defaults to
    match_input_image
  • MiniMax: sends official
    aspect_ratio
    values directly; if
    --size <WxH>
    is given without
    --ar
    ,
    width
    /
    height
    are sent for
    image-01
支持的宽高比:
1:1
16:9
9:16
4:3
3:4
2.35:1
  • Google多模态:使用
    imageConfig.aspectRatio
  • OpenAI:映射到最接近的支持尺寸
  • OpenRouter:发送
    imageGenerationOptions.aspect_ratio
    ;若仅提供
    --size <WxH>
    ,会自动推断宽高比
  • Replicate:将
    aspect_ratio
    传递给模型;若提供
    --ref
    但未指定
    --ar
    ,默认使用
    match_input_image
  • MiniMax:直接发送官方
    aspect_ratio
    值;若提供
    --size <WxH>
    但未指定
    --ar
    ,会为
    image-01
    发送
    width
    /
    height
    参数

Generation Mode

生成模式

Default: Sequential generation.
Batch Parallel Generation: When
--batchfile
contains 2 or more pending tasks, the script automatically enables parallel generation.
ModeWhen to Use
Sequential (default)Normal usage, single images, small batches
Parallel batchBatch mode with 2+ tasks
Execution choice:
SituationPreferred approachWhy
One image, or 1-2 simple imagesSequentialLower coordination overhead and easier debugging
Multiple images already have saved prompt filesBatch (
--batchfile
)
Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput
Each image still needs separate reasoning, prompt writing, or style explorationSubagentsThe work is still exploratory, so each image may need independent analysis before generation
Output comes from
baoyu-article-illustrator
with
outline.md
+
prompts/
Batch (
build-batch.ts
->
--batchfile
)
That workflow already produces prompt files, so direct batch execution is the intended path
Rule of thumb:
  • Prefer batch over subagents once prompt files are already saved and the task is "generate all of these"
  • Use subagents only when generation is coupled with per-image thinking, rewriting, or divergent creative exploration
Parallel behavior:
  • Default worker count is automatic, capped by config, built-in default 10
  • Provider-specific throttling is applied only in batch mode, and the built-in defaults are tuned for faster throughput while still avoiding obvious RPM bursts
  • You can override worker count with
    --jobs <count>
  • Each image retries automatically up to 3 attempts
  • Final output includes success count, failure count, and per-image failure reasons
默认模式:顺序生成
批量并行生成:当
--batchfile
包含2个或更多待处理任务时,脚本会自动启用并行生成
模式使用场景
顺序生成(默认)常规使用、单图生成、小批量生成
批量并行生成包含2个以上任务的批量模式
执行方式选择:
场景推荐方式原因
单图生成,或1-2张简单图像顺序生成协调开销更低,调试更简单
已有多个保存好的提示词文件,需生成多图批量模式(
--batchfile
可复用已确定的提示词,应用统一的限流/重试机制,吞吐量可预测
每张图像仍需单独推理、编写提示词或探索风格子Agent工作仍处于探索阶段,每张图像在生成前可能需要独立分析
输出来自
baoyu-article-illustrator
,包含
outline.md
+
prompts/
目录
批量模式(
build-batch.ts
--batchfile
该工作流已生成提示词文件,直接批量执行是设计的预期路径
经验法则:
  • 一旦提示词文件已保存,且任务是“生成所有这些图像”,优先使用批量模式而非子Agent
  • 仅当生成过程需要结合单图思考、重写提示词或发散性创意探索时,才使用子Agent
并行行为:
  • 默认任务数为自动分配,上限由配置决定,内置默认值为10
  • 仅在批量模式下应用服务商专属限流,内置默认值经过调优,可在避免明显RPM突增的同时提升吞吐量
  • 可通过
    --jobs <count>
    覆盖任务数
  • 每张图像自动重试最多3次
  • 最终输出包含成功数、失败数和单图失败原因

Error Handling

错误处理

  • Missing API key → error with setup instructions
  • Generation failure → auto-retry up to 3 attempts per image
  • Invalid aspect ratio → warning, proceed with default
  • Reference images with unsupported provider/model → error with fix hint
  • 缺少API密钥 → 抛出错误并给出设置说明
  • 生成失败 → 每张图像自动重试最多3次
  • 无效宽高比 → 发出警告,使用默认值继续
  • 参考图与服务商/模型不兼容 → 抛出错误并给出修复提示

Extension Support

扩展支持

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.
可通过EXTEND.md进行自定义配置,详见偏好设置部分的路径和支持选项。