videoagent-image-studio

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

🎨 VideoAgent Image Studio

🎨 VideoAgent 图像工作室

Use when: User asks to generate, draw, create, or make any kind of image, photo, illustration, icon, logo, or artwork.
Generate images with 8 state-of-the-art AI models. This skill automatically picks the best model for the job and handles all the complexity — including Midjourney's async polling — so you can focus on the conversation.

适用场景: 用户要求生成、绘制、创建或制作任何类型的图像、照片、插画、图标、logo或艺术作品。
借助8款最先进的AI模型生成图像。这个技能会自动为任务选择最佳模型并处理所有复杂流程——包括Midjourney的异步轮询——让你可以专注于对话本身。

Quick Reference

快速参考

User IntentModelSpeed
Artistic, cinematic, painterly
midjourney
~15s
Photorealistic, portrait, product
flux-pro
~8s
General purpose, balanced
flux-dev
~10s
Quick draft, fast iteration
flux-schnell
~2s
Image with text, logo, poster
ideogram
~10s
Vector art, icon, flat design
recraft
~8s
Anime, stylized illustration
sdxl
~5s
Gemini-powered, consistent style
nano-banana
~12s

用户意图模型速度
艺术风、电影感、绘画风
midjourney
~15s
照片写实、人像、产品
flux-pro
~8s
通用场景、均衡表现
flux-dev
~10s
快速草稿、快速迭代
flux-schnell
~2s
带文字的图像、logo、海报
ideogram
~10s
矢量艺术、图标、扁平化设计
recraft
~8s
动漫、风格化插画
sdxl
~5s
Gemini驱动、风格统一
nano-banana
~12s

How to Generate an Image

如何生成图像

Step 1 — Enhance the prompt

步骤1 — 优化提示词

Before calling the script, expand the user's prompt with style, lighting, and quality descriptors appropriate for the chosen model.
  • Midjourney: Add
    cinematic lighting
    ,
    ultra detailed
    ,
    --v 7
    ,
    --style raw
  • Flux: Add
    masterpiece
    ,
    highly detailed
    ,
    sharp focus
    ,
    professional photography
  • Ideogram: Be explicit about text content, font style, and layout
  • Recraft: Specify
    vector illustration
    ,
    flat design
    ,
    icon style
在调用脚本前,根据所选模型的适配要求,补充风格、光照、画质描述来丰富用户的提示词。
  • Midjourney: 添加
    cinematic lighting
    ultra detailed
    --v 7
    --style raw
  • Flux: 添加
    masterpiece
    highly detailed
    sharp focus
    professional photography
  • Ideogram: 明确说明文字内容、字体风格和排版
  • Recraft: 指定
    vector illustration
    flat design
    icon style

Step 2 — Run the script

步骤2 — 运行脚本

bash
node {baseDir}/tools/generate.js \
  --model <model_id> \
  --prompt "<enhanced prompt>" \
  --aspect-ratio <ratio>
All parameters:
ParameterDefaultDescription
--model
flux-dev
Model ID from the table above
--prompt
(required)The image generation prompt
--aspect-ratio
1:1
1:1
,
16:9
,
9:16
,
4:3
,
3:4
,
3:2
,
21:9
--num-images
1
Number of images (1–4; Midjourney always returns 4)
--negative-prompt
Things to avoid (not supported by Midjourney)
--seed
Seed for reproducibility
bash
node {baseDir}/tools/generate.js \\
  --model <model_id> \\
  --prompt "<enhanced prompt>" \\
  --aspect-ratio <ratio>
所有参数:
参数默认值描述
--model
flux-dev
上表中的模型ID
--prompt
(必填)图像生成提示词
--aspect-ratio
1:1
可选值:
1:1
16:9
9:16
4:3
3:4
3:2
21:9
--num-images
1
生成图像数量(1-4;Midjourney固定返回4张)
--negative-prompt
要排除的内容(Midjourney不支持该参数)
--seed
用于复现结果的随机种子

Step 3 — Return the result

步骤3 — 返回结果

The script always waits and returns the final image URL(s). No polling required.
json
{
  "success": true,
  "model": "flux-pro",
  "imageUrl": "https://...",
  "images": ["https://..."]
}
Send the
imageUrl
to the user.

脚本会一直等待并返回最终的图片URL,无需自行轮询。
json
{
  "success": true,
  "model": "flux-pro",
  "imageUrl": "https://...",
  "images": ["https://..."]
}
imageUrl
发送给用户即可。

Midjourney Actions

Midjourney 操作

After generating a 4-image grid with Midjourney, offer the user these options:
bash
undefined
用Midjourney生成4张图片的网格后,向用户提供以下选项:
bash
undefined

Upscale image #2 (subtle, preserves details)

放大第2张图片(风格柔和,保留细节)

node {baseDir}/tools/generate.js
--model midjourney
--action upscale
--index 2
--job-id <job_id>
node {baseDir}/tools/generate.js \ --model midjourney \ --action upscale \ --index 2 \ --job-id <job_id>

Create a strong variation of image #3

为第3张图片生成强变体

node {baseDir}/tools/generate.js
--model midjourney
--action variation
--index 3
--job-id <job_id>
--variation-type 1
node {baseDir}/tools/generate.js \ --model midjourney \ --action variation \ --index 3 \ --job-id <job_id> \ --variation-type 1

Regenerate with same prompt

使用相同提示词重新生成

node {baseDir}/tools/generate.js
--model midjourney
--action reroll
--job-id <job_id>

**Upscale types:** `0` = Subtle (default, best for photos), `1` = Creative (best for illustrations)

**Variation types:** `0` = Subtle (default), `1` = Strong (dramatic changes)

---
node {baseDir}/tools/generate.js \ --model midjourney \ --action reroll \ --job-id <job_id>

**放大类型:** `0` = 柔和(默认,最适合照片)、`1` = 创意(最适合插画)

**变体类型:** `0` = 柔和(默认)、`1` = 强变体(变化幅度大)

---

Example Conversations

对话示例

User: "Draw a snow leopard on a snowy mountain with cinematic lighting"
bash
undefined
用户: "Draw a snow leopard on a snowy mountain with cinematic lighting"
bash
undefined

Choose midjourney for artistic quality

选择midjourney获得更好的艺术效果

node {baseDir}/tools/generate.js
--model midjourney
--prompt "a majestic snow leopard on a snowy mountain peak, cinematic lighting, dramatic atmosphere, ultra detailed --ar 16:9 --v 7"
--aspect-ratio 16:9

> 🎨 Done! Which one to upscale? (U1-U4) Or create a variant? (V1-V4)

---

**User:** "Use Flux to generate a perfume product poster, white background"

```bash
node {baseDir}/tools/generate.js \ --model midjourney \ --prompt "a majestic snow leopard on a snowy mountain peak, cinematic lighting, dramatic atmosphere, ultra detailed --ar 16:9 --v 7" \ --aspect-ratio 16:9

> 🎨 生成完成!要放大哪一张?(U1-U4)或者生成变体?(V1-V4)

---

**用户:** "Use Flux to generate a perfume product poster, white background"

```bash

Choose flux-pro for photorealistic product shots

选择flux-pro获得写实的产品拍摄效果

node {baseDir}/tools/generate.js
--model flux-pro
--prompt "a luxury perfume bottle on a clean white background, professional product photography, soft shadows, 8k, highly detailed"
--aspect-ratio 3:4

---

**User:** "Show me a quick draft"

```bash
node {baseDir}/tools/generate.js \ --model flux-pro \ --prompt "a luxury perfume bottle on a clean white background, professional product photography, soft shadows, 8k, highly detailed" \ --aspect-ratio 3:4

---

**用户:** "Show me a quick draft"

```bash

flux-schnell for instant previews

使用flux-schnell生成即时预览

node {baseDir}/tools/generate.js
--model flux-schnell
--prompt "..."
--aspect-ratio 1:1

---

**User:** "Make me an App icon, flat style, blue theme"

```bash
node {baseDir}/tools/generate.js \ --model flux-schnell \ --prompt "..." \ --aspect-ratio 1:1

---

**用户:** "Make me an App icon, flat style, blue theme"

```bash

recraft for vector/icon style

使用recraft生成矢量/图标风格内容

node {baseDir}/tools/generate.js
--model recraft
--prompt "a minimal flat design app icon, blue color scheme, simple geometric shapes, vector style, white background"

---
node {baseDir}/tools/generate.js \ --model recraft \ --prompt "a minimal flat design app icon, blue color scheme, simple geometric shapes, vector style, white background"

---

Setup

配置

Zero API keys needed! All requests go through a hosted proxy that handles authentication server-side.
The skill works out of the box — just install and use.
无需API密钥! 所有请求都通过托管的代理发送,由服务端处理身份验证。
这个技能开箱即用——安装即可使用。

Advanced: Custom proxy or token

进阶:自定义代理或令牌

If you want to use your own proxy or a persistent token, set these environment variables:
json
{
  "skills": {
    "entries": {
      "videoagent-image-studio": {
        "enabled": true,
        "env": {
          "IMAGE_STUDIO_PROXY_URL": "https://your-proxy.vercel.app",
          "IMAGE_STUDIO_TOKEN": "your_token_here"
        }
      }
    }
  }
}
VariableRequiredDescription
IMAGE_STUDIO_PROXY_URL
NoCustom proxy base URL (default:
https://image-gen-proxy.vercel.app
)
IMAGE_STUDIO_TOKEN
NoPersistent token (auto-obtained if not set, 100 free uses per token)
To deploy your own proxy, see the videoagent-audio-studio proxy as a reference implementation. You'll need
FAL_KEY
and
LEGNEXT_KEY
as Vercel environment variables.

如果你想要使用自己的代理或持久化令牌,请设置以下环境变量:
json
{
  "skills": {
    "entries": {
      "videoagent-image-studio": {
        "enabled": true,
        "env": {
          "IMAGE_STUDIO_PROXY_URL": "https://your-proxy.vercel.app",
          "IMAGE_STUDIO_TOKEN": "your_token_here"
        }
      }
    }
  }
}
变量是否必填描述
IMAGE_STUDIO_PROXY_URL
自定义代理基础URL(默认:
https://image-gen-proxy.vercel.app
IMAGE_STUDIO_TOKEN
持久化令牌(未设置时会自动获取,每个令牌有100次免费使用额度)
要部署自己的代理,可以参考videoagent-audio-studio 代理的实现示例,你需要将
FAL_KEY
LEGNEXT_KEY
设置为Vercel的环境变量。

Changelog

更新日志

v2.0.0

v2.0.0

  • Simplified async: The script now blocks until Midjourney completes. No more
    --async
    /
    --poll
    flags needed in SKILL.md instructions.
  • Unified output format: All models return the same
    { success, imageUrl, images }
    shape.
  • Reference images for Nano Banana: Pass
    --reference-images "url1,url2"
    for character/style consistency across generations.
  • 简化异步流程: 脚本现在会阻塞直到Midjourney任务完成,SKILL.md说明中不再需要
    --async
    /
    --poll
    参数。
  • 统一输出格式: 所有模型都返回相同的
    { success, imageUrl, images }
    结构。
  • Nano Banana支持参考图像: 传入
    --reference-images "url1,url2"
    即可在多次生成中保持角色/风格一致性。

v1.3.0

v1.3.0

  • Added non-blocking async mode for Midjourney (
    --async
    +
    --poll
    ).
  • 新增Midjourney非阻塞异步模式(
    --async
    +
    --poll
    )。

v1.2.0

v1.2.0

  • Midjourney turbo mode enabled by default (~10-20s).
  • 默认开启Midjourney极速模式(~10-20s)。

v1.1.0

v1.1.0

  • Switched Midjourney provider from TTAPI to Legnext.ai for better stability.
  • Midjourney服务商从TTAPI切换为Legnext.ai,稳定性更高。

v1.0.0

v1.0.0

  • Initial release with Midjourney, Flux, SDXL, Nano Banana, Ideogram, Recraft.
  • 首次发布,支持Midjourney、Flux、SDXL、Nano Banana、Ideogram、Recraft。",