videoagent-image-studio
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese🎨 VideoAgent Image Studio
🎨 VideoAgent 图像工作室
Use when: User asks to generate, draw, create, or make any kind of image, photo, illustration, icon, logo, or artwork.
Generate images with 8 state-of-the-art AI models. This skill automatically picks the best model for the job and handles all the complexity — including Midjourney's async polling — so you can focus on the conversation.
适用场景: 用户要求生成、绘制、创建或制作任何类型的图像、照片、插画、图标、logo或艺术作品。
借助8款最先进的AI模型生成图像。这个技能会自动为任务选择最佳模型并处理所有复杂流程——包括Midjourney的异步轮询——让你可以专注于对话本身。
Quick Reference
快速参考
| User Intent | Model | Speed |
|---|---|---|
| Artistic, cinematic, painterly | | ~15s |
| Photorealistic, portrait, product | | ~8s |
| General purpose, balanced | | ~10s |
| Quick draft, fast iteration | | ~2s |
| Image with text, logo, poster | | ~10s |
| Vector art, icon, flat design | | ~8s |
| Anime, stylized illustration | | ~5s |
| Gemini-powered, consistent style | | ~12s |
| 用户意图 | 模型 | 速度 |
|---|---|---|
| 艺术风、电影感、绘画风 | | ~15s |
| 照片写实、人像、产品 | | ~8s |
| 通用场景、均衡表现 | | ~10s |
| 快速草稿、快速迭代 | | ~2s |
| 带文字的图像、logo、海报 | | ~10s |
| 矢量艺术、图标、扁平化设计 | | ~8s |
| 动漫、风格化插画 | | ~5s |
| Gemini驱动、风格统一 | | ~12s |
How to Generate an Image
如何生成图像
Step 1 — Enhance the prompt
步骤1 — 优化提示词
Before calling the script, expand the user's prompt with style, lighting, and quality descriptors appropriate for the chosen model.
- Midjourney: Add ,
cinematic lighting,ultra detailed,--v 7--style raw - Flux: Add ,
masterpiece,highly detailed,sharp focusprofessional photography - Ideogram: Be explicit about text content, font style, and layout
- Recraft: Specify ,
vector illustration,flat designicon style
在调用脚本前,根据所选模型的适配要求,补充风格、光照、画质描述来丰富用户的提示词。
- Midjourney: 添加 、
cinematic lighting、ultra detailed、--v 7--style raw - Flux: 添加 、
masterpiece、highly detailed、sharp focusprofessional photography - Ideogram: 明确说明文字内容、字体风格和排版
- Recraft: 指定 、
vector illustration、flat designicon style
Step 2 — Run the script
步骤2 — 运行脚本
bash
node {baseDir}/tools/generate.js \
--model <model_id> \
--prompt "<enhanced prompt>" \
--aspect-ratio <ratio>All parameters:
| Parameter | Default | Description |
|---|---|---|
| | Model ID from the table above |
| (required) | The image generation prompt |
| | |
| | Number of images (1–4; Midjourney always returns 4) |
| — | Things to avoid (not supported by Midjourney) |
| — | Seed for reproducibility |
bash
node {baseDir}/tools/generate.js \\
--model <model_id> \\
--prompt "<enhanced prompt>" \\
--aspect-ratio <ratio>所有参数:
| 参数 | 默认值 | 描述 |
|---|---|---|
| | 上表中的模型ID |
| (必填) | 图像生成提示词 |
| | 可选值: |
| | 生成图像数量(1-4;Midjourney固定返回4张) |
| — | 要排除的内容(Midjourney不支持该参数) |
| — | 用于复现结果的随机种子 |
Step 3 — Return the result
步骤3 — 返回结果
The script always waits and returns the final image URL(s). No polling required.
json
{
"success": true,
"model": "flux-pro",
"imageUrl": "https://...",
"images": ["https://..."]
}Send the to the user.
imageUrl脚本会一直等待并返回最终的图片URL,无需自行轮询。
json
{
"success": true,
"model": "flux-pro",
"imageUrl": "https://...",
"images": ["https://..."]
}将发送给用户即可。
imageUrlMidjourney Actions
Midjourney 操作
After generating a 4-image grid with Midjourney, offer the user these options:
bash
undefined用Midjourney生成4张图片的网格后,向用户提供以下选项:
bash
undefinedUpscale image #2 (subtle, preserves details)
放大第2张图片(风格柔和,保留细节)
node {baseDir}/tools/generate.js
--model midjourney
--action upscale
--index 2
--job-id <job_id>
--model midjourney
--action upscale
--index 2
--job-id <job_id>
node {baseDir}/tools/generate.js \
--model midjourney \
--action upscale \
--index 2 \
--job-id <job_id>
Create a strong variation of image #3
为第3张图片生成强变体
node {baseDir}/tools/generate.js
--model midjourney
--action variation
--index 3
--job-id <job_id>
--variation-type 1
--model midjourney
--action variation
--index 3
--job-id <job_id>
--variation-type 1
node {baseDir}/tools/generate.js \
--model midjourney \
--action variation \
--index 3 \
--job-id <job_id> \
--variation-type 1
Regenerate with same prompt
使用相同提示词重新生成
node {baseDir}/tools/generate.js
--model midjourney
--action reroll
--job-id <job_id>
--model midjourney
--action reroll
--job-id <job_id>
**Upscale types:** `0` = Subtle (default, best for photos), `1` = Creative (best for illustrations)
**Variation types:** `0` = Subtle (default), `1` = Strong (dramatic changes)
---node {baseDir}/tools/generate.js \
--model midjourney \
--action reroll \
--job-id <job_id>
**放大类型:** `0` = 柔和(默认,最适合照片)、`1` = 创意(最适合插画)
**变体类型:** `0` = 柔和(默认)、`1` = 强变体(变化幅度大)
---Example Conversations
对话示例
User: "Draw a snow leopard on a snowy mountain with cinematic lighting"
bash
undefined用户: "Draw a snow leopard on a snowy mountain with cinematic lighting"
bash
undefinedChoose midjourney for artistic quality
选择midjourney获得更好的艺术效果
node {baseDir}/tools/generate.js
--model midjourney
--prompt "a majestic snow leopard on a snowy mountain peak, cinematic lighting, dramatic atmosphere, ultra detailed --ar 16:9 --v 7"
--aspect-ratio 16:9
--model midjourney
--prompt "a majestic snow leopard on a snowy mountain peak, cinematic lighting, dramatic atmosphere, ultra detailed --ar 16:9 --v 7"
--aspect-ratio 16:9
> 🎨 Done! Which one to upscale? (U1-U4) Or create a variant? (V1-V4)
---
**User:** "Use Flux to generate a perfume product poster, white background"
```bashnode {baseDir}/tools/generate.js \
--model midjourney \
--prompt "a majestic snow leopard on a snowy mountain peak, cinematic lighting, dramatic atmosphere, ultra detailed --ar 16:9 --v 7" \
--aspect-ratio 16:9
> 🎨 生成完成!要放大哪一张?(U1-U4)或者生成变体?(V1-V4)
---
**用户:** "Use Flux to generate a perfume product poster, white background"
```bashChoose flux-pro for photorealistic product shots
选择flux-pro获得写实的产品拍摄效果
node {baseDir}/tools/generate.js
--model flux-pro
--prompt "a luxury perfume bottle on a clean white background, professional product photography, soft shadows, 8k, highly detailed"
--aspect-ratio 3:4
--model flux-pro
--prompt "a luxury perfume bottle on a clean white background, professional product photography, soft shadows, 8k, highly detailed"
--aspect-ratio 3:4
---
**User:** "Show me a quick draft"
```bashnode {baseDir}/tools/generate.js \
--model flux-pro \
--prompt "a luxury perfume bottle on a clean white background, professional product photography, soft shadows, 8k, highly detailed" \
--aspect-ratio 3:4
---
**用户:** "Show me a quick draft"
```bashflux-schnell for instant previews
使用flux-schnell生成即时预览
node {baseDir}/tools/generate.js
--model flux-schnell
--prompt "..."
--aspect-ratio 1:1
--model flux-schnell
--prompt "..."
--aspect-ratio 1:1
---
**User:** "Make me an App icon, flat style, blue theme"
```bashnode {baseDir}/tools/generate.js \
--model flux-schnell \
--prompt "..." \
--aspect-ratio 1:1
---
**用户:** "Make me an App icon, flat style, blue theme"
```bashrecraft for vector/icon style
使用recraft生成矢量/图标风格内容
node {baseDir}/tools/generate.js
--model recraft
--prompt "a minimal flat design app icon, blue color scheme, simple geometric shapes, vector style, white background"
--model recraft
--prompt "a minimal flat design app icon, blue color scheme, simple geometric shapes, vector style, white background"
---node {baseDir}/tools/generate.js \
--model recraft \
--prompt "a minimal flat design app icon, blue color scheme, simple geometric shapes, vector style, white background"
---Setup
配置
Zero API keys needed! All requests go through a hosted proxy that handles authentication server-side.
The skill works out of the box — just install and use.
无需API密钥! 所有请求都通过托管的代理发送,由服务端处理身份验证。
这个技能开箱即用——安装即可使用。
Advanced: Custom proxy or token
进阶:自定义代理或令牌
If you want to use your own proxy or a persistent token, set these environment variables:
json
{
"skills": {
"entries": {
"videoagent-image-studio": {
"enabled": true,
"env": {
"IMAGE_STUDIO_PROXY_URL": "https://your-proxy.vercel.app",
"IMAGE_STUDIO_TOKEN": "your_token_here"
}
}
}
}
}| Variable | Required | Description |
|---|---|---|
| No | Custom proxy base URL (default: |
| No | Persistent token (auto-obtained if not set, 100 free uses per token) |
To deploy your own proxy, see the videoagent-audio-studio proxy as a reference implementation. You'll need and as Vercel environment variables.
FAL_KEYLEGNEXT_KEY如果你想要使用自己的代理或持久化令牌,请设置以下环境变量:
json
{
"skills": {
"entries": {
"videoagent-image-studio": {
"enabled": true,
"env": {
"IMAGE_STUDIO_PROXY_URL": "https://your-proxy.vercel.app",
"IMAGE_STUDIO_TOKEN": "your_token_here"
}
}
}
}
}| 变量 | 是否必填 | 描述 |
|---|---|---|
| 否 | 自定义代理基础URL(默认: |
| 否 | 持久化令牌(未设置时会自动获取,每个令牌有100次免费使用额度) |
要部署自己的代理,可以参考videoagent-audio-studio 代理的实现示例,你需要将和设置为Vercel的环境变量。
FAL_KEYLEGNEXT_KEYChangelog
更新日志
v2.0.0
v2.0.0
- Simplified async: The script now blocks until Midjourney completes. No more /
--asyncflags needed in SKILL.md instructions.--poll - Unified output format: All models return the same shape.
{ success, imageUrl, images } - Reference images for Nano Banana: Pass for character/style consistency across generations.
--reference-images "url1,url2"
- 简化异步流程: 脚本现在会阻塞直到Midjourney任务完成,SKILL.md说明中不再需要/
--async参数。--poll - 统一输出格式: 所有模型都返回相同的结构。
{ success, imageUrl, images } - Nano Banana支持参考图像: 传入即可在多次生成中保持角色/风格一致性。
--reference-images "url1,url2"
v1.3.0
v1.3.0
- Added non-blocking async mode for Midjourney (+
--async).--poll
- 新增Midjourney非阻塞异步模式(+
--async)。--poll
v1.2.0
v1.2.0
- Midjourney turbo mode enabled by default (~10-20s).
- 默认开启Midjourney极速模式(~10-20s)。
v1.1.0
v1.1.0
- Switched Midjourney provider from TTAPI to Legnext.ai for better stability.
- Midjourney服务商从TTAPI切换为Legnext.ai,稳定性更高。
v1.0.0
v1.0.0
- Initial release with Midjourney, Flux, SDXL, Nano Banana, Ideogram, Recraft.
- 首次发布,支持Midjourney、Flux、SDXL、Nano Banana、Ideogram、Recraft。",