image-generation
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseImage Generation
图片生成
Generate images using Google's Gemini model (default) or fal.ai FLUX.2 klein 4B (cheap mode).
使用Google的Gemini模型(默认)或fal.ai FLUX.2 klein 4B(低成本模式)生成图片。
Quick Start
快速开始
MANDATORY: Always use (destination) flag to specify output path. This avoids file management issues with in the scripts folder.
-dposter_0.jpgbash
cd ~/.claude/skills/image-generation/scripts【必填项】:务必使用(目标路径)参数指定输出路径。这可以避免脚本文件夹中出现这类文件管理问题。
-dposter_0.jpgbash
cd ~/.claude/skills/image-generation/scriptsALWAYS specify destination with -d flag
务必使用-d参数指定目标路径
npx tsx generate_poster.ts -d /tmp/my-image.jpg "A futuristic city at sunset"
npx tsx generate_poster.ts -d /tmp/my-image.jpg "日落时分的未来都市"
Cheap mode: fal.ai FLUX.2 klein 4B (fast, lower cost)
低成本模式:使用fal.ai FLUX.2 klein 4B(速度快、成本低)
npx tsx generate_poster.ts --cheap -d /tmp/city.jpg "A futuristic city at sunset"
npx tsx generate_poster.ts --cheap -d /tmp/city.jpg "日落时分的未来都市"
With aspect ratio
指定宽高比
npx tsx generate_poster.ts -d /tmp/landscape.jpg --aspect 3:2 "A wide landscape poster"
npx tsx generate_poster.ts --cheap -d /tmp/story.jpg -a 9:16 "A vertical story format"
npx tsx generate_poster.ts -d /tmp/landscape.jpg --aspect 3:2 "宽幅风景海报"
npx tsx generate_poster.ts --cheap -d /tmp/story.jpg -a 9:16 "竖版故事格式"
With reference assets (image editing)
使用参考素材(图片编辑)
npx tsx generate_poster.ts -d /tmp/banner.jpg --assets "/path/to/avatar.jpg" "Create banner with character"
npx tsx generate_poster.ts --cheap -d /tmp/photo.jpg --assets "/path/to/image.jpg" "Turn this into a realistic photo"
**Why `-d` is mandatory:**
- Output goes to predictable location (e.g., `/tmp/` for temp images)
- Avoids hunting for `poster_0.jpg` in scripts folder
- Enables immediate use without file renamingnpx tsx generate_poster.ts -d /tmp/banner.jpg --assets "/path/to/avatar.jpg" "创建包含角色的横幅"
npx tsx generate_poster.ts --cheap -d /tmp/photo.jpg --assets "/path/to/image.jpg" "将此转换为写实照片"
**为什么`-d`是必填项:**
- 输出文件会保存到可预测的位置(例如`/tmp/`用于临时图片)
- 无需在脚本文件夹中查找`poster_0.jpg`
- 无需重命名即可直接使用文件Providers
服务商
| Provider | Flag | Use Case | Cost |
|---|---|---|---|
| Gemini | (default) | High quality, best results | Higher |
| fal.ai klein 4B | | Fast, budget-friendly | ~$0.003/image |
| 服务商 | 参数 | 适用场景 | 成本 |
|---|---|---|---|
| Gemini | (默认) | 高质量,效果最佳 | 较高 |
| fal.ai klein 4B | | 速度快、经济实惠 | 约$0.003/张 |
Cheap Mode (--cheap
)
--cheap低成本模式(--cheap
)
--cheapUses fal.ai FLUX.2 klein 4B - a distilled FLUX model optimized for speed and cost.
Why use cheap mode?
- Cost: ~$0.003 per image (vs Gemini's higher cost)
- Speed: Fast 4-step inference (~2-4 seconds)
- Quality: Good enough for drafts, social media, iterations
- Batch friendly: Generate many variations quickly
Endpoints:
- Text-to-image:
fal-ai/flux-2/klein/4b - Image editing (with --assets):
fal-ai/flux-2/klein/4b/edit
Best for:
- Quick iterations and previews
- Social media content
- Concept exploration
- Batch generation (10+ images)
Use Gemini instead for:
- Final production assets
- Complex compositions
- Text-heavy images (especially Hebrew)
使用fal.ai FLUX.2 klein 4B——这是一个经过蒸馏优化的FLUX模型,兼顾速度与成本。
为什么使用低成本模式?
- 成本:约$0.003每张图片(对比Gemini成本更低)
- 速度:快速4步推理(约2-4秒)
- 质量:足以满足草稿、社交媒体内容、迭代需求
- 批量友好:可快速生成多个变体
接口:
- 文本转图片:
fal-ai/flux-2/klein/4b - 图片编辑(搭配--assets参数):
fal-ai/flux-2/klein/4b/edit
最佳适用场景:
- 快速迭代与预览
- 社交媒体内容
- 概念探索
- 批量生成(10张以上图片)
以下场景建议使用Gemini:
- 最终生产素材
- 复杂构图
- 含大量文字的图片(尤其是希伯来语)
Quality
画质控制
Control image resolution with or :
--quality-q| Quality | Resolution | Use Case |
|---|---|---|
| 1024px | Default - fast, good for web |
| 2048px | High quality - print, detailed posters |
bash
npx tsx generate_poster.ts -q 2K "High quality poster"使用或参数控制图片分辨率:
--quality-q| 画质 | 分辨率 | 适用场景 |
|---|---|---|
| 1024px | 默认 - 速度快,适合网页使用 |
| 2048px | 高质量 - 印刷、细节丰富的海报 |
bash
npx tsx generate_poster.ts -q 2K "高质量海报"Aspect Ratio
宽高比
IMPORTANT: Always use the default 3:2 aspect ratio unless the user explicitly requests a different format (like "vertical", "story", "square", etc.). Do NOT change the aspect ratio on your own.
Control image dimensions with or :
--aspect-a| Ratio | Use Case |
|---|---|
| Horizontal (DEFAULT - use this unless user specifies otherwise) |
| Square - Instagram, profile pics |
| Vertical - Pinterest, posters |
| Wide - YouTube thumbnails, headers |
| Tall - Stories, reels, TikTok |
bash
npx tsx generate_poster.ts --aspect 3:2 "Your prompt"
npx tsx generate_poster.ts -a 16:9 "Your prompt"【重要提示】:除非用户明确要求其他格式(如"竖版"、"故事格式"、"正方形"等),否则请始终使用默认的3:2宽高比。请勿自行更改宽高比。
使用或参数控制图片尺寸:
--aspect-a| 比例 | 适用场景 |
|---|---|
| 横版 (默认 - 除非用户指定,否则使用此比例) |
| 正方形 - Instagram、头像 |
| 竖版 - Pinterest、海报 |
| 宽屏 - YouTube缩略图、页眉 |
| 竖长版 - 故事视频、Reels、TikTok |
bash
npx tsx generate_poster.ts --aspect 3:2 "你的提示词"
npx tsx generate_poster.ts -a 16:9 "你的提示词"Adding Assets (Reference Images)
添加参考素材
Use with full paths to include reference images:
--assetsbash
undefined使用参数搭配完整路径来添加参考图片:
--assetsbash
undefinedSingle asset
单张素材
npx tsx generate_poster.ts --assets "/full/path/to/image.jpg" "Your prompt"
npx tsx generate_poster.ts --assets "/full/path/to/image.jpg" "你的提示词"
Multiple assets (comma-separated)
多张素材(逗号分隔)
npx tsx generate_poster.ts --assets "/path/a.jpg,/path/b.png" "Use both images"
npx tsx generate_poster.ts --assets "/path/a.jpg,/path/b.png" "使用这两张图片"
With cheap mode - uses fal.ai edit endpoint
搭配低成本模式 - 使用fal.ai编辑接口
npx tsx generate_poster.ts --cheap --assets "/path/to/image.jpg" "Turn this into a painting"
**Supported formats:** `.jpg`, `.jpeg`, `.png`, `.webp`, `.gif`
**IMPORTANT:** Assets are NOT automatically included. You must explicitly pass them via `--assets`.npx tsx generate_poster.ts --cheap --assets "/path/to/image.jpg" "将此转换为油画风格"
**支持的格式:** `.jpg`, `.jpeg`, `.png`, `.webp`, `.gif`
【重要提示】:素材不会自动添加,必须通过`--assets`参数显式传入。Destination (MANDATORY)
目标路径(必填项)
ALWAYS use or to specify output path:
--destination-dbash
undefined【务必】使用或参数指定输出路径:
--destination-dbash
undefinedREQUIRED: Always specify destination
必填:始终指定目标路径
npx tsx generate_poster.ts -d /tmp/poster.jpg "My poster"
npx tsx generate_poster.ts -d /tmp/poster.jpg "我的海报"
If file exists, auto-adds suffix: poster_1.jpg, poster_2.jpg, etc.
如果文件已存在,会自动添加后缀:poster_1.jpg, poster_2.jpg等
**Features:**
- Auto-creates parent directories if needed
- Collision avoidance: if file exists, adds `_1`, `_2`, etc.
- Use `/tmp/` for temporary images that will be uploaded elsewhere
**特性:**
- 自动创建所需的父目录
- 避免文件冲突:如果文件已存在,会添加`_1`、`_2`等后缀
- 临时图片可保存到`/tmp/`,方便后续上传API Configuration
API配置
Create :
scripts/.envGEMINI_API_KEY=your_gemini_api_key
FAL_KEY=your_fal_api_key
XAI_API_KEY=your_xai_api_key创建文件:
scripts/.envGEMINI_API_KEY=your_gemini_api_key
FAL_KEY=your_fal_api_key
XAI_API_KEY=your_xai_api_keyHebrew/RTL Content
希伯来语/RTL内容
IMPORTANT: When the image contains Hebrew text, you MUST add the following sentence to the prompt:
"CRITICAL: Layout is RTL (right-to-left). All text in Hebrew. Visual flow, reading order, and panel sequence go from RIGHT to LEFT."Copy this exact sentence and paste it at the BEGINNING of your prompt. Without it, the image will render left-to-right which is wrong for Hebrew content.
【重要提示】:当图片包含希伯来语文本时,必须在提示词开头添加以下语句:
"CRITICAL: Layout is RTL (right-to-left). All text in Hebrew. Visual flow, reading order, and panel sequence go from RIGHT to LEFT."请复制此精确语句并粘贴到提示词的开头。如果不添加,图片会以左到右的布局渲染,这对于希伯来语内容是错误的。
WOW Mode
WOW模式
When user asks for "wow mode" or wants maximum visual impact, add these epic elements to the prompt:
"EPIC CINEMATIC WOW: Amazed expression, mind-blown reaction, intense VFX - shattered glass particles, explosive energy bursts, volumetric light rays, dramatic lens flares, particle explosions, motion blur streaks, holographic glitches, electric sparks, cinematic color grading, dramatic rim lighting, depth of field bokeh, anamorphic lens effects. Maximum visual spectacle."当用户要求"wow模式"或需要最大化视觉冲击力时,在提示词中添加以下元素:
"EPIC CINEMATIC WOW: Amazed expression, mind-blown reaction, intense VFX - shattered glass particles, explosive energy bursts, volumetric light rays, dramatic lens flares, particle explosions, motion blur streaks, holographic glitches, electric sparks, cinematic color grading, dramatic rim lighting, depth of field bokeh, anamorphic lens effects. Maximum visual spectacle."Output
输出说明
- Files saved to path specified by flag (MANDATORY)
-d - Aspect ratio: Configurable via (default: 3:2)
--aspect - Quality: 1K (1024px on longest edge)
- 文件保存到参数指定的路径(必填项)
-d - 宽高比:可通过配置(默认:3:2)
--aspect - 画质:1K(最长边1024px)
Video Generation
视频生成
Generate videos using xAI Grok Imagine Video (default) or fal.ai LTX-2 (cheap mode).
使用xAI Grok Imagine Video(默认)或fal.ai LTX-2(低成本模式)生成视频。
Quick Start
快速开始
bash
cd ~/.claude/skills/image-generation/scriptsbash
cd ~/.claude/skills/image-generation/scriptsText-to-video (Grok Imagine - high quality, 720p)
文本转视频(Grok Imagine - 高质量,720p)
npx tsx generate_video.ts -d /tmp/video.mp4 "A futuristic city at sunset, cinematic drone shot"
npx tsx generate_video.ts -d /tmp/video.mp4 "日落时分的未来都市,电影级无人机镜头"
Cheap mode: fal.ai LTX-2 (fast, low cost)
低成本模式:使用fal.ai LTX-2(速度快、成本低)
npx tsx generate_video.ts --cheap -d /tmp/video.mp4 "A futuristic city at sunset"
npx tsx generate_video.ts --cheap -d /tmp/video.mp4 "日落时分的未来都市"
Image-to-video (animate a still image)
图片转视频(将静态图片动起来)
npx tsx generate_video.ts -d /tmp/animated.mp4 --image /path/to/image.jpg "Slow zoom with particles floating"
npx tsx generate_video.ts --cheap -d /tmp/animated.mp4 --image /path/to/image.jpg "Gentle camera movement"
npx tsx generate_video.ts -d /tmp/animated.mp4 --image /path/to/image.jpg "缓慢缩放+粒子漂浮"
npx tsx generate_video.ts --cheap -d /tmp/animated.mp4 --image /path/to/image.jpg "轻柔的镜头移动"
Custom duration and aspect ratio
自定义时长和宽高比
npx tsx generate_video.ts -d /tmp/long.mp4 -t 10 -a 9:16 "Vertical video for stories"
undefinednpx tsx generate_video.ts -d /tmp/long.mp4 -t 10 -a 9:16 "竖版故事视频"
undefinedVideo Providers
视频服务商
| Provider | Flag | Quality | Duration | Resolution | Cost |
|---|---|---|---|---|---|
| Grok Imagine | (default) | High, cinematic | 1-15s | 480p/720p | Per-second billing |
| fal.ai LTX-2 | | Good, fast | 1-10s | ~480p | Budget-friendly |
| 服务商 | 参数 | 画质 | 时长 | 分辨率 | 成本 |
|---|---|---|---|---|---|
| Grok Imagine | (默认) | 高质量,电影级 | 1-15秒 | 480p/720p | 按秒计费 |
| fal.ai LTX-2 | | 良好,速度快 | 1-10秒 | 约480p | 经济实惠 |
Video Options
视频选项
| Option | Flag | Default | Description |
|---|---|---|---|
| Prompt | (positional) | required | Text description of the video |
| Destination | | required | Output file path (.mp4) |
| Duration | | 5 | Duration in seconds |
| Aspect ratio | | 16:9 | 16:9, 9:16, 1:1, 4:3, 3:4, 3:2, 2:3 |
| Resolution | | 720p | 480p or 720p (Grok only) |
| Image source | | none | Source image for image-to-video |
| Cheap mode | | false | Use LTX-2 instead of Grok |
| 选项 | 参数 | 默认值 | 说明 |
|---|---|---|---|
| 提示词 | (位置参数) | 必填 | 视频的文字描述 |
| 目标路径 | | 必填 | 输出文件路径(.mp4) |
| 时长 | | 5 | 视频时长(秒) |
| 宽高比 | | 16:9 | 16:9、9:16、1:1、4:3、3:4、3:2、2:3 |
| 图片源 | | 无 | 图片转视频的源图片 |
| 低成本模式 | | false | 使用LTX-2替代Grok |