image-gen
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWhen to Use
适用场景
- User wants to generate an AI image from a text description
- User says "generate image", "draw", "create picture", "配图"
- User says "生成图片", "画一张", "AI图"
- User needs a cover image, illustration, or concept art
- 用户希望根据文本描述生成AI图片
- 用户说出"generate image"、"draw"、"create picture"、"配图"
- 用户说出"生成图片"、"画一张"、"AI图"
- 用户需要封面图、插画或概念艺术图
When NOT to Use
不适用场景
- User wants to create audio content (use ,
/podcast)/speech - User wants to create a video (use )
/explainer - User wants to edit an existing image (not supported)
- User wants to extract content from a URL (use )
/content-parser
- 用户想要创建音频内容(请使用、
/podcast)/speech - 用户想要创建视频(请使用)
/explainer - 用户想要编辑已有图片(暂不支持)
- 用户想要从URL提取内容(请使用)
/content-parser
Purpose
功能目标
Generate AI images using the Labnana API. Supports text prompts with optional reference images, multiple resolutions, and aspect ratios. Images are saved as local files.
通过Labnana API生成AI图片。支持基于文本提示词生成,可附带参考图片,支持多种分辨率和宽高比。生成的图片将保存为本地文件。
Hard Constraints
硬性约束
- No shell scripts. Construct curl commands from the API reference files listed in Resources
- Always read for API key and headers
shared/authentication.md - Follow for error handling
shared/common-patterns.md - Image generation uses a different base URL:
https://api.labnana.com/openapi/v1 - Always read config following before any interaction
shared/config-pattern.md - Output saved to — never
.listenhub/image-gen/YYYY-MM-DD-{jobId}/~/Downloads/
- 不得使用shell脚本。需根据资源中列出的API参考文件构建curl命令
- 务必阅读获取API密钥和请求头信息
shared/authentication.md - 遵循中的错误处理规范
shared/common-patterns.md - 图片生成使用独立的基础URL:
https://api.labnana.com/openapi/v1 - 在进行任何交互前,务必遵循读取配置
shared/config-pattern.md - 输出文件保存至——绝对不能保存到
.listenhub/image-gen/YYYY-MM-DD-{jobId}/~/Downloads/
Step -1: API Key Check
步骤-1:API密钥检查
Follow § API Key Check. If the key is missing, stop immediately.
shared/config-pattern.md遵循中的「API密钥检查」章节。如果密钥缺失,立即终止操作。
shared/config-pattern.mdStep 0: Config Setup
步骤0:配置设置
Follow Step 0.
shared/config-pattern.mdIf file doesn't exist — ask location, then create immediately:
bash
mkdir -p ".listenhub/image-gen"
echo '{"outputDir":".listenhub","outputMode":"inline"}' > ".listenhub/image-gen/config.json"
CONFIG_PATH=".listenhub/image-gen/config.json"遵循中的步骤0。
shared/config-pattern.md若配置文件不存在——询问保存位置,立即创建:
bash
mkdir -p ".listenhub/image-gen"
echo '{"outputDir":".listenhub","outputMode":"inline"}' > ".listenhub/image-gen/config.json"
CONFIG_PATH=".listenhub/image-gen/config.json"(or $HOME/.listenhub/image-gen/config.json for global)
(或全局配置保存至 $HOME/.listenhub/image-gen/config.json)
Then run **Setup Flow** below.
**If file exists** — read config, display summary, and confirm:当前配置 (image-gen):
输出方式:{inline / download / both}
Ask: "使用已保存的配置?" → **确认,直接继续** / **重新配置**随后执行下方的**设置流程**。
**若配置文件已存在**——读取配置,显示汇总信息并请求确认:当前配置 (image-gen):
输出方式:{inline / download / both}
询问:"使用已保存的配置?" → **确认,直接继续** / **重新配置**Setup Flow (first run or reconfigure)
设置流程(首次运行或重新配置)
- outputMode: Follow § Setup Flow Question.
shared/output-mode.md
Save immediately:
bash
undefined- 输出模式(outputMode):遵循中的「设置流程问题」章节。
shared/output-mode.md
立即保存配置:
bash
undefinedFollow shared/output-mode.md § Save to Config
遵循 shared/output-mode.md 中的「保存至配置文件」章节
NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")
undefinedNEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")
undefinedInteraction Flow
交互流程
Step 1: Image Description
步骤1:图片描述
Free text input. Ask the user:
Describe the image you want to generate.
If the prompt is very short (< 10 words) and the user hasn't asked for verbatim generation, offer to help enrich the prompt. Otherwise, use as-is.
支持自由文本输入。向用户提问:
请描述你想要生成的图片内容。
如果提示词非常简短(少于10个单词)且用户未要求完全按原词生成,可主动提出帮助优化提示词。否则直接使用用户提供的提示词。
Step 2: Model
步骤2:模型选择
Ask:
Question: "Which model?"
Options:
- "pro (recommended)" — gemini-3-pro-image-preview, higher quality
- "flash" — gemini-3.1-flash-image-preview, faster and cheaper, unlocks extreme aspect ratios (1:4, 4:1, 1:8, 8:1)询问用户:
问题:"选择哪种模型?"
选项:
- "pro(推荐)" —— gemini-3-pro-image-preview,画质更高
- "flash" —— gemini-3.1-flash-image-preview,生成速度更快、成本更低,支持极端宽高比(1:4、4:1、1:8、8:1)Step 3: Resolution and Aspect Ratio
步骤3:分辨率与宽高比
Ask both together (independent parameters):
Question: "What resolution?"
Options:
- "1K" — Standard quality
- "2K (recommended)" — High quality, good balance
- "4K" — Ultra high quality, slower generationQuestion: "What aspect ratio?"
Options (all models):
- "16:9" — Landscape, widescreen
- "1:1" — Square
- "9:16" — Portrait, phone screen
- "Other" — 2:3, 3:2, 3:4, 4:3, 21:9If flash model was selected, also offer: (narrow portrait), (wide landscape), (extreme portrait), (panoramic)
1:44:11:88:1同时询问这两个独立参数:
问题:"选择哪种分辨率?"
选项:
- "1K" —— 标准画质
- "2K(推荐)" —— 高画质,平衡度佳
- "4K" —— 超高清画质,生成速度较慢问题:"选择哪种宽高比?"
所有模型均支持的选项:
- "16:9" —— 横屏,宽屏格式
- "1:1" —— 正方形
- "9:16" —— 竖屏,手机屏幕格式
- "其他" —— 2:3、3:2、3:4、4:3、21:9如果用户选择了flash模型,额外提供以下选项:(窄竖屏)、(宽横屏)、(极端竖屏)、(全景)
1:44:11:88:1Step 4: Reference Images (optional)
步骤4:参考图片(可选)
Question: "Any reference images for style guidance?"
Options:
- "Yes, I have URL(s)" — Provide reference image URLs
- "No references" — Generate from prompt onlyIf yes, collect URLs (comma-separated, max 14). For each URL, infer mimeType from suffix and build:
json
{ "fileData": { "fileUri": "<url>", "mimeType": "<inferred>" } }Suffix mapping: / → , → , → , →
.jpg.jpegimage/jpeg.pngimage/png.webpimage/webp.gifimage/gif问题:"是否需要提供参考图片以指定风格?"
选项:
- "是,我有图片URL" —— 提供参考图片的URL
- "不需要参考图片" —— 仅根据提示词生成如果用户选择是,收集URL(逗号分隔,最多14个)。根据URL后缀推断mimeType,并构建如下结构:
json
{ "fileData": { "fileUri": "<url>", "mimeType": "<inferred>" } }后缀映射规则:/ → , → , → , →
.jpg.jpegimage/jpeg.pngimage/png.webpimage/webp.gifimage/gifStep 5: Confirm & Generate
步骤5:确认并生成
Summarize all choices:
Ready to generate image:
Prompt: {prompt text}
Model: {pro / flash}
Resolution: {1K / 2K / 4K}
Aspect ratio: {ratio}
References: {yes (N URLs) / no}
Proceed?Wait for explicit confirmation before calling the API.
汇总所有用户选择:
即将生成图片,参数如下:
提示词:{prompt text}
模型:{pro / flash}
分辨率:{1K / 2K / 4K}
宽高比:{ratio}
参考图片:{是(N个URL)/ 否}
是否继续?等待用户明确确认后,再调用API。
Workflow
执行流程
- Build request: Construct JSON with provider, model, prompt, imageConfig, and optional referenceImages
- Submit: with timeout of 600s
POST https://api.labnana.com/openapi/v1/images/generation - Extract image: Parse base64 data from response
- Decode and present result
Read from config. Follow for behavior.
OUTPUT_MODEshared/output-mode.mdinlinebothbash
JOB_ID=$(date +%s)
echo "$BASE64_DATA" | base64 -D > /tmp/image-gen-${JOB_ID}.jpgThen use the Read tool on . The image displays inline in the conversation.
/tmp/image-gen-{jobId}.jpgPresent:
图片已生成!downloadbothbash
JOB_ID=$(date +%s)
DATE=$(date +%Y-%m-%d)
JOB_DIR=".listenhub/image-gen/${DATE}-${JOB_ID}"
mkdir -p "$JOB_DIR"
echo "$BASE64_DATA" | base64 -D > "${JOB_DIR}/${JOB_ID}.jpg"Present:
图片已生成!
已保存到 .listenhub/image-gen/{YYYY-MM-DD}-{jobId}/:
{jobId}.jpgBase64 decoding (cross-platform):
bash
undefined- 构建请求:拼接包含provider、model、prompt、imageConfig及可选referenceImages的JSON数据
- 提交请求:发送请求,超时时间设置为600秒
POST https://api.labnana.com/openapi/v1/images/generation - 提取图片:从响应中解析base64格式的图片数据
- 解码并展示结果
从配置中读取参数,遵循中的规则处理输出。
OUTPUT_MODEshared/output-mode.md若为或:将base64数据解码至临时文件,然后使用Read工具展示。
inlinebothbash
JOB_ID=$(date +%s)
echo "$BASE64_DATA" | base64 -D > /tmp/image-gen-${JOB_ID}.jpg随后对调用Read工具,图片将在对话中内联显示。
/tmp/image-gen-{jobId}.jpg展示提示:
图片已生成!若为或:将图片保存至工件目录。
downloadbothbash
JOB_ID=$(date +%s)
DATE=$(date +%Y-%m-%d)
JOB_DIR=".listenhub/image-gen/${DATE}-${JOB_ID}"
mkdir -p "$JOB_DIR"
echo "$BASE64_DATA" | base64 -D > "${JOB_DIR}/${JOB_ID}.jpg"展示提示:
图片已生成!
已保存至 .listenhub/image-gen/{YYYY-MM-DD}-{jobId}/:
{jobId}.jpg跨平台Base64解码命令
bash
undefinedLinux
Linux系统
echo "$BASE64_DATA" | base64 -d > output.jpg
echo "$BASE64_DATA" | base64 -d > output.jpg
macOS
macOS系统
echo "$BASE64_DATA" | base64 -D > output.jpg
echo "$BASE64_DATA" | base64 -D > output.jpg
or
或使用
echo "$BASE64_DATA" | base64 --decode > output.jpg
**Retry logic**: On 429 (rate limit), wait 15 seconds and retry. Max 3 retries.echo "$BASE64_DATA" | base64 --decode > output.jpg
**重试逻辑**:若遇到429(请求频率超限)错误,等待15秒后重试,最多重试3次。Prompt Handling
提示词处理
Default: Pass the user's prompt directly without modification.
When to offer optimization:
- Prompt is very short (a few words) AND user hasn't requested verbatim
- Ask: "Would you like help enriching the prompt with style/lighting/composition details?"
When to never modify:
- Long, detailed, or structured prompts — treat the user as experienced
- User says "use this prompt exactly"
Optimization techniques (if user agrees):
- Style: "cyberpunk" → add "neon lights, futuristic, dystopian"
- Scene: time of day, lighting, weather
- Quality: "highly detailed", "8K quality", "cinematic composition"
- Always use English keywords (models trained on English)
- Show optimized prompt before submitting
默认规则:直接使用用户提供的提示词,不做任何修改。
可主动提供优化的场景:
- 提示词非常简短(仅几个单词)且用户未要求完全按原词生成
- 询问用户:"是否需要帮助为提示词添加风格、光线、构图等细节以优化生成效果?"
绝对不能修改的场景:
- 用户提供的提示词较长、细节丰富或结构化——默认用户为专业用户
- 用户明确要求"完全使用该提示词"
优化技巧(若用户同意优化):
- 风格:例如"赛博朋克" → 添加"霓虹灯光、未来感、反乌托邦"
- 场景:添加时间、光线、天气等元素
- 画质:添加"高度细节化"、"8K画质"、"电影级构图"
- 务必使用英文关键词(模型基于英文语料训练)
- 在提交前展示优化后的提示词,供用户确认
API Reference
API参考
- Image generation:
shared/api-image.md - Error handling: § Error Handling
shared/common-patterns.md
- 图片生成:
shared/api-image.md - 错误处理:中的「错误处理」章节
shared/common-patterns.md
Composability
可组合性
- Invokes: nothing (direct API call)
- Invoked by: platform skills for cover images (Phase 2)
- 调用其他服务:无(直接调用API)
- 被其他服务调用:平台技能可调用本功能生成封面图(第二阶段)
Example
示例
User: "Generate an image: cyberpunk city at night"
Agent workflow:
- Prompt is short → offer enrichment → user declines
- Ask model → "pro"
- Ask resolution → "2K"
- Ask ratio → "16:9"
- No references
bash
RESPONSE=$(curl -sS -X POST "https://api.labnana.com/openapi/v1/images/generation" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
--max-time 600 \
-d '{
"provider": "google",
"model": "gemini-3-pro-image-preview",
"prompt": "cyberpunk city at night",
"imageConfig": {"imageSize": "2K", "aspectRatio": "16:9"}
}')
BASE64_DATA=$(echo "$RESPONSE" | jq -r '.candidates[0].content.parts[0].inlineData.data // .data')
JOB_ID=$(date +%s)
DATE=$(date +%Y-%m-%d)
JOB_DIR=".listenhub/image-gen/${DATE}-${JOB_ID}"
mkdir -p "$JOB_DIR"
echo "$BASE64_DATA" | base64 -D > "${JOB_DIR}/${JOB_ID}.jpg"Decode the base64 data per (see ).
outputModeshared/output-mode.md用户:"Generate an image: cyberpunk city at night"
Agent工作流:
- 提示词较短 → 主动提供优化建议 → 用户拒绝
- 询问模型选择 → 用户选择"pro"
- 询问分辨率 → 用户选择"2K"
- 询问宽高比 → 用户选择"16:9"
- 用户表示不需要参考图片
bash
RESPONSE=$(curl -sS -X POST "https://api.labnana.com/openapi/v1/images/generation" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
--max-time 600 \
-d '{
"provider": "google",
"model": "gemini-3-pro-image-preview",
"prompt": "cyberpunk city at night",
"imageConfig": {"imageSize": "2K", "aspectRatio": "16:9"}
}')
BASE64_DATA=$(echo "$RESPONSE" | jq -r '.candidates[0].content.parts[0].inlineData.data // .data')
JOB_ID=$(date +%s)
DATE=$(date +%Y-%m-%d)
JOB_DIR=".listenhub/image-gen/${DATE}-${JOB_ID}"
mkdir -p "$JOB_DIR"
echo "$BASE64_DATA" | base64 -D > "${JOB_DIR}/${JOB_ID}.jpg"根据参数解码base64数据(详见)。
outputModeshared/output-mode.md