slideshow-producer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Slideshow Producer

幻灯片生成器

Follow shared public skill rules in:
  • postplus-shared
    public skill rules
请遵循以下公开共享技能规则:
  • postplus-shared
    公开技能规则

Pipeline Position

流水线位置

script-generator (slideshow mode)
  → slideshow-producer [THIS SKILL]
    → image-batch-runner
  • script-generator owns: hook logic, script flows, angle strategy, persona checks
  • This skill owns: vibe-to-prompt translation, slide manifest management, localhost review, batch orchestration, text compositing
  • image-batch-runner owns: actual image generation calls, downloaded assets, local manifests
script-generator (slideshow mode)
  → slideshow-producer [本技能]
    → image-batch-runner
  • script-generator 负责:钩子逻辑、脚本流程、角度策略、角色校验
  • 本技能负责:氛围到提示词的转换、幻灯片清单管理、localhost预览、批量编排、文本合成
  • image-batch-runner 负责:实际图像生成调用、资源下载、本地清单管理

What This Skill Is For

本技能的适用场景

  • turning a user's vibe description into a concrete slide-by-slide plan
  • writing concise, effective image prompts (core scene + vibe, not over-constrained)
  • producing a local slide manifest JSON for review and iteration
  • optionally launching a localhost GUI for drag-and-drop slide editing
  • orchestrating batch image generation through image-batch-runner
  • compositing TikTok-style overlay text (white with black stroke) onto final slides
  • 将用户的氛围描述转换为具体的逐页幻灯片计划
  • 编写简洁有效的图像提示词(核心场景+氛围,不过度约束)
  • 生成用于预览和迭代的本地幻灯片清单JSON
  • 可选启动localhost GUI进行拖放式幻灯片编辑
  • 通过image-batch-runner编排批量图像生成
  • 为最终幻灯片添加TikTok风格的叠加文本(白字黑边)

What This Skill Is Not For

本技能的不适用场景

  • writing hook variants or script flows from scratch (that is script-generator's job)
  • calling image generation APIs directly (that is image-batch-runner's job)
  • creating persona packs or brand voice documents
  • replacing creative-qa for final human quality review
  • 从零开始编写钩子变体或脚本流程(这是script-generator的工作)
  • 直接调用图像生成API(这是image-batch-runner的工作)
  • 创建角色包或品牌语调文档
  • 替代创意QA进行最终人工质量审核

Model Defaults

模型默认参数

ParamDefaultNotes
text model
image-gpt-image-2-text
Use only when the slide has no reference image
edit model
image-gpt-image-2-edit
Required when the slide has any reference image
quality
medium
Faster and cheaper; sufficient for UGC slideshows
resolution
1k
Good for 1080×1920 output; use 2k for fine detail
aspectRatio
9:16
(TT) or
4:5
(IG)
Based on platform
outputFormat
png
GPT Image 2 always outputs PNG
参数默认值说明
text model
image-gpt-image-2-text
仅当幻灯片无参考图像时使用
edit model
image-gpt-image-2-edit
当幻灯片有任何参考图像时必须使用
quality
medium
更快更便宜;足以满足UGC幻灯片需求
resolution
1k
适合1080×1920输出;精细细节场景使用2k
aspectRatio
9:16
(TikTok) 或
4:5
(Instagram)
基于平台选择
outputFormat
png
GPT Image 2始终输出PNG格式

Local Dependencies

本地依赖

  • python3
    with Pillow (
    PIL
    ) is required for
    scripts/composite-text.mjs
    .
  • 运行
    scripts/composite-text.mjs
    需要安装带有Pillow(
    PIL
    )的
    python3
    环境。

Reference Image Routing Rule

参考图像路由规则

The slide manifest is the routing source of truth.
  • imageSource: "local"
    means use
    localImagePath
    as the final slide image. Do not call image generation.
  • imageSource: "generate"
    with no
    referenceImagePaths
    or
    referenceImageUrls
    means
    generationMode: "text-to-image"
    and must call
    generate_image.mjs
    with
    image-gpt-image-2-text
    .
  • imageSource: "generate"
    with any
    referenceImagePaths
    or
    referenceImageUrls
    means
    generationMode: "edit"
    and must call
    edit_image.mjs
    with
    image-gpt-image-2-edit
    .
  • Local reference files must be uploaded through image-batch-runner's
    upload_media.mjs
    first. Pass only the returned URLs into
    edit_image.mjs
    as
    inputUrls
    .
Never send a slide with reference images through text-to-image. If the manifest has references but
generationMode
is missing or set to
text-to-image
, fix the manifest before generation.
幻灯片清单是路由的唯一依据。
  • imageSource: "local"
    表示使用
    localImagePath
    作为最终幻灯片图像,无需调用图像生成。
  • imageSource: "generate"
    且无
    referenceImagePaths
    referenceImageUrls
    时,
    generationMode: "text-to-image"
    ,必须调用
    generate_image.mjs
    并使用
    image-gpt-image-2-text
    模型。
  • imageSource: "generate"
    且存在
    referenceImagePaths
    referenceImageUrls
    时,
    generationMode: "edit"
    ,必须调用
    edit_image.mjs
    并使用
    image-gpt-image-2-edit
    模型。
  • 本地参考文件必须先通过image-batch-runner的
    upload_media.mjs
    上传,仅将返回的URL作为
    inputUrls
    传入
    edit_image.mjs
绝不能将带有参考图像的幻灯片传入文本转图像流程。如果清单包含参考图像但
generationMode
缺失或设置为
text-to-image
,请在生成前修正清单。

Prompt Writing Principles

提示词编写原则

GPT Image 2 needs clear scene and vibe description, not a wall of constraints.
GPT Image 2需要清晰的场景和氛围描述,而非堆砌约束条件。

Do: core scene + vibe

正确做法:核心场景+氛围

Describe what the image should show and how it should feel. 2-3 sentences max.
Good:
A person sitting at a cluttered home desk, frustrated expression, looking at a laptop screen.
Natural window light from the side, casual iphone photo feel, slightly warm color cast.
描述图像应展示的内容和给人的感觉,最多2-3句话。
示例:
一个人坐在杂乱的家用书桌前,面露沮丧,盯着笔记本电脑屏幕。
侧面自然窗光,日常iPhone拍摄质感,略带暖色调。

Don't: over-constrain

错误做法:过度约束

Do not pile on "no studio lighting, no professional photography, no cinematic quality, no soft glow, no perfect skin, no..." — GPT Image 2 doesn't need negative constraints. They clog the prompt and can confuse the model.
If the vibe is "casual iphone photo", just say that. The model understands.
不要堆砌“无 studio 灯光、无专业摄影、无电影质感、无柔光、无完美皮肤、无……”这类否定约束——GPT Image 2不需要这些。它们会占用提示词空间,还可能混淆模型。
如果想要“日常iPhone拍摄”的氛围,直接说明即可,模型能理解。

When to add specifics

何时添加细节

  • Product visible: describe it clearly (color, shape, placement)
  • Environment matters: room type, lighting source, time of day
  • Human expression: the specific emotion or action
  • Props: what else is in frame
When in doubt, be more specific about what IS there, not what ISN'T.
  • 需要展示产品:清晰描述(颜色、形状、位置)
  • 环境很重要:房间类型、光源、时间
  • 人物表情:具体情绪或动作
  • 道具:画面中还有什么
拿不准时,多描述存在的内容,而非不存在的内容

Platform Modes

平台模式

TikTok Slideshow (default)

TikTok幻灯片(默认)

  • Aspect ratio: 9:16 (1080×1920)
  • Safe zones: avoid top ~60px and bottom ~80px for UI overlays
  • Overlay text: upper-center, bold white with black stroke
  • Default slide count: 5-7
  • 宽高比:9:16(1080×1920)
  • 安全区域:避免顶部约60px和底部约80px的UI覆盖区域
  • 叠加文本:上部居中位置,粗体白字黑边
  • 默认幻灯片数量:5-7张

Instagram Carousel

Instagram轮播图

  • Aspect ratio: 4:5 (1080×1350) preferred; 1:1 (1080×1080) alternative
  • Most text in the caption field; minimal image overlay
  • Default slide count: 4-6
Default to TikTok when the user hasn't specified.
Read detailed specs in
references/platform-specs.md
.
  • 推荐宽高比:4:5(1080×1350);可选1:1(1080×1080)
  • 大部分文字放在标题栏;图像上仅添加少量叠加文本
  • 默认幻灯片数量:4-6张
用户未指定平台时,默认使用TikTok模式。
详细规格请查看
references/platform-specs.md

Default Workflow

默认工作流程

Phase 1: Requirements

阶段1:需求确认

Ask the user:
  1. How many slideshows? → determines sub-agent count
  2. TikTok or Instagram? → default TT
  3. Describe the vibe — what should this feel like? Keep it loose.
  4. Any local images to use for specific slides? → identify whether each path is a final slide image (
    imageSource: "local"
    ) or a reference for generation (
    generationMode: "edit"
    ).
Do not jump to scripts until you understand the feeling they want.
询问用户:
  1. 需要制作多少套幻灯片?→ 决定子Agent数量
  2. 用于TikTok还是Instagram?→ 默认TikTok
  3. 描述所需氛围——整体感觉是什么?无需太具体。
  4. 是否有本地图像用于特定幻灯片?→ 确定每个路径是最终幻灯片图像(
    imageSource: "local"
    )还是生成参考图(
    generationMode: "edit"
    )。
在理解用户想要的氛围之前,不要直接进入脚本创作环节。

Phase 2: Script Creation

阶段2:脚本创建

For each slideshow, produce a slide manifest JSON.
Each slide has:
  • position
    (1-indexed)
  • cognitiveJob
    (hook, problem, insight, proof, product, cta, etc.)
  • prompt
    (concise, following prompt writing principles above)
  • overlayText
    (3-8 words, lowercase, conversational — or null if no text)
  • imageSource
    (
    "generate"
    or
    "local"
    )
  • localImagePath
    (only if imageSource is "local")
  • generationMode
    (
    "text-to-image"
    ,
    "edit"
    , or null)
  • referenceImagePaths
    (local reference files for generated edit-mode slides)
  • referenceImageUrls
    (uploaded reference URLs for generated edit-mode slides)
Save as a local manifest file.
Read the full schema in
references/slide-manifest-schema.md
.
After producing the manifest, ask: "Need a localhost preview to review and edit? (y/n)"
If yes → start the GUI server, user reviews, drags to reorder, edits prompts and overlay text. If no → show the JSON summary, user confirms by text.
Before asking for generation approval, run:
bash
node ${CLAUDE_SKILL_DIR}/scripts/validate-manifest.mjs --manifest /path/to/slideshow-manifest.json
为每套幻灯片生成一份幻灯片清单JSON。
每张幻灯片包含:
  • position
    (从1开始的序号)
  • cognitiveJob
    (钩子、问题、洞察、证明、产品、行动号召等)
  • prompt
    (简洁,遵循上述提示词编写原则)
  • overlayText
    (3-8个单词,小写,口语化——无文本则为null)
  • imageSource
    "generate"
    "local"
  • localImagePath
    (仅当imageSource为"local"时填写)
  • generationMode
    "text-to-image"
    "edit"
    或null)
  • referenceImagePaths
    (用于编辑模式生成幻灯片的本地参考文件路径)
  • referenceImageUrls
    (用于编辑模式生成幻灯片的已上传参考图URL)
保存为本地清单文件。
完整 schema 请查看
references/slide-manifest-schema.md
生成清单后,询问用户:“是否需要localhost预览来查看和编辑?(y/n)”
如果是→启动GUI服务器,用户预览、拖放重排、编辑提示词和叠加文本。 如果否→展示JSON摘要,用户通过文字确认。
在请求生成批准之前,运行:
bash
node ${CLAUDE_SKILL_DIR}/scripts/validate-manifest.mjs --manifest /path/to/slideshow-manifest.json

Phase 3: Generation Trigger

阶段3:生成触发

After user approves the manifest, explicitly ask:
"Ready to generate N slides? GPT Image 2, quality: medium, resolution: 1k. X text-to-image, Y reference edits."
Do not auto-generate. User must confirm.
用户批准清单后,明确询问:
“是否准备生成N张幻灯片?使用GPT Image 2,质量:medium,分辨率:1k。其中X张为文本转图像,Y张为参考图编辑。”
请勿自动生成,必须获得用户确认。

Phase 4: Batch Generation

阶段4:批量生成

Local slides: copy the image into the output directory as-is.
Generated slides without references: call image-batch-runner's
generate_image.mjs
per slide.
Generated slides with references: upload each local reference path with
upload_media.mjs
, then call
edit_image.mjs
with the uploaded URLs. Existing
referenceImageUrls
can be used directly if they are already HTTP(S) URLs.
Save the normalized request JSON per slide so the run is reproducible.
Default text-to-image request shape per slide:
json
{
  "assetId": "{slideshowId}-slide-{N}",
  "runId": "{slideshowId}-run-{timestamp}",
  "provider": "hosted-media",
  "model": "image-gpt-image-2-text",
  "mode": "text-to-image",
  "prompt": "{the slide prompt}",
  "aspectRatio": "9:16",
  "quality": "medium",
  "resolution": "1k",
  "outputFormat": "png",
  "localAssetDir": "{output directory}"
}
Default reference edit request shape per slide:
json
{
  "assetId": "{slideshowId}-slide-{N}",
  "runId": "{slideshowId}-run-{timestamp}",
  "provider": "hosted-media",
  "model": "image-gpt-image-2-edit",
  "mode": "edit",
  "prompt": "{the slide prompt, explicitly preserving the reference image facts}",
  "inputUrls": ["{uploaded reference image URL}"],
  "aspectRatio": "9:16",
  "quality": "medium",
  "resolution": "1k",
  "outputFormat": "png",
  "localAssetDir": "{output directory}"
}
本地幻灯片:直接将图像复制到输出目录。
无参考图的生成幻灯片:为每张幻灯片调用image-batch-runner的
generate_image.mjs
带参考图的生成幻灯片:使用
upload_media.mjs
上传每个本地参考文件,然后调用
edit_image.mjs
并传入上传后的URL。如果已有
referenceImageUrls
且为HTTP(S) URL,可直接使用。
保存每张幻灯片的标准化请求JSON,以便重现生成流程。
每张幻灯片的默认文本转图像请求格式:
json
{
  "assetId": "{slideshowId}-slide-{N}",
  "runId": "{slideshowId}-run-{timestamp}",
  "provider": "hosted-media",
  "model": "image-gpt-image-2-text",
  "mode": "text-to-image",
  "prompt": "{the slide prompt}",
  "aspectRatio": "9:16",
  "quality": "medium",
  "resolution": "1k",
  "outputFormat": "png",
  "localAssetDir": "{output directory}"
}
每张幻灯片的默认参考图编辑请求格式:
json
{
  "assetId": "{slideshowId}-slide-{N}",
  "runId": "{slideshowId}-run-{timestamp}",
  "provider": "hosted-media",
  "model": "image-gpt-image-2-edit",
  "mode": "edit",
  "prompt": "{the slide prompt, explicitly preserving the reference image facts}",
  "inputUrls": ["{uploaded reference image URL}"],
  "aspectRatio": "9:16",
  "quality": "medium",
  "resolution": "1k",
  "outputFormat": "png",
  "localAssetDir": "{output directory}"
}

Phase 5: Text Overlay + Final Review

阶段5:文本叠加+最终预览

After all images are generated, run
composite-text.mjs
to add overlay text to each slide.
Overlay style (TikTok default):
  • White text (#FFFFFF), black stroke (#000000, 3px)
  • Font: bold sans-serif (Helvetica Bold or system equivalent)
  • Position: upper-center
  • Size: ~56px relative to 1080px canvas width
Show the final results to the user.
所有图像生成完成后,运行
composite-text.mjs
为每张幻灯片添加叠加文本。
默认叠加样式(TikTok):
  • 白色文本(#FFFFFF),黑色描边(#000000,3px)
  • 字体:粗体无衬线字体(Helvetica Bold或系统等效字体)
  • 位置:上部居中
  • 大小:相对于1080px画布宽度约56px
向用户展示最终结果。

Sub-Agent Orchestration

子Agent编排

When N slideshows are requested:
  • N ≤ 2: the main agent handles them sequentially
  • N ≥ 3: spawn N sub-agents in parallel for Phase 2 script creation
Each sub-agent produces one slide manifest JSON. The main agent collects and presents them together.
当用户请求制作N套幻灯片时:
  • N ≤ 2:主Agent依次处理
  • N ≥ 3:并行生成N个子Agent负责阶段2的脚本创建
每个子Agent生成一份幻灯片清单JSON,主Agent收集后统一呈现给用户。

GUI (Optional)

GUI(可选)

A minimal localhost review tool at
gui/index.html
.
Start with
node gui/server.mjs
(default port 3099).
The GUI lets the user:
  • See all slides in order
  • Drag to reorder
  • Edit prompt and overlay text inline
  • Toggle image source (generate vs local)
  • Save changes back to the manifest JSON
The GUI is a preview tool, not a render trigger. Generation is separate (Phase 3).
位于
gui/index.html
的轻量localhost预览工具。
使用
node gui/server.mjs
启动(默认端口3099)。
GUI支持用户:
  • 按顺序查看所有幻灯片
  • 拖放重排
  • 在线编辑提示词和叠加文本
  • 切换图像源(生成/本地)
  • 将修改保存回清单JSON
GUI仅为预览工具,不触发生成(生成在阶段3)。

Failure Modes

故障场景

  • No vibe description → ask again with more specific prompts
  • Single slide → minimum is 2 for a slideshow
  • Local image not found → flag before generation, not during
  • User wants to skip review → allow but warn
  • image-batch-runner unavailable → stop, report the gap
  • Python Pillow unavailable → stop before compositing and install Pillow in the active
    python3
    environment
  • 无氛围描述→再次询问,给出更具体的引导
  • 单张幻灯片→幻灯片最少需要2张
  • 本地图像未找到→生成前标记问题,而非生成过程中
  • 用户想要跳过预览→允许但给出警告
  • image-batch-runner不可用→停止操作,报告问题
  • Python Pillow未安装→在合成前停止操作,在当前
    python3
    环境中安装Pillow

Shared Source Context

共享源上下文

When campaign-level source files exist, use them as context:
  • brand.md
    for tone and forbidden phrases
  • persona.md
    for visual character constraints (only when slides feature the persona)
  • product.md
    for accurate product appearance
This skill may read these files. It does not create or manage them.
当存在活动级源文件时,将其作为上下文使用:
  • brand.md
    :用于语调和禁用短语
  • persona.md
    :用于视觉角色约束(仅当幻灯片包含该角色时)
  • product.md
    :用于准确的产品外观描述
本技能可读取这些文件,但不负责创建或管理。

Handoff

交接环节

After Phase 5, suggest next steps:
  • creative-qa
    for human quality review
  • social-media-publisher
    for publishing
完成阶段5后,建议后续步骤:
  • 调用
    creative-qa
    进行人工质量审核
  • 调用
    social-media-publisher
    进行发布