slideshow-producer

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Slideshow Producer

幻灯片生成器

Follow shared public skill rules in:

```
postplus-shared
```
public skill rules

请遵循以下公开共享技能规则：

```
postplus-shared
```
公开技能规则

Pipeline Position

流水线位置

script-generator (slideshow mode)
  → slideshow-producer [THIS SKILL]
    → image-batch-runner

script-generator owns: hook logic, script flows, angle strategy, persona checks
This skill owns: vibe-to-prompt translation, slide manifest management, localhost review, batch orchestration, text compositing
image-batch-runner owns: actual image generation calls, downloaded assets, local manifests

script-generator (slideshow mode)
  → slideshow-producer [本技能]
    → image-batch-runner

script-generator 负责：钩子逻辑、脚本流程、角度策略、角色校验
本技能负责：氛围到提示词的转换、幻灯片清单管理、localhost预览、批量编排、文本合成
image-batch-runner 负责：实际图像生成调用、资源下载、本地清单管理

What This Skill Is For

本技能的适用场景

turning a user's vibe description into a concrete slide-by-slide plan
writing concise, effective image prompts (core scene + vibe, not over-constrained)
producing a local slide manifest JSON for review and iteration
optionally launching a localhost GUI for drag-and-drop slide editing
orchestrating batch image generation through image-batch-runner
compositing TikTok-style overlay text (white with black stroke) onto final slides

将用户的氛围描述转换为具体的逐页幻灯片计划
编写简洁有效的图像提示词（核心场景+氛围，不过度约束）
生成用于预览和迭代的本地幻灯片清单JSON
可选启动localhost GUI进行拖放式幻灯片编辑
通过image-batch-runner编排批量图像生成
为最终幻灯片添加TikTok风格的叠加文本（白字黑边）

What This Skill Is Not For

本技能的不适用场景

writing hook variants or script flows from scratch (that is script-generator's job)
calling image generation APIs directly (that is image-batch-runner's job)
creating persona packs or brand voice documents
replacing creative-qa for final human quality review

从零开始编写钩子变体或脚本流程（这是script-generator的工作）
直接调用图像生成API（这是image-batch-runner的工作）
创建角色包或品牌语调文档
替代创意QA进行最终人工质量审核

Model Defaults

模型默认参数

Param	Default	Notes
text model	`image-gpt-image-2-text`	Use only when the slide has no reference image
edit model	`image-gpt-image-2-edit`	Required when the slide has any reference image
quality	`medium`	Faster and cheaper; sufficient for UGC slideshows
resolution	`1k`	Good for 1080×1920 output; use 2k for fine detail
aspectRatio	`9:16` (TT) or `4:5` (IG)	Based on platform
outputFormat	`png`	GPT Image 2 always outputs PNG

参数	默认值	说明
text model	`image-gpt-image-2-text`	仅当幻灯片无参考图像时使用
edit model	`image-gpt-image-2-edit`	当幻灯片有任何参考图像时必须使用
quality	`medium`	更快更便宜；足以满足UGC幻灯片需求
resolution	`1k`	适合1080×1920输出；精细细节场景使用2k
aspectRatio	`9:16` (TikTok) 或 `4:5` (Instagram)	基于平台选择
outputFormat	`png`	GPT Image 2始终输出PNG格式

Local Dependencies

本地依赖

```
python3
```
with Pillow (
```
PIL
```
) is required for
```
scripts/composite-text.mjs
```
.

运行
```
scripts/composite-text.mjs
```
需要安装带有Pillow（
```
PIL
```
）的
```
python3
```
环境。

Reference Image Routing Rule

参考图像路由规则

The slide manifest is the routing source of truth.

```
imageSource: "local"
```
means use
```
localImagePath
```
as the final slide image. Do not call image generation.

imageSource: "generate"

with no

referenceImagePaths

referenceImageUrls

means

generationMode: "text-to-image"

and must call

generate_image.mjs

with

image-gpt-image-2-text

imageSource: "generate"

with any

referenceImagePaths

referenceImageUrls

means

generationMode: "edit"

and must call

edit_image.mjs

with

image-gpt-image-2-edit

Local reference files must be uploaded through image-batch-runner's
```
upload_media.mjs
```
first. Pass only the returned URLs into
```
edit_image.mjs
```
as
```
inputUrls
```
.

Never send a slide with reference images through text-to-image. If the manifest has references but

generationMode

is missing or set to

text-to-image

, fix the manifest before generation.

幻灯片清单是路由的唯一依据。

```
imageSource: "local"
```
表示使用
```
localImagePath
```
作为最终幻灯片图像，无需调用图像生成。

imageSource: "generate"

且无

referenceImagePaths

或

referenceImageUrls

时，

generationMode: "text-to-image"

，必须调用

generate_image.mjs

并使用

image-gpt-image-2-text

模型。

imageSource: "generate"

且存在

referenceImagePaths

或

referenceImageUrls

时，

generationMode: "edit"

，必须调用

edit_image.mjs

并使用

image-gpt-image-2-edit

模型。

本地参考文件必须先通过image-batch-runner的
```
upload_media.mjs
```
上传，仅将返回的URL作为
```
inputUrls
```
传入
```
edit_image.mjs
```
。

绝不能将带有参考图像的幻灯片传入文本转图像流程。如果清单包含参考图像但

generationMode

缺失或设置为

text-to-image

，请在生成前修正清单。

Prompt Writing Principles

提示词编写原则

GPT Image 2 needs clear scene and vibe description, not a wall of constraints.

GPT Image 2需要清晰的场景和氛围描述，而非堆砌约束条件。

Do: core scene + vibe

正确做法：核心场景+氛围

Describe what the image should show and how it should feel. 2-3 sentences max.

Good:

A person sitting at a cluttered home desk, frustrated expression, looking at a laptop screen.
Natural window light from the side, casual iphone photo feel, slightly warm color cast.

描述图像应展示的内容和给人的感觉，最多2-3句话。

示例：

一个人坐在杂乱的家用书桌前，面露沮丧，盯着笔记本电脑屏幕。
侧面自然窗光，日常iPhone拍摄质感，略带暖色调。

Don't: over-constrain

错误做法：过度约束

Do not pile on "no studio lighting, no professional photography, no cinematic quality, no soft glow, no perfect skin, no..." — GPT Image 2 doesn't need negative constraints. They clog the prompt and can confuse the model.

If the vibe is "casual iphone photo", just say that. The model understands.

不要堆砌“无 studio 灯光、无专业摄影、无电影质感、无柔光、无完美皮肤、无……”这类否定约束——GPT Image 2不需要这些。它们会占用提示词空间，还可能混淆模型。

如果想要“日常iPhone拍摄”的氛围，直接说明即可，模型能理解。

When to add specifics

何时添加细节

Product visible: describe it clearly (color, shape, placement)
Environment matters: room type, lighting source, time of day
Human expression: the specific emotion or action
Props: what else is in frame

When in doubt, be more specific about what IS there, not what ISN'T.

需要展示产品：清晰描述（颜色、形状、位置）
环境很重要：房间类型、光源、时间
人物表情：具体情绪或动作
道具：画面中还有什么

拿不准时，多描述存在的内容，而非不存在的内容。

Platform Modes

平台模式

TikTok Slideshow (default)

TikTok幻灯片（默认）

Aspect ratio: 9:16 (1080×1920)
Safe zones: avoid top ~60px and bottom ~80px for UI overlays
Overlay text: upper-center, bold white with black stroke
Default slide count: 5-7

宽高比：9:16（1080×1920）
安全区域：避免顶部约60px和底部约80px的UI覆盖区域
叠加文本：上部居中位置，粗体白字黑边
默认幻灯片数量：5-7张

Instagram Carousel

Instagram轮播图

Aspect ratio: 4:5 (1080×1350) preferred; 1:1 (1080×1080) alternative
Most text in the caption field; minimal image overlay
Default slide count: 4-6

Default to TikTok when the user hasn't specified.

Read detailed specs in

references/platform-specs.md

推荐宽高比：4:5（1080×1350）；可选1:1（1080×1080）
大部分文字放在标题栏；图像上仅添加少量叠加文本
默认幻灯片数量：4-6张

用户未指定平台时，默认使用TikTok模式。

详细规格请查看

references/platform-specs.md

。

Default Workflow

默认工作流程

Phase 1: Requirements

阶段1：需求确认

Ask the user:

How many slideshows? → determines sub-agent count
TikTok or Instagram? → default TT
Describe the vibe — what should this feel like? Keep it loose.
Any local images to use for specific slides? → identify whether each path is a final slide image (
```
imageSource: "local"
```
) or a reference for generation (
```
generationMode: "edit"
```
).

Do not jump to scripts until you understand the feeling they want.

询问用户：

需要制作多少套幻灯片？→ 决定子Agent数量
用于TikTok还是Instagram？→ 默认TikTok
描述所需氛围——整体感觉是什么？无需太具体。
是否有本地图像用于特定幻灯片？→ 确定每个路径是最终幻灯片图像（
```
imageSource: "local"
```
）还是生成参考图（
```
generationMode: "edit"
```
）。

在理解用户想要的氛围之前，不要直接进入脚本创作环节。

Phase 2: Script Creation

阶段2：脚本创建

For each slideshow, produce a slide manifest JSON.

Each slide has:

```
position
```
(1-indexed)
```
cognitiveJob
```
(hook, problem, insight, proof, product, cta, etc.)
```
prompt
```
(concise, following prompt writing principles above)
```
overlayText
```
(3-8 words, lowercase, conversational — or null if no text)
```
imageSource
```
(
```
"generate"
```
or
```
"local"
```
)
```
localImagePath
```
(only if imageSource is "local")
```
generationMode
```
(
```
"text-to-image"
```
,
```
"edit"
```
, or null)
```
referenceImagePaths
```
(local reference files for generated edit-mode slides)
```
referenceImageUrls
```
(uploaded reference URLs for generated edit-mode slides)

Save as a local manifest file.

Read the full schema in

references/slide-manifest-schema.md

After producing the manifest, ask: "Need a localhost preview to review and edit? (y/n)"

If yes → start the GUI server, user reviews, drags to reorder, edits prompts and overlay text. If no → show the JSON summary, user confirms by text.

Before asking for generation approval, run:

bash

node ${CLAUDE_SKILL_DIR}/scripts/validate-manifest.mjs --manifest /path/to/slideshow-manifest.json

为每套幻灯片生成一份幻灯片清单JSON。

每张幻灯片包含：

```
position
```
（从1开始的序号）
```
cognitiveJob
```
（钩子、问题、洞察、证明、产品、行动号召等）
```
prompt
```
（简洁，遵循上述提示词编写原则）
```
overlayText
```
（3-8个单词，小写，口语化——无文本则为null）
```
imageSource
```
（
```
"generate"
```
或
```
"local"
```
）
```
localImagePath
```
（仅当imageSource为"local"时填写）
```
generationMode
```
（
```
"text-to-image"
```
、
```
"edit"
```
或null）
```
referenceImagePaths
```
（用于编辑模式生成幻灯片的本地参考文件路径）
```
referenceImageUrls
```
（用于编辑模式生成幻灯片的已上传参考图URL）

保存为本地清单文件。

完整 schema 请查看

references/slide-manifest-schema.md

。

生成清单后，询问用户：“是否需要localhost预览来查看和编辑？(y/n)”

如果是→启动GUI服务器，用户预览、拖放重排、编辑提示词和叠加文本。如果否→展示JSON摘要，用户通过文字确认。

在请求生成批准之前，运行：

bash

node ${CLAUDE_SKILL_DIR}/scripts/validate-manifest.mjs --manifest /path/to/slideshow-manifest.json

Phase 3: Generation Trigger

阶段3：生成触发

After user approves the manifest, explicitly ask:

"Ready to generate N slides? GPT Image 2, quality: medium, resolution: 1k. X text-to-image, Y reference edits."

Do not auto-generate. User must confirm.

用户批准清单后，明确询问：

“是否准备生成N张幻灯片？使用GPT Image 2，质量：medium，分辨率：1k。其中X张为文本转图像，Y张为参考图编辑。”

请勿自动生成，必须获得用户确认。

Phase 4: Batch Generation

阶段4：批量生成

Local slides: copy the image into the output directory as-is.

Generated slides without references: call image-batch-runner's

generate_image.mjs

per slide.

Generated slides with references: upload each local reference path with

upload_media.mjs

, then call

edit_image.mjs

with the uploaded URLs. Existing

referenceImageUrls

can be used directly if they are already HTTP(S) URLs.

Save the normalized request JSON per slide so the run is reproducible.

Default text-to-image request shape per slide:

json

{
  "assetId": "{slideshowId}-slide-{N}",
  "runId": "{slideshowId}-run-{timestamp}",
  "provider": "hosted-media",
  "model": "image-gpt-image-2-text",
  "mode": "text-to-image",
  "prompt": "{the slide prompt}",
  "aspectRatio": "9:16",
  "quality": "medium",
  "resolution": "1k",
  "outputFormat": "png",
  "localAssetDir": "{output directory}"
}

Default reference edit request shape per slide:

json

{
  "assetId": "{slideshowId}-slide-{N}",
  "runId": "{slideshowId}-run-{timestamp}",
  "provider": "hosted-media",
  "model": "image-gpt-image-2-edit",
  "mode": "edit",
  "prompt": "{the slide prompt, explicitly preserving the reference image facts}",
  "inputUrls": ["{uploaded reference image URL}"],
  "aspectRatio": "9:16",
  "quality": "medium",
  "resolution": "1k",
  "outputFormat": "png",
  "localAssetDir": "{output directory}"
}

本地幻灯片：直接将图像复制到输出目录。

无参考图的生成幻灯片：为每张幻灯片调用image-batch-runner的

generate_image.mjs

。

带参考图的生成幻灯片：使用

upload_media.mjs

上传每个本地参考文件，然后调用

edit_image.mjs

并传入上传后的URL。如果已有

referenceImageUrls

且为HTTP(S) URL，可直接使用。

保存每张幻灯片的标准化请求JSON，以便重现生成流程。

每张幻灯片的默认文本转图像请求格式：

json

{
  "assetId": "{slideshowId}-slide-{N}",
  "runId": "{slideshowId}-run-{timestamp}",
  "provider": "hosted-media",
  "model": "image-gpt-image-2-text",
  "mode": "text-to-image",
  "prompt": "{the slide prompt}",
  "aspectRatio": "9:16",
  "quality": "medium",
  "resolution": "1k",
  "outputFormat": "png",
  "localAssetDir": "{output directory}"
}

每张幻灯片的默认参考图编辑请求格式：

json

{
  "assetId": "{slideshowId}-slide-{N}",
  "runId": "{slideshowId}-run-{timestamp}",
  "provider": "hosted-media",
  "model": "image-gpt-image-2-edit",
  "mode": "edit",
  "prompt": "{the slide prompt, explicitly preserving the reference image facts}",
  "inputUrls": ["{uploaded reference image URL}"],
  "aspectRatio": "9:16",
  "quality": "medium",
  "resolution": "1k",
  "outputFormat": "png",
  "localAssetDir": "{output directory}"
}

Phase 5: Text Overlay + Final Review

阶段5：文本叠加+最终预览

After all images are generated, run

composite-text.mjs

to add overlay text to each slide.

Overlay style (TikTok default):

White text (#FFFFFF), black stroke (#000000, 3px)
Font: bold sans-serif (Helvetica Bold or system equivalent)
Position: upper-center
Size: ~56px relative to 1080px canvas width

Show the final results to the user.

所有图像生成完成后，运行

composite-text.mjs

为每张幻灯片添加叠加文本。

默认叠加样式（TikTok）：

白色文本（#FFFFFF），黑色描边（#000000，3px）
字体：粗体无衬线字体（Helvetica Bold或系统等效字体）
位置：上部居中
大小：相对于1080px画布宽度约56px

向用户展示最终结果。

Sub-Agent Orchestration

子Agent编排

When N slideshows are requested:

N ≤ 2: the main agent handles them sequentially
N ≥ 3: spawn N sub-agents in parallel for Phase 2 script creation

Each sub-agent produces one slide manifest JSON. The main agent collects and presents them together.

当用户请求制作N套幻灯片时：

N ≤ 2：主Agent依次处理
N ≥ 3：并行生成N个子Agent负责阶段2的脚本创建

每个子Agent生成一份幻灯片清单JSON，主Agent收集后统一呈现给用户。

GUI (Optional)

GUI（可选）

A minimal localhost review tool at

gui/index.html

Start with

node gui/server.mjs

(default port 3099).

The GUI lets the user:

See all slides in order
Drag to reorder
Edit prompt and overlay text inline
Toggle image source (generate vs local)
Save changes back to the manifest JSON

The GUI is a preview tool, not a render trigger. Generation is separate (Phase 3).

位于

gui/index.html

的轻量localhost预览工具。

使用

node gui/server.mjs

启动（默认端口3099）。

GUI支持用户：

按顺序查看所有幻灯片
拖放重排
在线编辑提示词和叠加文本
切换图像源（生成/本地）
将修改保存回清单JSON

GUI仅为预览工具，不触发生成（生成在阶段3）。

Failure Modes

故障场景

No vibe description → ask again with more specific prompts
Single slide → minimum is 2 for a slideshow
Local image not found → flag before generation, not during
User wants to skip review → allow but warn
image-batch-runner unavailable → stop, report the gap
Python Pillow unavailable → stop before compositing and install Pillow in the active
```
python3
```
environment

无氛围描述→再次询问，给出更具体的引导
单张幻灯片→幻灯片最少需要2张
本地图像未找到→生成前标记问题，而非生成过程中
用户想要跳过预览→允许但给出警告
image-batch-runner不可用→停止操作，报告问题
Python Pillow未安装→在合成前停止操作，在当前
```
python3
```
环境中安装Pillow

Shared Source Context

共享源上下文

When campaign-level source files exist, use them as context:

```
brand.md
```
for tone and forbidden phrases
```
persona.md
```
for visual character constraints (only when slides feature the persona)
```
product.md
```
for accurate product appearance

This skill may read these files. It does not create or manage them.

当存在活动级源文件时，将其作为上下文使用：

```
brand.md
```
：用于语调和禁用短语
```
persona.md
```
：用于视觉角色约束（仅当幻灯片包含该角色时）
```
product.md
```
：用于准确的产品外观描述

本技能可读取这些文件，但不负责创建或管理。

Handoff

交接环节

After Phase 5, suggest next steps:

```
creative-qa
```
for human quality review
```
social-media-publisher
```
for publishing

完成阶段5后，建议后续步骤：

调用
```
creative-qa
```
进行人工质量审核
调用
```
social-media-publisher
```
进行发布