slideshow-producer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSlideshow Producer
幻灯片生成器
Follow shared public skill rules in:
- public skill rules
postplus-shared
请遵循以下公开共享技能规则:
- 公开技能规则
postplus-shared
Pipeline Position
流水线位置
script-generator (slideshow mode)
→ slideshow-producer [THIS SKILL]
→ image-batch-runner- script-generator owns: hook logic, script flows, angle strategy, persona checks
- This skill owns: vibe-to-prompt translation, slide manifest management, localhost review, batch orchestration, text compositing
- image-batch-runner owns: actual image generation calls, downloaded assets, local manifests
script-generator (slideshow mode)
→ slideshow-producer [本技能]
→ image-batch-runner- script-generator 负责:钩子逻辑、脚本流程、角度策略、角色校验
- 本技能负责:氛围到提示词的转换、幻灯片清单管理、localhost预览、批量编排、文本合成
- image-batch-runner 负责:实际图像生成调用、资源下载、本地清单管理
What This Skill Is For
本技能的适用场景
- turning a user's vibe description into a concrete slide-by-slide plan
- writing concise, effective image prompts (core scene + vibe, not over-constrained)
- producing a local slide manifest JSON for review and iteration
- optionally launching a localhost GUI for drag-and-drop slide editing
- orchestrating batch image generation through image-batch-runner
- compositing TikTok-style overlay text (white with black stroke) onto final slides
- 将用户的氛围描述转换为具体的逐页幻灯片计划
- 编写简洁有效的图像提示词(核心场景+氛围,不过度约束)
- 生成用于预览和迭代的本地幻灯片清单JSON
- 可选启动localhost GUI进行拖放式幻灯片编辑
- 通过image-batch-runner编排批量图像生成
- 为最终幻灯片添加TikTok风格的叠加文本(白字黑边)
What This Skill Is Not For
本技能的不适用场景
- writing hook variants or script flows from scratch (that is script-generator's job)
- calling image generation APIs directly (that is image-batch-runner's job)
- creating persona packs or brand voice documents
- replacing creative-qa for final human quality review
- 从零开始编写钩子变体或脚本流程(这是script-generator的工作)
- 直接调用图像生成API(这是image-batch-runner的工作)
- 创建角色包或品牌语调文档
- 替代创意QA进行最终人工质量审核
Model Defaults
模型默认参数
| Param | Default | Notes |
|---|---|---|
| text model | | Use only when the slide has no reference image |
| edit model | | Required when the slide has any reference image |
| quality | | Faster and cheaper; sufficient for UGC slideshows |
| resolution | | Good for 1080×1920 output; use 2k for fine detail |
| aspectRatio | | Based on platform |
| outputFormat | | GPT Image 2 always outputs PNG |
| 参数 | 默认值 | 说明 |
|---|---|---|
| text model | | 仅当幻灯片无参考图像时使用 |
| edit model | | 当幻灯片有任何参考图像时必须使用 |
| quality | | 更快更便宜;足以满足UGC幻灯片需求 |
| resolution | | 适合1080×1920输出;精细细节场景使用2k |
| aspectRatio | | 基于平台选择 |
| outputFormat | | GPT Image 2始终输出PNG格式 |
Local Dependencies
本地依赖
- with Pillow (
python3) is required forPIL.scripts/composite-text.mjs
- 运行需要安装带有Pillow(
scripts/composite-text.mjs)的PIL环境。python3
Reference Image Routing Rule
参考图像路由规则
The slide manifest is the routing source of truth.
- means use
imageSource: "local"as the final slide image. Do not call image generation.localImagePath - with no
imageSource: "generate"orreferenceImagePathsmeansreferenceImageUrlsand must callgenerationMode: "text-to-image"withgenerate_image.mjs.image-gpt-image-2-text - with any
imageSource: "generate"orreferenceImagePathsmeansreferenceImageUrlsand must callgenerationMode: "edit"withedit_image.mjs.image-gpt-image-2-edit - Local reference files must be uploaded through image-batch-runner's first. Pass only the returned URLs into
upload_media.mjsasedit_image.mjs.inputUrls
Never send a slide with reference images through text-to-image. If the manifest has references but is missing or set to , fix the manifest before generation.
generationModetext-to-image幻灯片清单是路由的唯一依据。
- 表示使用
imageSource: "local"作为最终幻灯片图像,无需调用图像生成。localImagePath - 且无
imageSource: "generate"或referenceImagePaths时,referenceImageUrls,必须调用generationMode: "text-to-image"并使用generate_image.mjs模型。image-gpt-image-2-text - 且存在
imageSource: "generate"或referenceImagePaths时,referenceImageUrls,必须调用generationMode: "edit"并使用edit_image.mjs模型。image-gpt-image-2-edit - 本地参考文件必须先通过image-batch-runner的上传,仅将返回的URL作为
upload_media.mjs传入inputUrls。edit_image.mjs
绝不能将带有参考图像的幻灯片传入文本转图像流程。如果清单包含参考图像但缺失或设置为,请在生成前修正清单。
generationModetext-to-imagePrompt Writing Principles
提示词编写原则
GPT Image 2 needs clear scene and vibe description, not a wall of constraints.
GPT Image 2需要清晰的场景和氛围描述,而非堆砌约束条件。
Do: core scene + vibe
正确做法:核心场景+氛围
Describe what the image should show and how it should feel. 2-3 sentences max.
Good:
A person sitting at a cluttered home desk, frustrated expression, looking at a laptop screen.
Natural window light from the side, casual iphone photo feel, slightly warm color cast.描述图像应展示的内容和给人的感觉,最多2-3句话。
示例:
一个人坐在杂乱的家用书桌前,面露沮丧,盯着笔记本电脑屏幕。
侧面自然窗光,日常iPhone拍摄质感,略带暖色调。Don't: over-constrain
错误做法:过度约束
Do not pile on "no studio lighting, no professional photography, no cinematic quality, no soft glow, no perfect skin, no..." — GPT Image 2 doesn't need negative constraints. They clog the prompt and can confuse the model.
If the vibe is "casual iphone photo", just say that. The model understands.
不要堆砌“无 studio 灯光、无专业摄影、无电影质感、无柔光、无完美皮肤、无……”这类否定约束——GPT Image 2不需要这些。它们会占用提示词空间,还可能混淆模型。
如果想要“日常iPhone拍摄”的氛围,直接说明即可,模型能理解。
When to add specifics
何时添加细节
- Product visible: describe it clearly (color, shape, placement)
- Environment matters: room type, lighting source, time of day
- Human expression: the specific emotion or action
- Props: what else is in frame
When in doubt, be more specific about what IS there, not what ISN'T.
- 需要展示产品:清晰描述(颜色、形状、位置)
- 环境很重要:房间类型、光源、时间
- 人物表情:具体情绪或动作
- 道具:画面中还有什么
拿不准时,多描述存在的内容,而非不存在的内容。
Platform Modes
平台模式
TikTok Slideshow (default)
TikTok幻灯片(默认)
- Aspect ratio: 9:16 (1080×1920)
- Safe zones: avoid top ~60px and bottom ~80px for UI overlays
- Overlay text: upper-center, bold white with black stroke
- Default slide count: 5-7
- 宽高比:9:16(1080×1920)
- 安全区域:避免顶部约60px和底部约80px的UI覆盖区域
- 叠加文本:上部居中位置,粗体白字黑边
- 默认幻灯片数量:5-7张
Instagram Carousel
Instagram轮播图
- Aspect ratio: 4:5 (1080×1350) preferred; 1:1 (1080×1080) alternative
- Most text in the caption field; minimal image overlay
- Default slide count: 4-6
Default to TikTok when the user hasn't specified.
Read detailed specs in .
references/platform-specs.md- 推荐宽高比:4:5(1080×1350);可选1:1(1080×1080)
- 大部分文字放在标题栏;图像上仅添加少量叠加文本
- 默认幻灯片数量:4-6张
用户未指定平台时,默认使用TikTok模式。
详细规格请查看。
references/platform-specs.mdDefault Workflow
默认工作流程
Phase 1: Requirements
阶段1:需求确认
Ask the user:
- How many slideshows? → determines sub-agent count
- TikTok or Instagram? → default TT
- Describe the vibe — what should this feel like? Keep it loose.
- Any local images to use for specific slides? → identify whether each path is a final slide image () or a reference for generation (
imageSource: "local").generationMode: "edit"
Do not jump to scripts until you understand the feeling they want.
询问用户:
- 需要制作多少套幻灯片?→ 决定子Agent数量
- 用于TikTok还是Instagram?→ 默认TikTok
- 描述所需氛围——整体感觉是什么?无需太具体。
- 是否有本地图像用于特定幻灯片?→ 确定每个路径是最终幻灯片图像()还是生成参考图(
imageSource: "local")。generationMode: "edit"
在理解用户想要的氛围之前,不要直接进入脚本创作环节。
Phase 2: Script Creation
阶段2:脚本创建
For each slideshow, produce a slide manifest JSON.
Each slide has:
- (1-indexed)
position - (hook, problem, insight, proof, product, cta, etc.)
cognitiveJob - (concise, following prompt writing principles above)
prompt - (3-8 words, lowercase, conversational — or null if no text)
overlayText - (
imageSourceor"generate")"local" - (only if imageSource is "local")
localImagePath - (
generationMode,"text-to-image", or null)"edit" - (local reference files for generated edit-mode slides)
referenceImagePaths - (uploaded reference URLs for generated edit-mode slides)
referenceImageUrls
Save as a local manifest file.
Read the full schema in .
references/slide-manifest-schema.mdAfter producing the manifest, ask: "Need a localhost preview to review and edit? (y/n)"
If yes → start the GUI server, user reviews, drags to reorder, edits prompts and overlay text.
If no → show the JSON summary, user confirms by text.
Before asking for generation approval, run:
bash
node ${CLAUDE_SKILL_DIR}/scripts/validate-manifest.mjs --manifest /path/to/slideshow-manifest.json为每套幻灯片生成一份幻灯片清单JSON。
每张幻灯片包含:
- (从1开始的序号)
position - (钩子、问题、洞察、证明、产品、行动号召等)
cognitiveJob - (简洁,遵循上述提示词编写原则)
prompt - (3-8个单词,小写,口语化——无文本则为null)
overlayText - (
imageSource或"generate")"local" - (仅当imageSource为"local"时填写)
localImagePath - (
generationMode、"text-to-image"或null)"edit" - (用于编辑模式生成幻灯片的本地参考文件路径)
referenceImagePaths - (用于编辑模式生成幻灯片的已上传参考图URL)
referenceImageUrls
保存为本地清单文件。
完整 schema 请查看。
references/slide-manifest-schema.md生成清单后,询问用户:“是否需要localhost预览来查看和编辑?(y/n)”
如果是→启动GUI服务器,用户预览、拖放重排、编辑提示词和叠加文本。
如果否→展示JSON摘要,用户通过文字确认。
在请求生成批准之前,运行:
bash
node ${CLAUDE_SKILL_DIR}/scripts/validate-manifest.mjs --manifest /path/to/slideshow-manifest.jsonPhase 3: Generation Trigger
阶段3:生成触发
After user approves the manifest, explicitly ask:
"Ready to generate N slides? GPT Image 2, quality: medium, resolution: 1k. X text-to-image, Y reference edits."
Do not auto-generate. User must confirm.
用户批准清单后,明确询问:
“是否准备生成N张幻灯片?使用GPT Image 2,质量:medium,分辨率:1k。其中X张为文本转图像,Y张为参考图编辑。”
请勿自动生成,必须获得用户确认。
Phase 4: Batch Generation
阶段4:批量生成
Local slides: copy the image into the output directory as-is.
Generated slides without references: call image-batch-runner's per slide.
generate_image.mjsGenerated slides with references: upload each local reference path with , then call with the uploaded URLs. Existing can be used directly if they are already HTTP(S) URLs.
upload_media.mjsedit_image.mjsreferenceImageUrlsSave the normalized request JSON per slide so the run is reproducible.
Default text-to-image request shape per slide:
json
{
"assetId": "{slideshowId}-slide-{N}",
"runId": "{slideshowId}-run-{timestamp}",
"provider": "hosted-media",
"model": "image-gpt-image-2-text",
"mode": "text-to-image",
"prompt": "{the slide prompt}",
"aspectRatio": "9:16",
"quality": "medium",
"resolution": "1k",
"outputFormat": "png",
"localAssetDir": "{output directory}"
}Default reference edit request shape per slide:
json
{
"assetId": "{slideshowId}-slide-{N}",
"runId": "{slideshowId}-run-{timestamp}",
"provider": "hosted-media",
"model": "image-gpt-image-2-edit",
"mode": "edit",
"prompt": "{the slide prompt, explicitly preserving the reference image facts}",
"inputUrls": ["{uploaded reference image URL}"],
"aspectRatio": "9:16",
"quality": "medium",
"resolution": "1k",
"outputFormat": "png",
"localAssetDir": "{output directory}"
}本地幻灯片:直接将图像复制到输出目录。
无参考图的生成幻灯片:为每张幻灯片调用image-batch-runner的。
generate_image.mjs带参考图的生成幻灯片:使用上传每个本地参考文件,然后调用并传入上传后的URL。如果已有且为HTTP(S) URL,可直接使用。
upload_media.mjsedit_image.mjsreferenceImageUrls保存每张幻灯片的标准化请求JSON,以便重现生成流程。
每张幻灯片的默认文本转图像请求格式:
json
{
"assetId": "{slideshowId}-slide-{N}",
"runId": "{slideshowId}-run-{timestamp}",
"provider": "hosted-media",
"model": "image-gpt-image-2-text",
"mode": "text-to-image",
"prompt": "{the slide prompt}",
"aspectRatio": "9:16",
"quality": "medium",
"resolution": "1k",
"outputFormat": "png",
"localAssetDir": "{output directory}"
}每张幻灯片的默认参考图编辑请求格式:
json
{
"assetId": "{slideshowId}-slide-{N}",
"runId": "{slideshowId}-run-{timestamp}",
"provider": "hosted-media",
"model": "image-gpt-image-2-edit",
"mode": "edit",
"prompt": "{the slide prompt, explicitly preserving the reference image facts}",
"inputUrls": ["{uploaded reference image URL}"],
"aspectRatio": "9:16",
"quality": "medium",
"resolution": "1k",
"outputFormat": "png",
"localAssetDir": "{output directory}"
}Phase 5: Text Overlay + Final Review
阶段5:文本叠加+最终预览
After all images are generated, run to add overlay text to each slide.
composite-text.mjsOverlay style (TikTok default):
- White text (#FFFFFF), black stroke (#000000, 3px)
- Font: bold sans-serif (Helvetica Bold or system equivalent)
- Position: upper-center
- Size: ~56px relative to 1080px canvas width
Show the final results to the user.
所有图像生成完成后,运行为每张幻灯片添加叠加文本。
composite-text.mjs默认叠加样式(TikTok):
- 白色文本(#FFFFFF),黑色描边(#000000,3px)
- 字体:粗体无衬线字体(Helvetica Bold或系统等效字体)
- 位置:上部居中
- 大小:相对于1080px画布宽度约56px
向用户展示最终结果。
Sub-Agent Orchestration
子Agent编排
When N slideshows are requested:
- N ≤ 2: the main agent handles them sequentially
- N ≥ 3: spawn N sub-agents in parallel for Phase 2 script creation
Each sub-agent produces one slide manifest JSON. The main agent collects and presents them together.
当用户请求制作N套幻灯片时:
- N ≤ 2:主Agent依次处理
- N ≥ 3:并行生成N个子Agent负责阶段2的脚本创建
每个子Agent生成一份幻灯片清单JSON,主Agent收集后统一呈现给用户。
GUI (Optional)
GUI(可选)
A minimal localhost review tool at .
gui/index.htmlStart with (default port 3099).
node gui/server.mjsThe GUI lets the user:
- See all slides in order
- Drag to reorder
- Edit prompt and overlay text inline
- Toggle image source (generate vs local)
- Save changes back to the manifest JSON
The GUI is a preview tool, not a render trigger. Generation is separate (Phase 3).
位于的轻量localhost预览工具。
gui/index.html使用启动(默认端口3099)。
node gui/server.mjsGUI支持用户:
- 按顺序查看所有幻灯片
- 拖放重排
- 在线编辑提示词和叠加文本
- 切换图像源(生成/本地)
- 将修改保存回清单JSON
GUI仅为预览工具,不触发生成(生成在阶段3)。
Failure Modes
故障场景
- No vibe description → ask again with more specific prompts
- Single slide → minimum is 2 for a slideshow
- Local image not found → flag before generation, not during
- User wants to skip review → allow but warn
- image-batch-runner unavailable → stop, report the gap
- Python Pillow unavailable → stop before compositing and install Pillow in the active environment
python3
- 无氛围描述→再次询问,给出更具体的引导
- 单张幻灯片→幻灯片最少需要2张
- 本地图像未找到→生成前标记问题,而非生成过程中
- 用户想要跳过预览→允许但给出警告
- image-batch-runner不可用→停止操作,报告问题
- Python Pillow未安装→在合成前停止操作,在当前环境中安装Pillow
python3
Shared Source Context
共享源上下文
When campaign-level source files exist, use them as context:
- for tone and forbidden phrases
brand.md - for visual character constraints (only when slides feature the persona)
persona.md - for accurate product appearance
product.md
This skill may read these files. It does not create or manage them.
当存在活动级源文件时,将其作为上下文使用:
- :用于语调和禁用短语
brand.md - :用于视觉角色约束(仅当幻灯片包含该角色时)
persona.md - :用于准确的产品外观描述
product.md
本技能可读取这些文件,但不负责创建或管理。
Handoff
交接环节
After Phase 5, suggest next steps:
- for human quality review
creative-qa - for publishing
social-media-publisher
完成阶段5后,建议后续步骤:
- 调用进行人工质量审核
creative-qa - 调用进行发布
social-media-publisher