app-sizzle

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

App Sizzle — GPT-Image-2 Enhanced iOS App Teaser

App Sizzle — 经GPT-image-2增强的iOS应用预告视频

Generate a polished 15-second app teaser from real app screens. Each selected screen is passed through GPT-image-2 before Seedance so compressed captures become cleaner references without inventing UI.
Generation contract: use
resolution="1080p"
,
duration=15
, and
sound=True
. Skip
fast=true
because it caps Seedance at 720p. The skill owns duration and sound so the user only has to supply app identity, screens, logo, and aspect ratio.
The visual aesthetic is derived from the app's personality — not defaulted to liquid glass. The agent reads the app's soul from its icon, screenshots, and category, then chooses a treatment. The user provides the app identity and assets; the agent decides everything else (mode, prompt, camera, style).

基于真实应用屏幕生成时长15秒的精美应用预告。每个选中的屏幕在送入Seedance前都会经过GPT-image-2处理,让压缩的捕获画面成为更清晰的参考,同时不会凭空生成UI。
生成约定: 使用
resolution="1080p"
duration=15
sound=True
。请勿使用
fast=true
,因为它会将Seedance的分辨率限制在720p。本工具负责时长和音效设置,用户只需提供应用标识、屏幕截图、logo和宽高比即可。
视觉风格由应用的特性衍生——而非默认的液态玻璃效果。Agent会从应用图标、截图和类别中解读其核心特质,然后选择合适的处理方式。用户提供应用标识和资产,Agent决定其他所有事项(模式、提示词、镜头、风格)。

Mode: Reference-to-Video

模式:参考图转视频

Primary:
mcp__pika__generate_reference_video(provider="seedance", resolution="1080p")
with 3–5 screenshots + the app icon/logo as the final reference.
Fallback to
provider="kling", quality_mode="pro"
(= 1080p) when:
  • Seedance returns non-audio
    partner_validation_failed
    (celebrity faces, screen-recording UI)
  • Seedance returns
    insufficient_balance
  • Seedance stays queued/running until it returns a timeout such as
    seedance timed out after ...
Do not treat generated-audio moderation as an immediate Kling fallback. See the Seedance generated-audio moderation recovery runbook in Generate Video first.
Kling prompt uses
<<<image_1>>>
<<<image_5>>>
tokens instead of
@Image1
@Image5
. Drop the
resolution
param (Kling uses
quality_mode
instead). See Gotchas.

首选方案:调用
mcp__pika__generate_reference_video(provider="seedance", resolution="1080p")
,传入3–5张截图+应用图标/logo作为最终参考图。
当出现以下情况时, fallback到
provider="kling", quality_mode="pro"
(对应1080p):
  • Seedance返回无音频的
    partner_validation_failed
    (包含名人面孔、录屏UI等)
  • Seedance返回
    insufficient_balance
  • Seedance长时间处于排队/运行状态,最终返回超时信息,如
    seedance timed out after ...
请勿将生成音频的审核问题直接作为切换到Kling的触发条件。请先参考《生成视频》章节中的Seedance生成音频审核恢复手册。
Kling的提示词使用
<<<image_1>>>
<<<image_5>>>
标记,而非
@Image1
@Image5
。无需传入
resolution
参数(Kling使用
quality_mode
替代)。详见注意事项。

Stage 0 — Asset Sourcing

阶段0 — 资产获取

If invoked with empty args and no relevant prior context, print this menu verbatim and stop. Do not call tools until the user supplies the app identity and screen source.
To make your app promo, I need:

1. App name + one-line description of what it does
   (e.g. "Nova — an AI journaling app for iOS")

2. Where should I pull the app screens from?
   — iOS App Store: give me the App Store URL or app name → I'll use
     `mcp__pika__fetch_appstore_screens` to fetch screenshots, metadata, and icon
   — Web app / website: give me the URL → I'll capture it with Pika MCP
   — Local files / URLs: drop the paths and I'll upload them

3. Brand logo — path or URL (preferred) or skip to use the App Store icon
   The logo anchors the end card and prevents Seedance from hallucinating brand text.
   If you don't have a logo file, use the fetched App Store icon as the fallback.

4. Aspect ratio: 16:9 (landscape/YouTube) / 9:16 (Reels/TikTok) / 1:1
If the trigger message or prior context already supplies part of this, ask only for the missing required fields before touching any tool. These are the only questions the user needs to answer; the agent decides mode, prompt, camera, and style.
Once answered, the agent:
  1. Sources the screens (MCP App Store fetch / website capture / upload local files)
  2. Analyzes each screen (Stage 1) — reads every screenshot, maps UI → feature
  3. Designs the narrative arc (Stage 2) — builds a 15s story structure before touching the prompt
  4. Selects the 3–5 best screens for the promo (ordered by narrative role)
  5. Uploads logo + screens to get public URLs
  6. Writes the screen-specific prompt (Template A or B)
  7. Generates at 1080p

如果调用时参数为空且无相关上下文,请直接打印以下菜单并停止操作。在用户提供应用标识和屏幕来源前,请勿调用任何工具。
要制作您的应用宣传视频,我需要:

1. 应用名称 + 一行功能描述
   (例如:"Nova — 一款面向iOS的AI日记应用")

2. 应用屏幕截图的来源?
   — iOS App Store:提供App Store链接或应用名称 → 我将使用
     `mcp__pika__fetch_appstore_screens`获取截图、元数据和图标
   — Web应用/网站:提供网址 → 我将通过Pika MCP捕获屏幕
   — 本地文件/URL:提供文件路径,我将上传它们

3. 品牌logo — 文件路径或URL(优先推荐),或跳过以使用App Store图标
   logo用于锚定结尾画面,避免Seedance凭空生成品牌文字。
   如果您没有logo文件,将使用获取到的App Store图标作为替代。

4. 宽高比:16:9(横屏/YouTube)/ 9:16(Reels/TikTok)/ 1:1
如果触发消息或已有上下文已提供部分信息,只需询问缺失的必填字段,再调用工具。用户只需回答这些问题,Agent会决定模式、提示词、镜头和风格。
用户提供信息后,Agent将:
  1. 获取屏幕截图(通过MCP App Store抓取/网站捕获/上传本地文件)
  2. 分析每个屏幕(阶段1)——读取每张截图,映射UI到对应功能
  3. 设计叙事结构(阶段2)——在编写提示词前先构建15秒的故事框架
  4. 选择3–5张最适合宣传的截图(按叙事角色排序)
  5. 上传logo和截图以获取公共URL
  6. 编写针对屏幕的提示词(模板A或B)
  7. 生成1080p视频

Stage 0.5 — Asset Gate

阶段0.5 — 资产校验

Before calling any generation tool, verify both assets are in hand:
AssetRequiredIf missing
Real app screenshots (≥1 actual sourced image)YesStop and ask for screenshots
Brand logo OR app iconYesUse the
mcp__pika__fetch_appstore_screens
icon when App Store sourcing is used; otherwise stop and ask for a logo/icon
If either is missing, tell the user exactly what's needed and wait. Real assets are what keep the teaser grounded; text-to-video placeholders make Seedance invent UI.
Avoid:
  • Generate using text-to-video as a substitute when screens were expected
  • Describe imaginary UI in the prompt ("a dark dashboard with…") without a real reference image
  • Proceed with "I'll use a placeholder for now"
  • Make up what the app looks like from its name or description
The only acceptable path forward is real assets from the user. If MCP fetching or capturing failed (App Store returned nothing, website screenshot errored), report what happened and ask the user to provide the screens manually. Never invent them.

在调用任何生成工具前,需确认已获取以下两类资产:
资产是否必填缺失时处理
真实应用截图(≥1张实际获取的图片)停止操作并向用户索要截图
品牌logo或应用图标若使用App Store获取方式,则使用
mcp__pika__fetch_appstore_screens
返回的图标;否则停止操作并向用户索要logo/图标
若任一资产缺失,明确告知用户所需内容并等待。真实资产是保证预告视频贴合实际的关键;用文本转视频生成的占位图会导致Seedance凭空生成UI。
需避免:
  • 当预期使用屏幕截图时,用文本转视频作为替代
  • 在提示词中描述虚构的UI(如“深色仪表盘带有…”)而没有真实参考图
  • 以“我先使用占位图”为由继续操作
  • 根据应用名称或描述凭空想象应用外观
唯一可行的方式是获取用户提供的真实资产。如果MCP抓取或捕获失败(App Store未返回任何内容、网站截图出错),请告知用户具体情况,并请求用户手动提供3-5张真实截图和logo/图标。绝对不能凭空生成。

Screen Sourcing

屏幕截图获取

iOS App Store

iOS App Store

Use Pika MCP
mcp__pika__fetch_appstore_screens
; do not use a local scraper. It accepts a full App Store URL, numeric app ID, or app-name search term:
fetch_appstore_screens(
  query: <app_store_url | numeric_app_id | search_term>,
  country: "us",
  max_screens: 10,
  include_icon: true
)
Expected result shape:
{
  "app_url": "https://apps.apple.com/...",
  "metadata": { "name": "...", "subtitle": "...", "description": "...", "category": "...", "icon_url": "https://..." },
  "icon": { "url": "https://cdn.pika.art/...", "source_url": "https://is...mzstatic.com/...", "filename": "appstore-icon.png", "mime_type": "image/png", "width": 1024, "height": 1024 },
  "screenshots": [
    { "url": "https://cdn.pika.art/...", "source_url": "https://is...mzstatic.com/.../1290x2796bb.png", "filename": "appstore-screen-01.png", "mime_type": "image/png", "width": 1290, "height": 2796 }
  ],
  "count": 1
}
If
mcp__pika__fetch_appstore_screens
returns no screenshots, report the error and ask the user for 3-5 real screenshots plus a logo/icon. Do not fall back to Playwright/headless App Store capture and do not invent UI.
After App Store assets are fetched, pick the 3–5 screens that show the core UI. Skip:
  • Pure text/splash screens (no UI)
  • Blank or loading states
  • Screens with faces (may trigger content policy)
Screen selection principle — maximize visual contrast. Each selected screen should look as different as possible from the others: dark vs. light background, UI-dense vs. photo-heavy, micro close-up vs. wide grid, minimal vs. busy. If all your screens look similar, Seedance blends them into a visual mush. What made Dazz Cam work: 3D camera grid + Polaroid output + VHS panels + fisheye orb — four completely distinct visual worlds. What makes teasers fail: four screens of the same UI at slightly different scroll positions.
使用Pika MCP的
mcp__pika__fetch_appstore_screens
;请勿使用本地爬虫。该工具接受完整的App Store链接、数字应用ID或应用名称搜索词:
fetch_appstore_screens(
  query: <app_store_url | numeric_app_id | search_term>,
  country: "us",
  max_screens: 10,
  include_icon: true
)
预期返回结果格式:
{
  "app_url": "https://apps.apple.com/...",
  "metadata": { "name": "...", "subtitle": "...", "description": "...", "category": "...", "icon_url": "https://..." },
  "icon": { "url": "https://cdn.pika.art/...", "source_url": "https://is...mzstatic.com/...", "filename": "appstore-icon.png", "mime_type": "image/png", "width": 1024, "height": 1024 },
  "screenshots": [
    { "url": "https://cdn.pika.art/...", "source_url": "https://is...mzstatic.com/.../1290x2796bb.png", "filename": "appstore-screen-01.png", "mime_type": "image/png", "width": 1290, "height": 2796 }
  ],
  "count": 1
}
如果
mcp__pika__fetch_appstore_screens
未返回任何截图,请报告错误并请求用户提供3-5张真实截图和logo/图标。请勿回退到Playwright/无头浏览器捕获App Store截图,也不能凭空生成UI。
获取App Store资产后,挑选3–5张展示核心UI的截图。跳过:
  • 纯文本/启动画面(无UI)
  • 空白或加载状态
  • 包含人脸的画面(可能触发内容审核政策)
截图选择原则——最大化视觉对比。每张选中的截图应与其他截图尽可能不同:深色 vs 浅色背景、UI密集 vs 图片为主、局部特写 vs 全景网格、极简 vs 繁复。如果所有截图风格相似,Seedance会将它们融合成视觉模糊的效果。Dazz Cam成功的原因:3D相机网格+宝丽来输出+VHS面板+鱼眼镜头——四个完全不同的视觉场景。预告视频失败的原因:四张仅滚动位置略有不同的相同UI截图。

Web App / Website (auto-capture)

Web应用/网站(自动捕获)

Use Pika MCP's capture tool:
python
capture_website(url="https://example.com", mode="screenshot")
使用Pika MCP的捕获工具:
python
capture_website(url="https://example.com", mode="screenshot")

Returns image_url — use directly as a reference

返回image_url — 直接用作参考图


Call once per distinct page/view you want to include.

每个要包含的不同页面/视图调用一次。

Local Files

本地文件

User provides paths → upload each via Pika MCP (see Asset Upload section below).

用户提供文件路径 → 通过Pika MCP上传(见下文“资产上传”章节)。

Stage 1 — App Analysis

阶段1 — 应用分析

After sourcing screens, read every screenshot using Claude's vision before writing a single word of prompt. This is the most important step — skip it and you get a generic glass blob with no story.
For each screenshot, record:
  • What UI is shown — e.g. "chat input with suggested prompts", "video timeline with AI edit chips", "agent result card showing a generated clip"
  • What feature it represents — e.g. "creation entry", "agent at work", "output/share"
  • Emotional register — is this the power moment, the ease moment, the aha moment?
Also pull the app metadata from the
mcp__pika__fetch_appstore_screens
result, or from the user-provided description:
  • App name, subtitle, one-line value prop
  • Category and target user
Output of Stage 1: A numbered feature map:
Screen 1 — [filename]: Shows [X UI]. Represents [Y feature]. Moment: [hook/build/reveal].
Screen 2 — [filename]: ...
...
After the map, score each screen for visual uniqueness: does it look completely different from the others you've mapped? Prefer screens with distinct color palettes, distinct layout density, and distinct subject matter. A great set has maximum visual spread — the hook should feel nothing like the build, which should feel nothing like the reveal.
Do NOT proceed to Stage 2 until this map is written out.

获取截图后,在编写任何提示词前,先使用Claude的视觉功能读取每张截图。这是最重要的步骤——跳过此步骤会导致生成泛泛的玻璃效果视频,毫无故事性。
针对每张截图,记录:
  • 展示的UI内容——例如“带建议提示的聊天输入框”、“带AI编辑芯片的视频时间轴”、“显示生成片段的Agent结果卡片”
  • 代表的功能——例如“创作入口”、“Agent运行中”、“输出/分享”
  • 情感定位——这是震撼时刻、便捷时刻还是顿悟时刻?
同时从
mcp__pika__fetch_appstore_screens
的返回结果或用户提供的描述中提取应用元数据:
  • 应用名称、副标题、一行价值主张
  • 类别和目标用户
阶段1输出: 编号的功能映射表:
截图1 — [文件名]: 展示[X UI]。代表[Y功能]。时刻类型:[钩子/构建/揭示]。
截图2 — [文件名]: ...
...
完成映射表后,为每张截图评分视觉独特性:它与其他已映射的截图外观是否完全不同?优先选择具有独特调色板、布局密度和主题内容的截图。优秀的截图集合应具有最大的视觉跨度——钩子画面应与构建画面完全不同,构建画面又应与揭示画面完全不同。
在完成此映射表前,请勿进入阶段2。

Stage 2 — Narrative Architecture

阶段2 — 叙事架构

Every 15s promo needs a spine. Design the story arc before touching the prompt template.
每支15秒的宣传视频都需要核心框架。在使用提示词模板前,先设计故事结构。

The 4-beat structure

四节拍结构

BeatSecondsJobWhich screen(s)
Hook0–3sGrab attention — show the most dramatic UI moment or the problem being solvedThe most visually striking screen
Build3–10sFeature walkthrough in logical user-journey order2–3 screens in sequence
Reveal10–13sPull-back or product overview — the "so that's what it does" momentWide shot or most complete screen
Logo13–15sBrand lock — wordmark materializes, accent color pulseLogo (@Image6 or last ref).
COMING SOON
is added later as a post-generation text overlay.
节拍时长作用对应截图
钩子0–3秒吸引注意力——展示最具冲击力的UI时刻或正在解决的问题视觉效果最突出的截图
构建3–10秒按用户使用逻辑展示功能2–3张按顺序排列的截图
揭示10–13秒拉远镜头或展示产品全貌——“原来它是这么用的”时刻全景或最完整的截图
Logo13–15秒品牌锁定——文字标志显现,强调色闪烁Logo(@Image6或最后一张参考图)。
COMING SOON
将在生成后作为文本覆盖层添加,无需在视频生成阶段设置。

Story arc types — pick one based on the app

故事弧类型——根据应用选择

ArcWhen to useStructure
Problem → SolutionProductivity/tool appsHook = pain point UI → Build = app solves it → Reveal = result
Feature ParadeFeature-rich appsHook = most impressive feature → Build = 2 more features → Reveal = overview
JourneyConsumer/lifestyle appsHook = entry point → Build = the experience → Reveal = outcome
TransformationBefore/after type appsHook = the "before" → Build = the process → Reveal = the "after"
弧类型使用场景结构
问题→解决方案生产力/工具类应用钩子=痛点UI → 构建=应用解决问题 → 揭示=结果
功能展示功能丰富的应用钩子=最亮眼的功能 → 构建=另外2个功能 → 揭示=全貌
体验旅程消费/生活方式类应用钩子=入口点 → 构建=体验过程 → 揭示=成果
转变对比前后对比类应用钩子=“之前”状态 → 构建=过程 → 揭示=“之后”状态

Output of Stage 2

阶段2输出

Write out the arc explicitly before generating:
Arc type: [Problem→Solution / Feature Parade / Journey / Transformation]
Hook (0-3s): Screen [N] — [what happens] — camera: [extreme close-up on X]
Build (3-10s): Screen [N] → [N] → [N] — [what each reveals] — camera: [whip pan / orbital / etc.]
Reveal (10-13s): Screen [N] — [what it shows] — camera: [pull-back to show full product]
Logo (13-15s): @Image[N] — wordmark materializes whole in a burst of [accent color] light and holds. Do not ask the video model to render the `COMING SOON` copy; it is added later as a post-generation text overlay.
Do NOT write the Seedance prompt until this arc is defined.

在生成视频前,明确写出故事弧:
弧类型:[问题→解决方案 / 功能展示 / 体验旅程 / 转变对比]
钩子(0-3秒):截图[N] — [画面内容] — 镜头:[对X的极端特写]
构建(3-10秒):截图[N] → [N] → [N] — [每张截图展示的内容] — 镜头:[快速摇摄 / 环绕拍摄 / 等]
揭示(10-13秒):截图[N] — [展示内容] — 镜头:[拉远镜头展示完整产品]
Logo(13-15秒):@Image[N] — 文字标志在[强调色]光效中完整显现并保持。请勿让视频模型渲染`COMING SOON`文字;该文字将在生成后作为文本覆盖层添加。
在定义好此故事弧前,请勿编写Seedance提示词。

Stage 2.5 — GPT-Image-2 Enhancement

阶段2.5 — GPT-image-2增强

After the arc is defined and the 3–5 screens are selected, enhance each one with GPT-image-2 before uploading to Seedance. This lifts compressed website captures and App Store thumbnails to a cleaner, higher-fidelity reference.
For each selected screen (including the logo/end card reference):
python
result = generate_image(
    provider="gpt-image-2",
    prompt="High quality version, preserve all content exactly",
    reference_images=["<original_cdn_url>"],
    aspect_ratio="16:9",   # match the capture — use 9:16 for portrait screens
    quality="medium",
)
定义好故事弧并选中3–5张截图后,在上传到Seedance前,使用GPT-image-2增强每张截图。这可以提升压缩的网站捕获画面和App Store缩略图的清晰度,使其成为更高保真的参考图。
针对每张选中的截图(包括logo/结尾画面参考图):
python
result = generate_image(
    provider="gpt-image-2",
    prompt="High quality version, preserve all content exactly",
    reference_images=["<original_cdn_url>"],
    aspect_ratio="16:9",   # 匹配原始捕获比例——竖屏截图使用9:16
    quality="medium",
)

use result.image_url (or result.url) as the Seedance reference

使用result.image_url(或result.url)作为Seedance的参考图


**Rules:**
- Keep the prompt exactly as shown — short, non-descriptive. Describing the image content makes GPT-image-2 hallucinate new details.
- Match `aspect_ratio` to the original capture (desktop = 16:9, mobile = 9:16).
- Run all enhancements in parallel (one call per screen).
- Use the enhanced URLs as the `reference_images` array in the Seedance call — not the originals.
- Keep your Stage 1 feature map descriptions unchanged — they describe the original content, which the enhanced image preserves.

---

**规则:**
- 严格使用上述提示词——简短、无描述性内容。描述图像内容会导致GPT-image-2凭空生成新细节。
- `aspect_ratio`匹配原始捕获比例(桌面=16:9,移动端=9:16)。
- 并行运行所有增强处理(每张截图调用一次)。
- 在Seedance调用中使用增强后的URL作为`reference_images`数组——而非原始URL。
- 保持阶段1的功能映射描述不变——它们描述的是原始内容,增强后的图片会保留这些内容。

---

Stage 3 — Prompt Writing

阶段3 — 提示词编写

With the feature map (Stage 1) and arc (Stage 2) in hand, write the Seedance prompt. Every
@Image
description must reference the real UI content from the feature map — never write generic descriptions like "a mobile interface with controls."
Choose the template by reading the app's screenshots — don't default to liquid glass. Read exactly one of these based on the app's personality; the other never loads:
  • Template A — Cinematic Narrative (default; productivity, AI, creative, social, food, games): read
    references/template-a-cinematic.md
    . The proven BEAT-structure template plus validated examples.
  • Template B — Liquid Glass (photography, camera, filter apps only, where a lens/filter metaphor is apt): read
    references/liquid-glass.md
    . Template B skeleton plus glass transformation vocabulary.
The accent color is always from the brand — read the icon and primary UI color, never invent one.
在拥有功能映射表(阶段1)和故事弧(阶段2)后,编写Seedance提示词。每个
@Image
的描述必须直接引用功能映射表中的真实UI内容——绝不能写“移动界面带控件”这类泛泛的描述。
根据应用截图选择模板——不要默认使用液态玻璃模板。根据应用特性选择其中一种,另一种无需考虑:
  • 模板A — 电影叙事(默认;适用于生产力、AI、创意、社交、美食、游戏类应用):参考
    references/template-a-cinematic.md
    。经过验证的节拍结构模板及示例。
  • 模板B — 液态玻璃(仅适用于摄影、相机、滤镜类应用,镜头/滤镜隐喻适用的场景):参考
    references/liquid-glass.md
    。模板B框架及玻璃转换词汇。
强调色始终取自品牌——从图标和主UI颜色中提取,绝不能凭空创造。

Rules for both templates

两种模板的通用规则

  • Every
    @Image
    description comes directly from the Stage 1 feature map
  • Camera directions come directly from the Stage 2 arc
  • Never write "the app interface" or "a mobile screen" — be specific
  • Keep under 200 words
  • Why specificity matters: Seedance uses
    @ImageN
    description as its primary brief — "a dark chat interface" vs "a VHS three-panel grid of city streets, a skate park, and a coastal sunset with retro timestamp overlays" produce completely different results. Copy the most visually specific details from your Stage 1 feature map verbatim.

  • 每个
    @Image
    的描述直接来自阶段1的功能映射表
  • 镜头指令直接来自阶段2的故事弧
  • 绝不能写“应用界面”或“移动屏幕”——要具体
  • 字数控制在200字以内
  • 为何具体性如此重要: Seedance将
    @ImageN
    的描述作为核心指令——“深色聊天界面”与“VHS三面板网格,展示城市街道、滑板公园和海岸日落,带复古时间戳覆盖层”会产生完全不同的结果。直接复制阶段1功能映射表中最具视觉细节的描述。

Generate Video

生成视频

Primary — Seedance:
python
generate_reference_video(
    provider="seedance",
    reference_images=["<url1>", "<url2>", "<url3>", "<url4>", "<url5>"],  # 3–5 screens + icon
    prompt="<prompt using @Image1 … @Image5 tokens>",
    resolution="1080p",   # always
    duration=15,          # always
    sound=True,           # always
    aspect_ratio="16:9",  # or 9:16 / 1:1 per user request
    seed=<int>,           # set one; reuse it for content-policy recovery
)
首选方案 — Seedance:
python
generate_reference_video(
    provider="seedance",
    reference_images=["<url1>", "<url2>", "<url3>", "<url4>", "<url5>"],  # 3–5张截图 + 图标
    prompt="<使用@Image1 … @Image5标记的提示词>",
    resolution="1080p",   # 始终使用
    duration=15,          # 始终使用
    sound=True,           # 始终使用
    aspect_ratio="16:9",  # 或根据用户需求使用9:16 / 1:1
    seed=<int>,           # 设置一个值;在内容审核恢复时复用
)

Seedance generated-audio moderation recovery

Seedance生成音频审核恢复流程

If Seedance finishes generation and then returns a 422 whose body includes
type: "content_policy_violation"
,
reason: "partner_validation_failed"
,
loc: ["body", "generated_video"]
, and
msg: "Output audio has sensitive content."
, treat it as a recoverable generated-audio moderation false positive.
  1. Retry the exact same prompt and
    reference_images
    with
    sound=False
    and the same
    seed
    .
  2. If the silent probe succeeds, retry the exact same prompt/reference set with
    sound=True
    and the same seed.
  3. If the
    sound=True
    replay succeeds, route the recovered sound-on URL into Stage 4 as
    generated_teaser_url
    . Keep the silent probe URL only as debugging context.
  4. If the silent probe fails, treat the failure as video/reference moderation and use the Kling fallback.
  5. If the silent probe succeeds but the
    sound=True
    replay fails again, run the Kling fallback once. If Kling is unavailable, route the silent URL into Stage 4 as
    generated_teaser_url
    and explicitly note that generated-audio moderation remained flaky.
Do not change the prompt, references, aspect ratio, duration, or seed during this recovery path. Changing any of them turns the silent probe into a new generation instead of testing whether only generated audio triggered moderation.
如果Seedance完成生成后返回422错误,且响应体包含
type: "content_policy_violation"
reason: "partner_validation_failed"
loc: ["body", "generated_video"]
msg: "Output audio has sensitive content."
,则视为可恢复的生成音频审核误判。
  1. 使用完全相同的提示词、
    reference_images
    和seed,将
    sound
    设为
    False
    重试。
  2. 如果静音测试成功,再次使用完全相同的提示词/参考图、相同的seed,将
    sound
    设为
    True
    重试。
  3. 如果
    sound=True
    的重试成功,将恢复后的带声音URL作为
    generated_teaser_url
    传入阶段4。仅将静音测试URL作为调试保留。
  4. 如果静音测试失败,则视为视频/参考图审核失败,使用Kling fallback方案。
  5. 如果静音测试成功但
    sound=True
    的重试再次失败,则运行一次Kling fallback。如果Kling不可用,则将静音URL作为
    generated_teaser_url
    传入阶段4,并明确说明生成音频审核仍存在不稳定问题。
在此恢复流程中,请勿更改提示词、参考图、宽高比、时长或seed。更改任何一项都会使静音测试变成新的生成任务,而非仅测试是否是生成音频触发了审核。

Seedance timeout recovery

Seedance超时恢复流程

If task status remains queued or running until Seedance returns a timeout such as
seedance timed out after 900s
or
seedance timed out after 1200s
, treat it as provider queue saturation, not a prompt/content failure.
When this happens, run the Kling fallback with the same selected references, same beat structure,
duration=15
,
sound=True
, and
quality_mode="pro"
. Convert
@ImageN
prompt tokens to
<<<image_N>>>
before calling Kling.
Do not keep retrying Seedance after a timeout unless the user explicitly asks to wait for Seedance. The timeout path has already spent the launch-demo wall-clock budget; switching provider is the documented recovery.
Fallback — Kling (non-audio partner_validation_failed or insufficient_balance):
python
generate_reference_video(
    provider="kling",
    reference_images=["<url1>", "<url2>", "<url3>", "<url4>", "<url5>"],
    prompt="<prompt using <<<image_1>>> … <<<image_5>>> tokens>",
    quality_mode="pro",   # = 1080p on Kling (NOT resolution=)
    duration=15,
    sound=True,
    aspect_ratio="16:9",
)
Seedance tokens:
@Image1
@Image5
| Kling tokens:
<<<image_1>>>
<<<image_5>>>
Seedance constraints: skip
fast=True
because it caps at 720p; skip
negative_prompt
because Seedance rejects it; skip
auto_duration
because this path is fixed at 15s.
Kling constraint: use
quality_mode="pro"
for 1080p; Kling rejects
resolution=
.
如果任务状态始终处于排队或运行中,直到Seedance返回超时信息,如
seedance timed out after 900s
seedance timed out after 1200s
,则视为服务商队列饱和,而非提示词/内容失败。
出现此情况时,使用相同的选中参考图、相同的节拍结构、
duration=15
sound=True
quality_mode="pro"
运行Kling fallback。在调用Kling前,将提示词中的
@ImageN
标记转换为
<<<image_N>>>
除非用户明确要求等待Seedance,否则超时后请勿继续重试Seedance。超时已耗费了演示的时间预算,切换服务商是文档规定的恢复方式。
Fallback方案 — Kling(非音频类partner_validation_failed或余额不足):
python
generate_reference_video(
    provider="kling",
    reference_images=["<url1>", "<url2>", "<url3>", "<url4>", "<url5>"],
    prompt="<使用<<<image_1>>> … <<<image_5>>>标记的提示词>",
    quality_mode="pro",   # Kling上对应1080p(请勿使用resolution=)
    duration=15,
    sound=True,
    aspect_ratio="16:9",
)
Seedance标记:
@Image1
@Image5
| Kling标记:
<<<image_1>>>
<<<image_5>>>
Seedance约束:请勿使用
fast=True
,因为它会将分辨率限制在720p;请勿使用
negative_prompt
,因为Seedance会拒绝该字段;请勿使用
auto_duration
,因为此流程固定为15秒。
Kling约束:使用
quality_mode="pro"
以获得1080p分辨率;Kling会拒绝
resolution=
参数。

Kling queued/handoff recovery

Kling排队/移交恢复流程

Kling fallback is async. If
generate_reference_video(provider="kling")
returns a
task_id
, follow the task until terminal.
If
task_status
returns status:
queued
with
statusMessage
containing
Worker handoff: task was requeued for retry on another worker.
, treat it as a worker restart handoff, not a failed render. Keep polling
mcp__pika__task_status(task_id)
; the next worker should reclaim the same task.
If
statusMessage
starts with
Kling is at capacity
, treat it as provider capacity wait. Keep polling the same task while
lastUpdatedAt
continues moving.
Do not submit a duplicate Kling request while the original task is still
queued
or
running
. Duplicates can burn provider quota and make artifact provenance unclear.
If
status
stays
queued
for more than 10 minutes with no
lastUpdatedAt
movement, capture the
task_id
,
status
,
statusMessage
, and
lastUpdatedAt
, then cancel the stalled original with
mcp__pika__task_cancel(task_id)
before retrying. Only after cancel returns
cancelled
, retry the exact same Kling request once with the same prompt, references, shots, aspect ratio, duration, and quality mode. If cancel fails because the task already completed or failed, inspect that terminal result instead of retrying. If the retry also stalls, stop and report both task IDs instead of changing the creative prompt.

Kling fallback是异步的。如果
generate_reference_video(provider="kling")
返回
task_id
,则跟踪任务直到进入终端状态。
如果
task_status
返回状态:
queued
statusMessage
包含
Worker handoff: task was requeued for retry on another worker.
,则视为Worker重启移交,而非渲染失败。继续轮询
mcp__pika__task_status(task_id)
;下一个Worker应会接手同一任务。
如果
statusMessage
Kling is at capacity
开头,则视为服务商容量不足,需等待。在
lastUpdatedAt
持续更新时,继续轮询同一任务。
在原始任务仍处于
queued
running
状态时,请勿提交重复的Kling请求。重复请求会消耗服务商配额,并导致产物来源不清晰。
如果
status
保持
queued
状态超过10分钟且
lastUpdatedAt
无变化,则记录
task_id
status
statusMessage
lastUpdatedAt
,然后调用
mcp__pika__task_cancel(task_id)
取消停滞的原始任务,再重试。仅在取消返回
cancelled
后,使用完全相同的提示词、参考图、镜头、宽高比、时长和质量模式重试一次Kling请求。如果取消失败(因为任务已完成或失败),则检查终端结果而非重试。如果重试也停滞,则停止操作并报告两个任务ID,不要更改创意提示词。

Asset Upload (local files → public URL)

资产上传(本地文件→公共URL)

If the user provides local file paths, convert them to public URLs before calling generate:
  1. Read the file size and MIME type.
  2. Call
    mcp__pika__upload_asset(filename, mime_type, size_bytes)
    .
  3. Upload the bytes to the returned
    presigned_url
    using the host client's file-upload capability.
  4. Use the returned
    public_url
    as the reference URL in generation calls.
Supported mime types:
image/png
,
image/jpeg
,
image/webp
,
video/mp4
,
audio/mpeg
,
audio/wav

如果用户提供本地文件路径,在调用生成工具前将其转换为公共URL:
  1. 读取文件大小和MIME类型。
  2. 调用
    mcp__pika__upload_asset(filename, mime_type, size_bytes)
  3. 使用宿主客户端的文件上传功能,将文件字节上传到返回的
    presigned_url
  4. 使用返回的
    public_url
    作为生成调用中的参考URL。
支持的MIME类型:
image/png
image/jpeg
image/webp
video/mp4
audio/mpeg
audio/wav

Stage 4 — Deterministic COMING SOON Overlay

阶段4 — 固定的COMING SOON覆盖层

Do not ask Seedance or Kling to render
COMING SOON
. Video models garble new typography, especially all-caps CTA text, so the final two seconds use a deterministic
COMING SOON
overlay as a post-generation text overlay.
After Seedance or Kling returns the 15s teaser URL, call:
python
edit_text_overlay(
    video_url=<generated_teaser_url>,
    text="COMING SOON",
    position="bottom_center",
    font_size=56,
    font_color="white",
    start_s=13,
    end_s=15,
)
If
edit_text_overlay
returns
{ task_id }
, poll
mcp__pika__task_status
until it reaches
completed
,
failed
, or
cancelled
, then unwrap the returned URL. Save the returned URL as
final_url
. If the overlay call fails, surface that failure and the unoverlaid teaser URL as a diagnostic preview; do not deliver a teaser whose only
COMING SOON
text was generated by the video model.

请勿让Seedance或Kling渲染
COMING SOON
文字。视频模型会扭曲新的排版,尤其是全大写的CTA文字,因此最后两秒使用固定的
COMING SOON
覆盖层作为生成后文本叠加。
Seedance或Kling返回15秒预告视频URL后,调用:
python
edit_text_overlay(
    video_url=<generated_teaser_url>,
    text="COMING SOON",
    position="bottom_center",
    font_size=56,
    font_color="white",
    start_s=13,
    end_s=15,
)
如果
edit_text_overlay
返回
{ task_id }
,则轮询
mcp__pika__task_status
直到任务进入
completed
failed
cancelled
状态,然后提取返回的URL。将返回的URL保存为
final_url
。如果叠加调用失败,则告知用户失败情况并提供未叠加的预告视频URL作为诊断预览;请勿交付仅由视频模型生成
COMING SOON
文字的预告视频。

Result Delivery

结果交付

Return the final Pika CDN URL as the primary deliverable. If the host client requires local media markers, create that local preview outside this skill flow after confirming the CDN URL is reachable.
If generation completes asynchronously: follow the MCP tool's returned status handle until the video reaches a terminal state, then deliver the final URL.

将最终的Pika CDN URL作为主要交付物。如果宿主客户端需要本地媒体标记,在确认CDN URL可访问后,在本工具流程外创建本地预览。
如果生成为异步完成: 跟踪MCP工具返回的状态句柄,直到视频进入终端状态,再交付最终URL。

Prompting Guide

提示词指南

The prompt is the output of Stages 1 + 2, not a starting point. Never fill in the template from imagination — fill it from the feature map and arc you built. A prompt written without Stage 1 analysis will produce a generic glass blob.
提示词是阶段1+阶段2的输出,而非起点。 绝不能凭空填充模板——必须基于您构建的功能映射表和故事弧填充。未经过阶段1分析编写的提示词会生成泛泛的玻璃效果视频。

Camera Vocabulary

镜头词汇

Use specific camera language — Seedance responds to it:
TermEffect
extreme macro close-up on [specific element]
Tight detail shot — glass edge, button, icon
crash zoom into [element]
Fast push-in, creates energy
whip pan to
Hard lateral cut with motion blur
orbital sweep around
360° arc around the floating panel
push-in drift
Slow, cinematic dolly
pull-back to reveal
Classic product reveal — shows full form
hard cut to black
Clean beat before logo
Alternate fast cuts with slower drifts — pure rapid cuts feel chaotic, pure slow drifts feel boring.
(Glass transformation vocabulary lives in
references/liquid-glass.md
— only relevant on the Template B path.)
使用具体的镜头语言——Seedance对这类指令响应良好:
术语效果
extreme macro close-up on [specific element]
特写细节镜头——玻璃边缘、按钮、图标
crash zoom into [element]
快速推进镜头,营造活力
whip pan to
快速横向切换,带运动模糊
orbital sweep around
围绕悬浮面板的360°环绕镜头
push-in drift
缓慢的电影式推拉镜头
pull-back to reveal
经典产品揭示镜头——展示完整形态
hard cut to black
Logo前的干净节拍切换
快速切换与缓慢移动交替使用——纯快速切换会显得混乱,纯缓慢移动会显得乏味。
(玻璃转换词汇在
references/liquid-glass.md
中——仅适用于模板B流程。)

Device Framing

设备取景

For product shots, lock the device to a black void — never place in environments:
undefined
对于产品镜头,将设备锁定在纯黑背景中——绝不要放置在具体环境中:
undefined

Floating desktop screens (SaaS / desktop apps)

悬浮桌面屏幕(SaaS/桌面应用)

Show the desktop screens floating in 3D space on a pure black background, tilted at slight angles like a MacBook product shot. The UI elements on screen become translucent glass with reflections and refractions. No text, no logos, no words.
显示桌面屏幕悬浮在3D空间的纯黑背景中,像MacBook产品镜头一样轻微倾斜。屏幕上的UI元素变为半透明玻璃,带有反射和折射效果。无文字、无logo、无标识。

iPad reveal

iPad展示

An iPad Pro floating in empty black space, tilted at a cinematic angle like an Apple product shot. The iPad is a real solid device with visible bezels — only the screen content has the glass effect. The device slowly rotates. No text, no logos.
iPad Pro悬浮在纯黑空间中,像苹果产品镜头一样呈电影角度倾斜。iPad是真实的实体设备,带有可见边框——仅屏幕内容有玻璃效果。设备缓慢旋转。无文字、无logo。

MacBook

MacBook

A MacBook Pro floating in empty black space, open at a cinematic angle. The screen displays [content]. Light catches the aluminium edges. No text, no logos.
undefined
MacBook Pro悬浮在纯黑空间中,以电影角度打开。屏幕显示[内容]。光线照射在铝合金边缘。无文字、无logo。
undefined

Reference Count Guide

参考图数量指南

All runs are 15s, 1080p. Select 3–5 screens based on the narrative arc.
RefsUse case
3Standard — one screen per beat (hook / build / reveal) + icon as @Image4
4Two build beats + hook + icon
5Feature-rich — hook + 3 build beats + icon. Don't exceed 5.
The golden rule: 1 reference per ~3 seconds of video.

所有生成任务均为15秒、1080p。根据叙事弧选择3–5张截图。
参考图数量使用场景
3标准配置——每个节拍对应一张截图(钩子/构建/揭示)+ 图标作为@Image4
4两个构建节拍 + 钩子 + 图标
5功能丰富的应用——钩子 + 3个构建节拍 + 图标。请勿超过5张。
黄金法则:每约3秒视频对应1张参考图。

Load-bearing phrases

关键短语

These phrases are empirical prompt/flow anchors. Keep them when simplifying the skill:
PhraseWhereWhy load-bearing
High quality version, preserve all content exactly
GPT-image-2 enhancement passKeeps the enhancement pass from inventing UI while cleaning compression artifacts.
Do NOT write the Seedance prompt until this arc is defined
Stage 2.5 gatePrevents generic motion prompts that are not grounded in the selected screens.
The prompt is the output of Stages 1 + 2, not a starting point
Prompting guideForces the agent to use the screen feature map and story arc instead of template-filling from imagination.
pure black background
/
floating in empty black space
Device framing promptsKeeps product shots focused on the app UI rather than hallucinated environments.
materializes whole
/
crystallizes as a single form
/
fades in as a complete element
Logo reveal wordingAvoids per-letter logo construction, which causes garbled brand text.

这些短语是经过验证的提示词/流程锚点。在简化工具时请保留:
短语位置为何关键
High quality version, preserve all content exactly
GPT-image-2增强步骤确保增强步骤在清除压缩 artifacts的同时不会凭空生成UI。
Do NOT write the Seedance prompt until this arc is defined
阶段2.5校验点防止生成未基于选中截图的泛泛运动提示词。
The prompt is the output of Stages 1 + 2, not a starting point
提示词指南迫使Agent使用屏幕功能映射表和故事弧,而非凭空填充模板。
pure black background
/
floating in empty black space
设备取景提示词确保产品镜头聚焦于应用UI,而非凭空生成的环境。
materializes whole
/
crystallizes as a single form
/
fades in as a complete element
Logo揭示措辞避免逐字母构建logo,这会导致品牌文字扭曲。

Runtime Expectations

运行时间预期

Typical run time is 4-8 minutes:
StepWall clockNotes
Asset sourcing10-60sApp Store via
mcp__pika__fetch_appstore_screens
; website capture depends on page load
Screen analysis + arc2-5 minUser confirmation can add time
GPT-image-2 enhancement30-90sRun selected screens in parallel
Seedance generation3-5 minGenerated-audio moderation recovery adds one silent probe plus one same-seed sound replay
Kling fallback5-15 minCapacity wait or worker handoff may temporarily show
queued
; follow the Kling queued/handoff recovery runbook
Download verification<30sLocal sanity check before delivery
典型运行时间为4-8分钟:
步骤耗时说明
资产获取10-60秒通过
mcp__pika__fetch_appstore_screens
获取App Store资产;网站捕获耗时取决于页面加载速度
屏幕分析+故事弧2-5分钟用户确认可能增加耗时
GPT-image-2增强30-90秒并行处理选中的截图
Seedance生成3-5分钟生成音频审核恢复会增加一次静音测试和一次同seed带声音重试
Kling fallback5-15分钟容量等待或Worker移交可能暂时显示
queued
;请遵循Kling排队/移交恢复手册
下载验证<30秒交付前的本地 sanity检查

Engine Choice: Seedance Primary, Kling Fallback

引擎选择:Seedance首选,Kling Fallback

Seedance is the default because it handles polished motion-graphics references and 1080p app teasers well. Kling is the fallback for moderation, balance, or Seedance timeout failures because it is more permissive on some screen content and uses
quality_mode="pro"
for 1080p.
Seedance是默认选择,因为它能很好地处理精美的动效参考图和1080p应用预告视频。Kling作为fallback方案用于审核、余额或Seedance超时失败的情况,因为它对某些屏幕内容的限制更宽松,且通过
quality_mode="pro"
实现1080p分辨率。

Failure Modes

故障模式

SymptomCauseFix
fast=True
with
resolution="1080p"
Seedance caps fast mode at 720pRemove
fast
; keep
resolution="1080p"
negative_prompt
rejected
Seedance does not accept this fieldUse positive framing such as "smooth motion, stable camera"
Seedance generated-audio moderation:
content_policy_violation
/
partner_validation_failed
,
generated_video
, "Output audio has sensitive content."
Often a false positive on non-sensitive app-sizzle referencesFollow the generated-audio recovery runbook: same-seed
sound=False
probe, then same-seed
sound=True
replay
Seedance timeout such as
seedance timed out after ...
Provider queue saturation or tail latency exceeded the tool budgetRun the Kling fallback; do not keep retrying Seedance unless the user explicitly asks to wait
Seedance
partner_validation_failed
on video
Screen content includes recording UI, celebrity faces, or similar moderation triggersSwitch to
provider="kling"
and convert tokens to
<<<image_N>>>
Faces in screenshots trigger content policyScreenshot includes real peopleCrop faces out before upload, or use Kling
6+ reference images reduce qualityThe model blends too many refsKeep to 3-5 references, roughly one per 3 seconds
Prompt tail ignoredPrompt exceeds about 200 wordsTrim to the beat structure and the concrete UI details
Text in output is garbledVideo model is asked to render new textKeep text as existing reference-image content; overlay any new branding in post
Logo reveal hallucinates letterforms"assemble/build/construct" language triggers per-glyph renderingUse "materializes whole", "crystallizes as a single form", or "fades in as a complete element"
Task returns
{ task_id }
instead of inline
Long-running generation exceeded inline budgetPoll
mcp__pika__task_status(task_id)
until
completed
,
failed
, or
cancelled
; unwrap
result.structuredContent
when present
Kling task returns status:
queued
after previously running
Worker handoff or provider capacity waitFollow the Kling queued/handoff recovery runbook; do not duplicate-submit unless queued for more than 10 minutes with no
lastUpdatedAt
movement
Kling rejects
resolution=
Kling uses a different quality knobUse
quality_mode="pro"
App Store icon URL points to promo artApp Store metadata fallback found feature artworkPrefer the
icon.url
returned by
mcp__pika__fetch_appstore_screens
; if missing, ask for a logo/icon file
症状原因修复方案
fast=True
搭配
resolution="1080p"
Seedance的快速模式分辨率上限为720p移除
fast
参数;保留
resolution="1080p"
negative_prompt
被拒绝
Seedance不接受该字段使用正向表述,如“smooth motion, stable camera”
Seedance生成音频审核:
content_policy_violation
/
partner_validation_failed
generated_video
、“Output audio has sensitive content.”
通常是对非敏感应用预告参考图的误判遵循生成音频恢复手册:同seed的
sound=False
测试,然后同seed的
sound=True
重试
Seedance超时,如
seedance timed out after ...
服务商队列饱和或延迟超出工具预算运行Kling fallback;除非用户明确要求等待,否则请勿继续重试Seedance
Seedance返回视频类
partner_validation_failed
屏幕内容包含录屏UI、名人面孔或类似审核触发内容切换到
provider="kling"
并将标记转换为
<<<image_N>>>
截图中的人脸触发内容政策截图包含真实人物上传前裁剪掉人脸,或使用Kling
6张及以上参考图导致质量下降模型融合了过多参考图限制在3-5张参考图,约每3秒视频对应1张
提示词末尾内容被忽略提示词超过约200字精简到节拍结构和具体UI细节
输出中的文字扭曲要求视频模型渲染新文字文字仅保留为参考图中的现有内容;任何新品牌文字在生成后叠加
Logo揭示时文字形态扭曲“assemble/build/construct”这类词汇触发逐字母渲染使用“materializes whole”、“crystallizes as a single form”或“fades in as a complete element”
任务返回
{ task_id }
而非直接结果
长时间生成超出即时预算轮询
mcp__pika__task_status(task_id)
直到进入
completed
failed
cancelled
;如有
result.structuredContent
则提取
Kling任务在之前运行后返回
queued
状态
Worker移交或服务商容量不足遵循Kling排队/移交恢复手册;除非排队超过10分钟且
lastUpdatedAt
无变化,否则请勿重复提交
Kling拒绝
resolution=
参数
Kling使用不同的质量控制参数使用
quality_mode="pro"
App Store图标URL指向宣传图App Store元数据 fallback到了功能宣传图优先使用
mcp__pika__fetch_appstore_screens
返回的
icon.url
;若缺失则请求用户提供logo/图标文件