app-sizzle

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

App Sizzle — GPT-Image-2 Enhanced iOS App Teaser

App Sizzle — 经GPT-image-2增强的iOS应用预告视频

Generate a polished 15-second app teaser from real app screens. Each selected screen is passed through GPT-image-2 before Seedance so compressed captures become cleaner references without inventing UI.

Generation contract: use

resolution="1080p"

duration=15

, and

sound=True

. Skip

fast=true

because it caps Seedance at 720p. The skill owns duration and sound so the user only has to supply app identity, screens, logo, and aspect ratio.

The visual aesthetic is derived from the app's personality — not defaulted to liquid glass. The agent reads the app's soul from its icon, screenshots, and category, then chooses a treatment. The user provides the app identity and assets; the agent decides everything else (mode, prompt, camera, style).

基于真实应用屏幕生成时长15秒的精美应用预告。每个选中的屏幕在送入Seedance前都会经过GPT-image-2处理，让压缩的捕获画面成为更清晰的参考，同时不会凭空生成UI。

生成约定： 使用

resolution="1080p"

、

duration=15

和

sound=True

。请勿使用

fast=true

，因为它会将Seedance的分辨率限制在720p。本工具负责时长和音效设置，用户只需提供应用标识、屏幕截图、logo和宽高比即可。

视觉风格由应用的特性衍生——而非默认的液态玻璃效果。Agent会从应用图标、截图和类别中解读其核心特质，然后选择合适的处理方式。用户提供应用标识和资产，Agent决定其他所有事项（模式、提示词、镜头、风格）。

Mode: Reference-to-Video

模式：参考图转视频

Primary:

mcp__pika__generate_reference_video(provider="seedance", resolution="1080p")

with 3–5 screenshots + the app icon/logo as the final reference.

Fallback to

provider="kling", quality_mode="pro"

(= 1080p) when:

Seedance returns non-audio
```
partner_validation_failed
```
(celebrity faces, screen-recording UI)
Seedance returns
```
insufficient_balance
```
Seedance stays queued/running until it returns a timeout such as
```
seedance timed out after ...
```

Do not treat generated-audio moderation as an immediate Kling fallback. See the Seedance generated-audio moderation recovery runbook in Generate Video first.

Kling prompt uses

<<<image_1>>>

…

<<<image_5>>>

tokens instead of

@Image1

…

@Image5

. Drop the

resolution

param (Kling uses

quality_mode

instead). See Gotchas.

首选方案：调用

mcp__pika__generate_reference_video(provider="seedance", resolution="1080p")

，传入3–5张截图+应用图标/logo作为最终参考图。

当出现以下情况时， fallback到

provider="kling", quality_mode="pro"

（对应1080p）：

Seedance返回无音频的
```
partner_validation_failed
```
（包含名人面孔、录屏UI等）
Seedance返回
```
insufficient_balance
```
Seedance长时间处于排队/运行状态，最终返回超时信息，如
```
seedance timed out after ...
```

请勿将生成音频的审核问题直接作为切换到Kling的触发条件。请先参考《生成视频》章节中的Seedance生成音频审核恢复手册。

Kling的提示词使用

<<<image_1>>>

…

<<<image_5>>>

标记，而非

@Image1

…

@Image5

。无需传入

resolution

参数（Kling使用

quality_mode

替代）。详见注意事项。

Stage 0 — Asset Sourcing

阶段0 — 资产获取

If invoked with empty args and no relevant prior context, print this menu verbatim and stop. Do not call tools until the user supplies the app identity and screen source.

To make your app promo, I need:

1. App name + one-line description of what it does
   (e.g. "Nova — an AI journaling app for iOS")

2. Where should I pull the app screens from?
   — iOS App Store: give me the App Store URL or app name → I'll use
     `mcp__pika__fetch_appstore_screens` to fetch screenshots, metadata, and icon
   — Web app / website: give me the URL → I'll capture it with Pika MCP
   — Local files / URLs: drop the paths and I'll upload them

3. Brand logo — path or URL (preferred) or skip to use the App Store icon
   The logo anchors the end card and prevents Seedance from hallucinating brand text.
   If you don't have a logo file, use the fetched App Store icon as the fallback.

4. Aspect ratio: 16:9 (landscape/YouTube) / 9:16 (Reels/TikTok) / 1:1

If the trigger message or prior context already supplies part of this, ask only for the missing required fields before touching any tool. These are the only questions the user needs to answer; the agent decides mode, prompt, camera, and style.

Once answered, the agent:

Sources the screens (MCP App Store fetch / website capture / upload local files)
Analyzes each screen (Stage 1) — reads every screenshot, maps UI → feature
Designs the narrative arc (Stage 2) — builds a 15s story structure before touching the prompt
Selects the 3–5 best screens for the promo (ordered by narrative role)
Uploads logo + screens to get public URLs
Writes the screen-specific prompt (Template A or B)
Generates at 1080p

如果调用时参数为空且无相关上下文，请直接打印以下菜单并停止操作。在用户提供应用标识和屏幕来源前，请勿调用任何工具。

要制作您的应用宣传视频，我需要：

1. 应用名称 + 一行功能描述
   （例如："Nova — 一款面向iOS的AI日记应用"）

2. 应用屏幕截图的来源？
   — iOS App Store：提供App Store链接或应用名称 → 我将使用
     `mcp__pika__fetch_appstore_screens`获取截图、元数据和图标
   — Web应用/网站：提供网址 → 我将通过Pika MCP捕获屏幕
   — 本地文件/URL：提供文件路径，我将上传它们

3. 品牌logo — 文件路径或URL（优先推荐），或跳过以使用App Store图标
   logo用于锚定结尾画面，避免Seedance凭空生成品牌文字。
   如果您没有logo文件，将使用获取到的App Store图标作为替代。

4. 宽高比：16:9（横屏/YouTube）/ 9:16（Reels/TikTok）/ 1:1

如果触发消息或已有上下文已提供部分信息，只需询问缺失的必填字段，再调用工具。用户只需回答这些问题，Agent会决定模式、提示词、镜头和风格。

用户提供信息后，Agent将：

获取屏幕截图（通过MCP App Store抓取/网站捕获/上传本地文件）
分析每个屏幕（阶段1）——读取每张截图，映射UI到对应功能
设计叙事结构（阶段2）——在编写提示词前先构建15秒的故事框架
选择3–5张最适合宣传的截图（按叙事角色排序）
上传logo和截图以获取公共URL
编写针对屏幕的提示词（模板A或B）
生成1080p视频

Stage 0.5 — Asset Gate

阶段0.5 — 资产校验

Before calling any generation tool, verify both assets are in hand:

Asset	Required	If missing
Real app screenshots (≥1 actual sourced image)	Yes	Stop and ask for screenshots
Brand logo OR app icon	Yes	Use the `mcp__pika__fetch_appstore_screens` icon when App Store sourcing is used; otherwise stop and ask for a logo/icon

If either is missing, tell the user exactly what's needed and wait. Real assets are what keep the teaser grounded; text-to-video placeholders make Seedance invent UI.

Avoid:

Generate using text-to-video as a substitute when screens were expected
Describe imaginary UI in the prompt ("a dark dashboard with…") without a real reference image
Proceed with "I'll use a placeholder for now"
Make up what the app looks like from its name or description

The only acceptable path forward is real assets from the user. If MCP fetching or capturing failed (App Store returned nothing, website screenshot errored), report what happened and ask the user to provide the screens manually. Never invent them.

在调用任何生成工具前，需确认已获取以下两类资产：

资产	是否必填	缺失时处理
真实应用截图（≥1张实际获取的图片）	是	停止操作并向用户索要截图
品牌logo或应用图标	是	若使用App Store获取方式，则使用 `mcp__pika__fetch_appstore_screens` 返回的图标；否则停止操作并向用户索要logo/图标

若任一资产缺失，明确告知用户所需内容并等待。真实资产是保证预告视频贴合实际的关键；用文本转视频生成的占位图会导致Seedance凭空生成UI。

需避免：

当预期使用屏幕截图时，用文本转视频作为替代
在提示词中描述虚构的UI（如“深色仪表盘带有…”）而没有真实参考图
以“我先使用占位图”为由继续操作
根据应用名称或描述凭空想象应用外观

唯一可行的方式是获取用户提供的真实资产。如果MCP抓取或捕获失败（App Store未返回任何内容、网站截图出错），请告知用户具体情况，并请求用户手动提供3-5张真实截图和logo/图标。绝对不能凭空生成。

Screen Sourcing

屏幕截图获取

iOS App Store

Use Pika MCP

mcp__pika__fetch_appstore_screens

; do not use a local scraper. It accepts a full App Store URL, numeric app ID, or app-name search term:

fetch_appstore_screens(
  query: <app_store_url | numeric_app_id | search_term>,
  country: "us",
  max_screens: 10,
  include_icon: true
)

Expected result shape:

{
  "app_url": "https://apps.apple.com/...",
  "metadata": { "name": "...", "subtitle": "...", "description": "...", "category": "...", "icon_url": "https://..." },
  "icon": { "url": "https://cdn.pika.art/...", "source_url": "https://is...mzstatic.com/...", "filename": "appstore-icon.png", "mime_type": "image/png", "width": 1024, "height": 1024 },
  "screenshots": [
    { "url": "https://cdn.pika.art/...", "source_url": "https://is...mzstatic.com/.../1290x2796bb.png", "filename": "appstore-screen-01.png", "mime_type": "image/png", "width": 1290, "height": 2796 }
  ],
  "count": 1
}

mcp__pika__fetch_appstore_screens

returns no screenshots, report the error and ask the user for 3-5 real screenshots plus a logo/icon. Do not fall back to Playwright/headless App Store capture and do not invent UI.

After App Store assets are fetched, pick the 3–5 screens that show the core UI. Skip:

Pure text/splash screens (no UI)
Blank or loading states
Screens with faces (may trigger content policy)

Screen selection principle — maximize visual contrast. Each selected screen should look as different as possible from the others: dark vs. light background, UI-dense vs. photo-heavy, micro close-up vs. wide grid, minimal vs. busy. If all your screens look similar, Seedance blends them into a visual mush. What made Dazz Cam work: 3D camera grid + Polaroid output + VHS panels + fisheye orb — four completely distinct visual worlds. What makes teasers fail: four screens of the same UI at slightly different scroll positions.

使用Pika MCP的

mcp__pika__fetch_appstore_screens

；请勿使用本地爬虫。该工具接受完整的App Store链接、数字应用ID或应用名称搜索词：

fetch_appstore_screens(
  query: <app_store_url | numeric_app_id | search_term>,
  country: "us",
  max_screens: 10,
  include_icon: true
)

预期返回结果格式：

{
  "app_url": "https://apps.apple.com/...",
  "metadata": { "name": "...", "subtitle": "...", "description": "...", "category": "...", "icon_url": "https://..." },
  "icon": { "url": "https://cdn.pika.art/...", "source_url": "https://is...mzstatic.com/...", "filename": "appstore-icon.png", "mime_type": "image/png", "width": 1024, "height": 1024 },
  "screenshots": [
    { "url": "https://cdn.pika.art/...", "source_url": "https://is...mzstatic.com/.../1290x2796bb.png", "filename": "appstore-screen-01.png", "mime_type": "image/png", "width": 1290, "height": 2796 }
  ],
  "count": 1
}

如果

mcp__pika__fetch_appstore_screens

未返回任何截图，请报告错误并请求用户提供3-5张真实截图和logo/图标。请勿回退到Playwright/无头浏览器捕获App Store截图，也不能凭空生成UI。

获取App Store资产后，挑选3–5张展示核心UI的截图。跳过：

纯文本/启动画面（无UI）
空白或加载状态
包含人脸的画面（可能触发内容审核政策）

截图选择原则——最大化视觉对比。每张选中的截图应与其他截图尽可能不同：深色 vs 浅色背景、UI密集 vs 图片为主、局部特写 vs 全景网格、极简 vs 繁复。如果所有截图风格相似，Seedance会将它们融合成视觉模糊的效果。Dazz Cam成功的原因：3D相机网格+宝丽来输出+VHS面板+鱼眼镜头——四个完全不同的视觉场景。预告视频失败的原因：四张仅滚动位置略有不同的相同UI截图。

Web App / Website (auto-capture)

Web应用/网站（自动捕获）

Use Pika MCP's capture tool:

python

capture_website(url="https://example.com", mode="screenshot")

使用Pika MCP的捕获工具：

python

capture_website(url="https://example.com", mode="screenshot")

Returns image_url — use directly as a reference

返回image_url — 直接用作参考图


Call once per distinct page/view you want to include.


每个要包含的不同页面/视图调用一次。

Local Files

本地文件

User provides paths → upload each via Pika MCP (see Asset Upload section below).

用户提供文件路径 → 通过Pika MCP上传（见下文“资产上传”章节）。

Stage 1 — App Analysis

阶段1 — 应用分析

After sourcing screens, read every screenshot using Claude's vision before writing a single word of prompt. This is the most important step — skip it and you get a generic glass blob with no story.

For each screenshot, record:

What UI is shown — e.g. "chat input with suggested prompts", "video timeline with AI edit chips", "agent result card showing a generated clip"
What feature it represents — e.g. "creation entry", "agent at work", "output/share"
Emotional register — is this the power moment, the ease moment, the aha moment?

Also pull the app metadata from the

mcp__pika__fetch_appstore_screens

result, or from the user-provided description:

App name, subtitle, one-line value prop
Category and target user

Output of Stage 1: A numbered feature map:

Screen 1 — [filename]: Shows [X UI]. Represents [Y feature]. Moment: [hook/build/reveal].
Screen 2 — [filename]: ...
...

After the map, score each screen for visual uniqueness: does it look completely different from the others you've mapped? Prefer screens with distinct color palettes, distinct layout density, and distinct subject matter. A great set has maximum visual spread — the hook should feel nothing like the build, which should feel nothing like the reveal.

Do NOT proceed to Stage 2 until this map is written out.

获取截图后，在编写任何提示词前，先使用Claude的视觉功能读取每张截图。这是最重要的步骤——跳过此步骤会导致生成泛泛的玻璃效果视频，毫无故事性。

针对每张截图，记录：

展示的UI内容——例如“带建议提示的聊天输入框”、“带AI编辑芯片的视频时间轴”、“显示生成片段的Agent结果卡片”
代表的功能——例如“创作入口”、“Agent运行中”、“输出/分享”
情感定位——这是震撼时刻、便捷时刻还是顿悟时刻？

同时从

mcp__pika__fetch_appstore_screens

的返回结果或用户提供的描述中提取应用元数据：

应用名称、副标题、一行价值主张
类别和目标用户

阶段1输出： 编号的功能映射表：

截图1 — [文件名]: 展示[X UI]。代表[Y功能]。时刻类型：[钩子/构建/揭示]。
截图2 — [文件名]: ...
...

完成映射表后，为每张截图评分视觉独特性：它与其他已映射的截图外观是否完全不同？优先选择具有独特调色板、布局密度和主题内容的截图。优秀的截图集合应具有最大的视觉跨度——钩子画面应与构建画面完全不同，构建画面又应与揭示画面完全不同。

在完成此映射表前，请勿进入阶段2。

Stage 2 — Narrative Architecture

阶段2 — 叙事架构

Every 15s promo needs a spine. Design the story arc before touching the prompt template.

每支15秒的宣传视频都需要核心框架。在使用提示词模板前，先设计故事结构。

The 4-beat structure

四节拍结构

Beat	Seconds	Job	Which screen(s)
Hook	0–3s	Grab attention — show the most dramatic UI moment or the problem being solved	The most visually striking screen
Build	3–10s	Feature walkthrough in logical user-journey order	2–3 screens in sequence
Reveal	10–13s	Pull-back or product overview — the "so that's what it does" moment	Wide shot or most complete screen
Logo	13–15s	Brand lock — wordmark materializes, accent color pulse	Logo (@Image6 or last ref). `COMING SOON` is added later as a post-generation text overlay.

节拍	时长	作用	对应截图
钩子	0–3秒	吸引注意力——展示最具冲击力的UI时刻或正在解决的问题	视觉效果最突出的截图
构建	3–10秒	按用户使用逻辑展示功能	2–3张按顺序排列的截图
揭示	10–13秒	拉远镜头或展示产品全貌——“原来它是这么用的”时刻	全景或最完整的截图
Logo	13–15秒	品牌锁定——文字标志显现，强调色闪烁	Logo（@Image6或最后一张参考图）。 `COMING SOON` 将在生成后作为文本覆盖层添加，无需在视频生成阶段设置。

Story arc types — pick one based on the app

故事弧类型——根据应用选择

Arc	When to use	Structure
Problem → Solution	Productivity/tool apps	Hook = pain point UI → Build = app solves it → Reveal = result
Feature Parade	Feature-rich apps	Hook = most impressive feature → Build = 2 more features → Reveal = overview
Journey	Consumer/lifestyle apps	Hook = entry point → Build = the experience → Reveal = outcome
Transformation	Before/after type apps	Hook = the "before" → Build = the process → Reveal = the "after"

弧类型	使用场景	结构
问题→解决方案	生产力/工具类应用	钩子=痛点UI → 构建=应用解决问题 → 揭示=结果
功能展示	功能丰富的应用	钩子=最亮眼的功能 → 构建=另外2个功能 → 揭示=全貌
体验旅程	消费/生活方式类应用	钩子=入口点 → 构建=体验过程 → 揭示=成果
转变对比	前后对比类应用	钩子=“之前”状态 → 构建=过程 → 揭示=“之后”状态

Output of Stage 2

阶段2输出

Write out the arc explicitly before generating:

Arc type: [Problem→Solution / Feature Parade / Journey / Transformation]
Hook (0-3s): Screen [N] — [what happens] — camera: [extreme close-up on X]
Build (3-10s): Screen [N] → [N] → [N] — [what each reveals] — camera: [whip pan / orbital / etc.]
Reveal (10-13s): Screen [N] — [what it shows] — camera: [pull-back to show full product]
Logo (13-15s): @Image[N] — wordmark materializes whole in a burst of [accent color] light and holds. Do not ask the video model to render the `COMING SOON` copy; it is added later as a post-generation text overlay.

Do NOT write the Seedance prompt until this arc is defined.

在生成视频前，明确写出故事弧：

弧类型：[问题→解决方案 / 功能展示 / 体验旅程 / 转变对比]
钩子（0-3秒）：截图[N] — [画面内容] — 镜头：[对X的极端特写]
构建（3-10秒）：截图[N] → [N] → [N] — [每张截图展示的内容] — 镜头：[快速摇摄 / 环绕拍摄 / 等]
揭示（10-13秒）：截图[N] — [展示内容] — 镜头：[拉远镜头展示完整产品]
Logo（13-15秒）：@Image[N] — 文字标志在[强调色]光效中完整显现并保持。请勿让视频模型渲染`COMING SOON`文字；该文字将在生成后作为文本覆盖层添加。

在定义好此故事弧前，请勿编写Seedance提示词。

Stage 2.5 — GPT-Image-2 Enhancement

阶段2.5 — GPT-image-2增强

After the arc is defined and the 3–5 screens are selected, enhance each one with GPT-image-2 before uploading to Seedance. This lifts compressed website captures and App Store thumbnails to a cleaner, higher-fidelity reference.

For each selected screen (including the logo/end card reference):

python

result = generate_image(
    provider="gpt-image-2",
    prompt="High quality version, preserve all content exactly",
    reference_images=["<original_cdn_url>"],
    aspect_ratio="16:9",   # match the capture — use 9:16 for portrait screens
    quality="medium",
)

定义好故事弧并选中3–5张截图后，在上传到Seedance前，使用GPT-image-2增强每张截图。这可以提升压缩的网站捕获画面和App Store缩略图的清晰度，使其成为更高保真的参考图。

针对每张选中的截图（包括logo/结尾画面参考图）：

python

result = generate_image(
    provider="gpt-image-2",
    prompt="High quality version, preserve all content exactly",
    reference_images=["<original_cdn_url>"],
    aspect_ratio="16:9",   # 匹配原始捕获比例——竖屏截图使用9:16
    quality="medium",
)

use result.image_url (or result.url) as the Seedance reference

使用result.image_url（或result.url）作为Seedance的参考图


**Rules:**
- Keep the prompt exactly as shown — short, non-descriptive. Describing the image content makes GPT-image-2 hallucinate new details.
- Match `aspect_ratio` to the original capture (desktop = 16:9, mobile = 9:16).
- Run all enhancements in parallel (one call per screen).
- Use the enhanced URLs as the `reference_images` array in the Seedance call — not the originals.
- Keep your Stage 1 feature map descriptions unchanged — they describe the original content, which the enhanced image preserves.

---


**规则：**
- 严格使用上述提示词——简短、无描述性内容。描述图像内容会导致GPT-image-2凭空生成新细节。
- `aspect_ratio`匹配原始捕获比例（桌面=16:9，移动端=9:16）。
- 并行运行所有增强处理（每张截图调用一次）。
- 在Seedance调用中使用增强后的URL作为`reference_images`数组——而非原始URL。
- 保持阶段1的功能映射描述不变——它们描述的是原始内容，增强后的图片会保留这些内容。

---

Stage 3 — Prompt Writing

阶段3 — 提示词编写

With the feature map (Stage 1) and arc (Stage 2) in hand, write the Seedance prompt. Every

@Image

description must reference the real UI content from the feature map — never write generic descriptions like "a mobile interface with controls."

Choose the template by reading the app's screenshots — don't default to liquid glass. Read exactly one of these based on the app's personality; the other never loads:

Template A — Cinematic Narrative (default; productivity, AI, creative, social, food, games): read
```
references/template-a-cinematic.md
```
. The proven BEAT-structure template plus validated examples.
Template B — Liquid Glass (photography, camera, filter apps only, where a lens/filter metaphor is apt): read
```
references/liquid-glass.md
```
. Template B skeleton plus glass transformation vocabulary.

The accent color is always from the brand — read the icon and primary UI color, never invent one.

在拥有功能映射表（阶段1）和故事弧（阶段2）后，编写Seedance提示词。每个

@Image

的描述必须直接引用功能映射表中的真实UI内容——绝不能写“移动界面带控件”这类泛泛的描述。

根据应用截图选择模板——不要默认使用液态玻璃模板。根据应用特性选择其中一种，另一种无需考虑：

模板A — 电影叙事（默认；适用于生产力、AI、创意、社交、美食、游戏类应用）：参考
```
references/template-a-cinematic.md
```
。经过验证的节拍结构模板及示例。
模板B — 液态玻璃（仅适用于摄影、相机、滤镜类应用，镜头/滤镜隐喻适用的场景）：参考
```
references/liquid-glass.md
```
。模板B框架及玻璃转换词汇。

强调色始终取自品牌——从图标和主UI颜色中提取，绝不能凭空创造。

Rules for both templates

两种模板的通用规则

Every
```
@Image
```
description comes directly from the Stage 1 feature map
Camera directions come directly from the Stage 2 arc
Never write "the app interface" or "a mobile screen" — be specific
Keep under 200 words
Why specificity matters: Seedance uses
```
@ImageN
```
description as its primary brief — "a dark chat interface" vs "a VHS three-panel grid of city streets, a skate park, and a coastal sunset with retro timestamp overlays" produce completely different results. Copy the most visually specific details from your Stage 1 feature map verbatim.

每个
```
@Image
```
的描述直接来自阶段1的功能映射表
镜头指令直接来自阶段2的故事弧
绝不能写“应用界面”或“移动屏幕”——要具体
字数控制在200字以内
为何具体性如此重要： Seedance将
```
@ImageN
```
的描述作为核心指令——“深色聊天界面”与“VHS三面板网格，展示城市街道、滑板公园和海岸日落，带复古时间戳覆盖层”会产生完全不同的结果。直接复制阶段1功能映射表中最具视觉细节的描述。

Generate Video

生成视频

Primary — Seedance:

python

generate_reference_video(
    provider="seedance",
    reference_images=["<url1>", "<url2>", "<url3>", "<url4>", "<url5>"],  # 3–5 screens + icon
    prompt="<prompt using @Image1 … @Image5 tokens>",
    resolution="1080p",   # always
    duration=15,          # always
    sound=True,           # always
    aspect_ratio="16:9",  # or 9:16 / 1:1 per user request
    seed=<int>,           # set one; reuse it for content-policy recovery
)

首选方案 — Seedance：

python

generate_reference_video(
    provider="seedance",
    reference_images=["<url1>", "<url2>", "<url3>", "<url4>", "<url5>"],  # 3–5张截图 + 图标
    prompt="<使用@Image1 … @Image5标记的提示词>",
    resolution="1080p",   # 始终使用
    duration=15,          # 始终使用
    sound=True,           # 始终使用
    aspect_ratio="16:9",  # 或根据用户需求使用9:16 / 1:1
    seed=<int>,           # 设置一个值；在内容审核恢复时复用
)

Seedance generated-audio moderation recovery

Seedance生成音频审核恢复流程

If Seedance finishes generation and then returns a 422 whose body includes

type: "content_policy_violation"

reason: "partner_validation_failed"

loc: ["body", "generated_video"]

, and

msg: "Output audio has sensitive content."

, treat it as a recoverable generated-audio moderation false positive.

Retry the exact same prompt and
```
reference_images
```
with
```
sound=False
```
and the same
```
seed
```
.
If the silent probe succeeds, retry the exact same prompt/reference set with
```
sound=True
```
and the same seed.
If the
```
sound=True
```
replay succeeds, route the recovered sound-on URL into Stage 4 as
```
generated_teaser_url
```
. Keep the silent probe URL only as debugging context.
If the silent probe fails, treat the failure as video/reference moderation and use the Kling fallback.
If the silent probe succeeds but the
```
sound=True
```
replay fails again, run the Kling fallback once. If Kling is unavailable, route the silent URL into Stage 4 as
```
generated_teaser_url
```
and explicitly note that generated-audio moderation remained flaky.

Do not change the prompt, references, aspect ratio, duration, or seed during this recovery path. Changing any of them turns the silent probe into a new generation instead of testing whether only generated audio triggered moderation.

如果Seedance完成生成后返回422错误，且响应体包含

type: "content_policy_violation"

、

reason: "partner_validation_failed"

、

loc: ["body", "generated_video"]

和

msg: "Output audio has sensitive content."

，则视为可恢复的生成音频审核误判。

使用完全相同的提示词、
```
reference_images
```
和seed，将
```
sound
```
设为
```
False
```
重试。
如果静音测试成功，再次使用完全相同的提示词/参考图、相同的seed，将
```
sound
```
设为
```
True
```
重试。
如果
```
sound=True
```
的重试成功，将恢复后的带声音URL作为
```
generated_teaser_url
```
传入阶段4。仅将静音测试URL作为调试保留。
如果静音测试失败，则视为视频/参考图审核失败，使用Kling fallback方案。
如果静音测试成功但
```
sound=True
```
的重试再次失败，则运行一次Kling fallback。如果Kling不可用，则将静音URL作为
```
generated_teaser_url
```
传入阶段4，并明确说明生成音频审核仍存在不稳定问题。

在此恢复流程中，请勿更改提示词、参考图、宽高比、时长或seed。更改任何一项都会使静音测试变成新的生成任务，而非仅测试是否是生成音频触发了审核。

Seedance timeout recovery

Seedance超时恢复流程

If task status remains queued or running until Seedance returns a timeout such as

seedance timed out after 900s

seedance timed out after 1200s

, treat it as provider queue saturation, not a prompt/content failure.

When this happens, run the Kling fallback with the same selected references, same beat structure,

duration=15

sound=True

, and

quality_mode="pro"

. Convert

@ImageN

prompt tokens to

<<<image_N>>>

before calling Kling.

Do not keep retrying Seedance after a timeout unless the user explicitly asks to wait for Seedance. The timeout path has already spent the launch-demo wall-clock budget; switching provider is the documented recovery.

Fallback — Kling (non-audio partner_validation_failed or insufficient_balance):

python

generate_reference_video(
    provider="kling",
    reference_images=["<url1>", "<url2>", "<url3>", "<url4>", "<url5>"],
    prompt="<prompt using <<<image_1>>> … <<<image_5>>> tokens>",
    quality_mode="pro",   # = 1080p on Kling (NOT resolution=)
    duration=15,
    sound=True,
    aspect_ratio="16:9",
)

Seedance tokens:

@Image1

…

@Image5

| Kling tokens:

<<<image_1>>>

…

<<<image_5>>>

Seedance constraints: skip

fast=True

because it caps at 720p; skip

negative_prompt

because Seedance rejects it; skip

auto_duration

because this path is fixed at 15s.

Kling constraint: use

quality_mode="pro"

for 1080p; Kling rejects

resolution=

如果任务状态始终处于排队或运行中，直到Seedance返回超时信息，如

seedance timed out after 900s

或

seedance timed out after 1200s

，则视为服务商队列饱和，而非提示词/内容失败。

出现此情况时，使用相同的选中参考图、相同的节拍结构、

duration=15

、

sound=True

和

quality_mode="pro"

运行Kling fallback。在调用Kling前，将提示词中的

@ImageN

标记转换为

<<<image_N>>>

。

除非用户明确要求等待Seedance，否则超时后请勿继续重试Seedance。超时已耗费了演示的时间预算，切换服务商是文档规定的恢复方式。

Fallback方案 — Kling（非音频类partner_validation_failed或余额不足）：

python

generate_reference_video(
    provider="kling",
    reference_images=["<url1>", "<url2>", "<url3>", "<url4>", "<url5>"],
    prompt="<使用<<<image_1>>> … <<<image_5>>>标记的提示词>",
    quality_mode="pro",   # Kling上对应1080p（请勿使用resolution=）
    duration=15,
    sound=True,
    aspect_ratio="16:9",
)

Seedance标记：

@Image1

…

@Image5

| Kling标记：

<<<image_1>>>

…

<<<image_5>>>

Seedance约束：请勿使用

fast=True

，因为它会将分辨率限制在720p；请勿使用

negative_prompt

，因为Seedance会拒绝该字段；请勿使用

auto_duration

，因为此流程固定为15秒。

Kling约束：使用

quality_mode="pro"

以获得1080p分辨率；Kling会拒绝

resolution=

参数。

Kling queued/handoff recovery

Kling排队/移交恢复流程

Kling fallback is async. If

generate_reference_video(provider="kling")

returns a

task_id

, follow the task until terminal.

task_status

returns status:

queued

with

statusMessage

containing

Worker handoff: task was requeued for retry on another worker.

, treat it as a worker restart handoff, not a failed render. Keep polling

mcp__pika__task_status(task_id)

; the next worker should reclaim the same task.

statusMessage

starts with

Kling is at capacity

, treat it as provider capacity wait. Keep polling the same task while

lastUpdatedAt

continues moving.

Do not submit a duplicate Kling request while the original task is still

queued

running

. Duplicates can burn provider quota and make artifact provenance unclear.

status

stays

queued

for more than 10 minutes with no

lastUpdatedAt

movement, capture the

task_id

status

statusMessage

, and

lastUpdatedAt

, then cancel the stalled original with

mcp__pika__task_cancel(task_id)

before retrying. Only after cancel returns

cancelled

, retry the exact same Kling request once with the same prompt, references, shots, aspect ratio, duration, and quality mode. If cancel fails because the task already completed or failed, inspect that terminal result instead of retrying. If the retry also stalls, stop and report both task IDs instead of changing the creative prompt.

Kling fallback是异步的。如果

generate_reference_video(provider="kling")

task_id

，则跟踪任务直到进入终端状态。

如果

task_status

返回状态：

queued

且

statusMessage

包含

Worker handoff: task was requeued for retry on another worker.

，则视为Worker重启移交，而非渲染失败。继续轮询

mcp__pika__task_status(task_id)

；下一个Worker应会接手同一任务。

如果

statusMessage

以

Kling is at capacity

开头，则视为服务商容量不足，需等待。在

lastUpdatedAt

持续更新时，继续轮询同一任务。

在原始任务仍处于

queued

或

running

状态时，请勿提交重复的Kling请求。重复请求会消耗服务商配额，并导致产物来源不清晰。

如果

status

保持

queued

状态超过10分钟且

lastUpdatedAt

无变化，则记录

task_id

、

status

、

statusMessage

和

lastUpdatedAt

，然后调用

mcp__pika__task_cancel(task_id)

取消停滞的原始任务，再重试。仅在取消返回

cancelled

后，使用完全相同的提示词、参考图、镜头、宽高比、时长和质量模式重试一次Kling请求。如果取消失败（因为任务已完成或失败），则检查终端结果而非重试。如果重试也停滞，则停止操作并报告两个任务ID，不要更改创意提示词。

Asset Upload (local files → public URL)

资产上传（本地文件→公共URL）

If the user provides local file paths, convert them to public URLs before calling generate:

Read the file size and MIME type.

Call

mcp__pika__upload_asset(filename, mime_type, size_bytes)

Upload the bytes to the returned
```
presigned_url
```
using the host client's file-upload capability.
Use the returned
```
public_url
```
as the reference URL in generation calls.

Supported mime types:

image/png

image/jpeg

image/webp

video/mp4

audio/mpeg

audio/wav

如果用户提供本地文件路径，在调用生成工具前将其转换为公共URL：

读取文件大小和MIME类型。

调用

mcp__pika__upload_asset(filename, mime_type, size_bytes)

。

使用宿主客户端的文件上传功能，将文件字节上传到返回的
```
presigned_url
```
。
使用返回的
```
public_url
```
作为生成调用中的参考URL。

支持的MIME类型：

image/png

、

image/jpeg

、

image/webp

、

video/mp4

、

audio/mpeg

、

audio/wav

Stage 4 — Deterministic COMING SOON Overlay

阶段4 — 固定的COMING SOON覆盖层

Do not ask Seedance or Kling to render

COMING SOON

. Video models garble new typography, especially all-caps CTA text, so the final two seconds use a deterministic

COMING SOON

overlay as a post-generation text overlay.

After Seedance or Kling returns the 15s teaser URL, call:

python

edit_text_overlay(
    video_url=<generated_teaser_url>,
    text="COMING SOON",
    position="bottom_center",
    font_size=56,
    font_color="white",
    start_s=13,
    end_s=15,
)

edit_text_overlay

returns

{ task_id }

, poll

mcp__pika__task_status

until it reaches

completed

failed

, or

cancelled

, then unwrap the returned URL. Save the returned URL as

final_url

. If the overlay call fails, surface that failure and the unoverlaid teaser URL as a diagnostic preview; do not deliver a teaser whose only

COMING SOON

text was generated by the video model.

请勿让Seedance或Kling渲染

COMING SOON

文字。视频模型会扭曲新的排版，尤其是全大写的CTA文字，因此最后两秒使用固定的

COMING SOON

覆盖层作为生成后文本叠加。

Seedance或Kling返回15秒预告视频URL后，调用：

python

edit_text_overlay(
    video_url=<generated_teaser_url>,
    text="COMING SOON",
    position="bottom_center",
    font_size=56,
    font_color="white",
    start_s=13,
    end_s=15,
)

如果

edit_text_overlay

{ task_id }

，则轮询

mcp__pika__task_status

直到任务进入

completed

、

failed

或

cancelled

状态，然后提取返回的URL。将返回的URL保存为

final_url

。如果叠加调用失败，则告知用户失败情况并提供未叠加的预告视频URL作为诊断预览；请勿交付仅由视频模型生成

COMING SOON

文字的预告视频。

Result Delivery

结果交付

Return the final Pika CDN URL as the primary deliverable. If the host client requires local media markers, create that local preview outside this skill flow after confirming the CDN URL is reachable.

If generation completes asynchronously: follow the MCP tool's returned status handle until the video reaches a terminal state, then deliver the final URL.

将最终的Pika CDN URL作为主要交付物。如果宿主客户端需要本地媒体标记，在确认CDN URL可访问后，在本工具流程外创建本地预览。

如果生成为异步完成： 跟踪MCP工具返回的状态句柄，直到视频进入终端状态，再交付最终URL。

Prompting Guide

提示词指南

The prompt is the output of Stages 1 + 2, not a starting point. Never fill in the template from imagination — fill it from the feature map and arc you built. A prompt written without Stage 1 analysis will produce a generic glass blob.

提示词是阶段1+阶段2的输出，而非起点。 绝不能凭空填充模板——必须基于您构建的功能映射表和故事弧填充。未经过阶段1分析编写的提示词会生成泛泛的玻璃效果视频。

Camera Vocabulary

镜头词汇

Use specific camera language — Seedance responds to it:

Term	Effect
`extreme macro close-up on [specific element]`	Tight detail shot — glass edge, button, icon
`crash zoom into [element]`	Fast push-in, creates energy
`whip pan to`	Hard lateral cut with motion blur
`orbital sweep around`	360° arc around the floating panel
`push-in drift`	Slow, cinematic dolly
`pull-back to reveal`	Classic product reveal — shows full form
`hard cut to black`	Clean beat before logo

Alternate fast cuts with slower drifts — pure rapid cuts feel chaotic, pure slow drifts feel boring.

(Glass transformation vocabulary lives in

references/liquid-glass.md

— only relevant on the Template B path.)

使用具体的镜头语言——Seedance对这类指令响应良好：

术语	效果
`extreme macro close-up on [specific element]`	特写细节镜头——玻璃边缘、按钮、图标
`crash zoom into [element]`	快速推进镜头，营造活力
`whip pan to`	快速横向切换，带运动模糊
`orbital sweep around`	围绕悬浮面板的360°环绕镜头
`push-in drift`	缓慢的电影式推拉镜头
`pull-back to reveal`	经典产品揭示镜头——展示完整形态
`hard cut to black`	Logo前的干净节拍切换

快速切换与缓慢移动交替使用——纯快速切换会显得混乱，纯缓慢移动会显得乏味。

（玻璃转换词汇在

references/liquid-glass.md

中——仅适用于模板B流程。）

Device Framing

设备取景

For product shots, lock the device to a black void — never place in environments:

undefined

对于产品镜头，将设备锁定在纯黑背景中——绝不要放置在具体环境中：

undefined

Floating desktop screens (SaaS / desktop apps)

悬浮桌面屏幕（SaaS/桌面应用）

Show the desktop screens floating in 3D space on a pure black background, tilted at slight angles like a MacBook product shot. The UI elements on screen become translucent glass with reflections and refractions. No text, no logos, no words.

显示桌面屏幕悬浮在3D空间的纯黑背景中，像MacBook产品镜头一样轻微倾斜。屏幕上的UI元素变为半透明玻璃，带有反射和折射效果。无文字、无logo、无标识。

iPad reveal

iPad展示

An iPad Pro floating in empty black space, tilted at a cinematic angle like an Apple product shot. The iPad is a real solid device with visible bezels — only the screen content has the glass effect. The device slowly rotates. No text, no logos.

iPad Pro悬浮在纯黑空间中，像苹果产品镜头一样呈电影角度倾斜。iPad是真实的实体设备，带有可见边框——仅屏幕内容有玻璃效果。设备缓慢旋转。无文字、无logo。

MacBook

A MacBook Pro floating in empty black space, open at a cinematic angle. The screen displays [content]. Light catches the aluminium edges. No text, no logos.

undefined

MacBook Pro悬浮在纯黑空间中，以电影角度打开。屏幕显示[内容]。光线照射在铝合金边缘。无文字、无logo。

undefined

Reference Count Guide

参考图数量指南

All runs are 15s, 1080p. Select 3–5 screens based on the narrative arc.

Refs	Use case
3	Standard — one screen per beat (hook / build / reveal) + icon as @Image4
4	Two build beats + hook + icon
5	Feature-rich — hook + 3 build beats + icon. Don't exceed 5.

The golden rule: 1 reference per ~3 seconds of video.

所有生成任务均为15秒、1080p。根据叙事弧选择3–5张截图。

参考图数量	使用场景
3	标准配置——每个节拍对应一张截图（钩子/构建/揭示）+ 图标作为@Image4
4	两个构建节拍 + 钩子 + 图标
5	功能丰富的应用——钩子 + 3个构建节拍 + 图标。请勿超过5张。

黄金法则：每约3秒视频对应1张参考图。

Load-bearing phrases

关键短语

These phrases are empirical prompt/flow anchors. Keep them when simplifying the skill:

Phrase	Where	Why load-bearing
`High quality version, preserve all content exactly`	GPT-image-2 enhancement pass	Keeps the enhancement pass from inventing UI while cleaning compression artifacts.
`Do NOT write the Seedance prompt until this arc is defined`	Stage 2.5 gate	Prevents generic motion prompts that are not grounded in the selected screens.
`The prompt is the output of Stages 1 + 2, not a starting point`	Prompting guide	Forces the agent to use the screen feature map and story arc instead of template-filling from imagination.
`pure black background` / `floating in empty black space`	Device framing prompts	Keeps product shots focused on the app UI rather than hallucinated environments.
`materializes whole` / `crystallizes as a single form` / `fades in as a complete element`	Logo reveal wording	Avoids per-letter logo construction, which causes garbled brand text.

这些短语是经过验证的提示词/流程锚点。在简化工具时请保留：

短语	位置	为何关键
`High quality version, preserve all content exactly`	GPT-image-2增强步骤	确保增强步骤在清除压缩 artifacts的同时不会凭空生成UI。
`Do NOT write the Seedance prompt until this arc is defined`	阶段2.5校验点	防止生成未基于选中截图的泛泛运动提示词。
`The prompt is the output of Stages 1 + 2, not a starting point`	提示词指南	迫使Agent使用屏幕功能映射表和故事弧，而非凭空填充模板。
`pure black background` / `floating in empty black space`	设备取景提示词	确保产品镜头聚焦于应用UI，而非凭空生成的环境。
`materializes whole` / `crystallizes as a single form` / `fades in as a complete element`	Logo揭示措辞	避免逐字母构建logo，这会导致品牌文字扭曲。

Runtime Expectations

运行时间预期

Typical run time is 4-8 minutes:

Step	Wall clock	Notes
Asset sourcing	10-60s	App Store via `mcp__pika__fetch_appstore_screens` ; website capture depends on page load
Screen analysis + arc	2-5 min	User confirmation can add time
GPT-image-2 enhancement	30-90s	Run selected screens in parallel
Seedance generation	3-5 min	Generated-audio moderation recovery adds one silent probe plus one same-seed sound replay
Kling fallback	5-15 min	Capacity wait or worker handoff may temporarily show `queued` ; follow the Kling queued/handoff recovery runbook
Download verification	<30s	Local sanity check before delivery

典型运行时间为4-8分钟：

步骤	耗时	说明
资产获取	10-60秒	通过 `mcp__pika__fetch_appstore_screens` 获取App Store资产；网站捕获耗时取决于页面加载速度
屏幕分析+故事弧	2-5分钟	用户确认可能增加耗时
GPT-image-2增强	30-90秒	并行处理选中的截图
Seedance生成	3-5分钟	生成音频审核恢复会增加一次静音测试和一次同seed带声音重试
Kling fallback	5-15分钟	容量等待或Worker移交可能暂时显示 `queued` ；请遵循Kling排队/移交恢复手册
下载验证	<30秒	交付前的本地 sanity检查

Engine Choice: Seedance Primary, Kling Fallback

引擎选择：Seedance首选，Kling Fallback

Seedance is the default because it handles polished motion-graphics references and 1080p app teasers well. Kling is the fallback for moderation, balance, or Seedance timeout failures because it is more permissive on some screen content and uses

quality_mode="pro"

for 1080p.

Seedance是默认选择，因为它能很好地处理精美的动效参考图和1080p应用预告视频。Kling作为fallback方案用于审核、余额或Seedance超时失败的情况，因为它对某些屏幕内容的限制更宽松，且通过

quality_mode="pro"

实现1080p分辨率。

Failure Modes

故障模式

Symptom	Cause	Fix
`fast=True` with `resolution="1080p"`	Seedance caps fast mode at 720p	Remove `fast` ; keep `resolution="1080p"`
`negative_prompt` rejected	Seedance does not accept this field	Use positive framing such as "smooth motion, stable camera"
Seedance generated-audio moderation: `content_policy_violation` / `partner_validation_failed` , `generated_video` , "Output audio has sensitive content."	Often a false positive on non-sensitive app-sizzle references	Follow the generated-audio recovery runbook: same-seed `sound=False` probe, then same-seed `sound=True` replay
Seedance timeout such as `seedance timed out after ...`	Provider queue saturation or tail latency exceeded the tool budget	Run the Kling fallback; do not keep retrying Seedance unless the user explicitly asks to wait
Seedance `partner_validation_failed` on video	Screen content includes recording UI, celebrity faces, or similar moderation triggers	Switch to `provider="kling"` and convert tokens to `<<<image_N>>>`
Faces in screenshots trigger content policy	Screenshot includes real people	Crop faces out before upload, or use Kling
6+ reference images reduce quality	The model blends too many refs	Keep to 3-5 references, roughly one per 3 seconds
Prompt tail ignored	Prompt exceeds about 200 words	Trim to the beat structure and the concrete UI details
Text in output is garbled	Video model is asked to render new text	Keep text as existing reference-image content; overlay any new branding in post
Logo reveal hallucinates letterforms	"assemble/build/construct" language triggers per-glyph rendering	Use "materializes whole", "crystallizes as a single form", or "fades in as a complete element"
Task returns `{ task_id }` instead of inline	Long-running generation exceeded inline budget	Poll `mcp__pika__task_status(task_id)` until `completed` , `failed` , or `cancelled` ; unwrap `result.structuredContent` when present
Kling task returns status: `queued` after previously running	Worker handoff or provider capacity wait	Follow the Kling queued/handoff recovery runbook; do not duplicate-submit unless queued for more than 10 minutes with no `lastUpdatedAt` movement
Kling rejects `resolution=`	Kling uses a different quality knob	Use `quality_mode="pro"`
App Store icon URL points to promo art	App Store metadata fallback found feature artwork	Prefer the `icon.url` returned by `mcp__pika__fetch_appstore_screens` ; if missing, ask for a logo/icon file

症状	原因	修复方案
`fast=True` 搭配 `resolution="1080p"`	Seedance的快速模式分辨率上限为720p	移除 `fast` 参数；保留 `resolution="1080p"`
`negative_prompt` 被拒绝	Seedance不接受该字段	使用正向表述，如“smooth motion, stable camera”
Seedance生成音频审核： `content_policy_violation` / `partner_validation_failed` 、 `generated_video` 、“Output audio has sensitive content.”	通常是对非敏感应用预告参考图的误判	遵循生成音频恢复手册：同seed的 `sound=False` 测试，然后同seed的 `sound=True` 重试
Seedance超时，如 `seedance timed out after ...`	服务商队列饱和或延迟超出工具预算	运行Kling fallback；除非用户明确要求等待，否则请勿继续重试Seedance
Seedance返回视频类 `partner_validation_failed`	屏幕内容包含录屏UI、名人面孔或类似审核触发内容	切换到 `provider="kling"` 并将标记转换为 `<<<image_N>>>`
截图中的人脸触发内容政策	截图包含真实人物	上传前裁剪掉人脸，或使用Kling
6张及以上参考图导致质量下降	模型融合了过多参考图	限制在3-5张参考图，约每3秒视频对应1张
提示词末尾内容被忽略	提示词超过约200字	精简到节拍结构和具体UI细节
输出中的文字扭曲	要求视频模型渲染新文字	文字仅保留为参考图中的现有内容；任何新品牌文字在生成后叠加
Logo揭示时文字形态扭曲	“assemble/build/construct”这类词汇触发逐字母渲染	使用“materializes whole”、“crystallizes as a single form”或“fades in as a complete element”
任务返回 `{ task_id }` 而非直接结果	长时间生成超出即时预算	轮询 `mcp__pika__task_status(task_id)` 直到进入 `completed` 、 `failed` 或 `cancelled` ；如有 `result.structuredContent` 则提取
Kling任务在之前运行后返回 `queued` 状态	Worker移交或服务商容量不足	遵循Kling排队/移交恢复手册；除非排队超过10分钟且 `lastUpdatedAt` 无变化，否则请勿重复提交
Kling拒绝 `resolution=` 参数	Kling使用不同的质量控制参数	使用 `quality_mode="pro"`
App Store图标URL指向宣传图	App Store元数据 fallback到了功能宣传图	优先使用 `mcp__pika__fetch_appstore_screens` 返回的 `icon.url` ；若缺失则请求用户提供logo/图标文件