muapi-ai-fight-scene
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAI Fight Scene Generator
AI打斗场景生成器
Generate a high-cut-density action / fight scene by first composing a 16-cell storyboard image, then driving Seedance 2.0 image-to-video off that storyboard.
Estimated credits: ~250 per run.
The core idea (from the AtlasCloud Seedance 2 + GPT Image 2 tutorial): action tension comes from cut density, not single-shot quality. Forcing the video model to follow a pre-drawn 4×4 storyboard grid gives you 16 distinct shots in a 15-second clip — landing punches, reverse angles, ECUs, whip-pans — that no t2v prompt could choreograph on its own.
先制作一张16格分镜图,再基于该分镜图通过Seedance 2.0图生视频工具生成高镜头密度的动作/打斗场景。
预估消耗点数: 每次运行约250点。
核心思路(来自AtlasCloud Seedance 2 + GPT Image 2教程):动作张力源于镜头密度,而非单镜头质量。 强制视频模型遵循预先绘制的4×4分镜网格,可在15秒的片段中呈现16个不同镜头——出拳、反打角度、特写镜头、甩镜等,这是纯文本转视频提示词无法独立编排的效果。
Inputs
输入参数
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| text | yes | — | Full physical description of the fighter(s). Asymmetric details (eye colour, scar side, holster on left hip) help the model preserve identity across panels. |
| text | yes | — | The scene setting — e.g. "cyberpunk wet back-alley, neon kanji signage, Stray-game aesthetic, rain on chrome." |
| text | yes | — | The action beat — prose or numbered beats. E.g. "Hero is cornered → blocks first punch → counter-elbow → throw opponent into trash cans → finisher." |
| text | no | cinematic action film, anamorphic lens, high contrast, motion blur on hits | Aesthetic / look tags applied to every frame. |
| int | no | 15 | Final video length in seconds. The storyboard's 16 cells map roughly 1 shot per second at default. |
| text | no | 16:9 | Output aspect — |
| 名称 | 类型 | 是否必填 | 默认值 | 描述 |
|---|---|---|---|---|
| 文本 | 是 | — | 打斗角色的完整外貌描述。不对称细节(如眼睛颜色、疤痕位置、左侧腰带上的枪套)有助于模型在不同画面中保持角色一致性。 |
| 文本 | 是 | — | 场景设定——例如:"赛博朋克潮湿后巷,霓虹日文标识,《Stray》游戏风格,雨水打在金属表面。" |
| 文本 | 是 | — | 动作节拍——散文或编号步骤形式。例如:"主角被包围 → 格挡第一拳 → 反击肘击 → 将对手扔进垃圾桶 → 终结招式。" |
| 文本 | 否 | cinematic action film, anamorphic lens, high contrast, motion blur on hits | 应用于每一帧的美学/视觉风格标签。 |
| 整数 | 否 | 15 | 最终视频时长(秒)。默认设置下,分镜的16个画面大致对应每秒一个镜头。 |
| 文本 | 否 | 16:9 | 输出画面比例—— |
Steps
操作步骤
Phase A — Character Sheet
阶段A — 角色设定图
Generate a clean turnaround-style character sheet using (model=):
muapi image generategpt-image-2-text-to-image- Prompt:
Character reference sheet of {{character_description}}. Three views — front, 3/4, profile — on a neutral grey backdrop. Studio lighting, full body, no text overlays, photoreal. Asymmetric identifying details preserved on the correct side. {{style_direction}}. - Aspect ratio:
3:2
Present the character sheet and confirm identity details look right before proceeding. This image becomes reference #1 for later phases.
使用 (模型=)生成清晰的多角度角色设定图:
muapi image generategpt-image-2-text-to-image- 提示词:
Character reference sheet of {{character_description}}. Three views — front, 3/4, profile — on a neutral grey backdrop. Studio lighting, full body, no text overlays, photoreal. Asymmetric identifying details preserved on the correct side. {{style_direction}}. - 画面比例:
3:2
展示角色设定图并确认身份细节无误后再继续。此图将作为后续阶段的参考素材#1。
Phase B — Environment Concept
阶段B — 环境概念设计
Use (model=) to design the scene/world:
muapi image generatenano-banana-2- Prompt:
Wide establishing shot of {{environment_description}}. No characters in frame — environment only. Strong perspective lines, depth, atmospheric haze. {{style_direction}}. Production-design concept art. - Aspect ratio:
{{aspect_ratio}}
Nano-Banana-2 is chosen here for its reasoning-driven composition — it's better than text-to-image-only models at producing locations with believable spatial logic (chokepoints, cover, sightlines) that an action scene can use. Present for approval. This becomes reference #2.
使用 (模型=)设计场景/世界观:
muapi image generatenano-banana-2- 提示词:
Wide establishing shot of {{environment_description}}. No characters in frame — environment only. Strong perspective lines, depth, atmospheric haze. {{style_direction}}. Production-design concept art. - 画面比例:
{{aspect_ratio}}
选择Nano-Banana-2是因为它擅长基于逻辑构建构图——相比纯文本转图像模型,它能生成具有可信空间逻辑(如 choke points、掩护物、视线)的场景,适合动作戏使用。展示并确认设计。此图将作为参考素材#2。
Phase C — 16-Cell Storyboard
阶段C — 16格分镜图
Compose the action onto a single 4×4 storyboard image using (model=):
muapi image editgpt-image-2-image-to-image- Reference Images: the character sheet from Phase A and the environment plate from Phase B.
- Prompt:
Compose a 4×4 storyboard grid (16 numbered cells) for the following action sequence: {{action_script}} CHARACTER (use reference image 1 identity throughout, asymmetric details preserved): {{character_description}} LOCATION (use reference image 2 spatial layout): {{environment_description}} Each cell labels: SHOT # (1–16) · SIZE (WIDE / MS / CU / ECU) · CAMERA-MOVE arrow (push, pull, whip, dolly, crash-zoom, handheld) · 1-word RHYTHM note (BEAT / IMPACT / RECOVERY / RESET). Vary shot size aggressively — never two WIDEs in a row. Land every IMPACT on a CU or ECU. Hand-drawn comic-book ink-and-wash style, monochrome with selective red accents on hits. Numbered cells, clear gutters between panels. Aesthetic: {{style_direction}}. - Aspect ratio: (square works best for a 4×4 grid)
1:1
Present the storyboard to the user. Confirm:
- The 16 shots read clearly
- Identity stays consistent cell-to-cell
- Cut density / shot-size variation looks aggressive enough
If a panel reads poorly, regenerate just the storyboard with that cell's note bolded ("CELL 7 must be an ECU on the right fist").
使用 (模型=)将动作编排到一张4×4的分镜图中:
muapi image editgpt-image-2-image-to-image- 参考图像:阶段A生成的角色设定图 和 阶段B生成的场景图。
- 提示词:
Compose a 4×4 storyboard grid (16 numbered cells) for the following action sequence: {{action_script}} CHARACTER (use reference image 1 identity throughout, asymmetric details preserved): {{character_description}} LOCATION (use reference image 2 spatial layout): {{environment_description}} Each cell labels: SHOT # (1–16) · SIZE (WIDE / MS / CU / ECU) · CAMERA-MOVE arrow (push, pull, whip, dolly, crash-zoom, handheld) · 1-word RHYTHM note (BEAT / IMPACT / RECOVERY / RESET). Vary shot size aggressively — never two WIDEs in a row. Land every IMPACT on a CU or ECU. Hand-drawn comic-book ink-and-wash style, monochrome with selective red accents on hits. Numbered cells, clear gutters between panels. Aesthetic: {{style_direction}}. - 画面比例:(正方形最适合4×4网格)
1:1
向用户展示分镜图,确认:
- 16个镜头逻辑清晰
- 角色身份在各画面中保持一致
- 镜头密度/镜头尺寸变化足够丰富
如果某一画面效果不佳,可重新生成分镜图并将该画面的要求加粗(例如"CELL 7必须是右拳的特写镜头")。
Phase D — Storyboard → Video (Seedance 2.0)
阶段D — 分镜转视频(Seedance 2.0)
Hand the storyboard to (model=):
muapi video from-imageseedance-v2.0-i2v- Reference Image: the 16-cell storyboard from Phase C.
- Prompt:
Generate a {{duration}}-second action sequence that strictly follows the 16-cell storyboard reference image, cell-by-cell, top-left to bottom-right. - Honour each cell's labelled SHOT SIZE and CAMERA-MOVE — match cuts to the storyboard's rhythm notes. - Strong cinematic feel and shot language. Exaggerated dynamics. Hits land hard with motion blur and impact frames. - Camera language: anamorphic, handheld where the storyboard calls for it, locked-off where it doesn't. - Native audio: impact sfx on every IMPACT cell, footsteps, fabric/Foley, restrained low score under the action. Action being rendered: {{action_script}}. Aesthetic: {{style_direction}}. - Duration: (default 15)
{{duration}} - Aspect ratio:
{{aspect_ratio}}
After generation, present the final video. If the cut density feels too low or shots don't match the storyboard, regenerate Phase D first (cheaper than rebuilding the storyboard) with the prompt emphasising "strict cell-by-cell adherence" more aggressively.
将分镜图导入 (模型=):
muapi video from-imageseedance-v2.0-i2v- 参考图像:阶段C生成的16格分镜图。
- 提示词:
Generate a {{duration}}-second action sequence that strictly follows the 16-cell storyboard reference image, cell-by-cell, top-left to bottom-right. - Honour each cell's labelled SHOT SIZE and CAMERA-MOVE — match cuts to the storyboard's rhythm notes. - Strong cinematic feel and shot language. Exaggerated dynamics. Hits land hard with motion blur and impact frames. - Camera language: anamorphic, handheld where the storyboard calls for it, locked-off where it doesn't. - Native audio: impact sfx on every IMPACT cell, footsteps, fabric/Foley, restrained low score under the action. Action being rendered: {{action_script}}. Aesthetic: {{style_direction}}. - 时长:(默认15秒)
{{duration}} - 画面比例:
{{aspect_ratio}}
生成完成后展示最终视频。如果镜头密度过低或镜头与分镜不符,可先重新生成阶段D(比分镜重构成本更低),并在提示词中更强调"严格遵循逐格画面"。
Notes
注意事项
- Why the storyboard image and not a text storyboard? Seedance 2.0 i2v anchors its motion plan to the visual reference. A grid of 16 drawn cells gives it 16 visual targets to hit — text descriptions of shots get averaged into mush.
- Asymmetric character details matter. Without something like "scar over the right eyebrow" or "leather glove on the left hand only", identity drift between cells is the #1 failure mode.
- Use to draft. Cheaper preview pass before committing to the full-res
seedance-2.0-i2v-480prun.seedance-v2.0-i2v - For longer fights, chain two runs: first run uses storyboard A (cells 1–16, beats 1–15s); second run uses storyboard B (cells 17–32, beats 15–30s) with the last cell of A as a continuity anchor in B's first cell.
- Language: Both English and Chinese prompts work in all four models, so the storyboard cell labels can be in either language.
- 为什么用分镜图而非文本分镜? Seedance 2.0 i2v会将运动计划锚定到视觉参考。16格绘制的画面为它提供了16个明确的视觉目标——而镜头的文本描述会被模型平均化,导致效果模糊。
- 角色不对称细节至关重要。 如果没有诸如"右眉上方的疤痕"或"仅左手戴皮手套"这类细节,角色身份在不同画面中漂移是最常见的失败原因。
- 使用进行草稿生成。 在使用全分辨率
seedance-2.0-i2v-480p生成前,先用更低成本的预览版本测试。seedance-v2.0-i2v - 如需更长打斗场景,可将两次运行串联:第一次运行使用分镜A(画面1–16,对应0–15秒动作);第二次运行使用分镜B(画面17–32,对应15–30秒动作),并将分镜A的最后一帧作为分镜B第一帧的连续性锚点。
- 语言支持:所有四个模型均支持英文和中文提示词,因此分镜画面的标签可使用任意一种语言。
Trigger Keywords
触发关键词
fight sceneaction sequencestoryboard to videocut densitycinematic actioncombat choreographyseedance 2 storyboardfight sceneaction sequencestoryboard to videocut densitycinematic actioncombat choreographyseedance 2 storyboardPipeline at a Glance
流程概览
character_description ──► [GPT-Image-2 t2i] ─► character sheet ──┐
│
environment_description ─► [Nano-Banana-2 t2i] ─► environment plate ┼─► [GPT-Image-2 i2i] ─► 16-cell storyboard ─► [Seedance 2.0 i2v] ─► 15s action video
│
action_script + style_direction ───────────────────────────────────►┘character_description ──► [GPT-Image-2 t2i] ─► character sheet ──┐
│
environment_description ─► [Nano-Banana-2 t2i] ─► environment plate ┼─► [GPT-Image-2 i2i] ─► 16-cell storyboard ─► [Seedance 2.0 i2v] ─► 15s action video
│
action_script + style_direction ───────────────────────────────────►┘Notes for the Executing Agent
执行Agent注意事项
- This recipe is LLM-orchestrated: read each phase, gather any missing inputs from the user, then call CLI commands. Use
muapifirst ifmuapi auth configureis unset.MUAPI_API_KEY - For model IDs without a CLI alias yet, fall back to the raw endpoint via and poll with
curl -X POST https://api.muapi.ai/api/v1/<endpoint> -H "x-api-key: $MUAPI_API_KEY" -H 'content-type: application/json' -d '{...}'.muapi predict wait <request_id> - Phase C uses TWO reference images (character sheet + environment plate). When calling , pass them as a list under
gpt-image-2-image-to-image(or the model's documented multi-ref field).images_list - Substitute placeholders with the user's actual inputs before issuing each call.
{{input_name}}
- 此流程由LLM编排:阅读每个阶段,向用户收集缺失的输入参数,然后调用CLI命令。如果
muapi未设置,先执行MUAPI_API_KEY。muapi auth configure - 对于尚未有CLI别名的模型ID,可通过原始端点调用:,并使用
curl -X POST https://api.muapi.ai/api/v1/<endpoint> -H "x-api-key: $MUAPI_API_KEY" -H 'content-type: application/json' -d '{...}'轮询结果。muapi predict wait <request_id> - 阶段C使用两张参考图像(角色设定图+场景图)。调用时,需将它们作为列表传入
gpt-image-2-image-to-image(或模型文档指定的多参考字段)。images_list - 在发出每个调用前,将占位符替换为用户的实际输入。
{{input_name}}