muapi-ai-fight-scene

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

AI Fight Scene Generator

AI打斗场景生成器

Generate a high-cut-density action / fight scene by first composing a 16-cell storyboard image, then driving Seedance 2.0 image-to-video off that storyboard.

Estimated credits: ~250 per run.

The core idea (from the AtlasCloud Seedance 2 + GPT Image 2 tutorial): action tension comes from cut density, not single-shot quality. Forcing the video model to follow a pre-drawn 4×4 storyboard grid gives you 16 distinct shots in a 15-second clip — landing punches, reverse angles, ECUs, whip-pans — that no t2v prompt could choreograph on its own.

先制作一张16格分镜图，再基于该分镜图通过Seedance 2.0图生视频工具生成高镜头密度的动作/打斗场景。

预估消耗点数： 每次运行约250点。

核心思路（来自AtlasCloud Seedance 2 + GPT Image 2教程）：动作张力源于镜头密度，而非单镜头质量。 强制视频模型遵循预先绘制的4×4分镜网格，可在15秒的片段中呈现16个不同镜头——出拳、反打角度、特写镜头、甩镜等，这是纯文本转视频提示词无法独立编排的效果。

Inputs

输入参数

Name	Type	Required	Default	Description
`character_description`	text	yes	—	Full physical description of the fighter(s). Asymmetric details (eye colour, scar side, holster on left hip) help the model preserve identity across panels.
`environment_description`	text	yes	—	The scene setting — e.g. "cyberpunk wet back-alley, neon kanji signage, Stray-game aesthetic, rain on chrome."
`action_script`	text	yes	—	The action beat — prose or numbered beats. E.g. "Hero is cornered → blocks first punch → counter-elbow → throw opponent into trash cans → finisher."
`style_direction`	text	no	cinematic action film, anamorphic lens, high contrast, motion blur on hits	Aesthetic / look tags applied to every frame.
`duration`	int	no	15	Final video length in seconds. The storyboard's 16 cells map roughly 1 shot per second at default.
`aspect_ratio`	text	no	16:9	Output aspect — `16:9` cinematic, `9:16` vertical, `1:1` square.

名称	类型	是否必填	默认值	描述
`character_description`	文本	是	—	打斗角色的完整外貌描述。不对称细节（如眼睛颜色、疤痕位置、左侧腰带上的枪套）有助于模型在不同画面中保持角色一致性。
`environment_description`	文本	是	—	场景设定——例如："赛博朋克潮湿后巷，霓虹日文标识，《Stray》游戏风格，雨水打在金属表面。"
`action_script`	文本	是	—	动作节拍——散文或编号步骤形式。例如："主角被包围 → 格挡第一拳 → 反击肘击 → 将对手扔进垃圾桶 → 终结招式。"
`style_direction`	文本	否	cinematic action film, anamorphic lens, high contrast, motion blur on hits	应用于每一帧的美学/视觉风格标签。
`duration`	整数	否	15	最终视频时长（秒）。默认设置下，分镜的16个画面大致对应每秒一个镜头。
`aspect_ratio`	文本	否	16:9	输出画面比例—— `16:9` 宽屏电影格式， `9:16` 竖屏， `1:1` 正方形。

Steps

操作步骤

Phase A — Character Sheet

阶段A — 角色设定图

Generate a clean turnaround-style character sheet using

muapi image generate

(model=

gpt-image-2-text-to-image

Prompt:

Character reference sheet of {{character_description}}. Three views — front, 3/4, profile — on a neutral grey backdrop. Studio lighting, full body, no text overlays, photoreal. Asymmetric identifying details preserved on the correct side. {{style_direction}}.

Aspect ratio:
```
3:2
```

Present the character sheet and confirm identity details look right before proceeding. This image becomes reference #1 for later phases.

使用

muapi image generate

（模型=

gpt-image-2-text-to-image

）生成清晰的多角度角色设定图：

提示词：

Character reference sheet of {{character_description}}. Three views — front, 3/4, profile — on a neutral grey backdrop. Studio lighting, full body, no text overlays, photoreal. Asymmetric identifying details preserved on the correct side. {{style_direction}}.

画面比例：
```
3:2
```

展示角色设定图并确认身份细节无误后再继续。此图将作为后续阶段的参考素材#1。

Phase B — Environment Concept

阶段B — 环境概念设计

Use

muapi image generate

(model=

nano-banana-2

) to design the scene/world:

Prompt:

Wide establishing shot of {{environment_description}}. No characters in frame — environment only. Strong perspective lines, depth, atmospheric haze. {{style_direction}}. Production-design concept art.

Aspect ratio:
```
{{aspect_ratio}}
```

Nano-Banana-2 is chosen here for its reasoning-driven composition — it's better than text-to-image-only models at producing locations with believable spatial logic (chokepoints, cover, sightlines) that an action scene can use. Present for approval. This becomes reference #2.

使用

muapi image generate

（模型=

nano-banana-2

）设计场景/世界观：

提示词：

Wide establishing shot of {{environment_description}}. No characters in frame — environment only. Strong perspective lines, depth, atmospheric haze. {{style_direction}}. Production-design concept art.

画面比例：
```
{{aspect_ratio}}
```

选择Nano-Banana-2是因为它擅长基于逻辑构建构图——相比纯文本转图像模型，它能生成具有可信空间逻辑（如 choke points、掩护物、视线）的场景，适合动作戏使用。展示并确认设计。此图将作为参考素材#2。

Phase C — 16-Cell Storyboard

阶段C — 16格分镜图

Compose the action onto a single 4×4 storyboard image using

muapi image edit

(model=

gpt-image-2-image-to-image

Reference Images: the character sheet from Phase A and the environment plate from Phase B.

Prompt:

Compose a 4×4 storyboard grid (16 numbered cells) for the following action sequence:
{{action_script}}

CHARACTER (use reference image 1 identity throughout, asymmetric details preserved):
{{character_description}}

LOCATION (use reference image 2 spatial layout):
{{environment_description}}

Each cell labels: SHOT # (1–16) · SIZE (WIDE / MS / CU / ECU) · CAMERA-MOVE arrow (push, pull, whip, dolly, crash-zoom, handheld) · 1-word RHYTHM note (BEAT / IMPACT / RECOVERY / RESET).

Vary shot size aggressively — never two WIDEs in a row. Land every IMPACT on a CU or ECU.
Hand-drawn comic-book ink-and-wash style, monochrome with selective red accents on hits.
Numbered cells, clear gutters between panels.

Aesthetic: {{style_direction}}.

Aspect ratio:
```
1:1
```
(square works best for a 4×4 grid)

Present the storyboard to the user. Confirm:

The 16 shots read clearly
Identity stays consistent cell-to-cell
Cut density / shot-size variation looks aggressive enough

If a panel reads poorly, regenerate just the storyboard with that cell's note bolded ("CELL 7 must be an ECU on the right fist").

使用

muapi image edit

（模型=

gpt-image-2-image-to-image

）将动作编排到一张4×4的分镜图中：

参考图像：阶段A生成的角色设定图和阶段B生成的场景图。

提示词：

Compose a 4×4 storyboard grid (16 numbered cells) for the following action sequence:
{{action_script}}

CHARACTER (use reference image 1 identity throughout, asymmetric details preserved):
{{character_description}}

LOCATION (use reference image 2 spatial layout):
{{environment_description}}

Each cell labels: SHOT # (1–16) · SIZE (WIDE / MS / CU / ECU) · CAMERA-MOVE arrow (push, pull, whip, dolly, crash-zoom, handheld) · 1-word RHYTHM note (BEAT / IMPACT / RECOVERY / RESET).

Vary shot size aggressively — never two WIDEs in a row. Land every IMPACT on a CU or ECU.
Hand-drawn comic-book ink-and-wash style, monochrome with selective red accents on hits.
Numbered cells, clear gutters between panels.

Aesthetic: {{style_direction}}.

画面比例：
```
1:1
```
（正方形最适合4×4网格）

向用户展示分镜图，确认：

16个镜头逻辑清晰
角色身份在各画面中保持一致
镜头密度/镜头尺寸变化足够丰富

如果某一画面效果不佳，可重新生成分镜图并将该画面的要求加粗（例如"CELL 7必须是右拳的特写镜头"）。

Phase D — Storyboard → Video (Seedance 2.0)

阶段D — 分镜转视频（Seedance 2.0）

Hand the storyboard to

muapi video from-image

(model=

seedance-v2.0-i2v

Reference Image: the 16-cell storyboard from Phase C.

Prompt:

Generate a {{duration}}-second action sequence that strictly follows the 16-cell storyboard reference image, cell-by-cell, top-left to bottom-right.

- Honour each cell's labelled SHOT SIZE and CAMERA-MOVE — match cuts to the storyboard's rhythm notes.
- Strong cinematic feel and shot language. Exaggerated dynamics. Hits land hard with motion blur and impact frames.
- Camera language: anamorphic, handheld where the storyboard calls for it, locked-off where it doesn't.
- Native audio: impact sfx on every IMPACT cell, footsteps, fabric/Foley, restrained low score under the action.

Action being rendered: {{action_script}}.
Aesthetic: {{style_direction}}.

Duration:
```
{{duration}}
```
(default 15)
Aspect ratio:
```
{{aspect_ratio}}
```

After generation, present the final video. If the cut density feels too low or shots don't match the storyboard, regenerate Phase D first (cheaper than rebuilding the storyboard) with the prompt emphasising "strict cell-by-cell adherence" more aggressively.

将分镜图导入

muapi video from-image

（模型=

seedance-v2.0-i2v

）：

参考图像：阶段C生成的16格分镜图。

提示词：

Generate a {{duration}}-second action sequence that strictly follows the 16-cell storyboard reference image, cell-by-cell, top-left to bottom-right.

- Honour each cell's labelled SHOT SIZE and CAMERA-MOVE — match cuts to the storyboard's rhythm notes.
- Strong cinematic feel and shot language. Exaggerated dynamics. Hits land hard with motion blur and impact frames.
- Camera language: anamorphic, handheld where the storyboard calls for it, locked-off where it doesn't.
- Native audio: impact sfx on every IMPACT cell, footsteps, fabric/Foley, restrained low score under the action.

Action being rendered: {{action_script}}.
Aesthetic: {{style_direction}}.

时长：
```
{{duration}}
```
（默认15秒）
画面比例：
```
{{aspect_ratio}}
```

生成完成后展示最终视频。如果镜头密度过低或镜头与分镜不符，可先重新生成阶段D（比分镜重构成本更低），并在提示词中更强调"严格遵循逐格画面"。

Notes

注意事项

Why the storyboard image and not a text storyboard? Seedance 2.0 i2v anchors its motion plan to the visual reference. A grid of 16 drawn cells gives it 16 visual targets to hit — text descriptions of shots get averaged into mush.
Asymmetric character details matter. Without something like "scar over the right eyebrow" or "leather glove on the left hand only", identity drift between cells is the #1 failure mode.
Use
seedance-2.0-i2v-480p
to draft. Cheaper preview pass before committing to the full-res
```
seedance-v2.0-i2v
```
run.
For longer fights, chain two runs: first run uses storyboard A (cells 1–16, beats 1–15s); second run uses storyboard B (cells 17–32, beats 15–30s) with the last cell of A as a continuity anchor in B's first cell.
Language: Both English and Chinese prompts work in all four models, so the storyboard cell labels can be in either language.

为什么用分镜图而非文本分镜？ Seedance 2.0 i2v会将运动计划锚定到视觉参考。16格绘制的画面为它提供了16个明确的视觉目标——而镜头的文本描述会被模型平均化，导致效果模糊。
角色不对称细节至关重要。 如果没有诸如"右眉上方的疤痕"或"仅左手戴皮手套"这类细节，角色身份在不同画面中漂移是最常见的失败原因。
使用
seedance-2.0-i2v-480p
进行草稿生成。在使用全分辨率
```
seedance-v2.0-i2v
```
生成前，先用更低成本的预览版本测试。
如需更长打斗场景，可将两次运行串联：第一次运行使用分镜A（画面1–16，对应0–15秒动作）；第二次运行使用分镜B（画面17–32，对应15–30秒动作），并将分镜A的最后一帧作为分镜B第一帧的连续性锚点。
语言支持：所有四个模型均支持英文和中文提示词，因此分镜画面的标签可使用任意一种语言。

Trigger Keywords

触发关键词

fight scene

action sequence

storyboard to video

cut density

cinematic action

combat choreography

seedance 2 storyboard

fight scene

action sequence

storyboard to video

cut density

cinematic action

combat choreography

seedance 2 storyboard

Pipeline at a Glance

流程概览

character_description ──► [GPT-Image-2 t2i]   ─► character sheet ──┐
                                                                    │
environment_description ─► [Nano-Banana-2 t2i] ─► environment plate ┼─► [GPT-Image-2 i2i] ─► 16-cell storyboard ─► [Seedance 2.0 i2v] ─► 15s action video
                                                                    │
action_script + style_direction ───────────────────────────────────►┘

character_description ──► [GPT-Image-2 t2i]   ─► character sheet ──┐
                                                                    │
environment_description ─► [Nano-Banana-2 t2i] ─► environment plate ┼─► [GPT-Image-2 i2i] ─► 16-cell storyboard ─► [Seedance 2.0 i2v] ─► 15s action video
                                                                    │
action_script + style_direction ───────────────────────────────────►┘

Notes for the Executing Agent

执行Agent注意事项

This recipe is LLM-orchestrated: read each phase, gather any missing inputs from the user, then call
```
muapi
```
CLI commands. Use
```
muapi auth configure
```
first if
```
MUAPI_API_KEY
```
is unset.

For model IDs without a CLI alias yet, fall back to the raw endpoint via

curl -X POST https://api.muapi.ai/api/v1/<endpoint> -H "x-api-key: $MUAPI_API_KEY" -H 'content-type: application/json' -d '{...}'

and poll with

muapi predict wait <request_id>

Phase C uses TWO reference images (character sheet + environment plate). When calling
```
gpt-image-2-image-to-image
```
, pass them as a list under
```
images_list
```
(or the model's documented multi-ref field).
Substitute
```
{{input_name}}
```
placeholders with the user's actual inputs before issuing each call.

此流程由LLM编排：阅读每个阶段，向用户收集缺失的输入参数，然后调用
```
muapi
```
CLI命令。如果
```
MUAPI_API_KEY
```
未设置，先执行
```
muapi auth configure
```
。

对于尚未有CLI别名的模型ID，可通过原始端点调用：

curl -X POST https://api.muapi.ai/api/v1/<endpoint> -H "x-api-key: $MUAPI_API_KEY" -H 'content-type: application/json' -d '{...}'

，并使用

muapi predict wait <request_id>

轮询结果。

阶段C使用两张参考图像（角色设定图+场景图）。调用
```
gpt-image-2-image-to-image
```
时，需将它们作为列表传入
```
images_list
```
（或模型文档指定的多参考字段）。
在发出每个调用前，将
```
{{input_name}}
```
占位符替换为用户的实际输入。