video-generator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Video Generator

视频生成器

Generate professional short-form videos using Google VEO 3.1 (with native audio) or OpenAI Sora (high visual quality, up to 12s).
使用Google VEO 3.1(支持原生音频)或OpenAI Sora(高视觉质量,最长12秒)生成专业短视频。

Prerequisites & Setup

前置条件与配置

API Keys

API密钥

You need at least one. Both gives you maximum flexibility.
VEO (Google):
  1. Go to Google AI Studio
  2. Sign in with your Google account
  3. Click "Create API Key"
  4. Copy the generated key
bash
export GEMINI_API_KEY=your_gemini_key_here
Sora (OpenAI):
  1. Go to OpenAI Platform
  2. Create a new API key
  3. Copy the generated key
bash
export OPENAI_API_KEY=your_openai_key_here
您至少需要其中一个密钥,同时拥有两者能获得最大灵活性。
VEO(Google):
  1. 访问 Google AI Studio
  2. 使用Google账号登录
  3. 点击「创建API密钥」
  4. 复制生成的密钥
bash
export GEMINI_API_KEY=your_gemini_key_here
Sora(OpenAI):
  1. 访问 OpenAI Platform
  2. 创建新的API密钥
  3. 复制生成的密钥
bash
export OPENAI_API_KEY=your_openai_key_here

Install Dependencies

安装依赖

bash
pip install google-genai requests
bash
pip install google-genai requests

Available Models

可用模型

ProviderModelCLI
--model
Best For
VEOVeo 3.1 Standard
standard
(default)
Quality, audio fidelity, final assets
VEOVeo 3.1 Fast
fast
Drafts, iteration, quick previews
SoraSora 2
sora-2
(default)
Visual quality, creative motion
SoraSora 2 Pro
sora-2-pro
Highest Sora quality, slower
提供商模型CLI
--model
适用场景
VEOVeo 3.1 Standard
standard
(默认)
画质、音频保真度、最终成品
VEOVeo 3.1 Fast
fast
草稿、迭代、快速预览
SoraSora 2
sora-2
(默认)
视觉质量、创意动态效果
SoraSora 2 Pro
sora-2-pro
最高Sora画质,生成速度较慢

When to Use Which

工具选择指南

NeedUse
Native synchronized audio (dialogue, SFX)VEO - Sora has no audio
Longer clips (12 seconds)Sora - VEO maxes at 8s
Higher visual fidelity / artistic stylesSora - stronger on visual aesthetics
Fast iteration / draftsVEO Fast - quickest turnaround
4K resolutionVEO - Sora uses fixed sizes
Negative prompts (exclude elements)VEO - Sora doesn't support them
需求推荐工具
原生同步音频(对话、音效)VEO - Sora不支持音频
更长片段(12秒)Sora - VEO最长为8秒
更高视觉保真度/艺术风格Sora - 在视觉美学上表现更出色
快速迭代/草稿生成VEO Fast - 生成速度最快
4K分辨率VEO - Sora使用固定尺寸
负面提示词(排除元素)VEO - Sora不支持该功能

Video Parameters

视频参数

ParameterVEO OptionsSora OptionsDefault
Duration4, 6, or 8 seconds4, 8, or 12 seconds8s
Resolution720p, 1080p, 4KFixed (from aspect ratio)720p
Aspect Ratio16:9, 9:1616:9 -> 1280x720, 9:16 -> 720x128016:9
Count1-4 variations1-4 variations1
Negative PromptSupportedNot supported (ignored)none
参数VEO可选值Sora可选值默认值
时长4、6或8秒4、8或12秒8秒
分辨率720p、1080p、4K固定尺寸(由宽高比决定)720p
宽高比16:9、9:1616:9 -> 1280x720,9:16 -> 720x128016:9
生成数量1-4个变体1-4个变体1
负面提示词支持不支持(会被忽略)

Approximate Cost

大致成本

ProviderModelCost
VEOStandard~$0.025-0.05 per video
VEOFast~$0.01-0.025 per video
Sorasora-2~$0.10 per second of video
Sorasora-2-pro~$0.20 per second of video
提供商模型成本
VEOStandard每段视频约0.025-0.05美元
VEOFast每段视频约0.01-0.025美元
Sorasora-2每秒钟视频约0.10美元
Sorasora-2-pro每秒钟视频约0.20美元

Latency

生成延迟

Video generation is async - expect 11 seconds to 6 minutes depending on server load and provider. The script polls automatically and saves when ready.

视频生成为异步操作——根据服务器负载和提供商不同,耗时约11秒到6分钟不等。脚本会自动轮询状态,生成完成后自动保存。

Workflow Overview

工作流程概述

  1. Define the Concept - What story does the video tell in 4-12 seconds?
  2. Storyboard the Shot - Camera, motion, subject, environment
  3. Add Audio Direction - Dialogue, sound effects, ambient sound (VEO only)
  4. Generate Video - Run via API
  5. Iterate - Adjust prompt based on results
  1. 明确创意概念 - 这段4-12秒的视频要讲述什么故事?
  2. 分镜设计 - 镜头、运动、主体、环境
  3. 音频指导 - 对话、音效、环境音(仅VEO支持)
  4. 生成视频 - 通过API运行生成
  5. 迭代优化 - 根据生成结果调整提示词

Prompt Rules

提示词规则

  1. 150-300 characters is the sweet spot. Under 100 = generic. Over 400 = the model drops elements unpredictably.
  2. One shot = one action. Don't pack multiple scene changes or style shifts into one prompt. One camera move + one subject action.
  3. Describe what you want, not what you don't want. Use the
    --negative
    flag for exclusions (VEO only), not the main prompt.
  4. Treat audio as a separate layer. Write audio cues in their own sentences, not mixed into visual descriptions. (VEO only - Sora has no audio.)
  5. Use colon syntax for dialogue.
    A man says: "Hello!"
    prevents subtitle artifacts. (VEO only.)
  6. Keep dialogue under 7 words per line. Longer speech causes lip-sync drift. (VEO only.)
  7. Start simple, then layer. Begin with a basic prompt, evaluate, then add one variable at a time.
  8. Slow camera movements win. Fast pans and spins break output. Use tight framing for perceived speed.

  1. 最佳长度为150-300字符。少于100字符会导致内容通用化;超过400字符,模型会随机忽略部分元素。
  2. 一个镜头对应一个动作。不要在一个提示词中加入多个场景切换或风格转变。一个镜头运动+一个主体动作即可。
  3. 描述想要的内容,而非不想要的。使用
    --negative
    参数指定排除元素(仅VEO支持),不要在主提示词中体现。
  4. 将音频视为独立层级。单独写音频提示,不要与视觉描述混合。(仅VEO支持——Sora无音频功能。)
  5. 对话使用冒号语法
    A man says: "Hello!"
    可避免字幕瑕疵。(仅VEO支持。)
  6. 每行对话不超过7个单词。过长的台词会导致唇形同步偏差。(仅VEO支持。)
  7. 从简单开始,逐步叠加。先写基础提示词,评估结果后再每次添加一个变量。
  8. 缓慢镜头运动效果更佳。快速摇镜和旋转会破坏输出效果。使用紧凑构图来营造速度感。

Storyboarding the Shot

分镜设计

Structure your prompt with cinematic language. Both VEO and Sora respond well to film terminology:
使用电影术语构建提示词。VEO和Sora都能很好地响应专业影视词汇:

Camera Language

镜头语言

TermEffect
Wide shotShows full environment, establishes context
Close-upTight on a subject, emphasizes detail
Tracking shotCamera follows subject movement
Dolly in/outCamera moves toward or away from subject
Static shotLocked camera, subject moves within frame
Slow panCamera rotates horizontally across scene
Overhead / bird's eyeLooking straight down
Low angleLooking up at subject, adds drama
术语效果
全景(Wide shot)展示完整环境,交代背景
特写(Close-up)聚焦主体,强调细节
跟拍(Tracking shot)镜头跟随主体运动
推拉镜头(Dolly in/out)镜头靠近或远离主体
固定镜头(Static shot)镜头固定,主体在画面内运动
缓慢摇镜(Slow pan)镜头水平缓慢扫过场景
俯拍/鸟瞰(Overhead / bird's eye)垂直向下拍摄
仰拍(Low angle)从低处向上拍摄主体,增加戏剧性

Motion Description

运动描述

Be explicit about what moves and how:
Don'tDo
"A dog in a park""A golden retriever runs toward camera through tall grass, ears bouncing"
"City at night""Camera slowly dollies through a neon-lit Tokyo alley as rain puddles reflect signs"
"Ocean""A single wave forms, curls, and crashes onto wet sand in slow motion"
明确描述运动对象和方式:
错误示例正确示例
"公园里的狗""一只金毛猎犬穿过高高的草丛朝镜头跑来,耳朵晃动"
"夜晚的城市""镜头缓慢推拉过霓虹闪烁的东京小巷,雨水坑倒映着招牌"
"海洋""一道海浪形成、卷曲,然后缓慢拍打到湿润的沙滩上"

Lighting & Atmosphere

光线与氛围

TermMood
Golden hourWarm, nostalgic, cinematic
OvercastSoft, even, contemplative
Neon / artificialUrban, energetic, modern
CandlelightIntimate, quiet
Hard shadowsDramatic, high contrast

术语氛围
黄金时段(Golden hour)温暖、怀旧、电影感
阴天(Overcast)柔和、均匀、沉思
霓虹/人工光(Neon / artificial)都市、活力、现代
烛光(Candlelight)私密、安静
硬阴影(Hard shadows)戏剧性、高对比度

Audio Direction (VEO Only)

音频指导(仅VEO支持)

VEO 3.1 generates synchronized audio natively. This is a major differentiator.
VEO 3.1可原生生成同步音频,这是其核心优势之一。

Three Types of Audio Cues

三种音频提示类型

1. Dialogue - Use colon syntax before quotes:
A barista says: "Here you go!" as she slides a latte across the counter.
Keep lines under 7 words for clean lip-sync. One sentence max per 8-second clip.
2. Sound Effects - Describe specific sounds:
The sound of a match striking, then a candle flame flickering to life.
3. Ambient Sound - Set the sonic environment:
Birds chirping in the background, distant traffic hum, morning atmosphere.

1. 对话 - 在引号前使用冒号语法:
咖啡师说: "您的餐点好了!" 同时将拿铁推过柜台。
每行对话不超过7个单词,以保证唇形同步清晰。每段8秒的视频最多包含一句台词。
2. 音效 - 描述具体声音:
火柴划燃的声音,然后蜡烛火焰闪烁着点燃。
3. 环境音 - 设定声音环境:
背景中鸟儿鸣叫,远处车流声,清晨的氛围。

Prompt Structure (5-Element Priority)

提示词结构(5要素优先级)

Structure prompts in this order - you don't need all five every time:
  1. Shot Specification - camera work, framing, movement
  2. Setting & Atmosphere - location, time, weather, lighting
  3. Subject & Action - who/what, described in beats
  4. Audio Layer - dialogue, SFX, ambient (VEO only)
  5. Style/Grade - artistic treatment, lens, color
[Shot type + camera movement]. [Setting and lighting]. [Subject doing action].
[Audio: what you hear]. [Style/grade].
按以下顺序构建提示词——无需每次都包含所有5个要素:
  1. 镜头规格 - 镜头类型、构图、运动
  2. 场景与氛围 - 地点、时间、天气、光线
  3. 主体与动作 - 主体是谁/是什么,分步骤描述动作
  4. 音频层级 - 对话、音效、环境音(仅VEO支持)
  5. 风格/调色 - 艺术处理、镜头、色彩
[镜头类型 + 运动]. [场景与光线]. [主体动作].
[音频:听到的内容]. [风格/调色].

Example Prompts

提示词示例

Podcast promo (VEO, 16:9):
A close-up tracking shot of a vintage microphone in a warmly lit podcast studio.
Steam rises slowly from a coffee mug beside it. Morning sunlight filters through
blinds, casting soft stripes across the desk. The sound of a quiet room - a clock
ticking, the faint hum of equipment. Cinematic, intimate, inviting.
Social teaser - vertical (Sora, 9:16):
A hand reaches into frame and opens a leather-bound journal on a wooden desk.
The pages flutter briefly before settling on a page covered in handwritten notes.
A pen is set down beside the book. Warm overhead lighting, shallow depth of field.
Shot on 16mm film, natural grain.
Product reveal (either provider, 16:9):
Camera slowly orbits a pair of wireless headphones placed on a dark marble surface.
Dramatic studio lighting with a single warm key light from the left. The headphones
cast a sharp shadow. Premium, minimal, modern.

播客宣传视频(VEO,16:9):
特写跟拍镜头,展示温暖灯光下播客工作室里的复古麦克风。
旁边的咖啡杯缓缓冒着蒸汽。清晨的阳光透过百叶窗,在桌面上投下柔和的条纹。
安静的房间声音——时钟滴答声,设备微弱的嗡嗡声。
电影感、私密、有吸引力。
社交平台预告视频 - 竖屏(Sora,9:16):
一只手进入画面,在木质桌面上打开一本皮质封面的日记。
书页短暂翻动后停在满是手写笔记的页面上。
一支钢笔放在书旁边。
温暖的顶光,浅景深。
16mm胶片拍摄,自然颗粒感。
产品展示视频(两款工具均可,16:9):
镜头缓慢环绕放置在深色大理石台面上的无线耳机。
戏剧性的工作室灯光,左侧有一束温暖的主光源。
耳机投下清晰的阴影。
高端、极简、现代。

Running the Script

运行脚本

bash
undefined
bash
undefined

VEO: Basic generation (8s, 720p, 16:9)

VEO:基础生成(8秒,720p,16:9)

python scripts/generate_video.py "Your prompt here"
python scripts/generate_video.py "你的提示词"

VEO: Fast draft for iteration

VEO:快速生成草稿(用于迭代)

python scripts/generate_video.py "Your prompt here" --model fast
python scripts/generate_video.py "你的提示词" --model fast

VEO: High quality vertical video for social

VEO:生成社交媒体用高质量竖屏视频

python scripts/generate_video.py "Your prompt" --aspect 9:16 --resolution 1080p
python scripts/generate_video.py "你的提示词" --aspect 9:16 --resolution 1080p

VEO: Multiple variations to choose from

VEO:生成多个变体供选择

python scripts/generate_video.py "Your prompt" --count 2 --output ./videos
python scripts/generate_video.py "你的提示词" --count 2 --output ./videos

VEO: Short clip with specific settings

VEO:生成特定参数的短视频

python scripts/generate_video.py "Your prompt" --duration 4 --resolution 4k --name "hero-clip"
python scripts/generate_video.py "你的提示词" --duration 4 --resolution 4k --name "hero-clip"

VEO: Exclude unwanted elements

VEO:排除不需要的元素

python scripts/generate_video.py "Your prompt" --negative "text overlays, watermarks, blurry"
python scripts/generate_video.py "你的提示词" --negative "text overlays, watermarks, blurry"

Sora: Basic generation

Sora:基础生成

python scripts/generate_video.py "A cat on a windowsill, warm light" --provider sora
python scripts/generate_video.py "窗台上的猫,温暖光线" --provider sora

Sora: 12-second clip (longer than VEO allows)

Sora:生成12秒长视频(VEO不支持)

python scripts/generate_video.py "A dog running through a meadow" --provider sora --duration 12
python scripts/generate_video.py "狗穿过草地奔跑" --provider sora --duration 12

Sora: Pro model, vertical

Sora:使用Pro模型生成竖屏视频

python scripts/generate_video.py "Latte art being poured" --provider sora --model sora-2-pro --aspect 9:16
python scripts/generate_video.py "拉花咖啡制作过程" --provider sora --model sora-2-pro --aspect 9:16

Sora: Multiple variations

Sora:生成多个变体

python scripts/generate_video.py "Ocean waves at sunset" --provider sora --count 2 --output ./videos

**Options:**

| Flag | Values | Default | Notes |
|------|--------|---------|-------|
| `--provider` | `veo`, `sora` | `veo` | VEO for audio, Sora for visual quality |
| `--model` | VEO: `standard`, `fast` / Sora: `sora-2`, `sora-2-pro` | `standard` / `sora-2` | Provider-specific models |
| `--aspect` | `16:9`, `9:16` | `16:9` | Vertical for Reels/TikTok/Shorts |
| `--resolution` | `720p`, `1080p`, `4k` | `720p` | VEO only (ignored by Sora) |
| `--duration` | VEO: `4`, `6`, `8` / Sora: `4`, `8`, `12` | `8` | Sora supports 12s |
| `--negative` | text | none | VEO only (ignored by Sora) |
| `--count` | `1`-`4` | `1` | Generate variations |
| `--output` | path | `.` | Save directory |
| `--name` | text | none | Filename prefix |

**Output:** MP4 files with timestamp-based filenames.

---
python scripts/generate_video.py "日落时的海浪" --provider sora --count 2 --output ./videos

**可选参数:**

| 参数 | 可选值 | 默认值 | 说明 |
|------|--------|---------|-------|
| `--provider` | `veo`, `sora` | `veo` | VEO支持音频,Sora视觉质量更高 |
| `--model` | VEO: `standard`, `fast` / Sora: `sora-2`, `sora-2-pro` | `standard` / `sora-2` | 各提供商专属模型 |
| `--aspect` | `16:9`, `9:16` | `16:9` | 9:16为竖屏,适用于Reels/TikTok/Shorts |
| `--resolution` | `720p`, `1080p`, `4k` | `720p` | 仅VEO支持(Sora会忽略该参数) |
| `--duration` | VEO: `4`, `6`, `8` / Sora: `4`, `8`, `12` | `8` | Sora支持12秒时长 |
| `--negative` | 文本 | 无 | 仅VEO支持(Sora会忽略该参数) |
| `--count` | `1`-`4` | `1` | 生成多个变体 |
| `--output` | 路径 | `.` | 保存目录 |
| `--name` | 文本 | 无 | 文件名前缀 |

**输出:** 带有时间戳文件名的MP4文件。

---

Iteration Strategy

迭代策略

  1. Start with
    --model fast
    (VEO) or
    --duration 4
    (Sora) for quick drafts
  2. Refine the prompt through 2-3 fast iterations
  3. Switch to full quality/duration for the final take
  4. Generate 2 variations of the final prompt and pick the best
After reviewing generated video:
  • Motion wrong? Be more explicit about direction, speed, and sequence
  • Audio off? (VEO) Add or refine audio cues
  • Too much happening? Simplify. One clear action per clip works best
  • Style drift? Add a negative prompt (VEO) or adjust style descriptors
  • Wrong mood? Adjust lighting and atmosphere descriptors

  1. 先使用
    --model fast
    (VEO)或
    --duration 4
    (Sora)快速生成草稿
  2. 通过2-3次快速迭代优化提示词
  3. 切换到全质量/完整时长生成最终版本
  4. 生成2个最终提示词的变体,选择最佳版本
查看生成的视频后:
  • 运动效果不符合预期? 更明确地描述方向、速度和动作顺序
  • 音频效果不佳?(VEO)添加或优化音频提示
  • 内容过于繁杂? 简化提示词。每个镜头一个清晰动作效果最佳
  • 风格偏差? 添加负面提示词(VEO)或调整风格描述
  • 氛围不对? 调整光线和氛围描述

Negative Prompt Guide (VEO Only)

负面提示词指南(仅VEO支持)

Use
--negative
to steer away from common problems:
ProblemNegative Prompt
Text/watermarks appearing"text, watermarks, logos, subtitles"
Uncanny faces"distorted faces, morphing features"
Jittery motion"jerky motion, flickering, stuttering"
Over-saturated look"oversaturated, HDR, neon colors"
Stock footage feel"generic, corporate, stock footage aesthetic"

使用
--negative
参数避免常见问题:
问题负面提示词
出现文字/水印"text, watermarks, logos, subtitles"
面部诡异"distorted faces, morphing features"
运动抖动"jerky motion, flickering, stuttering"
色彩过饱和"oversaturated, HDR, neon colors"
素材感过重"generic, corporate, stock footage aesthetic"

Prompting Principles

提示词原则

Think in Shots, Not Scenes

以镜头为单位思考,而非场景

8 seconds is one shot. Don't try to cram a narrative arc - describe a single continuous moment.
Don'tDo
"A chef makes a meal from scratch and serves it""A chef's hands julienne carrots on a wooden cutting board, knife moving rhythmically"
"A day at the beach from sunrise to sunset""Waves gently lap at bare feet on sand, golden hour light, camera at ground level"
8秒就是一个镜头。不要试图塞进完整叙事弧——描述一个连续的瞬间即可。
错误示例正确示例
"厨师从头制作餐点并端上""厨师的手在木质砧板上切胡萝卜丝,刀具节奏性地移动"
"从日出到日落的海滩一日""海浪轻轻拍打着沙滩上的赤脚,黄金时段光线,镜头贴近地面"

Be Specific About Motion

明确描述运动细节

Vague motion descriptions produce vague results. Describe what moves, how fast, and in which direction.
模糊的运动描述会导致模糊的结果。描述什么在动速度如何方向是什么

Layer Your Audio (VEO)

叠加音频层次(VEO)

Don't just describe one sound - create a soundscape:
The crackling of a vinyl record playing soft jazz,
a distant car horn outside the window,
the quiet clink of an ice cube in a glass.
不要只描述一种声音——构建完整的音景:
黑胶唱片播放轻柔爵士乐的噼啪声,
窗外远处的汽车喇叭声,
玻璃杯里冰块碰撞的轻微叮当声。

Use Negative Prompts Proactively (VEO)

主动使用负面提示词(VEO)

Always include
--negative "text, watermarks"
at minimum. The model occasionally generates unwanted text overlays.

至少始终包含
--negative "text, watermarks"
。模型偶尔会生成不必要的文字叠加层。

Use Cases

适用场景

Social Media (9:16, 4-8s)

社交媒体(9:16,4-8秒)

Short, punchy, loop-friendly. Favor close-ups and strong motion.
bash
undefined
简短、有力、适合循环播放。优先选择特写镜头和强烈运动效果。
bash
undefined

VEO with audio

VEO带音频版本

python scripts/generate_video.py "Close-up of coffee being poured into a ceramic mug, steam rising, warm morning light. The sound of liquid pouring and a soft sigh."
--aspect 9:16 --duration 4 --resolution 1080p
python scripts/generate_video.py "特写镜头展示咖啡倒入陶瓷杯,蒸汽升起,温暖的晨光。液体倾倒的声音和轻柔的叹息声。" \ --aspect 9:16 --duration 4 --resolution 1080p

Sora for visual quality

Sora高画质版本

python scripts/generate_video.py "Close-up of coffee being poured into a ceramic mug, steam rising, warm morning light. Photorealistic, shallow depth of field."
--provider sora --aspect 9:16 --duration 4
undefined
python scripts/generate_video.py "特写镜头展示咖啡倒入陶瓷杯,蒸汽升起,温暖的晨光。照片级真实感,浅景深。" \ --provider sora --aspect 9:16 --duration 4
undefined

Podcast/Newsletter Headers (16:9, 8s)

播客/通讯头图(16:9,8秒)

Ambient, atmospheric. Favor wide shots and subtle motion.
bash
python scripts/generate_video.py "A vintage radio on a wooden shelf, dial slowly turning. Warm tungsten light. Soft static transitioning into faint music." \
  --resolution 1080p --name "podcast-header"
氛围浓厚、有氛围感。优先选择全景镜头和缓慢运动。
bash
python scripts/generate_video.py "木质架子上的复古收音机,调谐旋钮缓慢转动。温暖的钨丝灯光。微弱的静电声逐渐过渡为轻柔的音乐。" \\
  --resolution 1080p --name "podcast-header"

Product/Brand (16:9, 6-8s)

产品/品牌宣传(16:9,6-8秒)

Clean, controlled, premium feel. Studio lighting, slow orbits.
bash
python scripts/generate_video.py "Camera slowly orbits a leather notebook on a dark wood desk. Single warm key light. The sound of pages turning gently." \
  --resolution 4k --duration 6 --negative "text, watermarks, busy background"
简洁、可控、高端质感。工作室灯光,缓慢环绕镜头。
bash
python scripts/generate_video.py "镜头缓慢环绕深色木桌上的皮质笔记本。单束温暖的主光源。书页轻轻翻动的声音。" \\
  --resolution 4k --duration 6 --negative "text, watermarks, busy background"

Extended Clips (Sora, 12s)

长片段视频(Sora,12秒)

For content that needs more breathing room.
bash
python scripts/generate_video.py "A woman walks through an autumn forest path, leaves falling around her. Golden hour light filters through the canopy. Cinematic, contemplative." \
  --provider sora --duration 12 --name "autumn-walk"

适合需要更多表达空间的内容。
bash
python scripts/generate_video.py "一名女子走过秋日林间小径,树叶在周围飘落。黄金时段的光线透过树冠洒下。电影感、沉思氛围。" \\
  --provider sora --duration 12 --name "autumn-walk"

Multi-Clip Consistency

多镜头一致性

When generating multiple clips for a project:
为同一项目生成多个镜头时:

Lock Your Constants

锁定常量

Create a consistency block and repeat it verbatim across all prompts:
CHARACTER: A woman in her thirties with short silver hair and a black turtleneck
PALETTE: amber, cream, walnut brown, deep olive
LIGHTING: Soft key light from camera right, warm tungsten
STYLE: Cinematic, shallow depth of field, warm film grain
NEGATIVE: no subtitles, no on-screen text, no watermarks
创建一致性模块,在所有提示词中逐字重复:
CHARACTER: 三十多岁的女性,银灰色短发,黑色高领衫
PALETTE: 琥珀色、奶油色、胡桃棕、深橄榄绿
LIGHTING: 镜头右侧的柔和主光源,温暖钨丝灯
STYLE: 电影感、浅景深、温暖胶片颗粒
NEGATIVE: no subtitles, no on-screen text, no watermarks

Consistency Checklist

一致性检查清单

  • Same character description, word for word
  • Same palette anchors (3-5 named colors)
  • Same lighting direction and quality
  • Same aspect ratio and resolution
  • Same style/grade language
  • Same provider for all clips in a sequence
  • 完全相同的角色描述
  • 相同的调色板锚点(3-5种指定颜色)
  • 相同的光线方向和质量
  • 相同的宽高比和分辨率
  • 相同的风格/调色语言
  • 同一序列的所有镜头使用同一提供商 ",