veo3-prompter

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Veo 3.1 Video Prompter

Veo 3.1 视频提示词生成器

Transform ideas into professional Veo 3.1 prompts using cinematic structure, audio direction, and multi-shot choreography.

借助电影化结构、音频指导和多镜头编排，将创意转化为专业的Veo 3.1提示词。

When to Use

使用场景

Invoke when user:

Says "create a video prompt" or "generate a Veo prompt"
Wants to "make a video of..." or "animate this..."
Asks for help with "video generation" or "AI video"
Needs "Veo 3" or "Veo 3.1" prompt assistance
Wants to create "multi-shot" or "cinematic" video sequences

当用户出现以下需求时调用：

提及「创建视频提示词」或「生成Veo提示词」
想要「制作一个关于...的视频」或「将这个内容动画化」
寻求「视频生成」或「AI视频」相关帮助
需要「Veo 3」或「Veo 3.1」提示词相关协助
想要创建「多镜头」或「电影化」视频序列

Core Prompt Formula

核心提示词公式

[Cinematography] + [Subject] + [Action] + [Context] + [Style & Audio]

Every prompt should address these five elements for maximum control.

[镜头设计] + [主体] + [动作] + [背景] + [风格与音频]

每个提示词都应覆盖这五个要素，以实现最大程度的创作可控性。

Prompt Density: Finding the Sweet Spot

提示词信息量：找到最佳平衡点

Prompts fail in two directions:

Too sparse: Model fills gaps unpredictably, you lose creative control
Too dense: Model can't execute all instructions, produces confused output

提示词失效通常有两种极端情况：

过于简略： 模型会不可预测地填补信息空白，你会失去创作控制权
过于繁杂： 模型无法执行所有指令，输出内容混乱

The Priority Framework

优先级框架

Tier 1 - MUST INCLUDE (model needs these):

Shot size (wide/medium/close-up)
Subject identity (who/what is in frame)
Primary action (what happens)
One dominant mood/style word

Tier 2 - SHOULD INCLUDE (significant impact):

Camera movement OR angle (pick one, not both)
Lighting quality (natural/dramatic/soft)
One audio layer (dialogue OR SFX OR ambient)
Setting/environment

Tier 3 - NICE TO HAVE (diminishing returns):

Secondary audio layers
Specific lens type
Color palette details
Film stock/grain texture
Background action

Rule of thumb: Include all Tier 1, most of Tier 2, and 1-2 from Tier 3.

第一层 - 必须包含（模型必备要素）：

镜头景别（全景/中景/特写）
主体身份（画面中的人/物）
核心动作（发生的事件）
一个主导情绪/风格词

第二层 - 建议包含（效果提升显著）：

镜头运动 OR 拍摄角度（二选一，不要同时选）
光线质感（自然/戏剧化/柔和）
一层音频（对话 OR SFX OR 环境音）
场景/环境

第三层 - 可选包含（收益递减）：

次要音频层
特定镜头类型
调色盘细节
胶片颗粒质感
背景动作

经验法则： 包含所有第一层要素，大部分第二层要素，以及1-2个第三层要素。

Density Comparison

信息量对比

TOO SPARSE (model guesses too much):

"A professor talking about philosophy"

TOO DENSE (model overloaded):

"Medium close-up shot at eye level with a 50mm lens at f/1.8 creating shallow depth of field with bokeh highlights, of a 52-year-old female professor with silver-streaked auburn hair pulled back in a loose bun, wearing an olive tweed jacket with leather elbow patches over a cream silk blouse with a small pearl brooch, standing in a contemporary lecture hall with tiered mahogany seating and brass fixtures visible in the soft background, natural diffused daylight streaming through floor-to-ceiling windows on the left side creating soft rembrandt lighting on her face with a gentle fill from reflected light on the right..."

OPTIMAL (directed but breathable):

"Medium close-up of a professor in her 50s, tweed jacket, standing in a university lecture hall. She gestures while speaking: 'Kant asked one question: could everyone do this?' Warm natural window light from left, soft academic atmosphere. SFX: marker on whiteboard."

过于简略（模型猜测内容过多）：

"一位教授谈论哲学"

过于繁杂（模型负荷过载）：

"平视角度中近景，使用50mm f/1.8镜头营造浅景深和焦外虚化效果，52岁女教授， Auburn色头发带银白挑染，松散挽成发髻，身穿橄榄色粗花呢夹克，肘部带皮质补丁，内搭米白色丝绸衬衫，佩戴小珍珠胸针，站在现代阶梯教室中，背景可见桃花心木阶梯座椅和黄铜装置，左侧落地玻璃窗透进柔和自然光，在她脸上形成柔和的伦勃朗光，右侧反射光提供 gentle 补光..."

最佳状态（指令清晰且留有余量）：

"50多岁教授的中近景，穿粗花呢夹克，站在大学阶梯教室内。她边做手势边说话：‘康德问了一个问题：所有人都能这么做吗？’左侧有温暖的自然窗光，柔和的学术氛围。SFX：白板上的马克笔书写声。"

Calibration Signals

校准信号

Signs your prompt is too sparse:

Results vary wildly between generations
Key elements missing or wrong
Mood/tone inconsistent with intent

Signs your prompt is too dense:

Model ignores some instructions entirely
Unnatural or frozen-looking motion
Conflicting elements appear (e.g., both day and night)
Audio doesn't match visual action

提示词过于简略的信号：

不同次生成的结果差异极大
核心要素缺失或出错
情绪/调性和预期不一致

提示词过于繁杂的信号：

模型完全忽略部分指令
动作不自然或看起来卡顿
出现冲突要素（比如同时是白天和黑夜）
音频和画面动作不匹配

Iteration Strategy

迭代策略

Start with Tier 1 only - generate test
Add Tier 2 elements that matter most to your vision
Add ONE Tier 3 detail if something specific is missing
Remove any element the model consistently ignores

See

references/prompt-calibration.md

for detailed examples and troubleshooting.

仅从第一层要素开始 - 生成测试版本
添加对你的创作愿景最重要的第二层要素
如果缺少特定细节，仅添加1个第三层要素
移除模型始终忽略的所有要素

参考

references/prompt-calibration.md

查看详细示例和问题排查方案。

Cinematography Elements

电影制作要素

Shot Composition

镜头构图

Wide shot, medium shot, close-up, extreme close-up
Single shot, two shot, over-the-shoulder shot
High angle, low angle, eye level, worm's eye, bird's eye

全景、中景、特写、极端特写
单人镜头、双人镜头、过肩镜头
高角度、低角度、平视、 worm's eye、鸟瞰

Camera Movement

镜头运动

Dolly (in/out), tracking shot, crane shot
Pan (left/right), tilt (up/down), zoom
Steadicam, handheld, aerial, POV

推轨（推进/拉出）、跟踪镜头、升降镜头
摇镜（左/右）、俯仰（上/下）、变焦
斯坦尼康、手持、航拍、第一视角

Lens & Focus

镜头与对焦

Shallow depth of field, deep focus
Wide-angle lens, telephoto, macro lens
Soft focus, rack focus, bokeh

浅景深、深焦
广角镜头、长焦镜头、微距镜头
柔焦、对焦切换、焦外虚化

Audio Direction

音频指导

Veo 3.1 generates synchronized sound. Direct it explicitly:

Dialogue (use quotes):

"A man says, 'The storm is coming.'"

Sound Effects (label with SFX):

"SFX: Thunder rumbles in the distance, rain patters on glass"

Ambient Noise:

"Ambient noise: busy café chatter, clinking cups, soft jazz"

Music:

"A swelling orchestral score begins to play"

Veo 3.1 可生成同步音效，请明确指定音频要求：

对话（使用引号标注）：

"一个男人说：‘风暴要来了。’"

音效（标注SFX）：

"SFX：远处雷声隆隆，雨点打在玻璃上的声音"

环境音：

"环境音：繁忙咖啡馆的交谈声、杯子碰撞声、柔和爵士乐"

音乐：

"渐强的管弦乐配乐开始播放"

Timestamp Prompting

时间轴提示词

For multi-shot sequences within one generation (max 8 seconds):

[00:00-00:02] Medium shot of a detective at his desk, lighting a cigarette.
SFX: Match strike, paper rustling.

[00:02-00:04] Close-up of his eyes narrowing as he reads a letter.
Ambient: Rain against the window.

[00:04-00:06] Reverse shot of a shadowy figure in the doorway.
A woman's voice: "You shouldn't have looked."

[00:06-00:08] Wide shot as the detective stands, reaching for his gun.
SFX: Chair scraping, thunder crack.

适用于单次生成内的多镜头序列（最长8秒）：

[00:00-00:02] Medium shot of a detective at his desk, lighting a cigarette.
SFX: Match strike, paper rustling.

[00:02-00:04] Close-up of his eyes narrowing as he reads a letter.
Ambient: Rain against the window.

[00:04-00:06] Reverse shot of a shadowy figure in the doorway.
A woman's voice: "You shouldn't have looked."

[00:06-00:08] Wide shot as the detective stands, reaching for his gun.
SFX: Chair scraping, thunder crack.

Style Keywords

风格关键词

Visual Aesthetic:

Photorealistic, cinematic, documentary, animation
Retro (sepia, grainy film, 1980s vaporwave)
Noir, epic fantasy, sci-fi, romantic, horror

Mood & Lighting:

Warm golden hour, cool blue tones, moody shadows
Harsh fluorescent, soft morning light, dramatic chiaroscuro
Neon-lit, candlelit, overcast diffused

Film Grain Tip:

Add "slightly grainy, film-like" to avoid overly clean AI look

视觉美学：

照片级写实、电影化、纪录片、动画
复古（ sepia、颗粒感胶片、1980年代蒸汽波）
黑色电影、史诗奇幻、科幻、浪漫、恐怖

情绪与光线：

温暖黄金小时、冷蓝色调、阴郁阴影
刺眼荧光灯、柔和晨光、戏剧化明暗对比
霓虹灯光、烛光、阴天柔光

胶片颗粒技巧：

添加「略带颗粒感，胶片质感」来避免过于干净的AI生成感

Output Formats

输出格式

Quick Prompt: Single sentence for simple shots Structured Prompt: Multi-line with all five elements Timestamp Sequence: Choreographed multi-shot within 8s Storyboard Mode: Multiple prompts for full narrative

快速提示词： 适用于简单镜头的单句提示词 结构化提示词： 覆盖所有五个要素的多行提示词 时间轴序列： 8秒内的编排好多镜头提示词 分镜模式： 适用于完整叙事的多个提示词组合

Example Prompts

提示词示例

Action Shot:

"Tracking shot following a parkour athlete sprinting across rooftops at sunset, warm orange light, urban cityscape background, cinematic, shallow depth of field. SFX: footsteps on concrete, wind rushing past."

Dialogue Scene:

"Medium two-shot in a dimly lit bar, a woman in red leans toward a man in a suit. She says quietly, 'I know what you did.' Ambient: jazz music, glasses clinking. Moody noir aesthetic, warm tungsten lighting."

Nature Documentary:

"Slow-motion close-up of a hummingbird drinking from a flower, macro lens with shallow focus, lush green garden background, soft morning light. SFX: gentle buzzing, birdsong."

动作镜头：

"跟踪镜头跟随跑酷运动员在日落时分的屋顶上冲刺，暖橙色光线，城市天际线背景，电影化，浅景深。SFX：混凝土上的脚步声、风声。"

对话场景：

"光线昏暗的酒吧里的双人中景，穿红色衣服的女人靠向穿西装的男人。她轻声说：‘我知道你做了什么。’环境音：爵士乐、杯子碰撞声。阴郁黑色电影美学，温暖钨丝灯照明。"

自然纪录片：

"蜂鸟从花中采蜜的慢动作特写，微距镜头浅焦，繁茂的绿色花园背景，柔和晨光。SFX：轻柔的嗡嗡声、鸟鸣。"

Technical Specs

技术规格

Duration: 4, 6, or 8 seconds
Resolution: 720p or 1080p
Aspect Ratio: 16:9 (landscape) or 9:16 (portrait)
Frame Rate: Configurable (default: 24 FPS)

时长： 4、6或8秒
分辨率： 720p或1080p
宽高比： 16:9（横屏）或9:16（竖屏）
帧率： 可配置（默认：24 FPS）

Advanced API Options

高级API选项

When using Veo through API (not Flow), these additional parameters are available:

Parameter	Description	Default
`negativePrompt`	Elements to exclude from the video	-
`seed`	RNG seed for reproducible results (same prompt + seed = same video)	Random
`enhancePrompt`	Let the model rewrite your prompt for better results	false
`generateAudio`	Generate synchronized audio	true
`personGeneration`	Control person generation: `dont_allow` or `allow_adult`	-
`referenceImages`	Up to 3 asset images OR 1 style image for consistency	-

通过API使用Veo（而非Flow）时，可使用以下额外参数：

参数	描述	默认值
`negativePrompt`	要从视频中排除的元素	-
`seed`	用于生成可复现结果的随机数种子（相同提示词+种子=相同视频）	随机
`enhancePrompt`	允许模型重写你的提示词以获得更好效果	false
`generateAudio`	生成同步音频	true
`personGeneration`	控制人物生成： `dont_allow` 或 `allow_adult`	-
`referenceImages`	最多3张资产图片 OR 1张风格图片用于保持一致性	-

Negative Prompts

负向提示词

Explicitly exclude unwanted elements:

"A forest at sunset" + negativePrompt: "people, animals, buildings"

明确排除不需要的元素：

"日落时的森林" + negativePrompt: "人物、动物、建筑"

Seed for Consistency

一致性种子值

Use the same seed to reproduce similar results:

First generation: seed=12345 → video A Same prompt + seed=12345 → nearly identical video

Useful for:

Iterating on a specific "look"
Creating variations with controlled changes
A/B testing different prompts

使用相同的种子值来生成相似结果：

第一次生成：seed=12345 → 视频A 相同提示词 + seed=12345 → 几乎完全相同的视频

适用场景：

迭代优化特定「视觉风格」
生成可控变化的不同版本
对不同提示词做A/B测试

Reference Images

参考图片

Maintain visual consistency across shots using reference images:

Asset References (up to 3):

Character appearances
Locations/settings
Props or products

Style References (1):

Overall aesthetic
Color palette
Visual treatment

使用参考图片保持不同镜头之间的视觉一致性：

资产参考（最多3张）：

人物外观
地点/场景
道具或产品

风格参考（1张）：

整体美学
调色盘
视觉处理方式

References

参考资料

```
references/prompt-calibration.md
```
- Finding the right detail level
```
references/cinematography-glossary.md
```
- Full camera terms
```
references/prompt-examples.md
```
- 20+ categorized examples
```
references/advanced-workflows.md
```
- Image-to-video, first/last frame

```
references/prompt-calibration.md
```
- 找到合适的细节程度
```
references/cinematography-glossary.md
```
- 完整镜头术语表
```
references/prompt-examples.md
```
- 20+个分类示例
```
references/advanced-workflows.md
```
- 图生视频、首帧/尾帧控制