veo3-prompter

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Veo 3.1 Video Prompter

Veo 3.1 视频提示词生成器

Transform ideas into professional Veo 3.1 prompts using cinematic structure, audio direction, and multi-shot choreography.
借助电影化结构、音频指导和多镜头编排,将创意转化为专业的Veo 3.1提示词。

When to Use

使用场景

Invoke when user:
  • Says "create a video prompt" or "generate a Veo prompt"
  • Wants to "make a video of..." or "animate this..."
  • Asks for help with "video generation" or "AI video"
  • Needs "Veo 3" or "Veo 3.1" prompt assistance
  • Wants to create "multi-shot" or "cinematic" video sequences
当用户出现以下需求时调用:
  • 提及「创建视频提示词」或「生成Veo提示词」
  • 想要「制作一个关于...的视频」或「将这个内容动画化」
  • 寻求「视频生成」或「AI视频」相关帮助
  • 需要「Veo 3」或「Veo 3.1」提示词相关协助
  • 想要创建「多镜头」或「电影化」视频序列

Core Prompt Formula

核心提示词公式

[Cinematography] + [Subject] + [Action] + [Context] + [Style & Audio]
Every prompt should address these five elements for maximum control.
[镜头设计] + [主体] + [动作] + [背景] + [风格与音频]
每个提示词都应覆盖这五个要素,以实现最大程度的创作可控性。

Prompt Density: Finding the Sweet Spot

提示词信息量:找到最佳平衡点

Prompts fail in two directions:
  • Too sparse: Model fills gaps unpredictably, you lose creative control
  • Too dense: Model can't execute all instructions, produces confused output
提示词失效通常有两种极端情况:
  • 过于简略: 模型会不可预测地填补信息空白,你会失去创作控制权
  • 过于繁杂: 模型无法执行所有指令,输出内容混乱

The Priority Framework

优先级框架

Tier 1 - MUST INCLUDE (model needs these):
  • Shot size (wide/medium/close-up)
  • Subject identity (who/what is in frame)
  • Primary action (what happens)
  • One dominant mood/style word
Tier 2 - SHOULD INCLUDE (significant impact):
  • Camera movement OR angle (pick one, not both)
  • Lighting quality (natural/dramatic/soft)
  • One audio layer (dialogue OR SFX OR ambient)
  • Setting/environment
Tier 3 - NICE TO HAVE (diminishing returns):
  • Secondary audio layers
  • Specific lens type
  • Color palette details
  • Film stock/grain texture
  • Background action
Rule of thumb: Include all Tier 1, most of Tier 2, and 1-2 from Tier 3.
第一层 - 必须包含(模型必备要素):
  • 镜头景别(全景/中景/特写)
  • 主体身份(画面中的人/物)
  • 核心动作(发生的事件)
  • 一个主导情绪/风格词
第二层 - 建议包含(效果提升显著):
  • 镜头运动 OR 拍摄角度(二选一,不要同时选)
  • 光线质感(自然/戏剧化/柔和)
  • 一层音频(对话 OR SFX OR 环境音)
  • 场景/环境
第三层 - 可选包含(收益递减):
  • 次要音频层
  • 特定镜头类型
  • 调色盘细节
  • 胶片颗粒质感
  • 背景动作
经验法则: 包含所有第一层要素,大部分第二层要素,以及1-2个第三层要素。

Density Comparison

信息量对比

TOO SPARSE (model guesses too much):
"A professor talking about philosophy"
TOO DENSE (model overloaded):
"Medium close-up shot at eye level with a 50mm lens at f/1.8 creating shallow depth of field with bokeh highlights, of a 52-year-old female professor with silver-streaked auburn hair pulled back in a loose bun, wearing an olive tweed jacket with leather elbow patches over a cream silk blouse with a small pearl brooch, standing in a contemporary lecture hall with tiered mahogany seating and brass fixtures visible in the soft background, natural diffused daylight streaming through floor-to-ceiling windows on the left side creating soft rembrandt lighting on her face with a gentle fill from reflected light on the right..."
OPTIMAL (directed but breathable):
"Medium close-up of a professor in her 50s, tweed jacket, standing in a university lecture hall. She gestures while speaking: 'Kant asked one question: could everyone do this?' Warm natural window light from left, soft academic atmosphere. SFX: marker on whiteboard."
过于简略(模型猜测内容过多):
"一位教授谈论哲学"
过于繁杂(模型负荷过载):
"平视角度中近景,使用50mm f/1.8镜头营造浅景深和焦外虚化效果,52岁女教授, Auburn色头发带银白挑染,松散挽成发髻,身穿橄榄色粗花呢夹克,肘部带皮质补丁,内搭米白色丝绸衬衫,佩戴小珍珠胸针,站在现代阶梯教室中,背景可见桃花心木阶梯座椅和黄铜装置,左侧落地玻璃窗透进柔和自然光,在她脸上形成柔和的伦勃朗光,右侧反射光提供 gentle 补光..."
最佳状态(指令清晰且留有余量):
"50多岁教授的中近景,穿粗花呢夹克,站在大学阶梯教室内。她边做手势边说话:‘康德问了一个问题:所有人都能这么做吗?’左侧有温暖的自然窗光,柔和的学术氛围。SFX:白板上的马克笔书写声。"

Calibration Signals

校准信号

Signs your prompt is too sparse:
  • Results vary wildly between generations
  • Key elements missing or wrong
  • Mood/tone inconsistent with intent
Signs your prompt is too dense:
  • Model ignores some instructions entirely
  • Unnatural or frozen-looking motion
  • Conflicting elements appear (e.g., both day and night)
  • Audio doesn't match visual action
提示词过于简略的信号:
  • 不同次生成的结果差异极大
  • 核心要素缺失或出错
  • 情绪/调性和预期不一致
提示词过于繁杂的信号:
  • 模型完全忽略部分指令
  • 动作不自然或看起来卡顿
  • 出现冲突要素(比如同时是白天和黑夜)
  • 音频和画面动作不匹配

Iteration Strategy

迭代策略

  1. Start with Tier 1 only - generate test
  2. Add Tier 2 elements that matter most to your vision
  3. Add ONE Tier 3 detail if something specific is missing
  4. Remove any element the model consistently ignores
See
references/prompt-calibration.md
for detailed examples and troubleshooting.
  1. 仅从第一层要素开始 - 生成测试版本
  2. 添加对你的创作愿景最重要的第二层要素
  3. 如果缺少特定细节,仅添加1个第三层要素
  4. 移除模型始终忽略的所有要素
参考
references/prompt-calibration.md
查看详细示例和问题排查方案。

Cinematography Elements

电影制作要素

Shot Composition

镜头构图

  • Wide shot, medium shot, close-up, extreme close-up
  • Single shot, two shot, over-the-shoulder shot
  • High angle, low angle, eye level, worm's eye, bird's eye
  • 全景、中景、特写、极端特写
  • 单人镜头、双人镜头、过肩镜头
  • 高角度、低角度、平视、 worm's eye、鸟瞰

Camera Movement

镜头运动

  • Dolly (in/out), tracking shot, crane shot
  • Pan (left/right), tilt (up/down), zoom
  • Steadicam, handheld, aerial, POV
  • 推轨(推进/拉出)、跟踪镜头、升降镜头
  • 摇镜(左/右)、俯仰(上/下)、变焦
  • 斯坦尼康、手持、航拍、第一视角

Lens & Focus

镜头与对焦

  • Shallow depth of field, deep focus
  • Wide-angle lens, telephoto, macro lens
  • Soft focus, rack focus, bokeh
  • 浅景深、深焦
  • 广角镜头、长焦镜头、微距镜头
  • 柔焦、对焦切换、焦外虚化

Audio Direction

音频指导

Veo 3.1 generates synchronized sound. Direct it explicitly:
Dialogue (use quotes):
"A man says, 'The storm is coming.'"
Sound Effects (label with SFX):
"SFX: Thunder rumbles in the distance, rain patters on glass"
Ambient Noise:
"Ambient noise: busy café chatter, clinking cups, soft jazz"
Music:
"A swelling orchestral score begins to play"
Veo 3.1 可生成同步音效,请明确指定音频要求:
对话(使用引号标注):
"一个男人说:‘风暴要来了。’"
音效(标注SFX):
"SFX:远处雷声隆隆,雨点打在玻璃上的声音"
环境音
"环境音:繁忙咖啡馆的交谈声、杯子碰撞声、柔和爵士乐"
音乐
"渐强的管弦乐配乐开始播放"

Timestamp Prompting

时间轴提示词

For multi-shot sequences within one generation (max 8 seconds):
[00:00-00:02] Medium shot of a detective at his desk, lighting a cigarette.
SFX: Match strike, paper rustling.

[00:02-00:04] Close-up of his eyes narrowing as he reads a letter.
Ambient: Rain against the window.

[00:04-00:06] Reverse shot of a shadowy figure in the doorway.
A woman's voice: "You shouldn't have looked."

[00:06-00:08] Wide shot as the detective stands, reaching for his gun.
SFX: Chair scraping, thunder crack.
适用于单次生成内的多镜头序列(最长8秒):
[00:00-00:02] Medium shot of a detective at his desk, lighting a cigarette.
SFX: Match strike, paper rustling.

[00:02-00:04] Close-up of his eyes narrowing as he reads a letter.
Ambient: Rain against the window.

[00:04-00:06] Reverse shot of a shadowy figure in the doorway.
A woman's voice: "You shouldn't have looked."

[00:06-00:08] Wide shot as the detective stands, reaching for his gun.
SFX: Chair scraping, thunder crack.

Style Keywords

风格关键词

Visual Aesthetic:
  • Photorealistic, cinematic, documentary, animation
  • Retro (sepia, grainy film, 1980s vaporwave)
  • Noir, epic fantasy, sci-fi, romantic, horror
Mood & Lighting:
  • Warm golden hour, cool blue tones, moody shadows
  • Harsh fluorescent, soft morning light, dramatic chiaroscuro
  • Neon-lit, candlelit, overcast diffused
Film Grain Tip:
Add "slightly grainy, film-like" to avoid overly clean AI look
视觉美学:
  • 照片级写实、电影化、纪录片、动画
  • 复古( sepia、颗粒感胶片、1980年代蒸汽波)
  • 黑色电影、史诗奇幻、科幻、浪漫、恐怖
情绪与光线:
  • 温暖黄金小时、冷蓝色调、阴郁阴影
  • 刺眼荧光灯、柔和晨光、戏剧化明暗对比
  • 霓虹灯光、烛光、阴天柔光
胶片颗粒技巧:
添加「略带颗粒感,胶片质感」来避免过于干净的AI生成感

Output Formats

输出格式

Quick Prompt: Single sentence for simple shots Structured Prompt: Multi-line with all five elements Timestamp Sequence: Choreographed multi-shot within 8s Storyboard Mode: Multiple prompts for full narrative
快速提示词: 适用于简单镜头的单句提示词 结构化提示词: 覆盖所有五个要素的多行提示词 时间轴序列: 8秒内的编排好多镜头提示词 分镜模式: 适用于完整叙事的多个提示词组合

Example Prompts

提示词示例

Action Shot:
"Tracking shot following a parkour athlete sprinting across rooftops at sunset, warm orange light, urban cityscape background, cinematic, shallow depth of field. SFX: footsteps on concrete, wind rushing past."
Dialogue Scene:
"Medium two-shot in a dimly lit bar, a woman in red leans toward a man in a suit. She says quietly, 'I know what you did.' Ambient: jazz music, glasses clinking. Moody noir aesthetic, warm tungsten lighting."
Nature Documentary:
"Slow-motion close-up of a hummingbird drinking from a flower, macro lens with shallow focus, lush green garden background, soft morning light. SFX: gentle buzzing, birdsong."
动作镜头:
"跟踪镜头跟随跑酷运动员在日落时分的屋顶上冲刺,暖橙色光线,城市天际线背景,电影化,浅景深。SFX:混凝土上的脚步声、风声。"
对话场景:
"光线昏暗的酒吧里的双人中景,穿红色衣服的女人靠向穿西装的男人。她轻声说:‘我知道你做了什么。’环境音:爵士乐、杯子碰撞声。阴郁黑色电影美学,温暖钨丝灯照明。"
自然纪录片:
"蜂鸟从花中采蜜的慢动作特写,微距镜头浅焦,繁茂的绿色花园背景,柔和晨光。SFX:轻柔的嗡嗡声、鸟鸣。"

Technical Specs

技术规格

  • Duration: 4, 6, or 8 seconds
  • Resolution: 720p or 1080p
  • Aspect Ratio: 16:9 (landscape) or 9:16 (portrait)
  • Frame Rate: Configurable (default: 24 FPS)
  • 时长: 4、6或8秒
  • 分辨率: 720p或1080p
  • 宽高比: 16:9(横屏)或9:16(竖屏)
  • 帧率: 可配置(默认:24 FPS)

Advanced API Options

高级API选项

When using Veo through API (not Flow), these additional parameters are available:
ParameterDescriptionDefault
negativePrompt
Elements to exclude from the video-
seed
RNG seed for reproducible results (same prompt + seed = same video)Random
enhancePrompt
Let the model rewrite your prompt for better resultsfalse
generateAudio
Generate synchronized audiotrue
personGeneration
Control person generation:
dont_allow
or
allow_adult
-
referenceImages
Up to 3 asset images OR 1 style image for consistency-
通过API使用Veo(而非Flow)时,可使用以下额外参数:
参数描述默认值
negativePrompt
要从视频中排除的元素-
seed
用于生成可复现结果的随机数种子(相同提示词+种子=相同视频)随机
enhancePrompt
允许模型重写你的提示词以获得更好效果false
generateAudio
生成同步音频true
personGeneration
控制人物生成:
dont_allow
allow_adult
-
referenceImages
最多3张资产图片 OR 1张风格图片用于保持一致性-

Negative Prompts

负向提示词

Explicitly exclude unwanted elements:
"A forest at sunset" + negativePrompt: "people, animals, buildings"
明确排除不需要的元素:
"日落时的森林" + negativePrompt: "人物、动物、建筑"

Seed for Consistency

一致性种子值

Use the same seed to reproduce similar results:
First generation: seed=12345 → video A Same prompt + seed=12345 → nearly identical video
Useful for:
  • Iterating on a specific "look"
  • Creating variations with controlled changes
  • A/B testing different prompts
使用相同的种子值来生成相似结果:
第一次生成:seed=12345 → 视频A 相同提示词 + seed=12345 → 几乎完全相同的视频
适用场景:
  • 迭代优化特定「视觉风格」
  • 生成可控变化的不同版本
  • 对不同提示词做A/B测试

Reference Images

参考图片

Maintain visual consistency across shots using reference images:
Asset References (up to 3):
  • Character appearances
  • Locations/settings
  • Props or products
Style References (1):
  • Overall aesthetic
  • Color palette
  • Visual treatment
使用参考图片保持不同镜头之间的视觉一致性:
资产参考(最多3张):
  • 人物外观
  • 地点/场景
  • 道具或产品
风格参考(1张):
  • 整体美学
  • 调色盘
  • 视觉处理方式

References

参考资料

  • references/prompt-calibration.md
    - Finding the right detail level
  • references/cinematography-glossary.md
    - Full camera terms
  • references/prompt-examples.md
    - 20+ categorized examples
  • references/advanced-workflows.md
    - Image-to-video, first/last frame
  • references/prompt-calibration.md
    - 找到合适的细节程度
  • references/cinematography-glossary.md
    - 完整镜头术语表
  • references/prompt-examples.md
    - 20+个分类示例
  • references/advanced-workflows.md
    - 图生视频、首帧/尾帧控制