veo3-prompter
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVeo 3.1 Video Prompter
Veo 3.1 视频提示词生成器
Transform ideas into professional Veo 3.1 prompts using cinematic structure, audio direction, and multi-shot choreography.
借助电影化结构、音频指导和多镜头编排,将创意转化为专业的Veo 3.1提示词。
When to Use
使用场景
Invoke when user:
- Says "create a video prompt" or "generate a Veo prompt"
- Wants to "make a video of..." or "animate this..."
- Asks for help with "video generation" or "AI video"
- Needs "Veo 3" or "Veo 3.1" prompt assistance
- Wants to create "multi-shot" or "cinematic" video sequences
当用户出现以下需求时调用:
- 提及「创建视频提示词」或「生成Veo提示词」
- 想要「制作一个关于...的视频」或「将这个内容动画化」
- 寻求「视频生成」或「AI视频」相关帮助
- 需要「Veo 3」或「Veo 3.1」提示词相关协助
- 想要创建「多镜头」或「电影化」视频序列
Core Prompt Formula
核心提示词公式
[Cinematography] + [Subject] + [Action] + [Context] + [Style & Audio]
Every prompt should address these five elements for maximum control.
[镜头设计] + [主体] + [动作] + [背景] + [风格与音频]
每个提示词都应覆盖这五个要素,以实现最大程度的创作可控性。
Prompt Density: Finding the Sweet Spot
提示词信息量:找到最佳平衡点
Prompts fail in two directions:
- Too sparse: Model fills gaps unpredictably, you lose creative control
- Too dense: Model can't execute all instructions, produces confused output
提示词失效通常有两种极端情况:
- 过于简略: 模型会不可预测地填补信息空白,你会失去创作控制权
- 过于繁杂: 模型无法执行所有指令,输出内容混乱
The Priority Framework
优先级框架
Tier 1 - MUST INCLUDE (model needs these):
- Shot size (wide/medium/close-up)
- Subject identity (who/what is in frame)
- Primary action (what happens)
- One dominant mood/style word
Tier 2 - SHOULD INCLUDE (significant impact):
- Camera movement OR angle (pick one, not both)
- Lighting quality (natural/dramatic/soft)
- One audio layer (dialogue OR SFX OR ambient)
- Setting/environment
Tier 3 - NICE TO HAVE (diminishing returns):
- Secondary audio layers
- Specific lens type
- Color palette details
- Film stock/grain texture
- Background action
Rule of thumb: Include all Tier 1, most of Tier 2, and 1-2 from Tier 3.
第一层 - 必须包含(模型必备要素):
- 镜头景别(全景/中景/特写)
- 主体身份(画面中的人/物)
- 核心动作(发生的事件)
- 一个主导情绪/风格词
第二层 - 建议包含(效果提升显著):
- 镜头运动 OR 拍摄角度(二选一,不要同时选)
- 光线质感(自然/戏剧化/柔和)
- 一层音频(对话 OR SFX OR 环境音)
- 场景/环境
第三层 - 可选包含(收益递减):
- 次要音频层
- 特定镜头类型
- 调色盘细节
- 胶片颗粒质感
- 背景动作
经验法则: 包含所有第一层要素,大部分第二层要素,以及1-2个第三层要素。
Density Comparison
信息量对比
TOO SPARSE (model guesses too much):
"A professor talking about philosophy"
TOO DENSE (model overloaded):
"Medium close-up shot at eye level with a 50mm lens at f/1.8 creating shallow depth of field with bokeh highlights, of a 52-year-old female professor with silver-streaked auburn hair pulled back in a loose bun, wearing an olive tweed jacket with leather elbow patches over a cream silk blouse with a small pearl brooch, standing in a contemporary lecture hall with tiered mahogany seating and brass fixtures visible in the soft background, natural diffused daylight streaming through floor-to-ceiling windows on the left side creating soft rembrandt lighting on her face with a gentle fill from reflected light on the right..."
OPTIMAL (directed but breathable):
"Medium close-up of a professor in her 50s, tweed jacket, standing in a university lecture hall. She gestures while speaking: 'Kant asked one question: could everyone do this?' Warm natural window light from left, soft academic atmosphere. SFX: marker on whiteboard."
过于简略(模型猜测内容过多):
"一位教授谈论哲学"
过于繁杂(模型负荷过载):
"平视角度中近景,使用50mm f/1.8镜头营造浅景深和焦外虚化效果,52岁女教授, Auburn色头发带银白挑染,松散挽成发髻,身穿橄榄色粗花呢夹克,肘部带皮质补丁,内搭米白色丝绸衬衫,佩戴小珍珠胸针,站在现代阶梯教室中,背景可见桃花心木阶梯座椅和黄铜装置,左侧落地玻璃窗透进柔和自然光,在她脸上形成柔和的伦勃朗光,右侧反射光提供 gentle 补光..."
最佳状态(指令清晰且留有余量):
"50多岁教授的中近景,穿粗花呢夹克,站在大学阶梯教室内。她边做手势边说话:‘康德问了一个问题:所有人都能这么做吗?’左侧有温暖的自然窗光,柔和的学术氛围。SFX:白板上的马克笔书写声。"
Calibration Signals
校准信号
Signs your prompt is too sparse:
- Results vary wildly between generations
- Key elements missing or wrong
- Mood/tone inconsistent with intent
Signs your prompt is too dense:
- Model ignores some instructions entirely
- Unnatural or frozen-looking motion
- Conflicting elements appear (e.g., both day and night)
- Audio doesn't match visual action
提示词过于简略的信号:
- 不同次生成的结果差异极大
- 核心要素缺失或出错
- 情绪/调性和预期不一致
提示词过于繁杂的信号:
- 模型完全忽略部分指令
- 动作不自然或看起来卡顿
- 出现冲突要素(比如同时是白天和黑夜)
- 音频和画面动作不匹配
Iteration Strategy
迭代策略
- Start with Tier 1 only - generate test
- Add Tier 2 elements that matter most to your vision
- Add ONE Tier 3 detail if something specific is missing
- Remove any element the model consistently ignores
See for detailed examples and troubleshooting.
references/prompt-calibration.md- 仅从第一层要素开始 - 生成测试版本
- 添加对你的创作愿景最重要的第二层要素
- 如果缺少特定细节,仅添加1个第三层要素
- 移除模型始终忽略的所有要素
参考 查看详细示例和问题排查方案。
references/prompt-calibration.mdCinematography Elements
电影制作要素
Shot Composition
镜头构图
- Wide shot, medium shot, close-up, extreme close-up
- Single shot, two shot, over-the-shoulder shot
- High angle, low angle, eye level, worm's eye, bird's eye
- 全景、中景、特写、极端特写
- 单人镜头、双人镜头、过肩镜头
- 高角度、低角度、平视、 worm's eye、鸟瞰
Camera Movement
镜头运动
- Dolly (in/out), tracking shot, crane shot
- Pan (left/right), tilt (up/down), zoom
- Steadicam, handheld, aerial, POV
- 推轨(推进/拉出)、跟踪镜头、升降镜头
- 摇镜(左/右)、俯仰(上/下)、变焦
- 斯坦尼康、手持、航拍、第一视角
Lens & Focus
镜头与对焦
- Shallow depth of field, deep focus
- Wide-angle lens, telephoto, macro lens
- Soft focus, rack focus, bokeh
- 浅景深、深焦
- 广角镜头、长焦镜头、微距镜头
- 柔焦、对焦切换、焦外虚化
Audio Direction
音频指导
Veo 3.1 generates synchronized sound. Direct it explicitly:
Dialogue (use quotes):
"A man says, 'The storm is coming.'"
Sound Effects (label with SFX):
"SFX: Thunder rumbles in the distance, rain patters on glass"
Ambient Noise:
"Ambient noise: busy café chatter, clinking cups, soft jazz"
Music:
"A swelling orchestral score begins to play"
Veo 3.1 可生成同步音效,请明确指定音频要求:
对话(使用引号标注):
"一个男人说:‘风暴要来了。’"
音效(标注SFX):
"SFX:远处雷声隆隆,雨点打在玻璃上的声音"
环境音:
"环境音:繁忙咖啡馆的交谈声、杯子碰撞声、柔和爵士乐"
音乐:
"渐强的管弦乐配乐开始播放"
Timestamp Prompting
时间轴提示词
For multi-shot sequences within one generation (max 8 seconds):
[00:00-00:02] Medium shot of a detective at his desk, lighting a cigarette.
SFX: Match strike, paper rustling.
[00:02-00:04] Close-up of his eyes narrowing as he reads a letter.
Ambient: Rain against the window.
[00:04-00:06] Reverse shot of a shadowy figure in the doorway.
A woman's voice: "You shouldn't have looked."
[00:06-00:08] Wide shot as the detective stands, reaching for his gun.
SFX: Chair scraping, thunder crack.适用于单次生成内的多镜头序列(最长8秒):
[00:00-00:02] Medium shot of a detective at his desk, lighting a cigarette.
SFX: Match strike, paper rustling.
[00:02-00:04] Close-up of his eyes narrowing as he reads a letter.
Ambient: Rain against the window.
[00:04-00:06] Reverse shot of a shadowy figure in the doorway.
A woman's voice: "You shouldn't have looked."
[00:06-00:08] Wide shot as the detective stands, reaching for his gun.
SFX: Chair scraping, thunder crack.Style Keywords
风格关键词
Visual Aesthetic:
- Photorealistic, cinematic, documentary, animation
- Retro (sepia, grainy film, 1980s vaporwave)
- Noir, epic fantasy, sci-fi, romantic, horror
Mood & Lighting:
- Warm golden hour, cool blue tones, moody shadows
- Harsh fluorescent, soft morning light, dramatic chiaroscuro
- Neon-lit, candlelit, overcast diffused
Film Grain Tip:
Add "slightly grainy, film-like" to avoid overly clean AI look
视觉美学:
- 照片级写实、电影化、纪录片、动画
- 复古( sepia、颗粒感胶片、1980年代蒸汽波)
- 黑色电影、史诗奇幻、科幻、浪漫、恐怖
情绪与光线:
- 温暖黄金小时、冷蓝色调、阴郁阴影
- 刺眼荧光灯、柔和晨光、戏剧化明暗对比
- 霓虹灯光、烛光、阴天柔光
胶片颗粒技巧:
添加「略带颗粒感,胶片质感」来避免过于干净的AI生成感
Output Formats
输出格式
Quick Prompt: Single sentence for simple shots
Structured Prompt: Multi-line with all five elements
Timestamp Sequence: Choreographed multi-shot within 8s
Storyboard Mode: Multiple prompts for full narrative
快速提示词: 适用于简单镜头的单句提示词
结构化提示词: 覆盖所有五个要素的多行提示词
时间轴序列: 8秒内的编排好多镜头提示词
分镜模式: 适用于完整叙事的多个提示词组合
Example Prompts
提示词示例
Action Shot:
"Tracking shot following a parkour athlete sprinting across rooftops at sunset, warm orange light, urban cityscape background, cinematic, shallow depth of field. SFX: footsteps on concrete, wind rushing past."
Dialogue Scene:
"Medium two-shot in a dimly lit bar, a woman in red leans toward a man in a suit. She says quietly, 'I know what you did.' Ambient: jazz music, glasses clinking. Moody noir aesthetic, warm tungsten lighting."
Nature Documentary:
"Slow-motion close-up of a hummingbird drinking from a flower, macro lens with shallow focus, lush green garden background, soft morning light. SFX: gentle buzzing, birdsong."
动作镜头:
"跟踪镜头跟随跑酷运动员在日落时分的屋顶上冲刺,暖橙色光线,城市天际线背景,电影化,浅景深。SFX:混凝土上的脚步声、风声。"
对话场景:
"光线昏暗的酒吧里的双人中景,穿红色衣服的女人靠向穿西装的男人。她轻声说:‘我知道你做了什么。’环境音:爵士乐、杯子碰撞声。阴郁黑色电影美学,温暖钨丝灯照明。"
自然纪录片:
"蜂鸟从花中采蜜的慢动作特写,微距镜头浅焦,繁茂的绿色花园背景,柔和晨光。SFX:轻柔的嗡嗡声、鸟鸣。"
Technical Specs
技术规格
- Duration: 4, 6, or 8 seconds
- Resolution: 720p or 1080p
- Aspect Ratio: 16:9 (landscape) or 9:16 (portrait)
- Frame Rate: Configurable (default: 24 FPS)
- 时长: 4、6或8秒
- 分辨率: 720p或1080p
- 宽高比: 16:9(横屏)或9:16(竖屏)
- 帧率: 可配置(默认:24 FPS)
Advanced API Options
高级API选项
When using Veo through API (not Flow), these additional parameters are available:
| Parameter | Description | Default |
|---|---|---|
| Elements to exclude from the video | - |
| RNG seed for reproducible results (same prompt + seed = same video) | Random |
| Let the model rewrite your prompt for better results | false |
| Generate synchronized audio | true |
| Control person generation: | - |
| Up to 3 asset images OR 1 style image for consistency | - |
通过API使用Veo(而非Flow)时,可使用以下额外参数:
| 参数 | 描述 | 默认值 |
|---|---|---|
| 要从视频中排除的元素 | - |
| 用于生成可复现结果的随机数种子(相同提示词+种子=相同视频) | 随机 |
| 允许模型重写你的提示词以获得更好效果 | false |
| 生成同步音频 | true |
| 控制人物生成: | - |
| 最多3张资产图片 OR 1张风格图片用于保持一致性 | - |
Negative Prompts
负向提示词
Explicitly exclude unwanted elements:
"A forest at sunset" + negativePrompt: "people, animals, buildings"
明确排除不需要的元素:
"日落时的森林" + negativePrompt: "人物、动物、建筑"
Seed for Consistency
一致性种子值
Use the same seed to reproduce similar results:
First generation: seed=12345 → video A Same prompt + seed=12345 → nearly identical video
Useful for:
- Iterating on a specific "look"
- Creating variations with controlled changes
- A/B testing different prompts
使用相同的种子值来生成相似结果:
第一次生成:seed=12345 → 视频A 相同提示词 + seed=12345 → 几乎完全相同的视频
适用场景:
- 迭代优化特定「视觉风格」
- 生成可控变化的不同版本
- 对不同提示词做A/B测试
Reference Images
参考图片
Maintain visual consistency across shots using reference images:
Asset References (up to 3):
- Character appearances
- Locations/settings
- Props or products
Style References (1):
- Overall aesthetic
- Color palette
- Visual treatment
使用参考图片保持不同镜头之间的视觉一致性:
资产参考(最多3张):
- 人物外观
- 地点/场景
- 道具或产品
风格参考(1张):
- 整体美学
- 调色盘
- 视觉处理方式
References
参考资料
- - Finding the right detail level
references/prompt-calibration.md - - Full camera terms
references/cinematography-glossary.md - - 20+ categorized examples
references/prompt-examples.md - - Image-to-video, first/last frame
references/advanced-workflows.md
- - 找到合适的细节程度
references/prompt-calibration.md - - 完整镜头术语表
references/cinematography-glossary.md - - 20+个分类示例
references/prompt-examples.md - - 图生视频、首帧/尾帧控制
references/advanced-workflows.md