veo-3.1-prompting
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVeo 3.1 Prompting
Veo 3.1 提示词指南
Core Capabilities
核心功能
Veo 3.1 generates video with native audio synthesis and advanced physical realism. Key specifications:
- Resolutions: 720p, 1080p, or 4K
- Aspect Ratios: 16:9 (landscape) or 9:16 (portrait)
- Duration: 4, 6, or 8 seconds per generation (billed as 8s minimum)
- Framerate: 24 FPS (cinematic standard)
- Native Audio: Dialogue, ambient sound, and music synchronized to video
- Reference Images: Up to 3 images for character/style consistency ("Ingredients to Video")
- Advanced Controls: First/Last Frame interpolation, Timestamp Prompting for multi-shot narratives
Veo 3.1 可生成具备原生音频合成与高级物理真实感的视频。关键规格:
- 分辨率:720p、1080p或4K
- 宽高比:16:9(横屏)或9:16(竖屏)
- 时长:每次生成4、6或8秒(按最低8秒计费)
- 帧率:24 FPS(电影行业标准)
- 原生音频:与视频同步的对话、环境音和音乐
- 参考图像:最多3张图像,用于保持角色/风格一致性(“视频素材”功能)
- 高级控制:首尾帧插值、用于多镜头叙事的时间戳提示词
The Five-Part Formula
五部分公式
Structure every prompt in this order:
[Cinematography] + [Subject] + [Action] + [Context] + [Style & Audio]
Example:
Tracking shot following a weathered fisherman mending nets on a wooden dock. Golden hour sun flares through rigging. Salt spray visible in air. Cinematic 35mm film grain. Sound: seagulls crying, rope creaking, gentle waves.
Critical: Lead with camera movement. Veo 3.1 weights early tokens heavily.
所有提示词均按以下顺序构建:
[摄影规格] + [主体] + [动作] + [环境氛围] + [风格与音频]
示例:
跟拍镜头,记录一位饱经风霜的渔夫在木质码头修补渔网。黄金时段的阳光透过船具形成光斑。空气中可见盐雾。电影级35mm胶片颗粒感。音效:海鸥鸣叫、绳索吱呀声、轻柔海浪声。
重点提示:以镜头运动开头。Veo 3.1 对开头的关键词权重更高。
1. Cinematography Specifications
1. 摄影规格
Always specify shot type, camera movement, and lens characteristics first.
Shot Framing
- ,
Extreme close-up,Close-up,Medium shotWide establishing shot - ,
Over-the-shoulder,POV,Bird's eye viewWorm's eye view
Camera Movement (See camera-techniques.md)
- Linear: ,
Dolly in/out,Tracking shot,Pan,TiltTruck - Dynamic: ,
Handheld,Gimbal smooth,Whip panRack focus - Aerial: ,
Drone descending,Orbit clockwiseCrane up
Lens & Focus
Shallow depth of field with creamy bokehDeep focus everything sharpAnamorphic lens with horizontal flaresMacro lens for extreme detail
务必首先指定镜头类型、相机运动和镜头特性。
镜头取景
- (极端特写)、
Extreme close-up(特写)、Close-up(中景)、Medium shot(宽景定场镜头)Wide establishing shot - (过肩镜头)、
Over-the-shoulder(第一视角)、POV(鸟瞰视角)、Bird's eye view(仰拍视角)Worm's eye view
相机运动(参见 camera-techniques.md)
- 线性运动:(推拉镜头)、
Dolly in/out(跟拍镜头)、Tracking shot(摇镜头)、Pan(俯仰镜头)、Tilt(横移镜头)Truck - 动态运动:(手持镜头)、
Handheld(稳定器平滑运动)、Gimbal smooth(快速摇镜头)、Whip pan(焦点转移)Rack focus - 航拍运动:(无人机下降)、
Drone descending(顺时针环绕)、Orbit clockwise(升降台上升)Crane up
镜头与对焦
- (浅景深搭配柔焦散景)
Shallow depth of field with creamy bokeh - (深景深,所有元素清晰)
Deep focus everything sharp - (变形镜头,带有水平光斑)
Anamorphic lens with horizontal flares - (微距镜头,捕捉极致细节)
Macro lens for extreme detail
2. Subject Specification
2. 主体规格
Describe subjects with exhaustive visual detail for consistency:
Character Anchors (Essential for multi-clip consistency)
- Physical: "Elderly man, silver hair cropped short, weathered face with scar above left eyebrow"
- Clothing: "Faded navy pea coat, brass buttons tarnished, wool scarf with fringe"
- Distinctive markers: "Red-rimmed glasses", "tattoo of anchor on forearm", "gold pocket watch chain"
Object Details
- Material and condition: "Brushed titanium, fingerprint-smudged", "distressed leather with patina"
- Interaction points: "Steam rising from ceramic rim", "condensation beads on cold glass"
详细描述主体的视觉细节以确保一致性:
角色锚点(多片段一致性的关键)
- 外貌:“老年男性,银灰色短发,饱经风霜的脸上左眉上方有一道疤痕”
- 衣着:“褪色海军蓝厚呢短大衣,黄铜纽扣已失去光泽,带流苏的羊毛围巾”
- 独特标识:“红框眼镜”、“前臂锚形纹身”、“金怀表链”
物体细节
- 材质与状态:“拉丝钛金属,带有指纹痕迹”、“做旧皮革,带有包浆”
- 互动细节:“陶瓷杯口冒出蒸汽”、“冷玻璃杯上凝结的水珠”
3. Action & Physics
3. 动作与物理效果
Use dynamic verbs demonstrating physical interaction:
- Motion quality: ,
strides confidently,shuffles nervously,explodes outwardcascades smoothly - Physics indicators: ,
realistic water displacement,accurate gravity,cloth simulationhair physics - Temporal markers: ,
gradually accelerating,suddenly freezingcontinuously swirling
Physics Strengths: Fluid dynamics, fire/smoke, fabric movement, particle systems (dust, snow, sparks).
使用动态动词展现物理互动:
- 运动质感:自信迈步、紧张拖步、爆发式冲出、顺滑倾泻
- 物理效果指标:真实水位移、精准重力、布料模拟、毛发物理
- 时间标记:逐渐加速、突然静止、持续旋转
物理优势:流体动力学、火焰/烟雾、布料运动、粒子系统(灰尘、雪花、火花)。
4. Context & Atmosphere
4. 环境氛围
Establish environment and mood:
Lighting
- Time: ,
dawn blue hour,golden hour,harsh middayneon-lit night - Quality: ,
volumetric god rays,chiaroscuro shadows,soft diffused overcastpractical lamp light - Effects: ,
lens flare,bloomatmospheric haze
Environment
- Spatial: ,
cramped interior,vast open landscapeclaustrophobic corridor - Weather: ,
rain-streaked windows,fog-choked streetssnow-dusted
构建环境与情绪氛围:
光线
- 时间:黎明蓝调时刻、黄金时段、正午强光、霓虹夜晚
- 质感:体积光上帝之光、明暗对比阴影、柔和漫射阴天光、实景灯光
- 效果:镜头光斑、光晕、大气薄雾
环境
- 空间:狭窄室内、广阔开阔地貌、幽闭走廊
- 天气:雨痕玻璃窗、浓雾街道、积雪覆盖
5. Style & Audio Synthesis
5. 风格与音频合成
Visual Style (See style-reference.md)
- Genre: ,
cinematic noir,BBC Earth documentary,Pixar 3D animationvintage 16mm - Color: ,
teal and orange blockbuster grade,bleach bypass desaturatedvibrant anime saturation - Texture: ,
film grain 35mm,digital sharpnessVHS tracking lines
Native Audio (See audio-synthesis.md)
- Dialogue: "Smooth baritone voice: 'The package has arrived.'"
- Ambient: "Coffee shop murmur, ceramic cup placed on saucer"
- Effects: "Satisfying mechanical keyboard clack", "metallic slide and lock"
- Music: "Tense strings building to crescendo", "lo-fi hip hop beat 80bpm"
Audio Sync Tips
- Time dialogue: "At 2-second mark, character speaks"
- Match motion: "Music swells as camera pushes in"
- Layer sounds: "Base ambiance, then specific Foley, then dialogue"
视觉风格(参见 style-reference.md)
- 类型:黑色电影风格、BBC地球纪录片风格、皮克斯3D动画风格、复古16mm胶片风格
- 色彩:青橙色调大片级调色、漂白跳过去饱和、高饱和动漫色调
- 质感:35mm胶片颗粒、数码锐利度、VHS跟踪线
原生音频(参见 audio-synthesis.md)
- 对话:“低沉男中音:‘包裹已送达。’”
- 环境音:“咖啡馆低语声、陶瓷杯放在碟子上的声音”
- 音效:“令人愉悦的机械键盘敲击声”、“金属滑动与锁定声”
- 音乐:“紧张弦乐逐渐推向高潮”、“lo-fi嘻哈节拍 80bpm”
音频同步技巧
- 时间标记对话:“在第2秒时,角色开始说话”
- 匹配运动:“镜头推进时音乐渐强”
- 分层音效:“基础环境音,然后是特定拟音,最后是对话”
Advanced Workflows
高级工作流
Ingredients to Video (Character Consistency)
视频素材(角色一致性)
Use when maintaining characters across multiple clips:
- Upload 1-3 reference images (character face, outfit, environment)
- Prompt structure: "Same woman from reference image, now walking through rainy Tokyo street. Maintaining red leather jacket and bob haircut."
- Keep anchors consistent: "Same scar on cheek, same gold hoop earrings"
用于在多个片段中保持角色一致性:
- 上传1-3张参考图像(角色面部、服装、环境)
- 提示词结构:“与参考图像中相同的女性,如今行走在东京雨中街道。保留红色皮夹克与波波头发型。”
- 保持锚点一致:“脸颊上的相同疤痕、相同的金圈耳环”
First & Last Frame Interpolation
首尾帧插值
Create specific transitions:
- Describe starting state: "Sealed envelope on mahogany desk"
- Describe ending state: "Open envelope, letter partially pulled out, handwritten text visible"
- Veo generates smooth 8-second transition between states
- Use for: reveals, transformations, camera moves through portals
创建特定转场效果:
- 描述初始状态:“桃花心木桌上的密封信封”
- 描述结束状态:“打开的信封,部分信件抽出,可见手写文字”
- Veo 会生成8秒内的平滑过渡动画
- 适用场景:揭示、变形、镜头穿过传送门等
Timestamp Prompting (Multi-Shot Narrative)
时间戳提示词(多镜头叙事)
Break 8 seconds into precise segments for narrative control:
- 0-2s: Wide shot of empty desert highway, heat shimmer rising
- 2-4s: Close-up of motorcycle speedometer needle climbing
- 4-6s: Tracking shot of rider leaning into turn, dust cloud trailing
- 6-8s: Wide shot motorcycle disappearing into sunset, engine roar fading
将8秒拆分为精确片段以控制叙事:
- 0-2s: 空荡沙漠公路的宽景镜头,热浪升腾
- 2-4s: 摩托车时速表指针攀升的特写
- 4-6s: 骑手倾斜过弯的跟拍镜头,尘土飞扬
- 6-8s: 摩托车消失在夕阳中的宽景镜头,引擎轰鸣声渐弱
Model Selection
模型选择
| Model | Speed | Use Case |
|---|---|---|
| Standard | High-quality commercial content, final delivery |
| Fast | Rapid iteration, social media drafts, testing compositions |
| 模型 | 速度 | 使用场景 |
|---|---|---|
| 标准 | 高质量商业内容、最终交付版本 |
| 快速 | 快速迭代、社交媒体草稿、构图测试 |
Structured Prompting (JSON)
结构化提示词(JSON)
For complex commercial workflows, use JSON structure:
json
{
"project_meta": {
"aspect_ratio": "16:9",
"resolution": "1080p",
"model": "veo-3.1-generate-001"
},
"scene": {
"cinematography": "Macro lens, slow orbit 360 degrees",
"subject": "Artisan coffee cup, steam rising in spiral patterns",
"action": "Hand enters frame, gentle grip, lifting slowly",
"context": "Minimalist white studio, softbox lighting from above",
"style": "Premium commercial aesthetic, shallow depth of field",
"audio": "Ceramic on ceramic sound, gentle sip, ambient cafe hum"
},
"timeline": [
{"time": "0-3s", "focus": "Product detail, steam physics"},
{"time": "3-6s", "focus": "Hand interaction, human element"},
{"time": "6-8s", "focus": "Logo reveal, satisfying audio punctuate"}
]
}针对复杂商业工作流,使用JSON格式:
json
{
"project_meta": {
"aspect_ratio": "16:9",
"resolution": "1080p",
"model": "veo-3.1-generate-001"
},
"scene": {
"cinematography": "Macro lens, slow orbit 360 degrees",
"subject": "Artisan coffee cup, steam rising in spiral patterns",
"action": "Hand enters frame, gentle grip, lifting slowly",
"context": "Minimalist white studio, softbox lighting from above",
"style": "Premium commercial aesthetic, shallow depth of field",
"audio": "Ceramic on ceramic sound, gentle sip, ambient cafe hum"
},
"timeline": [
{"time": "0-3s", "focus": "Product detail, steam physics"},
{"time": "3-6s", "focus": "Hand interaction, human element"},
{"time": "6-8s", "focus": "Logo reveal, satisfying audio punctuate"}
]
}Technical Constraints
技术限制
- Prompt limit: 2,000 characters
- Image input: Maximum 20MB (JPEG/PNG) for Ingredients to Video
- Audio: Generate natively or add post-production; Veo audio works best with clear, brief dialogue (1-2 sentences max)
- Text rendering: Veo struggles with legible text; avoid critical text overlays or specify "stylized unreadable text"
- Concurrent requests: 50 per minute per region (free tier)
- 提示词长度限制:2000字符
- 图像输入:视频素材功能最大支持20MB(JPEG/PNG格式)
- 音频:可原生生成或后期添加;Veo 音频最适配清晰、简短的对话(最多1-2句话)
- 文本渲染:Veo 难以生成清晰文本;避免关键文本叠加,或指定“风格化不可读文本”
- 并发请求限制:每个区域每分钟最多50次(免费版)
Troubleshooting
故障排查
| Issue | Solution |
|---|---|
| Blurry motion | Add "sharp motion capture, 1/1000s shutter freeze" or "slow-motion 240fps" |
| Character inconsistency | Use Ingredients to Video with reference images; anchor with unique clothing/jewelry |
| Unwanted morphing | Specify rigidity: "maintains solid form", "no deformation"; use negative prompting |
| Physics look artificial | Explicitly state "realistic physics", "accurate weight and momentum" |
| Audio out of sync | Use timestamp markers: "dialogue starts at 2s", "sound matches impact at 4s" |
| Flat lighting | Specify contrast: "dramatic side lighting", "deep shadows", "rim light separation" |
| Boring composition | Lead with dynamic camera: "whip pan", "aggressive tracking shot", "rapid dolly in" |
| 问题 | 解决方案 |
|---|---|
| 运动模糊 | 添加“sharp motion capture, 1/1000s shutter freeze”(清晰运动捕捉,1/1000秒快门冻结)或“slow-motion 240fps”(240fps慢动作) |
| 角色不一致 | 使用视频素材功能并上传参考图像;通过独特服装/饰品设置锚点 |
| 非预期变形 | 指定刚性:“maintains solid form”(保持固体形态)、“no deformation”(无变形);使用负面提示词 |
| 物理效果不真实 | 明确标注“realistic physics”(真实物理效果)、“accurate weight and momentum”(精准重量与动量) |
| 音频不同步 | 使用时间戳标记:“dialogue starts at 2s”(对话从第2秒开始)、“sound matches impact at 4s”(音效匹配第4秒的撞击) |
| 光线平淡 | 指定对比度:“dramatic side lighting”(戏剧性侧光)、“deep shadows”(深邃阴影)、“rim light separation”(轮廓光分离) |
| 构图乏味 | 以动态镜头开头:“whip pan”(快速摇镜头)、“aggressive tracking shot”(激进跟拍镜头)、“rapid dolly in”(快速推镜头) |
When to Load References
何时查阅参考资料
| Task | Resource |
|---|---|
| Camera movement vocabulary | |
| Visual styles and color grading | |
| Genre-specific templates | |
| Audio generation techniques | |
| Copy-paste templates | |
| 任务 | 资源 |
|---|---|
| 相机运动词汇 | |
| 视觉风格与调色 | |
| 特定类型模板 | |
| 音频生成技巧 | |
| 可复制模板 | |
Camera Techniques for Veo 3.1
Veo 3.1 相机技巧
Veo 3.1 excels at film grammar and camera movement. Precise terminology yields better motion control.
Veo 3.1 擅长电影语法与相机运动。精准术语可实现更好的运动控制。
Movement Types
运动类型
Linear Motion
线性运动
| Term | Description | Best For |
|---|---|---|
| Physical camera movement toward/away subject | Emotional reveals, focus shifts |
| Camera moves parallel to subject | Following subjects, dynamic action |
| Horizontal lateral movement | Revealing environments |
| Vertical movement without angle change | Elevating reveals |
| Sweeping vertical arc | Epic scale reveals |
| Zoom while maintaining perspective | Intensifying moments |
| 术语 | 描述 | 最佳适用场景 |
|---|---|---|
| 相机朝向/远离主体的物理移动 | 情感揭示、焦点转移 |
| 相机平行于主体移动 | 跟随主体、动态动作 |
| 水平横向移动 | 展示环境 |
| 无角度变化的垂直移动 | 提升式揭示 |
| 大范围垂直弧形移动 | 史诗级场景揭示 |
| 保持透视的变焦 | 强化时刻 |
Rotational Motion
旋转运动
| Term | Description | Best For |
|---|---|---|
| Horizontal rotation, fixed position | Scanning landscapes, following movement |
| Vertical rotation, fixed position | Revealing height, vertical subjects |
| 360° circular path around subject | Product showcases, dramatic encirclement |
| Tilted horizon | Disorientation, tension, stylized action |
| 术语 | 描述 | 最佳适用场景 |
|---|---|---|
| 固定位置的水平旋转 | 扫描风景、跟随移动 |
| 固定位置的垂直旋转 | 展示高度、垂直主体 |
| 围绕主体的360°圆形运动 | 产品展示、戏剧性包围 |
| 倾斜的地平线 | 迷失感、紧张感、风格化动作 |
Dynamic Techniques
动态技巧
| Term | Description | Best For |
|---|---|---|
| Slight organic jitter | Documentary realism, urgency |
| Stabilized fluid motion | Cinematic tracking, luxury aesthetic |
| Fast blur transition between subjects | Energy, scene transitions |
| Shifts focal plane between foreground/background | Narrative focus shifts, depth emphasis |
| Gradual camera advance | Contemplative, intimate moments |
| 术语 | 描述 | 最佳适用场景 |
|---|---|---|
| 轻微自然抖动 | 纪录片真实感、紧迫感 |
| 稳定的流畅运动 | 电影级跟拍、奢华风格 |
| 快速模糊转场至另一主体 | 活力、场景转场 |
| 焦点在前景/背景间切换 | 叙事焦点转移、强调深度 |
| 相机逐渐推进 | 沉思、亲密时刻 |
Shot Framing Hierarchy
镜头取景层级
Prefix camera moves with framing:
- — Detail only (eyes, texture)
Extreme close-up - — Head and shoulders
Close-up - — Waist up
Medium shot - — Full subject with environment
Wide shot - — Environment dominant
Extreme wide/Establishing - — Dialogue scenes
Over-the-shoulder - — Immersive perspective
POV (Point of view) - — Top-down, detachment
Bird's eye view - — Looking up, empowerment/intimidation
Worm's eye view
在相机运动前添加取景前缀:
- (极端特写)—— 仅展示细节(眼睛、纹理)
Extreme close-up - (特写)—— 头部与肩部
Close-up - (中景)—— 腰部以上
Medium shot - (宽景)—— 完整主体与环境
Wide shot - (超宽/定场镜头)—— 环境主导
Extreme wide/Establishing - (过肩镜头)—— 对话场景
Over-the-shoulder - (第一视角)—— 沉浸式视角
POV (Point of view) - (鸟瞰视角)—— 俯视、疏离感
Bird's eye view - (仰拍视角)—— 仰视、赋能/威慑感
Worm's eye view
Speed and Time Modifiers
速度与时间修饰符
- — Smooth, detailed action
Slow-motion 240fps - — Natural motion
Real-time 24fps - — Compressed time, clouds, construction
Time-lapse - — Moving time-lapse
Hyperlapse - — Uneven, mechanical, horror
Staccato/jerky motion
- (240fps慢动作)—— 流畅、细节丰富的动作
Slow-motion 240fps - (24fps实时)—— 自然运动
Real-time 24fps - (延时摄影)—— 压缩时间、云层、建筑施工
Time-lapse - (移动延时)—— 移动中的延时摄影
Hyperlapse - (断续/抖动运动)—— 不规则、机械感、恐怖风格
Staccato/jerky motion
Lens Characteristics
镜头特性
Include to control optical personality:
- — Horizontal flares, oval bokeh, cinematic width
Anamorphic lens - — Distortion, expansive space
Wide angle 16mm - — Compressed space, portrait perspective
Telephoto 85mm - — Extreme close focus, shallow depth
Macro lens - — Spherical distortion, extreme wide
Fisheye - — Softness, chromatic aberration, character
Vintage lens
添加以下内容以控制光学特性:
- (变形镜头)—— 水平光斑、椭圆形散景、电影级宽度
Anamorphic lens - (16mm广角镜头)—— 畸变、广阔空间
Wide angle 16mm - (85mm长焦镜头)—— 压缩空间、人像视角
Telephoto 85mm - (微距镜头)—— 极端近距离对焦、浅景深
Macro lens - (鱼眼镜头)—— 球形畸变、极端广角
Fisheye - (复古镜头)—— 柔和、色差、独特质感
Vintage lens
Example Combinations
示例组合
Extreme close-up of eye, slow push-in, rack focus to reflection in iris, anamorphic flares
Wide establishing shot, crane down through clouds to reveal mountain valley, golden hour lighting
Handheld medium shot following subject through crowded market, whip pan to street sign
眼睛的极端特写,缓慢推镜头,焦点转移至虹膜中的倒影,变形镜头光斑
宽景定场镜头,吊臂镜头穿过云层向下揭示山谷,黄金时段光线
手持中景镜头跟随主体穿过拥挤市场,快速摇镜头至街道标识