kling-3-prompting
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOverview
概述
Kling 3.0 is a unified multimodal video model. It understands cinematic direction, not keyword lists. Write prompts like a director — describe what the audience sees, hears, and feels over time.
Core shift: Description → Direction. Think "direct a scene" not "describe an image."
Kling 3.0是一款统一的多模态视频模型。它能理解电影化导演指令,而非关键词列表。要像导演一样撰写提示词——描述观众随时间推移看到、听到和感受到的内容。
核心转变: 描述 → 导演式指令。要思考“执导一个场景”而非“描述一张图片”。
Interactive Builder Workflow
交互式提示词构建流程
When invoked, guide the user through these steps using :
AskUserQuestiondot
digraph builder {
"1. Generation mode?" [shape=diamond];
"Text-to-Video" [shape=box];
"Image-to-Video" [shape=box];
"Multi-Shot Sequence" [shape=box];
"Keyframe Transition" [shape=box];
"2. Gather scene details" [shape=box];
"3. Assemble prompt" [shape=box];
"4. Present & refine" [shape=box];
"1. Generation mode?" -> "Text-to-Video";
"1. Generation mode?" -> "Image-to-Video";
"1. Generation mode?" -> "Multi-Shot Sequence";
"1. Generation mode?" -> "Keyframe Transition";
"Text-to-Video" -> "2. Gather scene details";
"Image-to-Video" -> "2. Gather scene details";
"Multi-Shot Sequence" -> "2. Gather scene details";
"Keyframe Transition" -> "2. Gather scene details";
"2. Gather scene details" -> "3. Assemble prompt";
"3. Assemble prompt" -> "4. Present & refine";
}调用时,使用引导用户完成以下步骤:
AskUserQuestiondot
digraph builder {
"1. Generation mode?" [shape=diamond];
"Text-to-Video" [shape=box];
"Image-to-Video" [shape=box];
"Multi-Shot Sequence" [shape=box];
"Keyframe Transition" [shape=box];
"2. Gather scene details" [shape=box];
"3. Assemble prompt" [shape=box];
"4. Present & refine" [shape=box];
"1. Generation mode?" -> "Text-to-Video";
"1. Generation mode?" -> "Image-to-Video";
"1. Generation mode?" -> "Multi-Shot Sequence";
"1. Generation mode?" -> "Keyframe Transition";
"Text-to-Video" -> "2. Gather scene details";
"Image-to-Video" -> "2. Gather scene details";
"Multi-Shot Sequence" -> "2. Gather scene details";
"Keyframe Transition" -> "2. Gather scene details";
"2. Gather scene details" -> "3. Assemble prompt";
"3. Assemble prompt" -> "4. Present & refine";
}Step 1: Determine Generation Mode
步骤1:确定生成模式
Ask the user which mode:
- Text-to-Video — prompt from scratch
- Image-to-Video — animate a reference image
- Multi-Shot Sequence — 2-6 shot storyboard (up to 15s)
- Keyframe Transition — start frame → end frame with interpolated motion
询问用户选择哪种模式:
- Text-to-Video — 从零开始生成提示词
- Image-to-Video — 为参考图片添加动画效果
- Multi-Shot Sequence — 2-6镜头的故事板(最长15秒)
- Keyframe Transition — 从起始帧到结束帧的插值运动过渡
Step 2: Gather Scene Details
步骤2:收集场景细节
Ask about each element (adapt questions to mode):
| Element | Question | Why it matters |
|---|---|---|
| Subject | Who/what is the focus? Specific appearance details? | Anchors consistency — define distinguishing traits early |
| Action | What happens? Describe the timeline (first → then → finally) | Kling 3.0 excels at sequential action over 15s arcs |
| Environment | Where? Be specific (not "a street" but "narrow Tokyo alley, steam from grates") | Grounds the scene physically |
| Camera | Shot type and movement? (See camera reference below) | Cinematic language produces far better results |
| Lighting | What light sources? Name them specifically | "Flickering neon" beats "dramatic lighting" |
| Mood/Emotion | What should the audience feel? | Drives color grade, pacing, music |
| Audio | Dialogue? Ambient sound? Music? | Kling 3.0 generates native audio + lip-sync |
| Duration | How long? (3-15s) | Longer = describe progression over time |
| Aspect Ratio | 16:9 / 9:16 / 1:1 / 21:9? | 16:9 cinematic, 9:16 social, 21:9 ultra-wide |
Image-to-Video: Focus on how the scene evolves from the image — movement, camera motion, environmental change. The model preserves identity/layout from the source.
Keyframes: Ask for start and end frame descriptions. Frames should match in color, style, and lighting. Prompt sparingly — Kling infers motion well.
Multi-Shot: Define each shot separately with its own framing, subject, action, and duration. Label shots explicitly.
询问每个元素的相关信息(根据模式调整问题):
| 元素 | 问题 | 重要性 |
|---|---|---|
| 主体 | 焦点是谁/是什么?有哪些具体的外观细节? | 确保一致性——尽早定义独特特征 |
| 动作 | 发生了什么?描述时间线(首先→然后→最后) | Kling 3.0擅长处理15秒时长内的连续动作 |
| 环境 | 场景在哪里?要具体(不要说“一条街道”,要说“东京狭窄小巷,下水道口冒着蒸汽”) | 让场景有真实的物理依托 |
| 镜头 | 镜头类型和运动方式?(见下方镜头参考) | 电影化语言能产出更优质的结果 |
| 灯光 | 有哪些光源?具体说出名称 | “闪烁的霓虹灯”比“戏剧性灯光”效果更好 |
| 氛围/情绪 | 观众应该感受到什么? | 决定色彩分级、节奏和音乐风格 |
| 音频 | 有对话?环境音?音乐? | Kling 3.0可原生生成音频+唇形同步效果 |
| 时长 | 视频时长?(3-15秒) | 时长越长,越要描述随时间的变化过程 |
| 宽高比 | 16:9 / 9:16 / 1:1 / 21:9? | 16:9适合电影,9:16适合社交平台,21:9为超宽屏 |
Image-to-Video模式: 重点描述场景如何从图片演变——运动方式、镜头移动、环境变化。模型会保留源图的主体和布局。
关键帧模式: 询问起始帧和结束帧的描述。帧的色彩、风格和灯光要匹配。提示词要简洁——Kling能很好地推断帧间的运动。
多镜头模式: 单独定义每个镜头,包含各自的取景、主体、动作和时长。明确标注每个镜头。
Step 3: Assemble the Prompt
步骤3:组合提示词
Use the Master Formula:
[Scene/Environment] + [Subject & Appearance] + [Action Timeline] + [Camera Movement] + [Audio & Atmosphere] + [Technical Specs]Writing rules:
- Use cinematic motion verbs: dolly push, whip-pan, crash zoom, rack focus, tracking shot — NOT "moves" or "goes"
- Name real light sources: neon signs, candlelight, golden hour, LED panels — NOT "dramatic lighting"
- Include texture for credibility: grain, lens flares, condensation, fabric sheen, smoke, sweat
- Describe temporal flow: beginning → middle → end
- Keep to 1-3 rich sentences per shot (specificity > length)
- For dialogue: use character labels, assign voice tone/emotion, use transitional words ("Immediately," "Pause")
使用通用公式:
[场景/环境] + [主体与外观] + [动作时间线] + [镜头运动] + [音频与氛围] + [技术参数]撰写规则:
- 使用电影化运动动词:推进镜头、快速摇镜、急推变焦、焦点切换、跟拍——不要用“移动”或“走”
- 说出真实的光源名称:霓虹灯、烛光、黄金时刻、LED面板——不要用“戏剧性灯光”
- 添加细节增强真实感:颗粒感、镜头光晕、冷凝水、织物光泽、烟雾、汗水
- 描述时间流:开始→中间→结束
- 每个镜头保持1-3句丰富的描述(具体性>长度)
- 对话部分:使用角色标签,指定语气/情绪,使用过渡词(“立刻”、“停顿”)
Step 4: Present & Refine
步骤4:呈现与优化
Present the assembled prompt. Ask if they want to:
- Adjust any element
- Add a negative prompt
- Generate variations (different duration, different camera, different mood)
展示组合好的提示词。询问用户是否需要:
- 调整任何元素
- 添加负向提示词
- 生成变体(不同时长、不同镜头、不同氛围)
Quick Reference
快速参考
Camera Movements
镜头运动
| Movement | Effect | Example phrase |
|---|---|---|
| Dolly push-in | Builds intimacy/tension | "slow dolly push-in toward her face" |
| Dolly zoom | Vertigo/dramatic reveal | "dolly zoom creating disorienting depth shift" |
| Tracking shot | Follows subject laterally | "camera tracks alongside as she walks" |
| Whip-pan | Energy/surprise | "whip-pan to reveal the door" |
| Crash zoom | Shock/emphasis | "sudden crash zoom on the object" |
| Rack focus | Shift attention | "rack focus from foreground hand to background figure" |
| Handheld/shoulder-cam | Raw/documentary feel | "handheld shoulder-cam with subtle sway" |
| Static tripod | Composed/observational | "locked-off static tripod, wide shot" |
| FPV drone | High-energy immersion | "dynamic FPV drone shot chasing through corridor" |
| Low-angle tracking | Heroic/imposing | "low-angle tracking shot, subject towers above" |
| Truck left/right | Lateral reveal | "camera trucks right revealing the cityscape" |
| Tilt up/down | Vertical reveal | "slow tilt up from boots to face" |
| 运动方式 | 效果 | 示例表述 |
|---|---|---|
| 推进镜头(Dolly push-in) | 增强亲密感/紧张感 | “缓慢推进镜头,对准她的脸庞” |
| 推拉变焦(Dolly zoom) | 眩晕感/戏剧性揭示 | “推拉变焦制造令人迷失的景深变化” |
| 跟拍镜头(Tracking shot) | 横向跟随主体 | “镜头横向跟拍她的行走过程” |
| 快速摇镜(Whip-pan) | 充满活力/带来惊喜 | “快速摇镜,露出门的位置” |
| 急推变焦(Crash zoom) | 冲击感/强调重点 | “突然急推变焦对准目标物体” |
| 焦点切换(Rack focus) | 转移注意力 | “焦点从前景的手切换到背景的人物” |
| 手持/肩扛镜头 | 原始/纪录片质感 | “手持肩扛镜头,带有轻微晃动” |
| 固定三脚架 | 构图规整/观察式视角 | “固定三脚架,广角镜头” |
| FPV无人机镜头 | 高能量沉浸感 | “动态FPV无人机镜头,追逐穿过走廊” |
| 低角度跟拍 | 英雄感/威严感 | “低角度跟拍,主体高高在上” |
| 横向移镜(Truck left/right) | 横向展示场景 | “镜头向右移,露出城市景观” |
| 上下摇镜(Tilt up/down) | 纵向展示场景 | “缓慢向上摇镜,从靴子拍到脸部” |
Lens & Film Stock
镜头与胶片类型
| Phrase | Effect |
|---|---|
| "Shot on 35mm film" | Warm grain, organic texture |
| "Macro 85mm lens" | Tight detail, shallow depth of field |
| "Wide-angle steadicam" | Smooth, immersive, spatial |
| "Handheld camcorder" | Raw VHS energy, nostalgic |
| "Anamorphic lens flare" | Cinematic horizontal streaks |
| 表述 | 效果 |
|---|---|
| "Shot on 35mm film" | 温暖颗粒感,有机质感 |
| "Macro 85mm lens" | 细节清晰,浅景深 |
| "Wide-angle steadicam" | 平滑,沉浸感强,空间感好 |
| "Handheld camcorder" | 原始VHS质感,怀旧风格 |
| "Anamorphic lens flare" | 电影化水平镜头光晕 |
Lighting
灯光
Use specific sources, not adjectives:
- "Golden hour sun cutting through dusty warehouse windows"
- "Flickering neon casting magenta/cyan across wet pavement"
- "Single bare bulb swinging, casting moving shadows"
- "Cool blue LED panels reflecting off glass surfaces"
- "Candlelight warming skin tones, deep shadows beyond"
使用具体光源,而非形容词:
- “黄金时刻的阳光透过布满灰尘的仓库窗户照射进来”
- “闪烁的霓虹灯在潮湿路面上投射洋红色/青色光影”
- “单根裸露灯泡晃动,投下移动的阴影”
- “冷蓝色LED面板在玻璃表面反光”
- “烛光温暖肤色,远处是深邃的阴影”
Color & Grade
色彩与分级
- "Desaturated teal grade, crushed blacks"
- "Amber nightclub strobe cutting through smoke"
- "Cool blue haze filling the corridor"
- "Magenta neon reflecting off wet asphalt"
- "Overexposed highlights, blown-out whites"
- “低饱和青色调,压暗暗部”
- “琥珀色夜店频闪灯光穿透烟雾”
- “冷蓝色薄雾填满走廊”
- “洋红色霓虹灯在潮湿沥青路面反光”
- “高光过曝,白色区域泛白”
Multi-Character Dialogue
多角色对话
| Rule | Do | Don't |
|---|---|---|
| Name characters | | |
| Anchor to action | Agent slams table. [Agent, angrily]: "Where is it?" | Just dialogue without visual action |
| Assign voice tone | | Generic "says" |
| Control timing | "Immediately," "Pause," "After a beat" | Back-to-back dialogue without transitions |
| 规则 | 正确做法 | 错误做法 |
|---|---|---|
| 命名角色 | | |
| 结合动作 | 特工拍桌。 [特工,愤怒地]:“它在哪里?” | 只写对话,没有视觉动作 |
| 指定语气 | | 通用的“说” |
| 控制节奏 | “立刻”、“停顿”、“片刻后” | 无过渡的连续对话 |
Multi-Shot Structure
多镜头结构
Shot 1 (0-5s): [Wide establishing shot description]
Shot 2 (5-10s): [Medium/close-up with action progression]
Shot 3 (10-15s): [Resolution/reaction with camera payoff]
Atmosphere: [Overall mood, color grade]
Audio: [Sound design, music, dialogue]Label every shot. Assign durations. Describe framing + subject + motion per shot.
镜头1(0-5秒):[广角建立镜头描述]
镜头2(5-10秒):[中景/特写,动作推进]
镜头3(10-15秒):[收尾/反应镜头,镜头呼应]
氛围:[整体情绪,色彩分级]
音频:[音效设计,音乐,对话]为每个镜头标注编号。分配时长。每个镜头描述取景+主体+运动。
Start & End Frame Tips
起始帧与结束帧技巧
- Frames should match in color palette, style, and lighting
- Identical start/end frames = seamless loop
- Prompt sparingly — Kling infers motion between frames well
- Simple camera directions: zoom in/out, pan left/right, tilt up/down
- 5s for dynamic transitions, 10s for complex transformations
- Start frame aspect ratio drives the whole clip
- 帧的调色板、风格和灯光要匹配
- 完全相同的起始/结束帧=无缝循环
- 提示词要简洁——Kling能很好地推断帧间的运动
- 简单的镜头指令:放大/缩小,左右摇镜,上下摇镜
- 动态过渡用5秒,复杂变形用10秒
- 起始帧的宽高比决定整个视频的比例
Negative Prompts
负向提示词
Use to prevent common AI defaults:
smiling, laughing, cartoonish, bright saturated colors, low resolution,
morphing, blurry text, disfigured hands, extra fingers, static pose,
frozen expression, stock photo aestheticCustomize based on scene — remove items that conflict with your intent.
用于避免常见的AI默认问题:
smiling, laughing, cartoonish, bright saturated colors, low resolution,
morphing, blurry text, disfigured hands, extra fingers, static pose,
frozen expression, stock photo aesthetic根据场景自定义——移除与你的需求冲突的内容。
Weak → Strong
弱提示词 → 强提示词
| Element | Weak | Strong |
|---|---|---|
| Camera | "Camera follows person" | "Handheld shoulder-cam drifts behind subject with subtle sway" |
| Subject | "A woman walking" | "Woman in red dress, heels clicking wet cobblestone" |
| Environment | "In a city" | "Narrow Tokyo alley, steam from grates, glowing vending machines" |
| Lighting | "Dramatic lighting" | "Flickering neon casting magenta/cyan across wet pavement" |
| Texture | "It looks realistic" | "Rain beading on leather jacket, condensation on glass, visible breath" |
| Motion | "She walks away" | "She turns slowly, hair catches light, disappears around corner" |
| 元素 | 弱提示词 | 强提示词 |
|---|---|---|
| 镜头 | “镜头跟随人物” | “手持肩扛镜头在主体后方轻微晃动着跟随” |
| 主体 | “一个女人在走路” | “穿红裙的女人,高跟鞋踩在潮湿鹅卵石路面上发出咔哒声” |
| 环境 | “在城市里” | “东京狭窄小巷,下水道口冒着蒸汽,自动售货机发出微光” |
| 灯光 | “戏剧性灯光” | “闪烁的霓虹灯在潮湿路面上投射洋红色/青色光影” |
| 质感 | “看起来很真实” | “雨水在皮夹克上凝结成珠,玻璃上有水汽,可见呼出的白气” |
| 动作 | “她走开了” | “她慢慢转身,头发被光线照亮,消失在拐角处” |
Common Mistakes
常见错误
| Mistake | Fix |
|---|---|
| Keyword lists instead of scene direction | Write like directing a shot: subject + action + camera + environment |
| Vague motion ("moves," "goes") | Use cinematic verbs: dolly, track, whip-pan, crash zoom |
| Generic lighting ("dramatic") | Name the source: neon, candle, golden hour, LED panel |
| Overlong prompts | 1-3 rich sentences per shot; specificity > length |
| No temporal progression | Describe beginning → middle → end of the shot |
| Mismatched keyframes | Match color, lighting, and style between start/end frames |
| Unattributed dialogue | Label every speaker with name, tone, and emotion |
| Cramming multi-shot into one paragraph | Separate and label each shot with duration |
| 错误 | 修正方法 |
|---|---|
| 用关键词列表而非场景导演指令 | 像执导镜头一样撰写:主体+动作+镜头+环境 |
| 模糊的运动描述(“移动”、“走”) | 使用电影化动词:推进、跟拍、快速摇镜、急推变焦 |
| 通用的灯光描述(“戏剧性”) | 说出具体光源:霓虹灯、蜡烛、黄金时刻、LED面板 |
| 提示词过长 | 每个镜头1-3句丰富描述;具体性>长度 |
| 没有时间线描述 | 镜头的开始→中间→结束都要描述 |
| 关键帧不匹配 | 起始/结束帧的色彩、灯光和风格要一致 |
| 对话未标注角色 | 为每个说话者标注姓名、语气和情绪 |
| 多镜头内容挤在一个段落 | 分开标注每个镜头并注明时长 |