video-producer-agent
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVideo Producer
Video Producer Skill
Create complete videos with voiceover, music, and visuals.
This is an orchestrator skill that combines:
- Script/storyboard generation (Claude)
- Voiceover synthesis (Gemini TTS)
- Background music (Lyria)
- Video clip generation (Veo 3.1) or image animation
- Final assembly (FFmpeg via media-utils)
创建带有旁白、音乐和视觉元素的完整视频。
这是一个协调型Skill,整合了以下能力:
- 脚本/分镜生成(Claude)
- 旁白合成(Gemini TTS)
- 背景音乐制作(Lyria)
- 视频片段生成(Veo 3.1)或图片动画
- 最终视频组装(通过media-utils调用FFmpeg)
Workflow
工作流程
Step 1: Gather Requirements (REQUIRED)
步骤1:收集需求(必填)
⚠️ DO NOT skip this step. DO NOT run init_project.py until you have ALL answers.
Use interactive questioning — ask ONE question at a time, wait for the response, then ask the next. This creates a collaborative spec-driven process.
⚠️ 请勿跳过此步骤。在收集到所有答案前,请勿运行init_project.py。
采用交互式提问方式——每次只问一个问题,等待用户回复后再提出下一个。这是一个协作式的需求规范制定流程。
Question Flow
提问流程
⚠️ Use the tool for each question below. Do not just print questions in your response — use the tool to create interactive prompts with the options shown.
AskUserQuestionQ1: Subject
"I'll create that video! First — what's it about?(e.g., product launch, brand story, tutorial, explainer — or describe your own)"
Wait for response.
Q2: Duration
"How long should the video be?
- 15 seconds (quick hook)
- 30 seconds (standard ad)
- 60 seconds (explainer)
- 2+ minutes (detailed)
- Or specify your own duration"
Wait for response.
Q3: Style
"What visual style?
- Premium/luxury
- Fun/playful
- Corporate/professional
- Dramatic/cinematic
- Minimal/clean
- Or describe your own style"
Wait for response.
Q4: Assets
"Do you have existing images or video clips to use?
- No, generate everything
- Yes, I have images (provide paths)
- Yes, I have video clips (provide paths)"
Wait for response.
Q5: Audio Strategy
"How should we handle audio?
- Custom — I generate voiceover + background music
- Veo native — Use Veo's built-in dialogue/SFX/ambient
- Silent — No audio, add later"
Wait for response.
Q6: Voice (if custom audio)
"What voice tone for the voiceover?
- Professional
- Friendly/warm
- Energetic
- Calm/soothing
- Dramatic
- Or describe your own tone"
Wait for response.
Q7: Music (if custom audio)
"What music vibe?
- Modern electronic
- Cinematic/epic
- Upbeat pop
- Ambient/chill
- Corporate
- Or describe your own style"
Wait for response.
Q8: Format
"What aspect ratio?
- 16:9 (YouTube, web)
- 9:16 (TikTok, Reels, Shorts)
- 1:1 (Instagram feed)"
Wait for response.
Q9: Resolution
"What resolution?
- 720p (faster generation)
- 1080p (standard HD)"
Wait for response.
Q10: Model
"Which Veo model?
— Latest, highest quality (default)veo-3.1 — Faster generation, slightly lower qualityveo-3.1-fast — Previous generationveo-3 — Previous gen, faster"veo-3-fast
Wait for response.
⚠️ 每个问题请使用工具。请勿直接在回复中打印问题——使用工具创建带有以下选项的交互式提示。
AskUserQuestionQ1:视频主题
"我这就为您制作视频!首先——视频的主题是什么?(例如:产品发布、品牌故事、教程、讲解视频——或者您自行描述)"
等待回复。
Q2:视频时长
"视频时长应为多少?
- 15秒(快速吸引注意力)
- 30秒(标准广告)
- 60秒(讲解视频)
- 2分钟以上(详细内容)
- 或自行指定时长"
等待回复。
Q3:视觉风格
"视频的视觉风格是什么?
- 高端/奢华
- 趣味/活泼
- 企业/专业
- 戏剧/电影感
- 极简/简洁
- 或自行描述风格"
等待回复。
Q4:现有素材
"您是否有可使用的现有图片或视频片段?
- 没有,全部生成新素材
- 有,我有图片(提供路径)
- 有,我有视频片段(提供路径)"
等待回复。
Q5:音频策略
"音频部分如何处理?
- 自定义——生成旁白+背景音乐
- Veo原生——使用Veo内置的对话/音效/环境音
- 无音频——不添加音频,后续自行处理"
等待回复。
Q6:旁白语音(如果选择自定义音频)
"旁白的语音风格是什么?
- 专业
- 友好/温暖
- 充满活力
- 平静/舒缓
- 戏剧感
- 或自行描述风格"
等待回复。
Q7:音乐风格(如果选择自定义音频)
"背景音乐的风格是什么?
- 现代电子
- 电影感/宏大
- 欢快流行
- 氛围/舒缓
- 企业商务
- 或自行描述风格"
等待回复。
Q8:视频格式
"视频的画面比例是什么?
- 16:9(YouTube、网页)
- 9:16(TikTok、Reels、Shorts)
- 1:1(Instagram动态)"
等待回复。
Q9:分辨率
"视频的分辨率是什么?
- 720p(生成速度更快)
- 1080p(标准高清)"
等待回复。
Q10:模型选择
"使用哪个Veo模型?
——最新版本,最高质量(默认)veo-3.1 ——生成速度更快,质量略低veo-3.1-fast ——上一代版本veo-3 ——上一代版本,速度更快"veo-3-fast
等待回复。
Quick Reference
快速参考
| Question | Determines |
|---|---|
| Subject | Scene content and prompts |
| Duration | Scene count (Veo clips must be 4, 6, or 8 seconds) |
| Style | Visual prompts and music selection |
| Assets | Generate vs use existing |
| Audio | custom, veo_audio, or silent |
| Voice | TTS voice selection |
| Music | Lyria prompt |
| Format | Aspect ratio for Veo |
| Resolution | 720p or 1080p output quality |
| Model | veo-3.1, veo-3.1-fast, veo-3, veo-3-fast |
| 问题 | 决定内容 |
|---|---|
| 主题 | 场景内容及提示词 |
| 时长 | 场景数量(Veo片段必须为4、6或8秒) |
| 风格 | 视觉提示词及音乐选择 |
| 素材 | 生成新素材或使用现有素材 |
| 音频 | 自定义、veo_audio或无音频 |
| 旁白 | TTS语音选择 |
| 音乐 | Lyria提示词 |
| 格式 | Veo的画面比例 |
| 分辨率 | 输出质量为720p或1080p |
| 模型 | veo-3.1、veo-3.1-fast、veo-3、veo-3-fast |
Step 2: Initialize Project
步骤2:初始化项目
Once you have the user's answers, initialize the project with their preferences:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-producer-agent/scripts/init_project.py \
--name "Product Launch Video" \
--duration 30 \
--aspect-ratio 16:9 \
--audio-strategy custom \
--scenes 5收集到用户的所有答案后,根据其偏好初始化项目:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-producer-agent/scripts/init_project.py \\
--name "Product Launch Video" \\
--duration 30 \\
--aspect-ratio 16:9 \\
--audio-strategy custom \\
--scenes 5Step 3: Configure project.json
步骤3:配置project.json
Edit with scene prompts, voiceover text, and music style based on user's answers.
project.json根据用户的答案,编辑文件,填写场景提示词、旁白文本和音乐风格。
project.jsonStep 4: Assemble the Video
步骤4:组装视频
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-producer-agent/scripts/assemble.py \
--project ~/my_video_project/bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-producer-agent/scripts/assemble.py \\
--project ~/my_video_project/Project Structure
项目结构
When you initialize a project, this folder structure is created:
my_project/
├── project.json # Configuration: scenes, voiceover, music, settings
├── storyboard.md # Planning document for the video
├── scenes/ # Generated video clips from Veo
│ ├── scene1_intro.mp4
│ ├── scene2_features.mp4
│ └── scene3_cta.mp4
├── audio/ # Audio assets
│ ├── voiceover.wav # Generated voiceover
│ ├── background_music.wav # Generated music
│ └── final_mix.mp3 # Mixed audio track
├── work/ # Intermediate files (auto-generated)
│ ├── silent_scene1.mp4
│ ├── video_concatenated.mp4
│ └── ...
└── output/ # Final deliverables
└── product_launch_video_final.mp4初始化项目后,将创建以下文件夹结构:
my_project/
├── project.json # 配置文件:场景、旁白、音乐、设置
├── storyboard.md # 视频规划文档
├── scenes/ # Veo生成的视频片段
│ ├── scene1_intro.mp4
│ ├── scene2_features.mp4
│ └── scene3_cta.mp4
├── audio/ # 音频素材
│ ├── voiceover.wav # 生成的旁白
│ ├── background_music.wav # 生成的背景音乐
│ └── final_mix.mp3 # 混合后的音频轨道
├── work/ # 中间文件(自动生成)
│ ├── silent_scene1.mp4
│ ├── video_concatenated.mp4
│ └── ...
└── output/ # 最终交付文件
└── product_launch_video_final.mp4Scripts
脚本说明
init_project.py
init_project.py
Initialize a new video project with folder structure and templates.
bash
undefined初始化新的视频项目,创建文件夹结构和模板。
bash
undefinedBasic project
基础项目
python3 init_project.py --name "My Video" --duration 30
python3 init_project.py --name "My Video" --duration 30
Create in specific directory
在指定目录创建项目
python3 init_project.py --name "Demo Video" --output ~/Videos/
python3 init_project.py --name "Demo Video" --output ~/Videos/
Vertical video for social
适配社交媒体的竖屏视频
python3 init_project.py --name "Instagram Reel" --aspect-ratio 9:16 --duration 15
python3 init_project.py --name "Instagram Reel" --aspect-ratio 9:16 --duration 15
Use Veo's native audio (no custom voiceover/music)
使用Veo原生音频(不使用自定义旁白/音乐)
python3 init_project.py --name "Cinematic Scene" --audio-strategy veo_audio
python3 init_project.py --name "Cinematic Scene" --audio-strategy veo_audio
More scenes
更多场景
python3 init_project.py --name "Long Video" --duration 60 --scenes 5
**Options:**
| Option | Default | Description |
|--------|---------|-------------|
| `--name` | required | Project name |
| `--output` | current dir | Parent directory |
| `--duration` | 30 | Target duration in seconds |
| `--aspect-ratio` | 16:9 | 16:9, 9:16, 1:1, 4:3 |
| `--audio-strategy` | custom | custom, veo_audio, silent |
| `--scenes` | 3 | Number of scene placeholders |python3 init_project.py --name "Long Video" --duration 60 --scenes 5
**选项说明:**
| 选项 | 默认值 | 说明 |
|------|--------|------|
| `--name` | 必填 | 项目名称 |
| `--output` | 当前目录 | 父目录 |
| `--duration` | 30 | 目标时长(秒) |
| `--aspect-ratio` | 16:9 | 画面比例:16:9、9:16、1:1、4:3 |
| `--audio-strategy` | custom | 音频策略:custom、veo_audio、silent |
| `--scenes` | 3 | 场景占位符数量 |assemble.py
assemble.py
Orchestrate the full video assembly pipeline.
bash
undefined协调完整的视频组装流程。
bash
undefinedFull pipeline (generate everything + assemble)
完整流程(生成所有内容并组装)
python3 assemble.py --project ~/my_project/
python3 assemble.py --project ~/my_project/
Skip generation (use existing scene/audio files)
跳过生成步骤(使用现有场景/音频文件)
python3 assemble.py --project ~/my_project/ --skip-generation
python3 assemble.py --project ~/my_project/ --skip-generation
Dry run (show what would be done)
试运行(展示将执行的操作)
python3 assemble.py --project ~/my_project/ --dry-run
**Pipeline steps:**
1. Generate video scenes (Veo 3.1)
2. Strip audio from scenes (if custom audio)
3. Generate voiceover (Gemini TTS)
4. Generate background music (Lyria)
5. Mix voiceover + music
6. Concatenate video clips
7. Merge audio with video
8. Output final video
---python3 assemble.py --project ~/my_project/ --dry-run
**流程步骤:**
1. 生成视频场景(Veo 3.1)
2. 移除场景中的音频(如果使用自定义音频)
3. 生成旁白(Gemini TTS)
4. 生成背景音乐(Lyria)
5. 混合旁白与音乐
6. 拼接视频片段
7. 将音频与视频合并
8. 输出最终视频
---project.json Configuration
project.json配置示例
json
{
"name": "Product Launch Video",
"duration_target": 30,
"aspect_ratio": "16:9",
"resolution": "720p",
"audio_strategy": "custom",
"scenes": [
{
"id": 1,
"name": "scene1_hero",
"prompt": "Cinematic slow zoom on premium product, dramatic lighting, high-end commercial style",
"duration": 6,
"notes": "Music only, no voiceover"
},
{
"id": 2,
"name": "scene2_features",
"prompt": "Product features demonstration, sleek animations, modern tech aesthetic",
"duration": 8,
"notes": "Voiceover starts here"
},
{
"id": 3,
"name": "scene3_cta",
"prompt": "Product with logo on clean background, call to action moment",
"duration": 6,
"notes": "Music swells, voiceover ends"
}
],
"voiceover": {
"enabled": true,
"text": "Introducing the future of audio. Crystal clear sound. All-day comfort. Experience the difference.",
"voice": "Charon",
"style": "Professional, confident, premium brand voice"
},
"music": {
"enabled": true,
"prompt": "modern electronic, premium, sleek, product showcase, subtle bass",
"duration": 35,
"bpm": 100,
"brightness": 0.6
},
"assembly": {
"transition": "fade",
"transition_duration": 0.5,
"music_volume": 0.3,
"fade_in": 1.0,
"fade_out": 2.0
}
}json
{
"name": "Product Launch Video",
"duration_target": 30,
"aspect_ratio": "16:9",
"resolution": "720p",
"audio_strategy": "custom",
"scenes": [
{
"id": 1,
"name": "scene1_hero",
"prompt": "Cinematic slow zoom on premium product, dramatic lighting, high-end commercial style",
"duration": 6,
"notes": "Music only, no voiceover"
},
{
"id": 2,
"name": "scene2_features",
"prompt": "Product features demonstration, sleek animations, modern tech aesthetic",
"duration": 8,
"notes": "Voiceover starts here"
},
{
"id": 3,
"name": "scene3_cta",
"prompt": "Product with logo on clean background, call to action moment",
"duration": 6,
"notes": "Music swells, voiceover ends"
}
],
"voiceover": {
"enabled": true,
"text": "Introducing the future of audio. Crystal clear sound. All-day comfort. Experience the difference.",
"voice": "Charon",
"style": "Professional, confident, premium brand voice"
},
"music": {
"enabled": true,
"prompt": "modern electronic, premium, sleek, product showcase, subtle bass",
"duration": 35,
"bpm": 100,
"brightness": 0.6
},
"assembly": {
"transition": "fade",
"transition_duration": 0.5,
"music_volume": 0.3,
"fade_in": 1.0,
"fade_out": 2.0
}
}Audio Strategies
音频策略说明
| Strategy | Description | Use When |
|---|---|---|
| Strip Veo audio, add custom voiceover + music | Most videos |
| Keep Veo's generated audio (dialogue, SFX) | Cinematic scenes, dialogues |
| Strip audio, output silent video | Adding audio later |
| 策略 | 说明 | 使用场景 |
|---|---|---|
| 移除Veo生成的音频,添加自定义旁白和音乐 | 大多数视频 |
| 保留Veo生成的音频(对话、音效) | 电影场景、对话视频 |
| 移除音频,输出无音视频 | 后续自行添加音频 |
Workflow: Creating a Video
制作视频的工作流程
Step 1: Initialize Project
步骤1:初始化项目
bash
python3 init_project.py --name "Wireless Earbuds Promo" --duration 30bash
python3 init_project.py --name "Wireless Earbuds Promo" --duration 30Step 2: Plan the Storyboard
步骤2:规划分镜
Edit to plan your video structure:
storyboard.mdmarkdown
undefined编辑来规划视频结构:
storyboard.mdmarkdown
undefinedScene 1: Hero Reveal (0-5s)
场景1:主视觉展示(0-5秒)
- Visual: Earbuds emerging from shadow, premium lighting
- Audio: Music only (dramatic intro)
- 视觉:耳机从阴影中出现,高端灯光效果
- 音频:仅音乐(戏剧性开场)
Scene 2: Sound Quality (5-12s)
场景2:音质展示(5-12秒)
- Visual: Sound waves, person enjoying music
- Audio: Voiceover: "Crystal clear sound. Immersive bass."
- 视觉:声波动画,用户享受音乐的画面
- 音频:旁白:"水晶般清晰的音质。沉浸式低音。"
Scene 3: Comfort (12-20s)
场景3:佩戴舒适度(12-20秒)
- Visual: Close-up of fit, person running
- Audio: Voiceover: "All-day comfort. Secure fit."
- 视觉:佩戴特写,用户跑步的画面
- 音频:旁白:"全天舒适佩戴。稳固贴合。"
Scene 4: CTA (20-30s)
场景4:行动号召(20-30秒)
- Visual: Product + logo
- Audio: Voiceover: "Experience the difference." + music swell
undefined- 视觉:产品+标志在简洁背景中
- 音频:旁白:"体验与众不同的音质。" + 音乐渐强
undefinedStep 3: Configure project.json
步骤3:配置project.json
Fill in the scene prompts, voiceover text, and music style based on your storyboard.
根据分镜填写场景提示词、旁白文本和音乐风格。
Step 4: Assemble
步骤4:组装视频
bash
python3 assemble.py --project ~/wireless_earbuds_promo/bash
python3 assemble.py --project ~/wireless_earbuds_promo/Step 5: Review and Iterate
步骤5:审核与迭代
Check the output in . If adjustments needed:
output/bash
undefined查看目录下的最终视频。如需调整:
output/bash
undefinedRe-run with existing scenes (just re-mix audio)
重新运行,使用现有场景(仅重新混合音频)
python3 assemble.py --project ~/wireless_earbuds_promo/ --skip-generation
---python3 assemble.py --project ~/wireless_earbuds_promo/ --skip-generation
---What You Can Create
可创建的视频类型
| Type | Example |
|---|---|
| Product video | 30s hero video showcasing a product |
| Explainer video | How-to or feature explanation |
| Promo/ad video | Marketing advertisement |
| Demo video | Product demonstration |
| Training video | Internal training content |
| Testimonial | Customer quote style video |
| Brand video | Company/brand story |
| 类型 | 示例 |
|---|---|
| 产品视频 | 30秒主打产品展示视频 |
| 讲解视频 | 操作指南或功能说明视频 |
| 推广/广告视频 | 营销广告视频 |
| 演示视频 | 产品演示视频 |
| 培训视频 | 内部培训内容 |
| 客户见证视频 | 客户语录风格视频 |
| 品牌视频 | 公司/品牌故事视频 |
Prerequisites
前置条件
- - For Veo (video), Gemini TTS (voice), Lyria (music)
GOOGLE_API_KEY - FFmpeg installed:
brew install ffmpeg
- - 用于Veo(视频)、Gemini TTS(旁白)、Lyria(音乐)
GOOGLE_API_KEY - 已安装FFmpeg:
brew install ffmpeg
Video Styles & Music Pairings
视频风格与音乐搭配
| Style | Music Prompt | Voice |
|---|---|---|
| Premium/Luxury | "elegant, minimal, ambient, sophisticated" | Charon (informative) |
| Tech/Modern | "electronic, futuristic, clean, innovative" | Kore (firm) |
| Fun/Playful | "upbeat, cheerful, acoustic, positive" | Puck (upbeat) |
| Corporate | "professional, inspiring, orchestral lite" | Orus (firm) |
| Lifestyle | "chill, aspirational, indie, warm" | Aoede (breezy) |
| Dramatic/Cinematic | "epic, orchestral, emotional, building" | Gacrux (mature) |
| 风格 | 音乐提示词 | 旁白语音 |
|---|---|---|
| 高端/奢华 | "elegant, minimal, ambient, sophisticated" | Charon(信息性) |
| 科技/现代 | "electronic, futuristic, clean, innovative" | Kore(坚定) |
| 趣味/活泼 | "upbeat, cheerful, acoustic, positive" | Puck(活力) |
| 企业商务 | "professional, inspiring, orchestral lite" | Orus(坚定) |
| 生活方式 | "chill, aspirational, indie, warm" | Aoede(轻松) |
| 戏剧/电影感 | "epic, orchestral, emotional, building" | Gacrux(成熟) |
Common Video Structures
常见视频结构
Product Video (30s)
产品视频(30秒)
0-5s: Hero shot (music only)
5-15s: Features (voiceover + music)
15-25s: Lifestyle/use case (voiceover + music)
25-30s: Logo + CTA (music fade)0-5秒:主视觉镜头(仅音乐)
5-15秒:产品功能(旁白+音乐)
15-25秒:使用场景/生活方式(旁白+音乐)
25-30秒:标志+行动号召(音乐渐弱)Explainer Video (60s)
讲解视频(60秒)
0-5s: Hook/problem statement
5-20s: Solution introduction
20-45s: How it works (3 steps)
45-55s: Benefits summary
55-60s: CTA0-5秒:钩子/问题陈述
5-20秒:解决方案介绍
20-45秒:工作原理(3个步骤)
45-55秒:优势总结
55-60秒:行动号召Testimonial Video (45s)
客户见证视频(45秒)
0-5s: Intro/name card
5-35s: Testimonial quote (multiple scenes)
35-45s: Product shot + logo0-5秒:介绍/姓名卡片
5-35秒:客户见证语录(多场景)
35-45秒:产品画面+标志Manual Workflow (Without Scripts)
手动工作流程(不使用脚本)
If you prefer to run each step manually:
Generate scenes:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
--batch scenes.jsonGenerate voiceover:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/voice-generation/scripts/gemini_tts.py \
--text "Your voiceover text..." \
--voice Charon \
--style "Professional, warm"Generate music:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \
--prompt "modern electronic, premium" \
--duration 35Strip audio from clips:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_strip_audio.py \
-i scene*.mp4Concatenate videos:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_concat.py \
-i silent_*.mp4 --transition fade -o video.mp4Mix audio:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_mix.py \
--voice voiceover.wav --music music.wav -o audio.mp3Merge audio + video:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_audio_merge.py \
--video video.mp4 --audio audio.mp3 -o final.mp4如果您偏好手动执行每个步骤:
生成场景:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \\
--batch scenes.json生成旁白:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/voice-generation/scripts/gemini_tts.py \\
--text "Your voiceover text..." \\
--voice Charon \\
--style "Professional, warm"生成音乐:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \\
--prompt "modern electronic, premium" \\
--duration 35移除视频片段中的音频:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_strip_audio.py \\
-i scene*.mp4拼接视频:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_concat.py \\
-i silent_*.mp4 --transition fade -o video.mp4混合音频:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_mix.py \\
--voice voiceover.wav --music music.wav -o audio.mp3合并音频与视频:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_audio_merge.py \\
--video video.mp4 --audio audio.mp3 -o final.mp4Input Files You Can Provide
可提供的输入文件
| File Type | How It's Used |
|---|---|
| Product images | Animate with Veo as first frame |
| Logo (PNG) | Overlay on final scene |
| Existing voiceover | Place in |
| Brand music | Place in |
| Video clips | Place in |
| Script/copy | Use for voiceover text |
| 文件类型 | 使用方式 |
|---|---|
| 产品图片 | 作为第一帧通过Veo制作动画 |
| 标志(PNG) | 叠加在最后一个场景 |
| 现有旁白 | 放置在 |
| 品牌音乐 | 放置在 |
| 视频片段 | 放置在 |
| 脚本/文案 | 用作旁白文本 |
Limitations
限制条件
- Veo video duration: Max 8 seconds per clip (concatenate for longer)
- Veo 3.1 includes audio: All clips have generated audio (strip if using custom)
- Processing time: Video generation takes 1-3 minutes per clip
- Resolution: Currently 720p or 1080p (1080p for 8s only)
- Veo视频时长:每个片段最长8秒(如需更长视频可拼接)
- Veo 3.1包含音频:所有片段均带有生成的音频(如果使用自定义音频则需移除)
- 处理时间:视频生成每个片段需要1-3分钟
- 分辨率:目前仅支持720p或1080p(1080p仅支持8秒片段)
Error Handling
错误处理
| Error | Solution |
|---|---|
| "GOOGLE_API_KEY not set" | Set up API key per README |
| "FFmpeg not found" | Install: |
| "project.json not found" | Run init_project.py first |
| Video generation timeout | Retry, or use shorter duration |
| Audio/video sync issues | Adjust scene durations |
| 错误 | 解决方案 |
|---|---|
| "GOOGLE_API_KEY not set" | 按照README设置API密钥 |
| "FFmpeg not found" | 安装FFmpeg: |
| "project.json not found" | 先运行init_project.py |
| 视频生成超时 | 重试,或使用更短的时长 |
| 音视频不同步 | 调整场景时长 |
Example Prompts
示例提示词
Simple:
"Create a 30-second product video for my new coffee maker"
Detailed:
"Create a 45-second product video for our new wireless earbuds. Premium, luxury feel. I have product photos attached. Professional male voiceover. Modern electronic music. 16:9 for YouTube."
With project:
"Initialize a video project called 'App Demo' with 5 scenes, 60 seconds total, vertical format for TikTok"
简单提示:
"为我的新咖啡机创建一个30秒的产品视频"
详细提示:
"为我们的新款无线耳机创建一个45秒的产品视频。高端、奢华风格。我已附上产品照片。使用专业男性旁白。现代电子音乐。16:9比例适配YouTube。"
项目初始化提示:
"初始化一个名为'应用演示'的视频项目,包含5个场景,总时长60秒,竖屏格式适配TikTok" ",