moviepy

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

moviepy for Video Production

基于moviepy的视频制作

moviepy is the toolkit's go-to library for putting deterministic text on top of AI-generated video and for building short, single-file Python video projects without a Remotion toolchain.

The deeper principle is trustworthy text: any genre where text has to be readable, accurate, and consistent (legally, editorially, or commercially) is a genre where AI-rendered in-frame text is unacceptable and a moviepy overlay step is the natural fix. Names must be spelled right. Prices must be exact. Source attributions must be pixel-perfect. AI generation models cannot guarantee any of that.

moviepy是本工具集中用于在AI生成的视频上叠加确定性文本，以及无需Remotion工具链即可构建简短单文件Python视频项目的首选库。

其核心原则是可信文本：任何对文本的可读性、准确性和一致性有严格要求（法律、编辑或商业层面）的场景，AI渲染的帧内文本都是不可接受的，而moviepy的叠加步骤则是天然的解决方案。姓名拼写必须准确，价格必须精确，来源标注必须像素级完美。AI生成模型无法保证这些要求。

When to use moviepy vs. Remotion

moviepy与Remotion的适用场景对比

Use moviepy when…	Use Remotion when…
Overlaying text/labels on an LTX-2 or SadTalker output	Building long-form sprint reviews or product demos
Building sub-30s ad-style spots in a single `build.py`	Multi-template, multi-brand, design-heavy work
Compositing data-driven visuals (matplotlib `FuncAnimation` → mp4)	Anything needing React components or design system reuse
One-off transformations on existing video files	Anything where the project lifecycle (planning → render) matters
You want zero Node.js / no React mental overhead	You want hot-reload preview in Remotion Studio

Two runnable references for everything in this skill live in

examples/

examples/quick-spot/build.py
— 15-second ad-style spot. Audio-anchored timeline, text overlay, optional VO + ducked music. Renders silent out of the box with zero external assets.
examples/data-viz-chart/build.py
— animated time-series chart with deterministic title and source attribution. Demonstrates the matplotlib (data) + moviepy (trustworthy text) split.

Both run with

python3 build.py

and produce a real

out.mp4

immediately. Read them alongside this skill — every pattern below is shown working there.

Dependencies.

moviepy

Pillow

, and

matplotlib

are declared in

tools/requirements.txt

and installed with the toolkit's one-line Python setup:

python3 -m pip install -r tools/requirements.txt

. If you hit

Missing dependency

when running an example, run that command from the repo root — the examples'

build.py

files will tell you the same thing in their error message and exit cleanly rather than printing a bare traceback.

选择moviepy的场景…	选择Remotion的场景…
为LTX-2或SadTalker的输出叠加文本/标签	制作长格式冲刺回顾或产品演示视频
用单文件 `build.py` 制作30秒以内的广告风格视频	多模板、多品牌、重设计的项目
合成数据驱动的可视化内容（matplotlib `FuncAnimation` → mp4）	需要使用React组件或复用设计系统的场景
对现有视频文件进行一次性转换	重视项目生命周期（规划→渲染）的场景
希望完全无需Node.js / 无React思维负担	希望在Remotion Studio中实现热重载预览

本技能的两个可运行参考示例位于

examples/

目录下：

examples/quick-spot/build.py
— 15秒广告风格视频。基于音频锚定的时间线、文本叠加、可选旁白+压低音量的背景音乐。默认无需外部资源即可渲染出无音视频。
examples/data-viz-chart/build.py
— 带有确定性标题和来源标注的动画时间序列图表。展示了matplotlib（数据处理）+ moviepy（可信文本）的分工模式。

两个示例均可通过

python3 build.py

运行，并立即生成真实的

out.mp4

文件。阅读本技能时可同时参考这两个示例——下文提到的所有模式均已在其中实现。

依赖项。

moviepy

、

Pillow

和

matplotlib

已在

tools/requirements.txt

中声明，可通过工具集的一键Python安装命令安装：

python3 -m pip install -r tools/requirements.txt

。如果运行示例时遇到

Missing dependency

错误，请从仓库根目录运行该命令——示例中的

build.py

文件会在错误信息中提示相同内容，并干净退出而非打印原始回溯信息。

The main use case: text on AI-generated video

主要使用场景：AI生成视频上的文本叠加

Both LTX-2 and SadTalker output bare visuals:

LTX-2 cannot reliably render readable text (the model hallucinates letterforms — see the ltx2 skill's "Bad Prompts").
SadTalker outputs a talking head with no captions, labels, lower thirds, or context.

The fix is to generate the visual cleanly, then composite text over it deterministically with moviepy. This is the canonical pattern in this toolkit:

python

from moviepy import VideoFileClip, ImageClip, CompositeVideoClip

LTX-2和SadTalker均仅输出纯视觉内容：

LTX-2无法可靠渲染可读文本（模型会幻写字形——详见ltx2技能的“不良提示”部分）。
SadTalker输出的是说话人头像，无字幕、标签、下三分之一字幕或上下文信息。

解决方案是先干净生成视觉内容，再通过moviepy确定性地合成文本叠加层。这是本工具集中的标准模式：

python

from moviepy import VideoFileClip, ImageClip, CompositeVideoClip

1. AI-generated visual (LTX-2 or SadTalker output)

1. AI生成的视觉内容（LTX-2或SadTalker输出）

bg = VideoFileClip("lugh_ltx.mp4").without_audio()

2. Text rendered via PIL → ImageClip (see "Text rendering" below)

2. 通过PIL渲染文本→ImageClip（详见下文“文本渲染”）

title = ( ImageClip("text_cache/intro_title.png") .with_duration(2.0) .with_start(0.5) .with_position(("center", 880)) )

3. Composite

3. 合成

final = CompositeVideoClip([bg, title], size=(1920, 1080)) final.write_videofile("lugh_with_caption.mp4", fps=30, codec="libx264")


Common shapes this takes:

| Shape | LTX-2 use | SadTalker use |
|-------|-----------|---------------|
| Title card over hero footage | "INTRODUCING LONGARM" over a cinematic LTX-2 b-roll | n/a |
| Lower third / name plate | n/a | "Lugh — Ancient Warrior God" under a talking head |
| Quote caption | "I am going home." over an LTX-2 character cameo | Same, over a SadTalker talking head |
| Brand attribution | Logo + URL fade-in over the last second | Same |
| Tinted overlay for contrast | Dark navy semi-transparent layer behind text | Same |

final = CompositeVideoClip([bg, title], size=(1920, 1080)) final.write_videofile("lugh_with_caption.mp4", fps=30, codec="libx264")


常见的应用形式：

| 形式 | LTX-2应用场景 | SadTalker应用场景 |
|-------|-----------|---------------|
| 英雄镜头上的标题卡片 | 在LTX-2生成的电影风格B-roll上叠加“INTRODUCING LONGARM” | 不适用 |
| 下三分之一字幕/姓名牌 | 不适用 | 在说话人头像下方叠加“Lugh — Ancient Warrior God” |
| 引语字幕 | 在LTX-2生成的角色特写镜头上叠加“I am going home.” | 相同，在SadTalker生成的说话人头像上叠加 |
| 品牌标识 | 在最后一秒淡入Logo+网址 | 相同 |
| 用于增强对比度的着色叠加层 | 在文本后方添加深蓝色半透明层 | 相同 |

Genres where this shines

适用的视频类型

The "AI-visual + deterministic text overlay" pattern is the natural production pipeline for several styles of video. If the request matches one of these, reach for moviepy by default:

Genre	What you overlay	Why moviepy is the right call
News / talking-head journalism	Speaker name plates, location bars, breaking-news banners, source attribution, pull quotes	Names must be spelled right (editorial / legal). The biggest category by volume.
Documentary segments	Interviewee lower thirds, chapter titles, archival source credits, location stamps	Same trust requirement as news.
Trailers / promo spots	Title cards, credit overlays ("FROM THE DIRECTOR OF…"), date stings, quote cards, CTAs	Tightly timed, text-heavy, every frame matters. The `q2-townhall-longarm-ad` example is exactly this.
Social short-form (Reels, TikTok, Shorts)	Word-accurate captions for sound-off viewing, hashtag overlays	Most social viewing is muted; captions are non-negotiable.
Product demos with annotations	Pricing callouts, feature labels, "click here" pointers over screen recordings, before/after labels	Prices and product names must be exact.
Tutorials / explainers	Step number overlays, terminal-command captions, keyboard-shortcut callouts	Step numbers must be sequential, commands must be copy-pasteable.

Lesser-but-real fits: music videos (lyric overlays), reaction videos (source attribution), sports recaps (score overlays), real-estate tours (price / sqft), conference talks (speaker + session plate).

For full SRT-driven subtitling (long-form, time-coded, multilingual) moviepy is workable but not ideal — reach for

ffmpeg

with

subtitles

filter or a dedicated subtitle tool. moviepy is best for hand-placed overlays, not bulk caption tracks.

“AI视觉内容+确定性文本叠加”模式是多种视频风格的天然制作流程。如果需求符合以下类型，默认选择moviepy即可：

视频类型	叠加内容	为什么moviepy是正确选择
新闻/访谈类视频	说话人姓名牌、位置栏、突发新闻横幅、来源标注、引语卡片	姓名拼写必须准确（编辑/法律要求）。这是需求量最大的类别。
纪录片片段	受访者下三分之一字幕、章节标题、档案来源 credits、位置标记	与新闻类有相同的可信性要求。
预告片/宣传视频	标题卡片、信用叠加层（“FROM THE DIRECTOR OF…”）、日期标记、引语卡片、行动号召（CTA）	时间紧凑、文本密集，每一帧都至关重要。 `q2-townhall-longarm-ad` 示例正是此类场景。
社交短视频（Reels、TikTok、Shorts）	精准对应语音的字幕（静音观看场景）、话题标签叠加层	大多数社交视频观看为静音状态，字幕必不可少。
带注释的产品演示	价格标注、功能标签、屏幕录制上的“点击此处”指示、前后对比标签	价格和产品名称必须精确。
教程/讲解视频	步骤编号叠加层、终端命令字幕、快捷键标注	步骤编号必须连续，命令必须可复制粘贴。

其他适用场景：音乐视频（歌词叠加）、反应视频（来源标注）、体育回顾（比分叠加）、房地产巡展（价格/面积）、会议演讲（演讲者+场次信息牌）。

对于完整的SRT驱动字幕（长格式、时间编码、多语言），moviepy虽可行但并非理想选择——建议使用带

subtitles

滤镜的

ffmpeg

或专用字幕工具。moviepy最适合手动放置的叠加层，而非批量字幕轨道。

Text rendering — use PIL, not

TextClip

文本渲染——使用PIL，而非

TextClip

Critical gotcha: moviepy 2.x's

TextClip(method='label')

has a tight-bbox bug that clips letter ascenders and descenders (the tops of capitals, the tails of g/p/y). On Apple Silicon you'll see characters with sliced edges and not realise what's wrong for hours.

The workaround: render text to a transparent PNG via PIL, then load it as an

ImageClip

. Cache the result by content hash so re-builds are free.

python

import hashlib
from pathlib import Path
from PIL import Image, ImageDraw, ImageFont

ARIAL_BOLD = "/System/Library/Fonts/Supplemental/Arial Bold.ttf"

def render_text_png(txt, size, hex_color, cache_dir="./text_cache"):
    cache = Path(cache_dir); cache.mkdir(parents=True, exist_ok=True)
    key = hashlib.sha1(f"{txt}|{size}|{hex_color}".encode()).hexdigest()[:16]
    path = cache / f"{key}.png"
    if path.exists():
        return str(path)

    font = ImageFont.truetype(ARIAL_BOLD, size)
    bbox = ImageDraw.Draw(Image.new("RGBA", (1, 1))).textbbox((0, 0), txt, font=font)
    tw, th = bbox[2] - bbox[0], bbox[3] - bbox[1]
    pad = max(20, size // 4)

    img = Image.new("RGBA", (tw + pad * 2, th + pad * 2), (0, 0, 0, 0))
    rgb = tuple(int(hex_color.lstrip("#")[i:i+2], 16) for i in (0, 2, 4))
    ImageDraw.Draw(img).text((pad - bbox[0], pad - bbox[1]), txt, font=font, fill=(*rgb, 255))
    img.save(path)
    return str(path)

The full helper (with kwargs for bold, position, fades, and cleaner ergonomics) is in

examples/quick-spot/build.py

— copy it rather than re-implementing.

关键陷阱： moviepy 2.x的

TextClip(method='label')

存在边界框过紧的bug，会裁剪字母的上升部和下降部（大写字母的顶部、g/p/y的尾部）。在Apple Silicon设备上，你会看到字符边缘被切割，可能数小时都意识不到问题所在。

解决方案： 通过PIL将文本渲染为透明PNG，再作为

ImageClip

加载。通过内容哈希缓存结果，以便重新构建时无需重复渲染。

python

import hashlib
from pathlib import Path
from PIL import Image, ImageDraw, ImageFont

ARIAL_BOLD = "/System/Library/Fonts/Supplemental/Arial Bold.ttf"

def render_text_png(txt, size, hex_color, cache_dir="./text_cache"):
    cache = Path(cache_dir); cache.mkdir(parents=True, exist_ok=True)
    key = hashlib.sha1(f"{txt}|{size}|{hex_color}".encode()).hexdigest()[:16]
    path = cache / f"{key}.png"
    if path.exists():
        return str(path)

    font = ImageFont.truetype(ARIAL_BOLD, size)
    bbox = ImageDraw.Draw(Image.new("RGBA", (1, 1))).textbbox((0, 0), txt, font=font)
    tw, th = bbox[2] - bbox[0], bbox[3] - bbox[1]
    pad = max(20, size // 4)

    img = Image.new("RGBA", (tw + pad * 2, th + pad * 2), (0, 0, 0, 0))
    rgb = tuple(int(hex_color.lstrip("#")[i:i+2], 16) for i in (0, 2, 4))
    ImageDraw.Draw(img).text((pad - bbox[0], pad - bbox[1]), txt, font=font, fill=(*rgb, 255))
    img.save(path)
    return str(path)

完整的辅助函数（包含粗体、位置、淡入淡出等参数，使用更便捷）位于

examples/quick-spot/build.py

中——直接复制使用即可，无需重新实现。

Audio-anchored timeline pattern

音频锚定时间线模式

For ad-style edits where every frame matters, generate per-scene VO first and anchor every visual to known absolute timestamps. This eliminates timing drift entirely. See CLAUDE.md → Video Timing → Audio-Anchored Timelines for the full pattern. The short version:

python

undefined

对于对每一帧都有严格要求的广告风格剪辑，先生成每个场景的旁白，再将所有视觉内容锚定到已知的绝对时间戳。这可完全消除时间漂移。完整模式详见CLAUDE.md → Video Timing → Audio-Anchored Timelines。简化版本如下：

python

undefined

Audio-anchored timeline (25s):

音频锚定时间线（25秒）：

Scene 1 tired 0.3 → 3.74 (audio 3.44s)

场景1 tired 0.3 → 3.74 （音频时长3.44秒）

Scene 2 worries 4.0 → 8.88 (audio 4.88s)

场景2 worries 4.0 → 8.88 （音频时长4.88秒）

text_clip("TIRED OF", start=0.5, duration=1.2) text_clip("THIRD-PARTY", start=1.0, duration=1.8) vo_clip("01_tired.mp3", start=0.3) vo_clip("02_worries.mp3", start=4.0)

undefined

text_clip("TIRED OF", start=0.5, duration=1.2) text_clip("THIRD-PARTY", start=1.0, duration=1.8) vo_clip("01_tired.mp3", start=0.3) vo_clip("02_worries.mp3", start=4.0)

undefined

Common recipes

常用配方

Text on a single AI-generated clip

单段AI生成视频上的文本叠加

python

from moviepy import VideoFileClip, ImageClip, CompositeVideoClip

bg = VideoFileClip("ltx_hero.mp4").without_audio()
caption = (
    ImageClip(render_text_png("THE FUTURE OF AGENTS", 140, "#FFFFFF"))
    .with_duration(bg.duration)
    .with_position(("center", 880))
)
CompositeVideoClip([bg, caption], size=bg.size).write_videofile("captioned.mp4", fps=30)

python

from moviepy import VideoFileClip, ImageClip, CompositeVideoClip

bg = VideoFileClip("ltx_hero.mp4").without_audio()
caption = (
    ImageClip(render_text_png("THE FUTURE OF AGENTS", 140, "#FFFFFF"))
    .with_duration(bg.duration)
    .with_position(("center", 880))
)
CompositeVideoClip([bg, caption], size=bg.size).write_videofile("captioned.mp4", fps=30)

Lower third over a SadTalker talking head

SadTalker说话人头像上的下三分之一字幕

python

from moviepy import VideoFileClip, ImageClip, ColorClip, CompositeVideoClip

talking = VideoFileClip("narrator_sadtalker.mp4")
W, H = talking.size

python

from moviepy import VideoFileClip, ImageClip, ColorClip, CompositeVideoClip

talking = VideoFileClip("narrator_sadtalker.mp4")
W, H = talking.size

Semi-transparent bar across the bottom for contrast

底部半透明条增强对比度

bar = ( ColorClip((W, 140), color=(20, 24, 38)) .with_duration(talking.duration) .with_opacity(0.75) .with_position(("center", H - 160)) ) name = ( ImageClip(render_text_png("LUGH", 72, "#F06859")) .with_duration(talking.duration) .with_position((80, H - 150)) ) title = ( ImageClip(render_text_png("Ancient Warrior God", 36, "#FFFFFF")) .with_duration(talking.duration) .with_position((80, H - 80)) ) CompositeVideoClip([talking, bar, name, title]).write_videofile("with_lower_third.mp4", fps=30)

undefined

undefined

Tinted overlay for text contrast over busy footage

为复杂画面添加着色叠加层增强文本对比度

LTX-2 b-roll is often too visually busy for legible text. Drop a semi-transparent navy layer between the video and the text:

python

from moviepy import ColorClip

tint = (
    ColorClip((W, H), color=(20, 24, 38))
    .with_duration(duration)
    .with_opacity(0.55)
)

LTX-2生成的B-roll画面通常过于复杂，导致文本难以辨认。在视频和文本之间添加半透明深蓝色层：

python

from moviepy import ColorClip

tint = (
    ColorClip((W, H), color=(20, 24, 38))
    .with_duration(duration)
    .with_opacity(0.55)
)

Composite order: bg → tint → text

合成顺序：背景 → 着色层 → 文本

CompositeVideoClip([bg, tint, text_clip])

undefined

CompositeVideoClip([bg, tint, text_clip])

undefined

Side-by-side composite

并排合成

python

from moviepy import VideoFileClip, CompositeVideoClip, ColorClip

left  = VideoFileClip("demo_a.mp4").resized(width=960).with_position((  0, "center"))
right = VideoFileClip("demo_b.mp4").resized(width=960).with_position((960, "center"))
bg    = ColorClip((1920, 1080), color=(0, 0, 0)).with_duration(max(left.duration, right.duration))
CompositeVideoClip([bg, left, right]).write_videofile("split.mp4", fps=30)

python

from moviepy import VideoFileClip, CompositeVideoClip, ColorClip

left  = VideoFileClip("demo_a.mp4").resized(width=960).with_position((  0, "center"))
right = VideoFileClip("demo_b.mp4").resized(width=960).with_position((960, "center"))
bg    = ColorClip((1920, 1080), color=(0, 0, 0)).with_duration(max(left.duration, right.duration))
CompositeVideoClip([bg, left, right]).write_videofile("split.mp4", fps=30)

Mix per-scene VO with ducked music

混合场景旁白与压低音量的背景音乐

python

from moviepy import AudioFileClip, CompositeAudioClip
from moviepy.audio.fx.MultiplyVolume import MultiplyVolume
from moviepy.audio.fx.AudioFadeIn import AudioFadeIn
from moviepy.audio.fx.AudioFadeOut import AudioFadeOut

music = AudioFileClip("music.mp3").with_effects([
    MultiplyVolume(0.22),  # duck under VO
    AudioFadeIn(0.5),
    AudioFadeOut(1.5),
])
vo = [
    AudioFileClip(f"scenes/0{i}.mp3").with_effects([MultiplyVolume(1.15)]).with_start(start)
    for i, start in [(1, 0.3), (2, 4.0), (3, 9.1)]
]
final_audio = CompositeAudioClip([music] + vo)

python

from moviepy import AudioFileClip, CompositeAudioClip
from moviepy.audio.fx.MultiplyVolume import MultiplyVolume
from moviepy.audio.fx.AudioFadeIn import AudioFadeIn
from moviepy.audio.fx.AudioFadeOut import AudioFadeOut

music = AudioFileClip("music.mp3").with_effects([
    MultiplyVolume(0.22),  # 旁白时压低音量
    AudioFadeIn(0.5),
    AudioFadeOut(1.5),
])
vo = [
    AudioFileClip(f"scenes/0{i}.mp3").with_effects([MultiplyVolume(1.15)]).with_start(start)
    for i, start in [(1, 0.3), (2, 4.0), (3, 9.1)]
]
final_audio = CompositeAudioClip([music] + vo)

Gotchas

常见陷阱

moviepy 2.x renamed methods. Use
```
subclipped
```
(not
```
subclip
```
),
```
with_duration
```
/
```
with_start
```
/
```
with_position
```
(not
```
set_duration
```
etc.),
```
with_effects([...])
```
instead of
```
.fadein()
```
/
```
.fadeout()
```
. Many tutorials online still show 1.x syntax — be skeptical.
TextClip(method='label')
clips ascenders/descenders. Always use the PIL workaround above.
OffthreadVideo
is Remotion-only. moviepy uses
```
VideoFileClip
```
. Don't mix the two.
Resizing requires Pillow ≥ 10.0 for the LANCZOS resample. If you see
```
ANTIALIAS
```
errors, upgrade Pillow.
ColorClip
takes RGB tuples, not hex strings. Use
```
(20, 24, 38)
```
, not
```
"#141826"
```
.
Audio in
VideoFileClip
is loaded by default. Call
```
.without_audio()
```
if you only want the visual — composing with audio you don't want will cause silent VO drops in
```
CompositeAudioClip
```
.
Always set
size=(W, H)
on
CompositeVideoClip
. Without it, output dimensions follow the first clip, which can be smaller than your target.

moviepy 2.x重命名了方法。使用
```
subclipped
```
（而非
```
subclip
```
）、
```
with_duration
```
/
```
with_start
```
/
```
with_position
```
（而非
```
set_duration
```
等）、
```
with_effects([...])
```
替代
```
.fadein()
```
/
```
.fadeout()
```
。网上许多教程仍使用1.x语法——需谨慎辨别。
TextClip(method='label')
会裁剪字母的上升部/下降部。始终使用上述PIL解决方案。
OffthreadVideo
是Remotion专属。moviepy使用
```
VideoFileClip
```
。不要混用两者。
调整大小需要Pillow ≥ 10.0以支持LANCZOS重采样。如果遇到
```
ANTIALIAS
```
错误，请升级Pillow。
ColorClip
接受RGB元组，而非十六进制字符串。使用
```
(20, 24, 38)
```
，而非
```
"#141826"
```
。
VideoFileClip
默认加载音频。如果仅需要视觉内容，请调用
```
.without_audio()
```
——合成不需要的音频会导致
```
CompositeAudioClip
```
中出现无声旁白丢失的问题。
始终为
CompositeVideoClip
设置
size=(W, H)
。如果不设置，输出尺寸将跟随第一个剪辑，可能小于目标尺寸。

When to reach for what

工具选择指南

Task	Tool
Animate a still image	`tools/ltx2.py --input`
Talking head from photoreal portrait	`tools/sadtalker.py`
Talking head from stylized character	`tools/ltx2.py --input` (see ltx2 skill)
Add a label/caption/lower third to either of the above	moviepy + PIL (this skill)
Convert / compress / resize an existing file	`ffmpeg` (see ffmpeg skill)
Long-form, design-system-driven video	Remotion (see remotion skill)

任务	工具
为静态图片添加动画效果	`tools/ltx2.py --input`
从写实肖像生成说话人头像	`tools/sadtalker.py`
从风格化角色生成说话人头像	`tools/ltx2.py --input` （详见ltx2技能）
为上述任意一种输出添加标签/字幕/下三分之一字幕	moviepy + PIL（本技能）
转换/压缩/调整现有文件大小	`ffmpeg` （详见ffmpeg技能）
长格式、基于设计系统的视频	Remotion（详见remotion技能）

References

参考资料

Runnable example — short ad-style spot:
```
examples/quick-spot/build.py
```
Runnable example — data-viz with text overlay:
```
examples/data-viz-chart/build.py
```

Audio-anchored timelines:

CLAUDE.md → Video Timing → Audio-Anchored Timelines

Related skills:
```
ltx2
```
,
```
ffmpeg
```
,
```
remotion
```

可运行示例——短广告风格视频：
```
examples/quick-spot/build.py
```
可运行示例——带文本叠加的数据可视化视频：
```
examples/data-viz-chart/build.py
```

音频锚定时间线：

CLAUDE.md → Video Timing → Audio-Anchored Timelines

相关技能：
```
ltx2
```
、
```
ffmpeg
```
、
```
remotion
```

moviepy

Original

Translation

moviepy for Video Production

基于moviepy的视频制作

When to use moviepy vs. Remotion

moviepy与Remotion的适用场景对比

The main use case: text on AI-generated video

主要使用场景：AI生成视频上的文本叠加

1. AI-generated visual (LTX-2 or SadTalker output)

1. AI生成的视觉内容（LTX-2或SadTalker输出）

2. Text rendered via PIL → ImageClip (see "Text rendering" below)

2. 通过PIL渲染文本→ImageClip（详见下文“文本渲染”）

3. Composite

3. 合成

Genres where this shines

适用的视频类型

Text rendering — use PIL, not TextClip

文本渲染——使用PIL，而非TextClip

Audio-anchored timeline pattern

音频锚定时间线模式

Audio-anchored timeline (25s):

音频锚定时间线（25秒）：

Scene 1 tired 0.3 → 3.74 (audio 3.44s)

场景1 tired 0.3 → 3.74 （音频时长3.44秒）

Scene 2 worries 4.0 → 8.88 (audio 4.88s)

场景2 worries 4.0 → 8.88 （音频时长4.88秒）

Common recipes

常用配方

Text on a single AI-generated clip

单段AI生成视频上的文本叠加

Lower third over a SadTalker talking head

SadTalker说话人头像上的下三分之一字幕

Semi-transparent bar across the bottom for contrast

底部半透明条增强对比度

Tinted overlay for text contrast over busy footage

为复杂画面添加着色叠加层增强文本对比度

Composite order: bg → tint → text

合成顺序：背景 → 着色层 → 文本

Side-by-side composite

并排合成

Mix per-scene VO with ducked music

混合场景旁白与压低音量的背景音乐

Gotchas

常见陷阱

When to reach for what

工具选择指南

References

参考资料

Text rendering — use PIL, not
`TextClip`

文本渲染——使用PIL，而非
`TextClip`