hero-demo
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHero Demo Composition
Hero演示合成
An ~51s animated composition built with Editframe's React/timeline components that plays on the landing page. 8 scenes in a Timegroup with crossfade overlaps, a single continuous voiceover, and word-level synced captions.
sequence一个时长约51秒的动画合成效果,基于Editframe的React/时间线组件构建,用于着陆页展示。包含8个场景的 Timegroup,带有交叉淡入淡出过渡、一段连续旁白以及逐词同步的字幕。
sequenceFile Map
文件映射
| File | Purpose |
|---|---|
| Composition: DUR constants, CAPTIONS array, 8 scene functions, HeroDemoContent layout |
| All |
| TTS generation + whisper splitting script |
| Generated WAV/MP3 files, timing.json |
| 文件 | 用途 |
|---|---|
| 合成文件:包含DUR常量、CAPTIONS数组、8个场景函数、HeroDemoContent布局 |
| 所有 |
| TTS生成 + 基于whisper的拆分脚本 |
| 生成的WAV/MP3文件、timing.json |
Architecture
架构
FitScale
Timegroup mode="contain" (960x540, relative)
Audio src={VOICEOVER_SRC} ← single continuous track
Timegroup mode="sequence" overlapMs=495
SceneTitle (3333ms)
SceneAuthor (6333ms)
SceneLayers (7067ms)
SceneTimeline (9767ms)
SceneEditor (8200ms)
SceneTemplate (5500ms)
SceneStream (5700ms)
SceneRender (8833ms)
SceneCaptions groups={CAPTIONS} ← absolute overlay, global timestamps- Overlap is 495ms (15 frames at 30fps). Each scene fades in/out over this window.
- Total duration with overlaps: ~51.3s.
- DUR values = voice duration + buffer. Buffer gives the scene visual breathing room beyond the VO.
- All animations are CSS-only, driven by relative to each scene's local timeline. Set fill-mode on non-ef-text elements as needed (see css-animations skill); ef-text handles it automatically.
animationDelay
FitScale
Timegroup mode="contain" (960x540, relative)
Audio src={VOICEOVER_SRC} ← 单条连续音轨
Timegroup mode="sequence" overlapMs=495
SceneTitle (3333ms)
SceneAuthor (6333ms)
SceneLayers (7067ms)
SceneTimeline (9767ms)
SceneEditor (8200ms)
SceneTemplate (5500ms)
SceneStream (5700ms)
SceneRender (8833ms)
SceneCaptions groups={CAPTIONS} ← 绝对定位覆盖层,全局时间戳- 过渡重叠时长为495ms(30帧率下的15帧)。每个场景在此窗口内淡入/淡出。
- 包含过渡的总时长:约51.3秒。
- DUR值 = 旁白时长 + 缓冲时间。缓冲时间为场景提供超出旁白内容的视觉喘息空间。
- 所有动画均为纯CSS实现,由驱动,相对于每个场景的本地时间线。非ef-text元素需按需设置fill-mode(参考css-animations技能);ef-text会自动处理该属性。
animationDelay
Voiceover Script
旁白脚本
| # | Scene | Text |
|---|---|---|
| 1 | title | Video is a web page that moves. |
| 2 | author | It starts with HTML and CSS. When you need more, it's just React. |
| 3 | layers | Stack layers the way you stack divs. Video, text, shapes, 3D, mix everything. |
| 4 | timeline | Need an editor? Snap together GUI primitives. Timeline, waveforms, captions, into any editing experience you want. |
| 5 | editor | A full NLE. A simple trim tool in a form. It's your UI. These are just the building blocks. |
| 6 | template | Feed in data, and one template becomes ten thousand unique videos. |
| 7 | stream | Preview is instant. Change the code, see the frame. |
| 8 | render | When it's ready, render to the cloud, the browser, or the command line. Same composition, every target. |
Messaging: Editframe is a declarative layer. HTML/CSS is the foundation; React and JS are power tools on top. Not "just React" and not "just HTML."
Text rules: No em-dashes (they cause TTS pauses). Keep sentences short and punchy.
| 序号 | 场景 | 文本 |
|---|---|---|
| 1 | title | 视频是能动起来的网页。 |
| 2 | author | 从HTML和CSS开始,需要更多功能时,只需用React。 |
| 3 | layers | 像堆叠div一样堆叠图层。视频、文字、图形、3D效果,任意混合。 |
| 4 | timeline | 需要编辑器?将GUI基础组件拼接起来。时间线、波形图、字幕,打造你想要的任何编辑体验。 |
| 5 | editor | 完整的非线性编辑器,表单里的简单修剪工具,都由你定义UI。这些只是基础构建块。 |
| 6 | template | 输入数据,一个模板就能生成一万个独特视频。 |
| 7 | stream | 即时预览。修改代码,就能看到画面变化。 |
| 8 | render | 完成后,可在云端、浏览器或命令行渲染。同一合成效果,适配所有目标环境。 |
核心信息:Editframe是一个声明式层。HTML/CSS是基础;React和JS是上层的强大工具。既非“仅React”也非“仅HTML”。
文本规则:不要使用破折号(会导致TTS停顿)。保持句子简短有力。
Voiceover Regeneration Workflow
旁白重生成流程
Follow these steps exactly when the script text changes.
当脚本文本变更时,请严格遵循以下步骤。
Step 1: Update the generation script
步骤1:更新生成脚本
Edit in . Each segment has:
SEGMENTStelecine/services/web/scripts/generate-hero-voiceover-local.py- : filename prefix (e.g.
key)01-title - : the voiceover line
text - : final word stem for whisper-based splitting
last_word - : true for the preamble segment only
discard
The first segment MUST be a preamble () with throwaway text like — this absorbs TTS model warmup noise. DO NOT remove it.
"discard": True"Here is the introduction."Segments are joined with pause separators into one TTS pass.
" ... "编辑中的。每个片段包含:
telecine/services/web/scripts/generate-hero-voiceover-local.pySEGMENTS- :文件名前缀(例如
key)01-title - :旁白台词
text - :用于whisper拆分的最终词干
last_word - :仅用于开头引导片段,设为true
discard
第一个片段必须是引导片段(),内容为类似的占位文本——用于吸收TTS模型预热噪音。请勿删除该片段。
"discard": True"Here is the introduction."片段之间用停顿分隔符连接,一次完成TTS生成。
" ... "Step 2: Generate audio locally
步骤2:本地生成音频
bash
cd telecine/services/web
python scripts/generate-hero-voiceover-local.pyThis requires:
- package with the VoiceDesign model (
qwen_tts)Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign - ,
whisper,soundfile,numpytorch
DO NOT use the CustomVoice model (). It produces garbled output. Only use with .
generate_custom_voicegenerate_voice_designlanguage="English"Output lands in :
telecine/services/web/public/audio/hero/- /
voiceover.wav— full audio including preamble.mp3 - through
01-title.wav/08-render.wav— per-scene splits (reference only).mp3 - — per-scene durations in seconds
timing.json
bash
cd telecine/services/web
python scripts/generate-hero-voiceover-local.py此步骤需要:
- 带有VoiceDesign模型()的
Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign包qwen_tts - 、
whisper、soundfile、numpytorch
请勿使用CustomVoice模型()。它会产生混乱输出。仅使用并设置。
generate_custom_voicegenerate_voice_designlanguage="English"输出文件位于:
telecine/services/web/public/audio/hero/- /
voiceover.wav— 包含引导片段的完整音频.mp3 - 至
01-title.wav/08-render.wav— 按场景拆分的音频(仅作参考).mp3 - — 按场景划分的时长(单位:秒)
timing.json
Step 3: Trim the preamble
步骤3:修剪引导片段
The preamble must be removed. Find the trim point from the script's console output (it prints the preamble end time), then:
bash
ffmpeg -y -i public/audio/hero/voiceover.wav -ss <PREAMBLE_END_SEC> public/audio/hero/voiceover-trimmed.wav
ffmpeg -y -i public/audio/hero/voiceover-trimmed.wav -codec:a libmp3lame -b:a 128k public/audio/hero/voiceover-trimmed.mp3必须移除引导片段。从脚本控制台输出中找到修剪时间点(会打印引导片段结束时间),然后执行:
bash
ffmpeg -y -i public/audio/hero/voiceover.wav -ss <PREAMBLE_END_SEC> public/audio/hero/voiceover-trimmed.wav
ffmpeg -y -i public/audio/hero/voiceover-trimmed.wav -codec:a libmp3lame -b:a 128k public/audio/hero/voiceover-trimmed.mp3Step 4: Get global word timestamps
步骤4:获取全局单词时间戳
Run whisper on the trimmed file to get global timestamps (all relative to t=0 of the trimmed audio):
python
import whisper, json
model = whisper.load_model("small")
result = model.transcribe("public/audio/hero/voiceover-trimmed.wav", language="en", word_timestamps=True)
words = []
for seg in result["segments"]:
for w in seg.get("words", []):
words.append({"word": w["word"].strip(), "start": round(w["start"] * 1000), "end": round(w["end"] * 1000)})
with open("/tmp/hero-whisper/global.json", "w") as f:
json.dump(words, f, indent=2)对修剪后的文件运行whisper以获取全局时间戳(均相对于修剪后音频的t=0):
python
import whisper, json
model = whisper.load_model("small")
result = model.transcribe("public/audio/hero/voiceover-trimmed.wav", language="en", word_timestamps=True)
words = []
for seg in result["segments"]:
for w in seg.get("words", []):
words.append({"word": w["word"].strip(), "start": round(w["start"] * 1000), "end": round(w["end"] * 1000)})
with open("/tmp/hero-whisper/global.json", "w") as f:
json.dump(words, f, indent=2)Step 5: Upload to GCS
步骤5:上传至GCS
Files go to (NOT ). CDN serves at .
gs://editframe-assets-7ac794b/hero/editframe-assetshttps://assets.editframe.com/hero/Use a content hash in the filename — the CDN cache won't let you overwrite the same URL:
bash
HASH=$(md5 -q public/audio/hero/voiceover-trimmed.mp3 | head -c 8)
gsutil -h "Content-Type:audio/mpeg" -h "Cache-Control:public, max-age=31536000" \
cp public/audio/hero/voiceover-trimmed.mp3 "gs://editframe-assets-7ac794b/hero/voiceover-${HASH}.mp3"文件需上传至(不是)。CDN地址为。
gs://editframe-assets-7ac794b/hero/editframe-assetshttps://assets.editframe.com/hero/文件名需包含内容哈希——CDN会强制缓存,无法覆盖同一URL:
bash
HASH=$(md5 -q public/audio/hero/voiceover-trimmed.mp3 | head -c 8)
gsutil -h "Content-Type:audio/mpeg" -h "Cache-Control:public, max-age=31536000" \
cp public/audio/hero/voiceover-trimmed.mp3 "gs://editframe-assets-7ac794b/hero/voiceover-${HASH}.mp3"Step 6: Update HeroDemo.tsx
步骤6:更新HeroDemo.tsx
Three things to update:
-
— set to the new CDN URL with the content hash.
VOICEOVER_SRC -
array — rebuild from the global whisper timestamps. Group words into caption groups that match natural sentence/phrase boundaries. Each group needs:
CAPTIONS- : timestamp of first word in the group (or slightly before)
showMs - : timestamp of last word's end + ~400ms buffer
hideMs - : array of
wordsfrom whisper output{ w, s, e }
Caption groups should break at sentence boundaries and match the scene they belong to. Multiple groups per scene is normal. The timestamps are GLOBAL (from the single voiceover track), not scene-relative. -
constants — if voice durations changed significantly, adjust scene durations. Formula:
DURwhere buffer is typically 800-1200ms. Round to 30fps alignment (multiples of 33.33ms). CheckDUR.scene = voiceDurationMs + bufferfor per-scene voice durations.timing.json
需要更新三个部分:
-
— 设置为包含内容哈希的新CDN地址。
VOICEOVER_SRC -
数组 — 根据whisper全局时间戳重建。将单词分组为符合自然语句/短语边界的字幕组。每个组需要:
CAPTIONS- :组内第一个单词的时间戳(或稍早一点)
showMs - :组内最后一个单词的结束时间 + 约400ms缓冲
hideMs - :来自whisper输出的
words数组{ w, s, e }
字幕组应在句子边界处拆分,并对应所属场景。每个场景包含多个组是正常情况。时间戳为全局(来自单条旁白音轨),而非场景相对时间。 -
常量 — 若旁白时长变化较大,调整场景时长。公式:
DUR,缓冲时间通常为800-1200ms。四舍五入至30帧率对齐(33.33ms的倍数)。可从DUR.scene = 旁白时长Ms + 缓冲时间查看各场景的旁白时长。timing.json
Step 7: Verify
步骤7:验证
Play the composition in the browser. Check:
- Caption words highlight in sync with audio
- Scene transitions don't cut off voiceover mid-sentence
- No dead air gaps between scenes (overlaps should feel smooth)
在浏览器中播放合成效果,检查:
- 字幕单词与音频同步高亮
- 场景过渡不会在旁白句子中途截断
- 场景间无空白间隙(过渡应流畅自然)
Scene Editing
场景编辑
Each scene is a function (, , etc.) returning a with explicit .
SceneTitleSceneAuthorTimegroup mode="fixed"duration={DUR.xxx}msCommon patterns:
- applies crossfade in/out via CSS animation
sceneStyle(d) - Visual animations use for stagger/sequencing; ef-text auto-defaults fill-mode, plain elements need explicit fill-mode
animationDelay - wraps R3F Three.js content (particles in SceneStream)
CompositionCanvas - for per-character stagger animations
<Text split="char">
When adding/editing animations, define new keyframes in with the prefix.
landing.csshero-每个场景都是一个返回的函数(、等),带有明确的参数。
Timegroup mode="fixed"SceneTitleSceneAuthorduration={DUR.xxx}ms常见模式:
- 通过CSS动画实现交叉淡入淡出
sceneStyle(d) - 视觉动画使用实现 stagger/序列效果;ef-text会自动设置fill-mode,普通元素需显式设置
animationDelay - 包裹R3F Three.js内容(SceneStream中的粒子效果)
CompositionCanvas - 用于逐字符 stagger 动画
<Text split="char">
添加/编辑动画时,在中定义新的关键帧,前缀为。
landing.csshero-Changelog MDX Authoring
变更日志MDX编写
Changelog entries live in . Each entry uses the same component set as HeroDemo but through MDX props.
telecine/services/web/app/content/changelogs/{version}.mdx变更日志条目位于。每个条目使用与HeroDemo相同的组件集,但通过MDX属性调用。
telecine/services/web/app/content/changelogs/{version}.mdxComponent API
组件API
ChangelogIntroCardversioncodenametitletitletitletitle.trim()illustrationmdx
<ChangelogIntroCard
version="0.47.0"
codename="Mocha Relay"
title="Short title for the card display"
durationMs={4000}
/>ReleaseVideo<Timegroup mode="contain"><Timegroup mode="sequence">mdx
<ReleaseVideo aspect="16/9">
<Timegroup mode="sequence" overlapMs={600} style={{ width: 1920, height: 1080, position: "relative" }}>
<ChangelogIntroCard ... />
<TextMoment ... />
<ChangelogOutroCard ... />
</Timegroup>
</ReleaseVideo>ChangelogOutroCardversiontaglinedurationMsTextMomentheadlinebodymotifaccentColordurationMsChangelogIntroCardversioncodenametitletitletitletitle.trim()illustrationmdx
<ChangelogIntroCard
version="0.47.0"
codename="Mocha Relay"
title="卡片显示的短标题"
durationMs={4000}
/>ReleaseVideo<Timegroup mode="contain"><Timegroup mode="sequence">mdx
<ReleaseVideo aspect="16/9">
<Timegroup mode="sequence" overlapMs={600} style={{ width: 1920, height: 1080, position: "relative" }}>
<ChangelogIntroCard ... />
<TextMoment ... />
<ChangelogOutroCard ... />
</Timegroup>
</ReleaseVideo>ChangelogOutroCardversiontaglinedurationMsTextMomentheadlinebodymotifaccentColordurationMsPitfall: blank placeholder on page load
陷阱:页面加载时显示空白占位符
If renders as a blank gray box and never hydrates into a player, the cause is almost always a runtime error in one of the scene components. uses gating, so SSR always shows the placeholder — errors in scene components silently kill client hydration. Check browser console for on the component throwing.
ReleaseVideoReleaseVideoisClientTypeError如果渲染为灰色空白框且从未 hydrate 为播放器,原因几乎总是某个场景组件的运行时错误。使用进行门控,因此SSR始终显示占位符——场景组件中的错误会静默终止客户端hydration。检查浏览器控制台中抛出的。
ReleaseVideoReleaseVideoisClientTypeErrorPitfalls
常见陷阱
- TTS CustomVoice model is broken: Only use VoiceDesign (). CustomVoice produces unintelligible audio.
generate_voice_design - Preamble is required: First 1-2s of TTS generation can be noisy. The throwaway preamble absorbs this. Always trim it off before uploading.
- GCS filename reuse: CDN caches aggressively. Always use content-hashed filenames when uploading new audio.
- Caption timestamps are global: They come from whisper on the single trimmed voiceover, not per-scene. Do not make them scene-relative.
- CompositionCanvas rendering: Uses +
flushSyncto work in the synchronous export pipeline. Do not add manualuseLayoutEffect/gl.render()calls — that doubles GPU work.gl.finish() - Em-dashes in VO text: Cause unwanted pauses in TTS output. Use commas or periods instead.
- TTS CustomVoice模型已损坏:仅使用VoiceDesign()。CustomVoice会产生难以理解的音频。
generate_voice_design - 必须保留引导片段:TTS生成的前1-2秒可能有噪音。占位引导片段用于吸收这些噪音。上传前务必修剪掉。
- GCS文件名重复使用:CDN缓存策略激进。上传新音频时务必使用包含内容哈希的文件名。
- 字幕时间戳为全局:来自对单条修剪后旁白运行whisper的结果,而非按场景划分。请勿设置为场景相对时间。
- CompositionCanvas渲染:使用+
flushSync以在同步导出流水线中正常工作。请勿添加手动useLayoutEffect/gl.render()调用——这会使GPU工作量翻倍。gl.finish() - 旁白文本中的破折号:会导致TTS输出出现不必要的停顿。改用逗号或句号。