hero-demo

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Hero Demo Composition

Hero演示合成

An ~51s animated composition built with Editframe's React/timeline components that plays on the landing page. 8 scenes in a
sequence
Timegroup with crossfade overlaps, a single continuous voiceover, and word-level synced captions.
一个时长约51秒的动画合成效果,基于Editframe的React/时间线组件构建,用于着陆页展示。包含8个场景的
sequence
Timegroup,带有交叉淡入淡出过渡、一段连续旁白以及逐词同步的字幕。

File Map

文件映射

FilePurpose
telecine/services/web/app/components/landing-v5/HeroDemo.tsx
Composition: DUR constants, CAPTIONS array, 8 scene functions, HeroDemoContent layout
telecine/services/web/app/styles/landing.css
All
hero-*
keyframe animations
telecine/services/web/scripts/generate-hero-voiceover-local.py
TTS generation + whisper splitting script
telecine/services/web/public/audio/hero/
Generated WAV/MP3 files, timing.json
文件用途
telecine/services/web/app/components/landing-v5/HeroDemo.tsx
合成文件:包含DUR常量、CAPTIONS数组、8个场景函数、HeroDemoContent布局
telecine/services/web/app/styles/landing.css
所有
hero-*
关键帧动画
telecine/services/web/scripts/generate-hero-voiceover-local.py
TTS生成 + 基于whisper的拆分脚本
telecine/services/web/public/audio/hero/
生成的WAV/MP3文件、timing.json

Architecture

架构

FitScale
  Timegroup mode="contain" (960x540, relative)
    Audio src={VOICEOVER_SRC}           ← single continuous track
    Timegroup mode="sequence" overlapMs=495
      SceneTitle      (3333ms)
      SceneAuthor     (6333ms)
      SceneLayers     (7067ms)
      SceneTimeline   (9767ms)
      SceneEditor     (8200ms)
      SceneTemplate   (5500ms)
      SceneStream     (5700ms)
      SceneRender     (8833ms)
    SceneCaptions groups={CAPTIONS}     ← absolute overlay, global timestamps
  • Overlap is 495ms (15 frames at 30fps). Each scene fades in/out over this window.
  • Total duration with overlaps: ~51.3s.
  • DUR values = voice duration + buffer. Buffer gives the scene visual breathing room beyond the VO.
  • All animations are CSS-only, driven by
    animationDelay
    relative to each scene's local timeline. Set fill-mode on non-ef-text elements as needed (see css-animations skill); ef-text handles it automatically.
FitScale
  Timegroup mode="contain" (960x540, relative)
    Audio src={VOICEOVER_SRC}           ← 单条连续音轨
    Timegroup mode="sequence" overlapMs=495
      SceneTitle      (3333ms)
      SceneAuthor     (6333ms)
      SceneLayers     (7067ms)
      SceneTimeline   (9767ms)
      SceneEditor     (8200ms)
      SceneTemplate   (5500ms)
      SceneStream     (5700ms)
      SceneRender     (8833ms)
    SceneCaptions groups={CAPTIONS}     ← 绝对定位覆盖层,全局时间戳
  • 过渡重叠时长为495ms(30帧率下的15帧)。每个场景在此窗口内淡入/淡出。
  • 包含过渡的总时长:约51.3秒。
  • DUR值 = 旁白时长 + 缓冲时间。缓冲时间为场景提供超出旁白内容的视觉喘息空间。
  • 所有动画均为纯CSS实现,由
    animationDelay
    驱动,相对于每个场景的本地时间线。非ef-text元素需按需设置fill-mode(参考css-animations技能);ef-text会自动处理该属性。

Voiceover Script

旁白脚本

#SceneText
1titleVideo is a web page that moves.
2authorIt starts with HTML and CSS. When you need more, it's just React.
3layersStack layers the way you stack divs. Video, text, shapes, 3D, mix everything.
4timelineNeed an editor? Snap together GUI primitives. Timeline, waveforms, captions, into any editing experience you want.
5editorA full NLE. A simple trim tool in a form. It's your UI. These are just the building blocks.
6templateFeed in data, and one template becomes ten thousand unique videos.
7streamPreview is instant. Change the code, see the frame.
8renderWhen it's ready, render to the cloud, the browser, or the command line. Same composition, every target.
Messaging: Editframe is a declarative layer. HTML/CSS is the foundation; React and JS are power tools on top. Not "just React" and not "just HTML."
Text rules: No em-dashes (they cause TTS pauses). Keep sentences short and punchy.
序号场景文本
1title视频是能动起来的网页。
2author从HTML和CSS开始,需要更多功能时,只需用React。
3layers像堆叠div一样堆叠图层。视频、文字、图形、3D效果,任意混合。
4timeline需要编辑器?将GUI基础组件拼接起来。时间线、波形图、字幕,打造你想要的任何编辑体验。
5editor完整的非线性编辑器,表单里的简单修剪工具,都由你定义UI。这些只是基础构建块。
6template输入数据,一个模板就能生成一万个独特视频。
7stream即时预览。修改代码,就能看到画面变化。
8render完成后,可在云端、浏览器或命令行渲染。同一合成效果,适配所有目标环境。
核心信息:Editframe是一个声明式层。HTML/CSS是基础;React和JS是上层的强大工具。既非“仅React”也非“仅HTML”。
文本规则:不要使用破折号(会导致TTS停顿)。保持句子简短有力。

Voiceover Regeneration Workflow

旁白重生成流程

Follow these steps exactly when the script text changes.
当脚本文本变更时,请严格遵循以下步骤。

Step 1: Update the generation script

步骤1:更新生成脚本

Edit
SEGMENTS
in
telecine/services/web/scripts/generate-hero-voiceover-local.py
. Each segment has:
  • key
    : filename prefix (e.g.
    01-title
    )
  • text
    : the voiceover line
  • last_word
    : final word stem for whisper-based splitting
  • discard
    : true for the preamble segment only
The first segment MUST be a preamble (
"discard": True
) with throwaway text like
"Here is the introduction."
— this absorbs TTS model warmup noise. DO NOT remove it.
Segments are joined with
" ... "
pause separators into one TTS pass.
编辑
telecine/services/web/scripts/generate-hero-voiceover-local.py
中的
SEGMENTS
。每个片段包含:
  • key
    :文件名前缀(例如
    01-title
  • text
    :旁白台词
  • last_word
    :用于whisper拆分的最终词干
  • discard
    :仅用于开头引导片段,设为true
第一个片段必须是引导片段(
"discard": True
),内容为类似
"Here is the introduction."
的占位文本——用于吸收TTS模型预热噪音。请勿删除该片段。
片段之间用
" ... "
停顿分隔符连接,一次完成TTS生成。

Step 2: Generate audio locally

步骤2:本地生成音频

bash
cd telecine/services/web
python scripts/generate-hero-voiceover-local.py
This requires:
  • qwen_tts
    package with the VoiceDesign model (
    Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign
    )
  • whisper
    ,
    soundfile
    ,
    numpy
    ,
    torch
DO NOT use the CustomVoice model (
generate_custom_voice
). It produces garbled output. Only use
generate_voice_design
with
language="English"
.
Output lands in
telecine/services/web/public/audio/hero/
:
  • voiceover.wav
    /
    .mp3
    — full audio including preamble
  • 01-title.wav
    through
    08-render.wav
    /
    .mp3
    — per-scene splits (reference only)
  • timing.json
    — per-scene durations in seconds
bash
cd telecine/services/web
python scripts/generate-hero-voiceover-local.py
此步骤需要:
  • 带有VoiceDesign模型(
    Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign
    )的
    qwen_tts
  • whisper
    soundfile
    numpy
    torch
请勿使用CustomVoice模型
generate_custom_voice
)。它会产生混乱输出。仅使用
generate_voice_design
并设置
language="English"
输出文件位于
telecine/services/web/public/audio/hero/
  • voiceover.wav
    /
    .mp3
    — 包含引导片段的完整音频
  • 01-title.wav
    08-render.wav
    /
    .mp3
    — 按场景拆分的音频(仅作参考)
  • timing.json
    — 按场景划分的时长(单位:秒)

Step 3: Trim the preamble

步骤3:修剪引导片段

The preamble must be removed. Find the trim point from the script's console output (it prints the preamble end time), then:
bash
ffmpeg -y -i public/audio/hero/voiceover.wav -ss <PREAMBLE_END_SEC> public/audio/hero/voiceover-trimmed.wav
ffmpeg -y -i public/audio/hero/voiceover-trimmed.wav -codec:a libmp3lame -b:a 128k public/audio/hero/voiceover-trimmed.mp3
必须移除引导片段。从脚本控制台输出中找到修剪时间点(会打印引导片段结束时间),然后执行:
bash
ffmpeg -y -i public/audio/hero/voiceover.wav -ss <PREAMBLE_END_SEC> public/audio/hero/voiceover-trimmed.wav
ffmpeg -y -i public/audio/hero/voiceover-trimmed.wav -codec:a libmp3lame -b:a 128k public/audio/hero/voiceover-trimmed.mp3

Step 4: Get global word timestamps

步骤4:获取全局单词时间戳

Run whisper on the trimmed file to get global timestamps (all relative to t=0 of the trimmed audio):
python
import whisper, json
model = whisper.load_model("small")
result = model.transcribe("public/audio/hero/voiceover-trimmed.wav", language="en", word_timestamps=True)
words = []
for seg in result["segments"]:
    for w in seg.get("words", []):
        words.append({"word": w["word"].strip(), "start": round(w["start"] * 1000), "end": round(w["end"] * 1000)})
with open("/tmp/hero-whisper/global.json", "w") as f:
    json.dump(words, f, indent=2)
对修剪后的文件运行whisper以获取全局时间戳(均相对于修剪后音频的t=0):
python
import whisper, json
model = whisper.load_model("small")
result = model.transcribe("public/audio/hero/voiceover-trimmed.wav", language="en", word_timestamps=True)
words = []
for seg in result["segments"]:
    for w in seg.get("words", []):
        words.append({"word": w["word"].strip(), "start": round(w["start"] * 1000), "end": round(w["end"] * 1000)})
with open("/tmp/hero-whisper/global.json", "w") as f:
    json.dump(words, f, indent=2)

Step 5: Upload to GCS

步骤5:上传至GCS

Files go to
gs://editframe-assets-7ac794b/hero/
(NOT
editframe-assets
). CDN serves at
https://assets.editframe.com/hero/
.
Use a content hash in the filename — the CDN cache won't let you overwrite the same URL:
bash
HASH=$(md5 -q public/audio/hero/voiceover-trimmed.mp3 | head -c 8)
gsutil -h "Content-Type:audio/mpeg" -h "Cache-Control:public, max-age=31536000" \
  cp public/audio/hero/voiceover-trimmed.mp3 "gs://editframe-assets-7ac794b/hero/voiceover-${HASH}.mp3"
文件需上传至
gs://editframe-assets-7ac794b/hero/
不是
editframe-assets
)。CDN地址为
https://assets.editframe.com/hero/
文件名需包含内容哈希——CDN会强制缓存,无法覆盖同一URL:
bash
HASH=$(md5 -q public/audio/hero/voiceover-trimmed.mp3 | head -c 8)
gsutil -h "Content-Type:audio/mpeg" -h "Cache-Control:public, max-age=31536000" \
  cp public/audio/hero/voiceover-trimmed.mp3 "gs://editframe-assets-7ac794b/hero/voiceover-${HASH}.mp3"

Step 6: Update HeroDemo.tsx

步骤6:更新HeroDemo.tsx

Three things to update:
  1. VOICEOVER_SRC
    — set to the new CDN URL with the content hash.
  2. CAPTIONS
    array
    — rebuild from the global whisper timestamps. Group words into caption groups that match natural sentence/phrase boundaries. Each group needs:
    • showMs
      : timestamp of first word in the group (or slightly before)
    • hideMs
      : timestamp of last word's end + ~400ms buffer
    • words
      : array of
      { w, s, e }
      from whisper output
    Caption groups should break at sentence boundaries and match the scene they belong to. Multiple groups per scene is normal. The timestamps are GLOBAL (from the single voiceover track), not scene-relative.
  3. DUR
    constants
    — if voice durations changed significantly, adjust scene durations. Formula:
    DUR.scene = voiceDurationMs + buffer
    where buffer is typically 800-1200ms. Round to 30fps alignment (multiples of 33.33ms). Check
    timing.json
    for per-scene voice durations.
需要更新三个部分:
  1. VOICEOVER_SRC
    — 设置为包含内容哈希的新CDN地址。
  2. CAPTIONS
    数组
    — 根据whisper全局时间戳重建。将单词分组为符合自然语句/短语边界的字幕组。每个组需要:
    • showMs
      :组内第一个单词的时间戳(或稍早一点)
    • hideMs
      :组内最后一个单词的结束时间 + 约400ms缓冲
    • words
      :来自whisper输出的
      { w, s, e }
      数组
    字幕组应在句子边界处拆分,并对应所属场景。每个场景包含多个组是正常情况。时间戳为全局(来自单条旁白音轨),而非场景相对时间。
  3. DUR
    常量
    — 若旁白时长变化较大,调整场景时长。公式:
    DUR.scene = 旁白时长Ms + 缓冲时间
    ,缓冲时间通常为800-1200ms。四舍五入至30帧率对齐(33.33ms的倍数)。可从
    timing.json
    查看各场景的旁白时长。

Step 7: Verify

步骤7:验证

Play the composition in the browser. Check:
  • Caption words highlight in sync with audio
  • Scene transitions don't cut off voiceover mid-sentence
  • No dead air gaps between scenes (overlaps should feel smooth)
在浏览器中播放合成效果,检查:
  • 字幕单词与音频同步高亮
  • 场景过渡不会在旁白句子中途截断
  • 场景间无空白间隙(过渡应流畅自然)

Scene Editing

场景编辑

Each scene is a function (
SceneTitle
,
SceneAuthor
, etc.) returning a
Timegroup mode="fixed"
with explicit
duration={DUR.xxx}ms
.
Common patterns:
  • sceneStyle(d)
    applies crossfade in/out via CSS animation
  • Visual animations use
    animationDelay
    for stagger/sequencing; ef-text auto-defaults fill-mode, plain elements need explicit fill-mode
  • CompositionCanvas
    wraps R3F Three.js content (particles in SceneStream)
  • <Text split="char">
    for per-character stagger animations
When adding/editing animations, define new keyframes in
landing.css
with the
hero-
prefix.
每个场景都是一个返回
Timegroup mode="fixed"
的函数(
SceneTitle
SceneAuthor
等),带有明确的
duration={DUR.xxx}ms
参数。
常见模式:
  • sceneStyle(d)
    通过CSS动画实现交叉淡入淡出
  • 视觉动画使用
    animationDelay
    实现 stagger/序列效果;ef-text会自动设置fill-mode,普通元素需显式设置
  • CompositionCanvas
    包裹R3F Three.js内容(SceneStream中的粒子效果)
  • <Text split="char">
    用于逐字符 stagger 动画
添加/编辑动画时,在
landing.css
中定义新的关键帧,前缀为
hero-

Changelog MDX Authoring

变更日志MDX编写

Changelog entries live in
telecine/services/web/app/content/changelogs/{version}.mdx
. Each entry uses the same component set as HeroDemo but through MDX props.
变更日志条目位于
telecine/services/web/app/content/changelogs/{version}.mdx
。每个条目使用与HeroDemo相同的组件集,但通过MDX属性调用。

Component API

组件API

ChangelogIntroCard
— required props:
version
,
codename
,
title
. The
title
prop is NOT optional even though TypeScript won't enforce it at MDX authoring time. Missing
title
causes
title.trim()
to throw at runtime, breaking hydration and leaving the page as a blank placeholder. The
illustration
prop is a no-op stub kept for MDX compat.
mdx
<ChangelogIntroCard
  version="0.47.0"
  codename="Mocha Relay"
  title="Short title for the card display"
  durationMs={4000}
/>
ReleaseVideo
— wraps children in its own
<Timegroup mode="contain">
. Pass scene elements as direct children via an inner
<Timegroup mode="sequence">
:
mdx
<ReleaseVideo aspect="16/9">
  <Timegroup mode="sequence" overlapMs={600} style={{ width: 1920, height: 1080, position: "relative" }}>
    <ChangelogIntroCard ... />
    <TextMoment ... />
    <ChangelogOutroCard ... />
  </Timegroup>
</ReleaseVideo>
ChangelogOutroCard
version
and
tagline
required.
durationMs
defaults exist.
TextMoment
headline
and
body
required.
motif
,
accentColor
,
durationMs
optional.
ChangelogIntroCard
— 必填属性:
version
codename
title
。尽管TypeScript在MDX编写时不会强制要求,但
title
属性不可选。缺失
title
会导致运行时
title.trim()
报错,破坏 hydration 并使页面显示空白占位符。
illustration
属性是为了MDX兼容性保留的无操作占位符。
mdx
<ChangelogIntroCard
  version="0.47.0"
  codename="Mocha Relay"
  title="卡片显示的短标题"
  durationMs={4000}
/>
ReleaseVideo
— 将子元素包裹在自身的
<Timegroup mode="contain">
中。通过内部的
<Timegroup mode="sequence">
传递场景元素作为直接子元素:
mdx
<ReleaseVideo aspect="16/9">
  <Timegroup mode="sequence" overlapMs={600} style={{ width: 1920, height: 1080, position: "relative" }}>
    <ChangelogIntroCard ... />
    <TextMoment ... />
    <ChangelogOutroCard ... />
  </Timegroup>
</ReleaseVideo>
ChangelogOutroCard
version
tagline
为必填属性。
durationMs
有默认值。
TextMoment
headline
body
为必填属性。
motif
accentColor
durationMs
为可选属性。

Pitfall: blank placeholder on page load

陷阱:页面加载时显示空白占位符

If
ReleaseVideo
renders as a blank gray box and never hydrates into a player, the cause is almost always a runtime error in one of the scene components.
ReleaseVideo
uses
isClient
gating, so SSR always shows the placeholder — errors in scene components silently kill client hydration. Check browser console for
TypeError
on the component throwing.
如果
ReleaseVideo
渲染为灰色空白框且从未 hydrate 为播放器,原因几乎总是某个场景组件的运行时错误。
ReleaseVideo
使用
isClient
进行门控,因此SSR始终显示占位符——场景组件中的错误会静默终止客户端hydration。检查浏览器控制台中抛出的
TypeError

Pitfalls

常见陷阱

  • TTS CustomVoice model is broken: Only use VoiceDesign (
    generate_voice_design
    ). CustomVoice produces unintelligible audio.
  • Preamble is required: First 1-2s of TTS generation can be noisy. The throwaway preamble absorbs this. Always trim it off before uploading.
  • GCS filename reuse: CDN caches aggressively. Always use content-hashed filenames when uploading new audio.
  • Caption timestamps are global: They come from whisper on the single trimmed voiceover, not per-scene. Do not make them scene-relative.
  • CompositionCanvas rendering: Uses
    flushSync
    +
    useLayoutEffect
    to work in the synchronous export pipeline. Do not add manual
    gl.render()
    /
    gl.finish()
    calls — that doubles GPU work.
  • Em-dashes in VO text: Cause unwanted pauses in TTS output. Use commas or periods instead.
  • TTS CustomVoice模型已损坏:仅使用VoiceDesign(
    generate_voice_design
    )。CustomVoice会产生难以理解的音频。
  • 必须保留引导片段:TTS生成的前1-2秒可能有噪音。占位引导片段用于吸收这些噪音。上传前务必修剪掉。
  • GCS文件名重复使用:CDN缓存策略激进。上传新音频时务必使用包含内容哈希的文件名。
  • 字幕时间戳为全局:来自对单条修剪后旁白运行whisper的结果,而非按场景划分。请勿设置为场景相对时间。
  • CompositionCanvas渲染:使用
    flushSync
    +
    useLayoutEffect
    以在同步导出流水线中正常工作。请勿添加手动
    gl.render()
    /
    gl.finish()
    调用——这会使GPU工作量翻倍。
  • 旁白文本中的破折号:会导致TTS输出出现不必要的停顿。改用逗号或句号。