hero-demo

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Hero Demo Composition

Hero演示合成

An ~51s animated composition built with Editframe's React/timeline components that plays on the landing page. 8 scenes in a

sequence

Timegroup with crossfade overlaps, a single continuous voiceover, and word-level synced captions.

一个时长约51秒的动画合成效果，基于Editframe的React/时间线组件构建，用于着陆页展示。包含8个场景的

sequence

Timegroup，带有交叉淡入淡出过渡、一段连续旁白以及逐词同步的字幕。

File Map

文件映射

File	Purpose
`telecine/services/web/app/components/landing-v5/HeroDemo.tsx`	Composition: DUR constants, CAPTIONS array, 8 scene functions, HeroDemoContent layout
`telecine/services/web/app/styles/landing.css`	All `hero-*` keyframe animations
`telecine/services/web/scripts/generate-hero-voiceover-local.py`	TTS generation + whisper splitting script
`telecine/services/web/public/audio/hero/`	Generated WAV/MP3 files, timing.json

文件	用途
`telecine/services/web/app/components/landing-v5/HeroDemo.tsx`	合成文件：包含DUR常量、CAPTIONS数组、8个场景函数、HeroDemoContent布局
`telecine/services/web/app/styles/landing.css`	所有 `hero-*` 关键帧动画
`telecine/services/web/scripts/generate-hero-voiceover-local.py`	TTS生成 + 基于whisper的拆分脚本
`telecine/services/web/public/audio/hero/`	生成的WAV/MP3文件、timing.json

Architecture

架构

FitScale
  Timegroup mode="contain" (960x540, relative)
    Audio src={VOICEOVER_SRC}           ← single continuous track
    Timegroup mode="sequence" overlapMs=495
      SceneTitle      (3333ms)
      SceneAuthor     (6333ms)
      SceneLayers     (7067ms)
      SceneTimeline   (9767ms)
      SceneEditor     (8200ms)
      SceneTemplate   (5500ms)
      SceneStream     (5700ms)
      SceneRender     (8833ms)
    SceneCaptions groups={CAPTIONS}     ← absolute overlay, global timestamps

Overlap is 495ms (15 frames at 30fps). Each scene fades in/out over this window.
Total duration with overlaps: ~51.3s.
DUR values = voice duration + buffer. Buffer gives the scene visual breathing room beyond the VO.
All animations are CSS-only, driven by
```
animationDelay
```
relative to each scene's local timeline. Set fill-mode on non-ef-text elements as needed (see css-animations skill); ef-text handles it automatically.

FitScale
  Timegroup mode="contain" (960x540, relative)
    Audio src={VOICEOVER_SRC}           ← 单条连续音轨
    Timegroup mode="sequence" overlapMs=495
      SceneTitle      (3333ms)
      SceneAuthor     (6333ms)
      SceneLayers     (7067ms)
      SceneTimeline   (9767ms)
      SceneEditor     (8200ms)
      SceneTemplate   (5500ms)
      SceneStream     (5700ms)
      SceneRender     (8833ms)
    SceneCaptions groups={CAPTIONS}     ← 绝对定位覆盖层，全局时间戳

过渡重叠时长为495ms（30帧率下的15帧）。每个场景在此窗口内淡入/淡出。
包含过渡的总时长：约51.3秒。
DUR值 = 旁白时长 + 缓冲时间。缓冲时间为场景提供超出旁白内容的视觉喘息空间。
所有动画均为纯CSS实现，由
```
animationDelay
```
驱动，相对于每个场景的本地时间线。非ef-text元素需按需设置fill-mode（参考css-animations技能）；ef-text会自动处理该属性。

Voiceover Script

旁白脚本

#	Scene	Text
1	title	Video is a web page that moves.
2	author	It starts with HTML and CSS. When you need more, it's just React.
3	layers	Stack layers the way you stack divs. Video, text, shapes, 3D, mix everything.
4	timeline	Need an editor? Snap together GUI primitives. Timeline, waveforms, captions, into any editing experience you want.
5	editor	A full NLE. A simple trim tool in a form. It's your UI. These are just the building blocks.
6	template	Feed in data, and one template becomes ten thousand unique videos.
7	stream	Preview is instant. Change the code, see the frame.
8	render	When it's ready, render to the cloud, the browser, or the command line. Same composition, every target.

Messaging: Editframe is a declarative layer. HTML/CSS is the foundation; React and JS are power tools on top. Not "just React" and not "just HTML."

Text rules: No em-dashes (they cause TTS pauses). Keep sentences short and punchy.

序号	场景	文本
1	title	视频是能动起来的网页。
2	author	从HTML和CSS开始，需要更多功能时，只需用React。
3	layers	像堆叠div一样堆叠图层。视频、文字、图形、3D效果，任意混合。
4	timeline	需要编辑器？将GUI基础组件拼接起来。时间线、波形图、字幕，打造你想要的任何编辑体验。
5	editor	完整的非线性编辑器，表单里的简单修剪工具，都由你定义UI。这些只是基础构建块。
6	template	输入数据，一个模板就能生成一万个独特视频。
7	stream	即时预览。修改代码，就能看到画面变化。
8	render	完成后，可在云端、浏览器或命令行渲染。同一合成效果，适配所有目标环境。

核心信息：Editframe是一个声明式层。HTML/CSS是基础；React和JS是上层的强大工具。既非“仅React”也非“仅HTML”。

文本规则：不要使用破折号（会导致TTS停顿）。保持句子简短有力。

Voiceover Regeneration Workflow

旁白重生成流程

Follow these steps exactly when the script text changes.

当脚本文本变更时，请严格遵循以下步骤。

Step 1: Update the generation script

步骤1：更新生成脚本

Edit

SEGMENTS

telecine/services/web/scripts/generate-hero-voiceover-local.py

. Each segment has:

```
key
```
: filename prefix (e.g.
```
01-title
```
)
```
text
```
: the voiceover line
```
last_word
```
: final word stem for whisper-based splitting
```
discard
```
: true for the preamble segment only

The first segment MUST be a preamble (

"discard": True

) with throwaway text like

"Here is the introduction."

— this absorbs TTS model warmup noise. DO NOT remove it.

Segments are joined with

" ... "

pause separators into one TTS pass.

编辑

telecine/services/web/scripts/generate-hero-voiceover-local.py

中的

SEGMENTS

。每个片段包含：

```
key
```
：文件名前缀（例如
```
01-title
```
）
```
text
```
：旁白台词
```
last_word
```
：用于whisper拆分的最终词干
```
discard
```
：仅用于开头引导片段，设为true

第一个片段必须是引导片段（

"discard": True

），内容为类似

"Here is the introduction."

的占位文本——用于吸收TTS模型预热噪音。请勿删除该片段。

片段之间用

" ... "

停顿分隔符连接，一次完成TTS生成。

Step 2: Generate audio locally

步骤2：本地生成音频

bash

cd telecine/services/web
python scripts/generate-hero-voiceover-local.py

This requires:

qwen_tts

package with the VoiceDesign model (

Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign

)

```
whisper
```
,
```
soundfile
```
,
```
numpy
```
,
```
torch
```

DO NOT use the CustomVoice model (

generate_custom_voice

). It produces garbled output. Only use

generate_voice_design

with

language="English"

Output lands in

telecine/services/web/public/audio/hero/

```
voiceover.wav
```
/
```
.mp3
```
— full audio including preamble
```
01-title.wav
```
through
```
08-render.wav
```
/
```
.mp3
```
— per-scene splits (reference only)
```
timing.json
```
— per-scene durations in seconds

bash

cd telecine/services/web
python scripts/generate-hero-voiceover-local.py

此步骤需要：

带有VoiceDesign模型（

Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign

）的

qwen_tts

包

```
whisper
```
、
```
soundfile
```
、
```
numpy
```
、
```
torch
```

请勿使用CustomVoice模型（

generate_custom_voice

）。它会产生混乱输出。仅使用

generate_voice_design

并设置

language="English"

。

输出文件位于

telecine/services/web/public/audio/hero/

：

```
voiceover.wav
```
/
```
.mp3
```
— 包含引导片段的完整音频
```
01-title.wav
```
至
```
08-render.wav
```
/
```
.mp3
```
— 按场景拆分的音频（仅作参考）
```
timing.json
```
— 按场景划分的时长（单位：秒）

Step 3: Trim the preamble

步骤3：修剪引导片段

The preamble must be removed. Find the trim point from the script's console output (it prints the preamble end time), then:

bash

ffmpeg -y -i public/audio/hero/voiceover.wav -ss <PREAMBLE_END_SEC> public/audio/hero/voiceover-trimmed.wav
ffmpeg -y -i public/audio/hero/voiceover-trimmed.wav -codec:a libmp3lame -b:a 128k public/audio/hero/voiceover-trimmed.mp3

必须移除引导片段。从脚本控制台输出中找到修剪时间点（会打印引导片段结束时间），然后执行：

bash

ffmpeg -y -i public/audio/hero/voiceover.wav -ss <PREAMBLE_END_SEC> public/audio/hero/voiceover-trimmed.wav
ffmpeg -y -i public/audio/hero/voiceover-trimmed.wav -codec:a libmp3lame -b:a 128k public/audio/hero/voiceover-trimmed.mp3

Step 4: Get global word timestamps

步骤4：获取全局单词时间戳

Run whisper on the trimmed file to get global timestamps (all relative to t=0 of the trimmed audio):

python

import whisper, json
model = whisper.load_model("small")
result = model.transcribe("public/audio/hero/voiceover-trimmed.wav", language="en", word_timestamps=True)
words = []
for seg in result["segments"]:
    for w in seg.get("words", []):
        words.append({"word": w["word"].strip(), "start": round(w["start"] * 1000), "end": round(w["end"] * 1000)})
with open("/tmp/hero-whisper/global.json", "w") as f:
    json.dump(words, f, indent=2)

对修剪后的文件运行whisper以获取全局时间戳（均相对于修剪后音频的t=0）：

python

import whisper, json
model = whisper.load_model("small")
result = model.transcribe("public/audio/hero/voiceover-trimmed.wav", language="en", word_timestamps=True)
words = []
for seg in result["segments"]:
    for w in seg.get("words", []):
        words.append({"word": w["word"].strip(), "start": round(w["start"] * 1000), "end": round(w["end"] * 1000)})
with open("/tmp/hero-whisper/global.json", "w") as f:
    json.dump(words, f, indent=2)

Step 5: Upload to GCS

步骤5：上传至GCS

Files go to

gs://editframe-assets-7ac794b/hero/

(NOT

editframe-assets

). CDN serves at

https://assets.editframe.com/hero/

Use a content hash in the filename — the CDN cache won't let you overwrite the same URL:

bash

HASH=$(md5 -q public/audio/hero/voiceover-trimmed.mp3 | head -c 8)
gsutil -h "Content-Type:audio/mpeg" -h "Cache-Control:public, max-age=31536000" \
  cp public/audio/hero/voiceover-trimmed.mp3 "gs://editframe-assets-7ac794b/hero/voiceover-${HASH}.mp3"

文件需上传至

gs://editframe-assets-7ac794b/hero/

（不是

editframe-assets

）。CDN地址为

https://assets.editframe.com/hero/

。

文件名需包含内容哈希——CDN会强制缓存，无法覆盖同一URL：

bash

HASH=$(md5 -q public/audio/hero/voiceover-trimmed.mp3 | head -c 8)
gsutil -h "Content-Type:audio/mpeg" -h "Cache-Control:public, max-age=31536000" \
  cp public/audio/hero/voiceover-trimmed.mp3 "gs://editframe-assets-7ac794b/hero/voiceover-${HASH}.mp3"

Step 6: Update HeroDemo.tsx

步骤6：更新HeroDemo.tsx

Three things to update:

VOICEOVER_SRC
— set to the new CDN URL with the content hash.
CAPTIONS
array — rebuild from the global whisper timestamps. Group words into caption groups that match natural sentence/phrase boundaries. Each group needs:
- ```
showMs
```
  : timestamp of first word in the group (or slightly before)
- ```
hideMs
```
  : timestamp of last word's end + ~400ms buffer
- ```
words
```
  : array of
```
{ w, s, e }
```
  from whisper output
Caption groups should break at sentence boundaries and match the scene they belong to. Multiple groups per scene is normal. The timestamps are GLOBAL (from the single voiceover track), not scene-relative.
DUR
constants — if voice durations changed significantly, adjust scene durations. Formula:
```
DUR.scene = voiceDurationMs + buffer
```
where buffer is typically 800-1200ms. Round to 30fps alignment (multiples of 33.33ms). Check
```
timing.json
```
for per-scene voice durations.

需要更新三个部分：

VOICEOVER_SRC
— 设置为包含内容哈希的新CDN地址。
CAPTIONS
数组 — 根据whisper全局时间戳重建。将单词分组为符合自然语句/短语边界的字幕组。每个组需要：
- ```
showMs
```
  ：组内第一个单词的时间戳（或稍早一点）
- ```
hideMs
```
  ：组内最后一个单词的结束时间 + 约400ms缓冲
- ```
words
```
  ：来自whisper输出的
```
{ w, s, e }
```
  数组
字幕组应在句子边界处拆分，并对应所属场景。每个场景包含多个组是正常情况。时间戳为全局（来自单条旁白音轨），而非场景相对时间。
DUR
常量 — 若旁白时长变化较大，调整场景时长。公式：
```
DUR.scene = 旁白时长Ms + 缓冲时间
```
，缓冲时间通常为800-1200ms。四舍五入至30帧率对齐（33.33ms的倍数）。可从
```
timing.json
```
查看各场景的旁白时长。

Step 7: Verify

步骤7：验证

Play the composition in the browser. Check:

Caption words highlight in sync with audio
Scene transitions don't cut off voiceover mid-sentence
No dead air gaps between scenes (overlaps should feel smooth)

在浏览器中播放合成效果，检查：

字幕单词与音频同步高亮
场景过渡不会在旁白句子中途截断
场景间无空白间隙（过渡应流畅自然）

Scene Editing

场景编辑

Each scene is a function (

SceneTitle

SceneAuthor

, etc.) returning a

Timegroup mode="fixed"

with explicit

duration={DUR.xxx}ms

Common patterns:

```
sceneStyle(d)
```
applies crossfade in/out via CSS animation
Visual animations use
```
animationDelay
```
for stagger/sequencing; ef-text auto-defaults fill-mode, plain elements need explicit fill-mode
```
CompositionCanvas
```
wraps R3F Three.js content (particles in SceneStream)
```
<Text split="char">
```
for per-character stagger animations

When adding/editing animations, define new keyframes in

landing.css

with the

hero-

prefix.

每个场景都是一个返回

Timegroup mode="fixed"

的函数（

SceneTitle

、

SceneAuthor

等），带有明确的

duration={DUR.xxx}ms

参数。

常见模式：

```
sceneStyle(d)
```
通过CSS动画实现交叉淡入淡出
视觉动画使用
```
animationDelay
```
实现 stagger/序列效果；ef-text会自动设置fill-mode，普通元素需显式设置
```
CompositionCanvas
```
包裹R3F Three.js内容（SceneStream中的粒子效果）
```
<Text split="char">
```
用于逐字符 stagger 动画

添加/编辑动画时，在

landing.css

中定义新的关键帧，前缀为

hero-

。

Changelog MDX Authoring

变更日志MDX编写

Changelog entries live in

telecine/services/web/app/content/changelogs/{version}.mdx

. Each entry uses the same component set as HeroDemo but through MDX props.

变更日志条目位于

telecine/services/web/app/content/changelogs/{version}.mdx

。每个条目使用与HeroDemo相同的组件集，但通过MDX属性调用。

Component API

组件API

ChangelogIntroCard
— required props:

version

codename

title

. The

title

prop is NOT optional even though TypeScript won't enforce it at MDX authoring time. Missing

title

causes

title.trim()

to throw at runtime, breaking hydration and leaving the page as a blank placeholder. The

illustration

prop is a no-op stub kept for MDX compat.

mdx

<ChangelogIntroCard
  version="0.47.0"
  codename="Mocha Relay"
  title="Short title for the card display"
  durationMs={4000}
/>

ReleaseVideo
— wraps children in its own

<Timegroup mode="contain">

. Pass scene elements as direct children via an inner

<Timegroup mode="sequence">

mdx

<ReleaseVideo aspect="16/9">
  <Timegroup mode="sequence" overlapMs={600} style={{ width: 1920, height: 1080, position: "relative" }}>
    <ChangelogIntroCard ... />
    <TextMoment ... />
    <ChangelogOutroCard ... />
  </Timegroup>
</ReleaseVideo>

ChangelogOutroCard
—

version

and

tagline

required.

durationMs

defaults exist.

TextMoment
—

headline

and

body

required.

motif

accentColor

durationMs

optional.

ChangelogIntroCard
— 必填属性：

version

、

codename

、

title

。尽管TypeScript在MDX编写时不会强制要求，但

title

属性不可选。缺失

title

会导致运行时

title.trim()

报错，破坏 hydration 并使页面显示空白占位符。

illustration

属性是为了MDX兼容性保留的无操作占位符。

mdx

<ChangelogIntroCard
  version="0.47.0"
  codename="Mocha Relay"
  title="卡片显示的短标题"
  durationMs={4000}
/>

ReleaseVideo
— 将子元素包裹在自身的

<Timegroup mode="contain">

中。通过内部的

<Timegroup mode="sequence">

传递场景元素作为直接子元素：

mdx

<ReleaseVideo aspect="16/9">
  <Timegroup mode="sequence" overlapMs={600} style={{ width: 1920, height: 1080, position: "relative" }}>
    <ChangelogIntroCard ... />
    <TextMoment ... />
    <ChangelogOutroCard ... />
  </Timegroup>
</ReleaseVideo>

ChangelogOutroCard
—

version

和

tagline

为必填属性。

durationMs

有默认值。

TextMoment
—

headline

和

body

为必填属性。

motif

、

accentColor

、

durationMs

为可选属性。

Pitfall: blank placeholder on page load

陷阱：页面加载时显示空白占位符

ReleaseVideo

renders as a blank gray box and never hydrates into a player, the cause is almost always a runtime error in one of the scene components.

ReleaseVideo

uses

isClient

gating, so SSR always shows the placeholder — errors in scene components silently kill client hydration. Check browser console for

TypeError

on the component throwing.

如果

ReleaseVideo

渲染为灰色空白框且从未 hydrate 为播放器，原因几乎总是某个场景组件的运行时错误。

ReleaseVideo

使用

isClient

进行门控，因此SSR始终显示占位符——场景组件中的错误会静默终止客户端hydration。检查浏览器控制台中抛出的

TypeError

。

Pitfalls

常见陷阱

TTS CustomVoice model is broken: Only use VoiceDesign (
```
generate_voice_design
```
). CustomVoice produces unintelligible audio.
Preamble is required: First 1-2s of TTS generation can be noisy. The throwaway preamble absorbs this. Always trim it off before uploading.
GCS filename reuse: CDN caches aggressively. Always use content-hashed filenames when uploading new audio.
Caption timestamps are global: They come from whisper on the single trimmed voiceover, not per-scene. Do not make them scene-relative.
CompositionCanvas rendering: Uses
```
flushSync
```
+
```
useLayoutEffect
```
to work in the synchronous export pipeline. Do not add manual
```
gl.render()
```
/
```
gl.finish()
```
calls — that doubles GPU work.
Em-dashes in VO text: Cause unwanted pauses in TTS output. Use commas or periods instead.

TTS CustomVoice模型已损坏：仅使用VoiceDesign（
```
generate_voice_design
```
）。CustomVoice会产生难以理解的音频。
必须保留引导片段：TTS生成的前1-2秒可能有噪音。占位引导片段用于吸收这些噪音。上传前务必修剪掉。
GCS文件名重复使用：CDN缓存策略激进。上传新音频时务必使用包含内容哈希的文件名。
字幕时间戳为全局：来自对单条修剪后旁白运行whisper的结果，而非按场景划分。请勿设置为场景相对时间。
CompositionCanvas渲染：使用
```
flushSync
```
+
```
useLayoutEffect
```
以在同步导出流水线中正常工作。请勿添加手动
```
gl.render()
```
/
```
gl.finish()
```
调用——这会使GPU工作量翻倍。
旁白文本中的破折号：会导致TTS输出出现不必要的停顿。改用逗号或句号。