faceless-explainer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

faceless-explainer - dispatch entry

faceless-explainer - 调度入口

Input is arbitrary text (article / notes / topic / brief). Output is a faceless explainer video: no captured website, no product screenshots — every visual is invented by the LLM (typography / abstract graphics / diagram / data-viz), chosen per scene by content. The style preset is auto-selected per input by the scriptwriting agent (Step 2) from the 5 shipped presets (
block-frame
/
capsule
/
claude
/
pin-and-paper
/
scatterbrain
; default
pin-and-paper
when nothing clearly fits).
Confirm the route before Step 0. This skill explains a topic / concept with no product and no site to capture. If the text actually markets a product / names its site
/product-launch-video
; there's a URL to turn into a video
/website-to-video
; a GitHub PR
/pr-to-video
; existing footage to caption / package →
/embedded-captions
·
/graphic-overlays
. Out of scope: timing visuals to a user-supplied / pre-recorded voiceover (faceless generates its own TTS →
/general-video
), or live / at-render-time data. Unsure product-vs-topic, or routed here on a vague request? Read
/hyperframes-read-first
first.
All artifacts go to
PROJECT_DIR = videos/<project-name>/
(created in Step 0); all paths below are relative to it. Dispatch is harness-portable: before the first subagent dispatch, read
<SKILL_DIR>/../hyperframes-core/references/subagent-dispatch.md
once — it maps the dispatch verbs (parallel fan-out / background / wait) to your harness's primitives; a concurrency cap below N means waves of the cap size, never fewer workers. This file is a binding runbook, not background reading: execute the steps in order and produce every phase artifact with its designated script or agent role — do not substitute a freestyle pipeline, and do not skip a pause step because the request seems clear. A step you cannot perform → stop and report.
PhaseExecutionPrimary artifactDetailed flow
initBash
hyperframes.json
Step 0
scaffoldBash (no agent)
capture/extracted/tokens.json
+
visible-text.txt
Step 1
scriptwritingsubagent
narrator_scripts.json
(incl. chosen
stylePreset
+
orientation
)
Step 2 /
agents/scriptwriting.md
design-systemBash (no agent, deterministic — style =
narrator_scripts.stylePreset
)
design-system/design.html
+
chunks/
Step 2b
audio
audio.mjs
in Bash
audio_meta.json
phases/audio/guide.md
visual-designsubagent
section_plan.md
agents/visual-design.md
prep
prep.mjs
in Bash
group_spec.json
scripts/prep.mjs
captions (deterministic)
captions.mjs group
->
captions.mjs html
in Bash (no subagent)
caption_groups.json
+
compositions/captions.html
scripts/captions.mjs
scenesN x subagent (parallel)
compositions/scene_*.html
or
compositions/group_w*.html
agents/hyperframes-scene.md
finalize (Phase 4c)Bash prelude (wait-bgm + assemble + inject/verify-transitions + hoist-videos + sfx-verify + preflight) -> finalize subagent (fix brief findings in place + one lean contact-sheet look + render)
renders/video.mp4
Step 7 /
agents/hyperframes-finalize.md
输入为任意文本(文章/笔记/主题/简报)。输出为无脸解说视频:无网站捕获内容、无产品截图——所有视觉元素均由LLM生成(排版/抽象图形/图表/数据可视化),并根据每个场景的内容进行选择。风格预设由脚本编写Agent(步骤2)根据输入内容从5个内置预设中自动选择(
block-frame
/
capsule
/
claude
/
pin-and-paper
/
scatterbrain
;若没有明确匹配项则默认使用
pin-and-paper
)。
步骤0前请确认路径。本技能用于讲解主题/概念,无产品及需捕获的网站。若文本实际用于推广产品/提及产品网站 → 使用
/product-launch-video
;若需将URL转换为视频 → 使用
/website-to-video
;若为GitHub PR → 使用
/pr-to-video
;若需为现有素材添加字幕/打包 → 使用
/embedded-captions
·
/graphic-overlays
超出范围场景:将视觉元素与用户提供/预录制的画外音同步(无脸解说视频会自行生成TTS语音,该场景请使用
/general-video
),或实时渲染数据。若不确定是产品类还是主题类内容,或因模糊请求被引导至此,请先阅读
/hyperframes-read-first
所有产物均存储至
PROJECT_DIR = videos/<project-name>/
(在步骤0中创建);以下所有路径均为该目录的相对路径。调度支持跨 harness 移植:在首次调度子Agent前,请阅读一次
<SKILL_DIR>/../hyperframes-core/references/subagent-dispatch.md
——该文档将调度动词(并行扇出/后台执行/等待)映射到你的harness原语;若并发上限低于N,则按上限大小分批执行,不得减少工作进程数量。本文件为绑定运行手册,而非背景阅读材料:请按顺序执行步骤,并按指定脚本或Agent角色生成每个阶段的产物——请勿替换为自由式流水线,也不要因请求看似明确而跳过暂停步骤。若无法执行某一步骤,请停止并上报。
阶段执行方式主要产物详细流程
初始化(init)Bash
hyperframes.json
步骤0
脚手架搭建(scaffold)Bash(无需Agent)
capture/extracted/tokens.json
+
visible-text.txt
步骤1
脚本编写(scriptwriting)子Agent
narrator_scripts.json
(包含选定的
stylePreset
+
orientation
步骤2 /
agents/scriptwriting.md
设计系统(design-system)Bash(无需Agent,确定性执行——样式 =
narrator_scripts.stylePreset
design-system/design.html
+
chunks/
步骤2b
音频生成(audio)在Bash中运行
audio.mjs
audio_meta.json
phases/audio/guide.md
视觉设计(visual-design)子Agent
section_plan.md
agents/visual-design.md
预处理(prep)在Bash中运行
prep.mjs
group_spec.json
scripts/prep.mjs
字幕生成(确定性)在Bash中依次运行
captions.mjs group
->
captions.mjs html
(无需子Agent)
caption_groups.json
+
compositions/captions.html
scripts/captions.mjs
场景生成(scenes)N个并行子Agent
compositions/scene_*.html
compositions/group_w*.html
agents/hyperframes-scene.md
最终处理(Phase 4c)Bash前置步骤(等待BGM生成 + 组装 + 注入/验证转场效果 + 提升视频层级 + 音效验证 + 预检查)→ 最终处理子Agent(就地修复简短问题 + 生成精简联系表预览 + 渲染)
renders/video.mp4
步骤7 /
agents/hyperframes-finalize.md

Prerequisites

前置条件

macOS Apple Silicon or Linux x64. System tools:
brew install python@3.11 node ffmpeg
(use Homebrew Python, not
/usr/bin/python3
, or
pip install
is blocked by PEP 668); then
npx hyperframes doctor
once (downloads Chrome). The rendered overlap gate (
scripts/check-overlap.mjs
, run in worker self-checks and preflight) reuses that same cached Chrome — it never downloads a browser; its only dep is the
puppeteer-core
npm module, ensured once before scene fan-out (Step 5.5,
--ensure-deps
, ~5s, no full
puppeteer
install). Optional cloud keys (else local fallbacks) — inject in Step 0.5:
KeyUsed forDefault / fallback
HEYGEN_API_KEY
(or
hyperframes auth login
)
TTS (cloud, word-level timestamps)voice: auto (first English starfish voice; override
--voice
)
ELEVENLABS_API_KEY
TTS (cloud; needs
pip install elevenlabs
)
voice
21m00Tcm4TlvDq8ikWAM
(Rachel)
neither, and not logged inTTSlocal Kokoro, voice
am_michael
(non-English: pass
--voice
)
GEMINI_API_KEY
/
GOOGLE_API_KEY
(aliases)
Lyria BGMunset -> local MusicGen (first run downloads ~300 MB)
需使用Apple Silicon架构的macOS或x64架构的Linux。系统工具:
brew install python@3.11 node ffmpeg
(请使用Homebrew安装的Python,不要使用
/usr/bin/python3
,否则
pip install
会被PEP 668阻止);然后运行一次
npx hyperframes doctor
(下载Chrome)。渲染重叠检查器(
scripts/check-overlap.mjs
,在工作进程自检和预检查中运行)会复用已缓存的Chrome——不会重复下载浏览器;其唯一依赖为
puppeteer-core
npm模块,会在场景扇出前(步骤5.5,
--ensure-deps
,约5秒,无需完整安装
puppeteer
)确保依赖已就绪。可选云密钥(若未提供则使用本地 fallback)——在步骤0.5中注入:
密钥用途默认值/降级方案
HEYGEN_API_KEY
(或
hyperframes auth login
TTS语音合成(云端,支持按单词级时间戳)语音:自动选择(首个英文starfish语音;可通过
--voice
覆盖)
ELEVENLABS_API_KEY
TTS语音合成(云端;需先
pip install elevenlabs
语音
21m00Tcm4TlvDq8ikWAM
(Rachel)
未提供上述密钥且未登录TTS语音合成本地Kokoro,语音
am_michael
(非英文场景:需传入
--voice
指定)
GEMINI_API_KEY
/
GOOGLE_API_KEY
(别名)
Lyria背景音乐生成未设置则使用本地MusicGen(首次运行需下载约300MB数据)

Flow

流程

Step 0.0 - Confirm the brief (ALWAYS ask one round, then build)

步骤0.0 - 确认简报(务必先询问一轮,再执行)

Before Step 0, always pause and ask the brief in one message, then wait for the user — never skip this, even for a request that looks complete. Lead with a recommended default for each field and pre-fill anything the user already gave (confirm it rather than re-asking blindly): the topic / angle (the one idea), length (default ~60-90s), and — if
/hyperframes-read-first
did not already set them — aspect (default 16:9; 9:16 for vertical) and language. Style is not asked here — the scriptwriting agent auto-picks the preset from the input in Step 2. Proceed to Step 0 only after the user replies; a "go" / "use the defaults" is a valid reply that accepts every default.
步骤0之前,务必暂停并通过一条消息询问简报细节,然后等待用户回复——切勿跳过此步骤,即使请求看似完整。为每个字段提供推荐默认值,并预填充用户已提供的内容(需确认而非盲目重问):主题/角度(核心观点)、时长(默认约60-90秒),以及——若
/hyperframes-read-first
未指定——宽高比(默认16:9;竖屏为9:16)和语言。此处无需询问风格——脚本编写Agent会在步骤2中根据输入内容自动选择预设。仅在用户回复后才可进入步骤0;“开始”/“使用默认值”属于有效回复,表示接受所有默认设置。

Step 0 - Initialize the video project

步骤0 - 初始化视频项目

cwd is the agent workspace root (e.g.
/tmp/explainer-video-...
). Write all video artifacts under
PROJECT_DIR = videos/<project-name>/
.
<project-name>
: use the directory the user gave (e.g.
Use ./videos/refactoring-explainer
), else a short kebab-case name derived from the input topic (
<topic>-explainer
/
<topic>-howto
). Not the workspace basename or a timestamp.
Only when
$PROJECT_DIR/hyperframes.json
is absent:
bash
PROJECT_DIR="${LAUNCH_VIDEO_DIR:-videos/<project-name>}"
mkdir -p "$(dirname "$PROJECT_DIR")"
npx hyperframes init "$PROJECT_DIR" --non-interactive --skip-skills --example=blank
hyperframes init
drops a generic
AGENTS.md
/
CLAUDE.md
into
$PROJECT_DIR
; leave them in place — they are agent scaffolding for whoever opens the finished project later. This skill (not those files) is the source of truth for the workflow, so do not treat their generic guidance as run-time constraints.
Constraints: never run
hyperframes init
/ generate
AGENTS.md
/
CLAUDE.md
in the workspace root; never nest another
hyperframes/
inside
PROJECT_DIR
; every Bash command (master + subagents) is a
(cd "$PROJECT_DIR" && ...)
subshell — never bare
cd
.
当前工作目录为Agent工作区根目录(例如
/tmp/explainer-video-...
)。所有视频产物均写入
PROJECT_DIR = videos/<project-name>/
目录下。
<project-name>
:使用用户指定的目录(例如
Use ./videos/refactoring-explainer
),若未指定则根据输入主题生成简短的短横线分隔名称(
"<topic>-explainer"
/
"<topic>-howto"
)。不得使用工作区基名或时间戳。
仅当
$PROJECT_DIR/hyperframes.json
不存在时执行以下命令:
bash
PROJECT_DIR="${LAUNCH_VIDEO_DIR:-videos/<project-name>}"
mkdir -p "$(dirname "$PROJECT_DIR")"
npx hyperframes init "$PROJECT_DIR" --non-interactive --skip-skills --example=blank
hyperframes init
会在
$PROJECT_DIR
中生成通用的
AGENTS.md
/
CLAUDE.md
文件;请保留这些文件——它们是后续打开已完成项目的Agent脚手架。本技能(而非这些文件)是工作流的唯一权威来源,因此请勿将这些文件中的通用指导视为运行时约束。
约束:切勿在工作区根目录运行
hyperframes init
/生成
AGENTS.md
/
CLAUDE.md
;切勿在
PROJECT_DIR
内部嵌套另一个
hyperframes/
目录;所有Bash命令(主进程+子Agent)均需在
(cd "$PROJECT_DIR" && ...)
子shell中执行——不得直接使用
cd
命令。

Step 0.5 - API key guidance

步骤0.5 - API密钥引导

Skip if
$PROJECT_DIR/.env
exists or
context.log
is non-empty (= not the first run). Otherwise first detect what's available (HeyGen TTS on if
$HEYGEN_API_KEY
/
$HYPERFRAMES_API_KEY
set or
~/.heygen/credentials
exists from
hyperframes auth login
; ElevenLabs / Gemini only if their env keys set), then always pause and offer the menu — wait for the user; do not proceed on your own even when a workable config is detected (the user may want to add a key like Gemini). State what's detected, then: paste keys (→ Write
$PROJECT_DIR/.env
, one
KEY=value
per line, overwrite same-name) / "go" (proceed with what's configured — env,
.env
, or
hyperframes auth login
) / "skip" (proceed with local fallbacks for anything unconfigured). Then proceed to Step 1.
$PROJECT_DIR/.env
已存在或
context.log
非空(表示非首次运行),则跳过此步骤。否则首先检测可用的密钥(若设置了
$HEYGEN_API_KEY
/
$HYPERFRAMES_API_KEY
,或存在
hyperframes auth login
生成的
~/.heygen/credentials
文件,则启用HeyGen TTS;仅当设置了对应环境密钥时才启用ElevenLabs/Gemini),然后务必暂停并提供选项菜单——等待用户回复;即使检测到可用配置,也不得自行继续(用户可能希望添加如Gemini等密钥)。说明已检测到的配置,然后提供选项:粘贴密钥(→ 将密钥写入
$PROJECT_DIR/.env
,每行一个
KEY=value
,覆盖同名密钥)/“开始”(使用当前配置继续——环境变量、
.env
hyperframes auth login
配置)/“跳过”(对未配置项使用本地降级方案继续)。之后进入步骤1。

Step 1 - Scaffold (Bash, NO agent, NO capture)

步骤1 - 脚手架搭建(Bash,无需Agent,无需捕获)

There is no website capture. Synthesize the minimal on-disk package the copied backend (
build-design --capture
,
prep --capture
) expects, directly from the user's text.
capture/
holds synthetic tokens + the input text (NOT a scrape);
capture/assets/
stays empty (faceless). With
colors:[]
, build-design uses the pin-and-paper native palette; if the user supplied brand colors, fill
colors[]
(
colors[0]
becomes the brand primary).
bash
(cd "$PROJECT_DIR" && mkdir -p capture/extracted capture/assets)
(cd "$PROJECT_DIR" && cat > capture/extracted/tokens.json <<'JSON'
{ "title": "<title>", "description": "<one-line>", "colors": [], "fonts": [], "headings": [], "sections": [], "ctas": [], "svgs": [], "cssVariables": {} }
JSON
)
(cd "$PROJECT_DIR" && printf '%s\n' "<full input text / article / notes / brief>" > capture/extracted/visible-text.txt)
Validation:
bash
[ -s "$PROJECT_DIR/capture/extracted/tokens.json" ] && \
[ -s "$PROJECT_DIR/capture/extracted/visible-text.txt" ] && \
[ -d "$PROJECT_DIR/capture/assets" ] && echo ok || echo missing
If any is missing, report and stop.
本步骤不涉及网站捕获。直接根据用户文本生成复制后端(
build-design --capture
prep --capture
)所需的最小磁盘文件包。
capture/
目录存储合成令牌+输入文本(而非抓取内容);
capture/assets/
目录保持为空(无脸解说场景)。若
colors:[]
,则
build-design
会使用
pin-and-paper
原生调色板;若用户提供了品牌颜色,请填充
colors[]
colors[0]
将作为品牌主色)。
bash
(cd "$PROJECT_DIR" && mkdir -p capture/extracted capture/assets)
(cd "$PROJECT_DIR" && cat > capture/extracted/tokens.json <<'JSON'
{ "title": "<title>", "description": "<one-line>", "colors": [], "fonts": [], "headings": [], "sections": [], "ctas": [], "svgs": [], "cssVariables": {} }
JSON
)
(cd "$PROJECT_DIR" && printf '%s\
' "<完整输入文本/文章/笔记/简报>" > capture/extracted/visible-text.txt)
验证:
bash
[ -s "$PROJECT_DIR/capture/extracted/tokens.json" ] && \\
[ -s "$PROJECT_DIR/capture/extracted/visible-text.txt" ] && \\
[ -d "$PROJECT_DIR/capture/assets" ] && echo ok || echo missing
若任何一项缺失,请上报并停止执行。

Step 2 - Scriptwriting (subagent — also picks the style preset)

步骤2 - 脚本编写(子Agent——同时选择风格预设)

Dispatch one subagent. prompt = full contents of
agents/scriptwriting.md
+ the
## Dispatch context
below, passed through verbatim:
SKILL_DIR: <absolute path>
PROJECT_DIR: <video project root>
Schema validator: <SKILL_DIR>/scripts/validate-narrator.mjs
Input text: ./capture/extracted/visible-text.txt   # The source article / notes / brief — the agent reads this first
Style preset: pick one from the menu in the guide and emit it as the top-level `stylePreset` (default `pin-and-paper` when unsure); match the narration register to the chosen preset
Orientation: <landscape | portrait | square>   # From the Step 0.0 aspect (16:9→landscape, 9:16→portrait, 1:1→square; default landscape). Emit it VERBATIM as the top-level `orientation` field — this is dictated, not a creative choice; it sets the canvas (portrait→1080×1920) for the whole pipeline.
Script style: Keep each scene's script concise — 1-2 sentences, no more than 20 words
Fill the
Orientation:
line from the aspect confirmed in Step 0.0 (default
landscape
). prep reads
narrator_scripts.orientation
→ stamps
group_spec.width/height
; without it the video stays 16:9.
The agent picks an explainer structure for
narrativeArchetype
(
concept-explainer
/
how-to-process
/
listicle
/
story-explainer
, or
"<outer> with <inner>"
), picks a top-level
stylePreset
from the 5 shipped presets (consumed by Step 2b), echoes the dispatched
orientation
as a top-level field (consumed by Step 5 prep → canvas size), and emits
narrator_scripts.json
(it runs the validator before returning).
continuity
drives worker grouping:
continue
= same worker as the previous scene (a run of up to 3 scenes, cap=3);
break
= new worker; scene 1 is always
break
.
intent
/
sharedMotif
are soft hints.
assetCandidates
is
[]
on essentially every scene (faceless).
调度一个子Agent。提示词 =
agents/scriptwriting.md
的完整内容 + 下方的
## Dispatch context
,原样传递:
SKILL_DIR: <绝对路径>
PROJECT_DIR: <视频项目根目录>
Schema验证器: <SKILL_DIR>/scripts/validate-narrator.mjs
输入文本: ./capture/extracted/visible-text.txt   # 源文章/笔记/简报——Agent会首先读取此文件
风格预设: 从指南菜单中选择一个并作为顶级`stylePreset`字段输出(不确定时默认使用`pin-and-paper`);旁白风格需与选定预设匹配
方向: <landscape | portrait | square>   # 来自步骤0.0确认的宽高比(16:9→landscape,9:16→portrait,1:1→square;默认landscape)。原样输出为顶级`orientation`字段——此为强制要求,非创意选择;会为整个流水线设置画布尺寸(portrait→1080×1920)
脚本风格: 每个场景的脚本需简洁——1-2句话,不超过20个单词
根据步骤0.0确认的宽高比填充
Orientation:
行(默认
landscape
)。
prep
脚本会读取
narrator_scripts.orientation
→ 标记
group_spec.width/height
;若未设置,视频将保持16:9比例。
Agent会为
narrativeArchetype
选择一种解说结构(
concept-explainer
/
how-to-process
/
listicle
/
story-explainer
,或
"<outer> with <inner>"
),从5个内置预设中选择顶级**
stylePreset
(供步骤2b使用),原样回显调度的
orientation
**作为顶级字段(供步骤5预处理→画布尺寸使用),并输出
narrator_scripts.json
(返回前会运行验证器)。
continuity
字段控制工作进程分组:
continue
= 使用与上一场景相同的工作进程(最多连续3个场景,上限=3);
break
= 使用新工作进程;场景1始终为
break
intent
/
sharedMotif
为软性提示。
assetCandidates
在几乎所有场景中均为
[]
(无脸解说场景)。

Step 2b - Design system (Bash, NO agent, deterministic — style chosen by Step 2)

步骤2b - 设计系统(Bash,无需Agent,确定性执行——风格由步骤2选择)

Read the agent's
stylePreset
from
narrator_scripts.json
(default
pin-and-paper
if absent), then run three deterministic commands to produce a fully-styled
design.html
+ chunks against the synthetic input:
bash
STYLE=$(cd "$PROJECT_DIR" && node -e 'try{const p=require("./narrator_scripts.json").stylePreset;process.stdout.write((p&&String(p).trim())||"pin-and-paper")}catch{process.stdout.write("pin-and-paper")}')
(cd "$PROJECT_DIR" && node <SKILL_DIR>/phases/design-system/scripts/build-design.mjs ./design-system --no-emit --style "$STYLE")
(cd "$PROJECT_DIR" && node <SKILL_DIR>/phases/design-system/scripts/build-design.mjs ./design-system --style "$STYLE")
(cd "$PROJECT_DIR" && node <SKILL_DIR>/phases/design-system/scripts/emit-chunks.mjs ./design-system)
stylePreset
must be one of the 5 shipped presets (
block-frame
/
capsule
/
claude
/
pin-and-paper
/
scatterbrain
); an unknown name makes
build-design.mjs
exit 1 — fall back to
pin-and-paper
and rerun. This step depends only on
narrator_scripts.json
, so it may run in parallel with Step 3 audio; both must finish before Step 4 visual-design.
Validation:
bash
[ -s "$PROJECT_DIR/design-system/inference.json" ] && \
[ -s "$PROJECT_DIR/design-system/design.html" ] && \
[ -s "$PROJECT_DIR/design-system/chunks/index.json" ] && echo ok || echo missing
If any is missing, read the build-design / emit-chunks stderr, fix the invocation, and rerun (deterministic, finishes in seconds).
narrator_scripts.json
中读取Agent选定的
stylePreset
(若不存在则默认使用
pin-and-paper
),然后运行三个确定性命令,根据合成输入生成完全样式化的
design.html
+ 代码块:
bash
STYLE=$(cd "$PROJECT_DIR" && node -e 'try{const p=require("./narrator_scripts.json").stylePreset;process.stdout.write((p&&String(p).trim())||"pin-and-paper")}catch{process.stdout.write("pin-and-paper")}')
(cd "$PROJECT_DIR" && node <SKILL_DIR>/phases/design-system/scripts/build-design.mjs ./design-system --no-emit --style "$STYLE")
(cd "$PROJECT_DIR" && node <SKILL_DIR>/phases/design-system/scripts/build-design.mjs ./design-system --style "$STYLE")
(cd "$PROJECT_DIR" && node <SKILL_DIR>/phases/design-system/scripts/emit-chunks.mjs ./design-system)
stylePreset
必须为5个内置预设之一(
block-frame
/
capsule
/
claude
/
pin-and-paper
/
scatterbrain
);若名称未知,
build-design.mjs
会以状态码1退出——此时需回退到
pin-and-paper
并重新运行。此步骤仅依赖
narrator_scripts.json
,因此可与步骤3音频生成并行执行;两者均需完成后才可进入步骤4视觉设计。
验证:
bash
[ -s "$PROJECT_DIR/design-system/inference.json" ] && \\
[ -s "$PROJECT_DIR/design-system/design.html" ] && \\
[ -s "$PROJECT_DIR/design-system/chunks/index.json" ] && echo ok || echo missing
若任何一项缺失,请读取
build-design
/
emit-chunks
的标准错误输出,修正调用方式后重新运行(确定性执行,数秒内完成)。

Step 3 - Audio

步骤3 - 音频生成

After
narrator_scripts.json
exists:
bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/audio.mjs \
  --narrator-scripts ./narrator_scripts.json \
  --hyperframes . \
  --out ./audio_meta.json \
  --lyria-recipe <SKILL_DIR>/phases/audio/lyria-recipe.py)
BGM generation runs detached in the background when keys/deps allow, otherwise is silently skipped. Flags + BGM mechanics: top of
audio.mjs
.
  • exit 0 -> voice + transcribe complete (BGM may still be rendering;
    audio_meta.json
    records
    bgm_log
    /
    bgm_pid
    ), continue.
  • exit 1 -> zero scenes produced voice; report and stop.
narrator_scripts.json
生成后执行以下命令:
bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/audio.mjs \\
  --narrator-scripts ./narrator_scripts.json \\
  --hyperframes . \\
  --out ./audio_meta.json \\
  --lyria-recipe <SKILL_DIR>/phases/audio/lyria-recipe.py)
当密钥/依赖满足时,背景音乐生成会在后台独立运行,否则会自动跳过。标志位+背景音乐机制:详见
audio.mjs
顶部说明。
  • 状态码0 -> 语音+转录完成(背景音乐可能仍在渲染;
    audio_meta.json
    会记录
    bgm_log
    /
    bgm_pid
    ),继续执行。
  • 状态码1 -> 无任何场景生成语音;上报并停止执行。

Step 4 - Visual design

步骤4 - 视觉设计

After
design-system/chunks/index.json
,
narrator_scripts.json
, and
audio_meta.json
exist, concatenate all inputs into one dispatch packet (contracts first, static references middle, work items last):
bash
undefined
design-system/chunks/index.json
narrator_scripts.json
audio_meta.json
均生成后,将所有输入拼接为一个调度包(契约在前,静态引用在中,工作项在后):
bash
undefined

Dispatch packets live in $PROJECT_DIR/.dispatch/ (transient; safe to delete after the run).

调度包存储在$PROJECT_DIR/.dispatch/(临时目录;运行完成后可安全删除)。

NEVER use a fixed /tmp path: it persists across runs/projects, so a failed write silently

切勿使用固定的/tmp路径:该路径会跨运行/项目保留,写入失败会静默复用其他项目的陈旧包,污染所有工作进程。

reuses another project's stale packet and contaminates every worker.

mkdir -p "$PROJECT_DIR/.dispatch" DP="$PROJECT_DIR/.dispatch/vd-dispatch.txt" { echo "## Design chunks" (cd "$PROJECT_DIR" && cat design-system/chunks/index.json
design-system/chunks/composition-hints.md design-system/chunks/voice.md
design-system/chunks/tokens.css design-system/chunks/easings.js 2>/dev/null) echo "## Effects catalog"; cat <SKILL_DIR>/phases/visual-design/effects-catalog.md echo "## Design rules"; cat <SKILL_DIR>/phases/visual-design/rules/{typography,color-system,composition,motion-language}.md echo "## SFX library"; cat <SKILL_DIR>/assets/sfx/manifest.json echo "## Narrator scripts"; (cd "$PROJECT_DIR" && cat narrator_scripts.json) echo "## Audio meta"; (cd "$PROJECT_DIR" && cat audio_meta.json 2>/dev/null) # Optional; overrides Duration if drift >10% } > "$DP"
mkdir -p "$PROJECT_DIR/.dispatch" DP="$PROJECT_DIR/.dispatch/vd-dispatch.txt" { echo "## Design chunks" (cd "$PROJECT_DIR" && cat design-system/chunks/index.json \ design-system/chunks/composition-hints.md design-system/chunks/voice.md \ design-system/chunks/tokens.css design-system/chunks/easings.js 2>/dev/null) echo "## Effects catalog"; cat <SKILL_DIR>/phases/visual-design/effects-catalog.md echo "## Design rules"; cat <SKILL_DIR>/phases/visual-design/rules/{typography,color-system,composition,motion-language}.md echo "## SFX library"; cat <SKILL_DIR>/assets/sfx/manifest.json echo "## Narrator scripts"; (cd "$PROJECT_DIR" && cat narrator_scripts.json) echo "## Audio meta"; (cd "$PROJECT_DIR" && cat audio_meta.json 2>/dev/null) # 可选;若时间偏差>10%则覆盖Duration字段 } > "$DP"

Guard: a partially-failed build must fail LOUDLY here, not downstream in the subagent

防护:部分失败的构建必须在此处直接失败,而非在子Agent中出现下游问题

grep -q '^## Narrator scripts' "$DP" || { echo "FATAL: vd-dispatch.txt incomplete — rebuild before dispatching"; }
grep -q '^## Narrator scripts' "$DP" || { echo "FATAL: vd-dispatch.txt不完整——调度前请重新构建"; }

Captions planning hint (put it in the Captions: line of the dispatch below)

字幕规划提示(放入下方调度内容的Captions:行中)

(cd "$PROJECT_DIR" && node -e 'try{const m=require("./audio_meta.json");process.stdout.write(Object.values(m.scenes||{}).some(s=>s.wordsPath)?"enabled":"disabled")}catch{process.stdout.write("enabled")}')

Then dispatch the visual-design subagent. prompt = full contents of `agents/visual-design.md` + the `## Dispatch context` below, verbatim:
SKILL_DIR: <absolute path> PROJECT_DIR: <video project root> Schema validator: <SKILL_DIR>/scripts/validate-section.mjs Canvas: <width>×<height> # default 1920×1080 (16:9 landscape); 1080×1920 (9:16 portrait) or 1080×1080 (1:1 square) if requested upstream (narrator_scripts.orientation/dimensions). Plan layouts for THIS aspect ratio — see composition.md "Portrait & square". Captions: <enabled | disabled> # Planning hint from the node -e above: enabled => leave the bottom ~17% of canvas height as caption territory in prose Dispatch packet: <PROJECT_DIR>/.dispatch/vd-dispatch.txt # Step 0 reads it once for all inputs Visuals: faceless — every scene is typography / abstract graphics / diagram / data-viz invented from the script. assetCandidates is [] for most or all scenes; plan visuals from text, not from captured assets.

Output is `section_plan.md`. `type-roles.md` and component HTML bodies are not in the packet (worker responsibilities). The `Captions:` line is an optimistic hint; the authoritative gate is `group_spec.captions_enabled` from Step 5.
(cd "$PROJECT_DIR" && node -e 'try{const m=require("./audio_meta.json");process.stdout.write(Object.values(m.scenes||{}).some(s=>s.wordsPath)?"enabled":"disabled")}catch{process.stdout.write("enabled")}')

然后调度视觉设计子Agent。提示词 = `agents/visual-design.md`的完整内容 + 下方的`## Dispatch context`,原样传递:
SKILL_DIR: <绝对路径> PROJECT_DIR: <视频项目根目录> Schema验证器: <SKILL_DIR>/scripts/validate-section.mjs 画布尺寸: <width>×<height> # 默认1920×1080(16:9横屏);若上游请求(narrator_scripts.orientation/dimensions)为竖屏或正方形,则使用1080×1920(9:16竖屏)或1080×1080(1:1正方形)。请为此宽高比规划布局——参考composition.md中的"Portrait & square"章节。 字幕: <enabled | disabled> # 来自上述node -e命令的规划提示:enabled => 在文本场景中为字幕预留画布底部约17%的区域 调度包: <PROJECT_DIR>/.dispatch/vd-dispatch.txt # 步骤0会一次性读取所有输入 视觉元素: 无脸解说——每个场景的排版/抽象图形/图表/数据可视化均根据脚本生成。assetCandidates在大多数或所有场景中为[];请根据文本而非捕获资产规划视觉元素。

输出为`section_plan.md`。`type-roles.md`和组件HTML主体不在包中(属于工作进程职责)。`Captions:`行是乐观提示;权威判断依据为步骤5的`group_spec.captions_enabled`。

Step 5 - prep (deterministic script, NO subagent)

步骤5 - 预处理(确定性脚本,无需子Agent)

After
section_plan.md
exists:
bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/prep.mjs \
  --section-plan ./section_plan.md \
  --narrator-scripts ./narrator_scripts.json \
  $( [ -f audio_meta.json ] && echo "--audio-meta ./audio_meta.json" ) \
  --rules-dir <SKILL_DIR>/../hyperframes-animation/rules \
  --capture ./capture \
  --design-system ./design-system \
  --hyperframes . \
  --sfx-lib <SKILL_DIR>/assets/sfx \
  --out ./group_spec.json)
Merges all upstream artifacts into
group_spec.json
(parse
section_plan
anchors, validate effect/component ids, group by
Continuity
with cap=3, build
visual_clips[]
where a multi-scene continue worker becomes one
group_wN.html
, compute Tier-B
transitions[]
between different visual clips, copy assets/fonts/SFX).
capture/assets/
is empty, so asset-copy is a no-op (faceless). Internal logic: header of
prep.mjs
.
  • exit 0 -> read stdout (scenes / groups / total duration / per-group) and append to
    context.log
    .
  • exit 1 -> stderr names the failing scene + anchor (usually a malformed anchor or unknown effect/transition id); return to Step 4 and re-dispatch visual-design.
section_plan.md
生成后执行以下命令:
bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/prep.mjs \\
  --section-plan ./section_plan.md \\
  --narrator-scripts ./narrator_scripts.json \\
  $( [ -f audio_meta.json ] && echo "--audio-meta ./audio_meta.json" ) \\
  --rules-dir <SKILL_DIR>/../hyperframes-animation/rules \\
  --capture ./capture \\
  --design-system ./design-system \\
  --hyperframes . \\
  --sfx-lib <SKILL_DIR>/assets/sfx \\
  --out ./group_spec.json)
将所有上游产物合并为
group_spec.json
(解析
section_plan
锚点,验证效果/组件ID,按
Continuity
分组,上限=3,构建
visual_clips[]
——多场景连续工作进程会生成一个
group_wN.html
,计算不同视觉片段之间的Tier-B
transitions[]
,复制资产/字体/音效)。
capture/assets/
为空,因此资产复制为无操作(无脸解说场景)。内部逻辑:详见
prep.mjs
头部说明。
  • 状态码0 -> 读取标准输出(场景/分组/总时长/每组时长)并追加到
    context.log
  • 状态码1 -> 标准错误输出会指出失败的场景+锚点(通常为格式错误的锚点或未知的效果/转场ID);返回步骤4并重新调度视觉设计。

Step 5.5 + Step 6 - Captions (deterministic) + scene worker fan-out

步骤5.5 + 步骤6 - 字幕生成(确定性)+ 场景工作进程扇出

Captions: two deterministic scripts (no subagent), after prep exits 0 and before fan-out:
bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/captions.mjs group \
  --group-spec ./group_spec.json --hyperframes . \
  --tokens design-system/chunks/tokens.css --out ./caption_groups.json)

(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/captions.mjs html \
  --hyperframes . --groups ./caption_groups.json \
  --tokens design-system/chunks/tokens.css \
  --inference design-system/inference.json \
  --out compositions/captions.html)
exit 0 = normal. If either prints
captions: skipped (<reason>)
, skip the whole chain: no
captions.html
, assemble won't mount track 12. Skin selection / self-check: top of
captions.mjs html
; for offline, pass
--skin-file
. Do not run
npx hyperframes lint
on
captions.html
.
Then ensure the overlap-gate dep once, from the workspace root (NOT inside
PROJECT_DIR
— the module must land in the workspace
node_modules/
where every worker and preflight can resolve it):
bash
node <SKILL_DIR>/scripts/check-overlap.mjs --ensure-deps
字幕生成:两个确定性脚本(无需子Agent),在预处理状态码为0后、扇出前执行:
bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/captions.mjs group \\
  --group-spec ./group_spec.json --hyperframes . \\
  --tokens design-system/chunks/tokens.css --out ./caption_groups.json)

(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/captions.mjs html \\
  --hyperframes . --groups ./caption_groups.json \\
  --tokens design-system/chunks/tokens.css \\
  --inference design-system/inference.json \\
  --out compositions/captions.html)
状态码0 = 正常。若任一脚本输出
captions: skipped (<reason>)
,则跳过整个字幕流程:不生成
captions.html
,组装步骤不会挂载轨道12。皮肤选择/自检:详见
captions.mjs html
头部说明;离线场景请传入
--skin-file
请勿
captions.html
运行
npx hyperframes lint
然后从工作区根目录确保一次重叠检查器依赖(不要
PROJECT_DIR
内部执行——模块需安装在工作区
node_modules/
中,以便所有工作进程和预检查均可解析):
bash
node <SKILL_DIR>/scripts/check-overlap.mjs --ensure-deps

Installs puppeteer-core (module only, no browser download) if not already resolvable; Chrome is

若尚未可解析则安装puppeteer-core(仅模块,无浏览器下载);Chrome会复用hyperframes浏览器缓存。工作进程不得自行安装(并行npm会出现竞争问题)。

reused from the hyperframes browser cache. Workers must NOT install it themselves (parallel npm race).


Then read `group_spec.json.groups[]` for worker count N. Each worker's self-check runs two scoped machine gates before returning — `captions.mjs keepout --scene` (when captions enabled) and `check-overlap.mjs --scene` (always) — so layout violations are fixed at the source instead of surfacing at preflight. Build the shared header once, then per-worker packets (`film direction` / `tokens` / `easings` / `voice` are identical for every worker):

```bash

然后读取`group_spec.json.groups[]`获取工作进程数量N。每个工作进程在返回前会运行两个范围化机器检查——`captions.mjs keepout --scene`(启用字幕时)和`check-overlap.mjs --scene`(始终运行)——因此布局违规会在源头修复,而非在预检查时暴露。先构建共享头部,然后为每个工作进程生成包(`film direction`/`tokens`/`easings`/`voice`对所有工作进程相同):

```bash

Same rule as Step 4: packets go in $PROJECT_DIR/.dispatch/, never a fixed /tmp path

与步骤4相同规则:包存储在$PROJECT_DIR/.dispatch/,切勿使用固定/tmp路径

(a stale /tmp file from a previous project survives a failed write and silently

(之前项目的陈旧/tmp文件会在写入失败时静默复用,导致所有工作进程使用错误的设计系统)。

poisons every worker with the wrong design system).

## Film direction
= group_spec.film_direction中的影片级不变量

## Film direction
= the film-level invariants from group_spec.film_direction

(调色板系统/运动默认值+预算/环境系统/禁用列表);

(palette system / motion defaults + budget / ambient system / negative list);

每个场景的creative_brief仅包含基于此的场景特定增量。

each scene's creative_brief carries only scene-specific deltas on top of it.

mkdir -p "$PROJECT_DIR/.dispatch/scene-dispatch" { echo "## Film direction" (cd "$PROJECT_DIR" && node -p 'JSON.parse(require("fs").readFileSync("group_spec.json","utf8")).film_direction || ""') echo "## Tokens/easings/voice" (cd "$PROJECT_DIR" && cat design-system/chunks/tokens.css design-system/chunks/easings.js design-system/chunks/voice.md 2>/dev/null) } > "$PROJECT_DIR/.dispatch/scene-shared.txt"
mkdir -p "$PROJECT_DIR/.dispatch/scene-dispatch" { echo "## Film direction" (cd "$PROJECT_DIR" && node -p 'JSON.parse(require("fs").readFileSync("group_spec.json","utf8")).film_direction || ""') echo "## Tokens/easings/voice" (cd "$PROJECT_DIR" && cat design-system/chunks/tokens.css design-system/chunks/easings.js design-system/chunks/voice.md 2>/dev/null) } > "$PROJECT_DIR/.dispatch/scene-shared.txt"

Guard BEFORE fan-out: the project's own brand token must be present; a contaminated

扇出前防护:项目自身的品牌令牌必须存在;此处的污染包会导致所有受影响工作进程需重新一轮创作。

packet here costs a full re-author round across every affected worker.

grep -q -- '--brand-primary' "$PROJECT_DIR/.dispatch/scene-shared.txt" ||
{ echo "FATAL: scene-shared.txt incomplete/stale — rebuild before dispatching workers"; }
grep -q -- '--brand-primary' "$PROJECT_DIR/.dispatch/scene-shared.txt" || \ { echo "FATAL: scene-shared.txt不完整/陈旧——调度前请重新构建"; }

Then per worker: shared header + that worker's Scenes YAML -> $PROJECT_DIR/.dispatch/scene-dispatch/w<N>.txt

然后为每个工作进程生成:共享头部 + 该工作进程的Scenes YAML -> $PROJECT_DIR/.dispatch/scene-dispatch/w<N>.txt


Start **N scene workers in parallel** (concurrent background dispatches; a harness concurrency cap below N means waves of the cap size until every worker has run — never fewer workers). prompt = full contents of `agents/hyperframes-scene.md` + `## Dispatch context`, verbatim. Top-level fields: `SKILL_DIR` / `PROJECT_DIR` / `Worker ID` / `Composition width` + `Composition height` (= `group_spec.width` / `group_spec.height`) / `Captions: <enabled|disabled>` (= `group_spec.captions_enabled`) / `Dispatch packet: <PROJECT_DIR>/.dispatch/scene-dispatch/w<N>.txt`, plus the shared header body (`## Film direction` + `## Tokens/easings/voice`) + a `Scenes:` list.

For the worker top-level context, copy from `group_spec.json.groups[i]`: `worker_id`, `composition_id`, `composition_file`, `duration_s`, `scene_ids`; and from the top of `group_spec.json`: `width`, `height` (the worker authors + self-checks the root at these dims — landscape 1920×1080 unless portrait/square was requested upstream). **When `Captions: enabled`, also pass `Caption band top y` = `height − round(height × 0.1667)` and `Foreground max y` = `Caption band top y − 20`** (landscape → 900 / 880; portrait → 1600 / 1580) — constraint #13 keep-out is computed from these, not hardcoded. Copy every field in the **`Scenes:` list verbatim from `group_spec.json.groups[i].scenes[<sid>]`** (only that worker's 1-3 logical scenes): `scene_id` / `local_start_s` / `effects` / `rule_paths` / `assetCandidates` / `estimatedDuration_s` / `voicePath` / `design_chunks` (absolute paths to the whole component library — the worker chooses by visual judgment) / `creative_brief`. A 2-3 scene worker writes one `group_wN.html` with true shared DOM across the segments.

`assetCandidates` is `[]` for most or all scenes — the worker invents the visual from `creative_brief` + design chunks; there are no captured assets to place. `design_chunks: null` (chunks missing) → worker falls back to reading `./design-system/design.html` fully; should not happen in the normal path.

After all workers + captions return, run preflight (scans `group_spec.visual_clips[]`; does NOT check `captions.html`):

```bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/check-compositions.mjs \
  --hyperframes . \
  --group-spec ./group_spec.json)
  • exit 0 -> all compositions pass, continue to Step 7.
  • exit 1 -> stderr names the violating scene + rule category; return to Step 6 and re-dispatch the affected worker (do not Edit in the master — fix upstream).

**并行启动N个场景工作进程**(后台并发调度;若harness并发上限低于N,则按上限大小分批执行,直到所有工作进程启动——不得减少工作进程数量)。提示词 = `agents/hyperframes-scene.md`的完整内容 + `## Dispatch context`,原样传递。顶级字段:`SKILL_DIR`/`PROJECT_DIR`/`Worker ID`/`Composition width`+`Composition height`(= `group_spec.width`/`group_spec.height`)/`Captions: <enabled|disabled>`(= `group_spec.captions_enabled`)/`Dispatch packet: <PROJECT_DIR>/.dispatch/scene-dispatch/w<N>.txt`,加上共享头部内容(`## Film direction`+`## Tokens/easings/voice`)+`Scenes:`列表。

工作进程顶级上下文从`group_spec.json.groups[i]`复制:`worker_id`、`composition_id`、`composition_file`、`duration_s`、`scene_ids`;从`group_spec.json`顶部复制:`width`、`height`(工作进程会以此尺寸创作+自检根元素——横屏为1920×1080,除非上游请求竖屏/正方形)。**当`Captions: enabled`时,还需传递`Caption band top y` = `height − round(height × 0.1667)`和`Foreground max y` = `Caption band top y − 20`**(横屏→900/880;竖屏→1600/1580)——约束#13的预留区域由此计算,而非硬编码。**`Scenes:`列表中的每个字段均从`group_spec.json.groups[i].scenes[<sid>]`原样复制**(仅该工作进程负责的1-3个逻辑场景):`scene_id`/`local_start_s`/`effects`/`rule_paths`/`assetCandidates`/`estimatedDuration_s`/`voicePath`/`design_chunks`(组件库的绝对路径——工作进程根据视觉判断选择)/`creative_brief`。负责2-3个场景的工作进程会生成一个`group_wN.html`,各片段之间共享真实DOM。

`assetCandidates`在大多数或所有场景中为`[]`——工作进程根据`creative_brief`+设计代码块生成视觉元素;无捕获资产可使用。`design_chunks: null`(代码块缺失)→ 工作进程会回退到完整读取`./design-system/design.html`;正常流程中不应出现此情况。

所有工作进程+字幕生成完成后,运行预检查(扫描`group_spec.visual_clips[]`;**不**检查`captions.html`):

```bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/check-compositions.mjs \\
  --hyperframes . \\
  --group-spec ./group_spec.json)
  • 状态码0 -> 所有合成内容通过检查,进入步骤7。
  • 状态码1 -> 标准错误输出会指出违规场景+规则类别;返回步骤6并重新调度受影响的工作进程(请勿在主进程中编辑——需从上游修复)。

Step 7 - Assembly prelude + preflight gate + finalize

步骤7 - 组装前置步骤 + 预检查门 + 最终处理

After Step 6 exits 0: a deterministic Bash prelude (wait-bgm + assemble + inject/verify-transitions + hoist-videos + sfx-verify + preflight), then one finalize subagent that fixes the brief's findings in place, takes ONE lean contact-sheet look, and renders. Principle: deterministic prelude is all Bash; findings go to finalize (not back to workers); worker re-dispatch is reserved for recomposition.
compositions/scene_N.html
/
group_wN.html
are worker source files; editing them edits the source.
(1) BGM wait + assembly (Bash):
bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/wait-bgm.mjs \
  --audio-meta ./audio_meta.json \
  --hyperframes . \
  --timeout-ms 120000 \
  --interval-ms 2000)
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/assemble-index.mjs --group-spec ./group_spec.json --hyperframes .)
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/transitions.mjs inject --group-spec ./group_spec.json --hyperframes .)
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/transitions.mjs verify --group-spec ./group_spec.json --index ./index.html)
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/hoist-videos.mjs --group-spec ./group_spec.json --hyperframes .)
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/verify-output.mjs sfx --group-spec ./group_spec.json --index ./index.html)
inject
only changes the
index.html
shell
data-start
/
data-duration
/
data-track-index
, never visual roots.
hoist-videos
reads each composition's poster
data-video-src
declarations, measures the poster's rendered rect headless, and mounts the real
<video class="clip">
at the index.html host root with global timing clamped clear of transitions
— the ONLY legal way footage plays, since the runtime never decodes a
<video>
nested in a scene. Internal logic: header of each script.
  • assemble exit 1 -> names a visual composition (root
    data-duration
    != group_spec, or file missing) = worker contract break → return to Step 6, re-dispatch that worker, rerun this step.
  • inject/verify-transitions exit 1 -> injector bug (prep already validated
    transitions[]
    ) → report, don't roll back workers.
  • hoist-videos exit 1 -> a
    data-video-src
    declaration is invalid (missing file / bad numbers / window too small after transition clamping / poster not measurable) — stderr names the scene + declaration;
    Edit
    the visual source file (or re-dispatch its worker for a real relayout), then rerun this step. exit 2 -> browser unavailable; run
    node <SKILL_DIR>/scripts/check-overlap.mjs --ensure-deps
    from the workspace root, then rerun. exit 0 prints one line per hoisted video (src, global window, track, rect).
  • sfx-verify exit 1 -> assembler bug → report.
(2) Preflight gate (Bash):
bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/preflight-finalize.mjs --group-spec ./group_spec.json --hyperframes .)
preflight does everything the agent does not need to judge and writes it all into
finalize_brief.json
: warms a pinned
npx hyperframes@<version>
cache, runs lint/validate/inspect with that version (inspect runs STRICT — no
--tolerance
flag, CLI default
; by-design transient overflow from 3D morph / tilt / zoom peaks is declared per-element with
data-layout-allow-overflow
, never absorbed numerically — any re-run of inspect elsewhere must also be plain or verdicts disagree) and captures tails + summary counts, computes the snapshot timeline, runs
check-overlap.mjs
(the single-rule rendered overlap gate: every scene loaded headless, timeline seeked to 0.4/0.7/0.92 of duration, all non-background paint atoms flattened onto one plane with z-index ignored, pairwise-intersected; persistent overlap = a finding finalize must fix;
status: unavailable
blocks at exit 2 — the gate never soft-skips), and when
captions_enabled
runs
captions.mjs keepout
static check for "foreground lower edge y <= 900" (the bbox math folds in CSS transforms AND
margin-top
/
margin-bottom
). Keep-out violations include ready-to-apply Edit strings (
edit_old
/
edit_new
) and overlap violations carry both selectors + both rects + the overlap rect — finalize consumes both directly and fixes them in place. Brief fields (
preflight_clean
/
gates_clean
/
gates.*
/
bgm.*
/
overlap.*
/
caption_keepout.*
/
anomalies[]
/
snapshot_times_s[]
/
npx_prefix
/
scenes[]
/
internal_seams[]
) and algorithm details are documented at the top of
preflight-finalize.mjs
. Only contrast and cramped-container remain eye-owned (finalize's one contact-sheet scan); collision / panel-bleed are machine-owned by the overlap gate.
Exit codes (orchestrator must read them):
  • exit 0 -> dispatch finalize — clean or not. Findings (gate errors /
    overlap.violations[]
    /
    caption_keepout.violations[]
    ) ride in the brief and finalize fixes them in place as its first work step. Do NOT diagnose them yourself, do NOT hand-Edit visual source files, do NOT re-dispatch workers for them.
  • exit 2 -> ONLY when the overlap gate could not run (
    overlap.status: "unavailable"
    — puppeteer-core / Chrome missing). Environment problem with a deterministic remedy: run
    node <SKILL_DIR>/scripts/check-overlap.mjs --ensure-deps
    from the workspace root (and
    npx hyperframes doctor
    if it names Chrome), then rerun preflight — do not proceed unmeasured.
  • exit 1 -> preflight itself crashed (bad invocation / missing group_spec) → fix the invocation.
Worker re-dispatch (Repair Mode) is the EXCEPTION path now, not a preflight branch: it triggers only when finalize STOPs because a scene needs recomposition (content fundamentally wrong / real relayout / animation broken beyond a couple of edits). Then: re-dispatch that scene's owning worker (a group worker owns every logical scene in its
group_wN.html
— dispatch it once with all its findings) with the full
agents/hyperframes-scene.md
+ normal dispatch context + a
## Repair context
block carrying finalize's verbatim findings,
npx_prefix
from the brief,
Inspect at: <t1,t2,t3>
(that scene's
midpoint_s
+ extras from
brief.scenes[]
), and
Captions: enabled|disabled
; the worker Edits in place and self-verifies (scoped plain
inspect --at
+
check-overlap.mjs --scene
+ keepout) per the contract's Repair Mode section. After it returns, rerun (1)+(2) and re-dispatch finalize. If the same finding survives two rounds, STOP and surface it to the user.
Scan
anomalies[]
even on exit 0 (loud non-blocking warnings; currently rare — read each entry's
message
and decide whether it changes the dispatch).
(3) Dispatch finalize subagent (fix brief findings in place -> ONE lean contact-sheet look -> render). prompt = full contents of
agents/hyperframes-finalize.md
+
## Dispatch context
:
SKILL_DIR: <absolute path>
PROJECT_DIR: <video project root>
Render quality: high     # Or draft / standard
Finalize brief: <PROJECT_DIR>/finalize_brief.json   # Preflight has already written it; agent reads once to get findings + npx_prefix + scene timings
Film direction: |        # = group_spec.film_direction (film-level invariants the briefs assume)
  <verbatim>
Visual clips:            # One line per group_spec.visual_clips[] entry
  - { id, file, kind, worker_id, scene_ids, start_s, duration_s }
Scenes:                  # One line per logical scene, copied verbatim from group_spec.json
  - { scene_id, start_s, estimatedDuration_s, effects: [...], creative_brief: |
      <Phase 3 prose for this scene> }
index.html
is already assembled (transitions injected, videos hoisted); all gates have already run. Finalize's flow: fix every brief finding in place first (gate
output_tail
-> Edit + rerun only that gate;
overlap.violations[]
-> Edit per the given selectors/rects + scoped
check-overlap --scene
verify;
caption_keepout.violations[]
-> apply
edit_old
/
edit_new
mechanically), then ONE snapshot call at scene midpoints + group-internal seam mids, one read of the contact sheet (looking only for blank/black panels, cut or unreadable text, crushed interiors, broken internal seams — escalate single frames only on suspicion), then render + verify-render. No per-frame QA walkthrough. Finalize must never change a visual root
data-duration
(=
visual_clips[].duration_s
, fixed upstream; changing it makes assemble fatal — timing is only fixable by returning to Step 6).
  • finalize reports the mp4 (verify-render passed) + gate status + findings fixed + lean-pass summary + files repaired in place -> complete.
  • finalize STOP (only when a scene needs full recomposition) -> return to Step 6, re-dispatch that worker, rerun (1)+(2), re-dispatch finalize. This is an exception path, not the default.
步骤6状态码为0后:执行确定性Bash前置步骤(等待BGM生成+组装+注入/验证转场效果+提升视频层级+音效验证+预检查),然后执行一个最终处理子Agent,就地修复简报中的问题,生成一份精简联系表预览,然后渲染。原则:确定性前置步骤均为Bash;问题交由最终处理子Agent(而非返回工作进程);工作进程重新调度仅用于重新创作。
compositions/scene_N.html
/
group_wN.html
为工作进程源文件;编辑这些文件即编辑源内容。
(1) BGM等待 + 组装(Bash):
bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/wait-bgm.mjs \\
  --audio-meta ./audio_meta.json \\
  --hyperframes . \\
  --timeout-ms 120000 \\
  --interval-ms 2000)
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/assemble-index.mjs --group-spec ./group_spec.json --hyperframes .)
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/transitions.mjs inject --group-spec ./group_spec.json --hyperframes .)
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/transitions.mjs verify --group-spec ./group_spec.json --index ./index.html)
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/hoist-videos.mjs --group-spec ./group_spec.json --hyperframes .)
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/verify-output.mjs sfx --group-spec ./group_spec.json --index ./index.html)
inject
仅修改
index.html
外壳的
data-start
/
data-duration
/
data-track-index
,从不修改视觉根元素。
hoist-videos
会读取每个合成内容的海报
data-video-src
声明,无头测量海报的渲染矩形,然后在index.html宿主根元素挂载真实的
<video class="clip">
,并全局调整时间以避开转场效果
——这是素材播放的唯一合法方式,因为运行时从不解码嵌套在场景中的
<video>
元素。内部逻辑:详见每个脚本的头部说明。
  • 组装步骤状态码1 -> 指出某个视觉合成内容(根元素
    data-duration
    与group_spec不匹配,或文件缺失)= 工作进程违反契约 → 返回步骤6,重新调度该工作进程,重新运行此步骤。
  • 注入/验证转场效果步骤状态码1 -> 注入器bug(预处理已验证
    transitions[]
    )→ 上报,无需回退工作进程。
  • 提升视频层级步骤状态码1 ->
    data-video-src
    声明无效(文件缺失/数值错误/转场调整后窗口过小/海报无法测量)——标准错误输出会指出场景+声明;编辑视觉源文件(或重新调度其工作进程进行真实重布局),然后重新运行此步骤。状态码2 -> 浏览器不可用;从工作区根目录运行
    node <SKILL_DIR>/scripts/check-overlap.mjs --ensure-deps
    ,然后重新运行。状态码0会输出每个提升视频的信息(源、全局时间窗口、轨道、矩形)。
  • 音效验证步骤状态码1 -> 组装器bug → 上报。
(2) 预检查门(Bash):
bash
(cd "$PROJECT_DIR" && node <SKILL_DIR>/scripts/preflight-finalize.mjs --group-spec ./group_spec.json --hyperframes .)
预检查会执行所有无需Agent判断的操作,并将结果写入
finalize_brief.json
:预热固定版本的
npx hyperframes@<version>
缓存,使用该版本运行lint/validate/inspect(inspect严格运行——无
--tolerance
标志,使用CLI默认值
;3D变形/倾斜/缩放峰值导致的设计性临时溢出需通过
data-layout-allow-overflow
按元素声明,从不数值吸收——其他地方运行inspect时也需使用默认值,否则判断结果会不一致)并捕获尾部信息+汇总统计,计算快照时间线,运行**
check-overlap.mjs
**(单规则渲染重叠检查门:无头加载每个场景,将时间线定位到时长的0.4/0.7/0.92处,将所有非背景绘制元素展平到同一平面(忽略z-index),两两相交;持续重叠=需最终处理子Agent修复的问题;
status: unavailable
会以状态码2阻止执行——检查门从不软跳过),若启用字幕则运行
captions.mjs keepout
静态检查“前景下边缘y <= 900”(边界框计算会包含CSS变换和
margin-top
/
margin-bottom
)。预留区域违规包含可直接应用的编辑字符串
edit_old
/
edit_new
),重叠违规包含选择器+矩形+重叠矩形——最终处理子Agent会直接使用这些信息就地修复。简报字段(
preflight_clean
/
gates_clean
/
gates.*
/
bgm.*
/
overlap.*
/
caption_keepout.*
/
anomalies[]
/
snapshot_times_s[]
/
npx_prefix
/
scenes[]
/
internal_seams[]
)和算法细节详见
preflight-finalize.mjs
头部说明。仅对比度和容器拥挤问题需人工判断(最终处理子Agent的单次联系表扫描);碰撞/面板溢出由重叠检查门自动判断。
退出码(编排器必须读取)
  • 状态码0 -> 调度最终处理子Agent——无论是否存在问题。问题(检查门错误/
    overlap.violations[]
    /
    caption_keepout.violations[]
    )会随简报传递,最终处理子Agent会就地修复这些问题作为首要工作步骤。请勿自行诊断这些问题,请勿手动编辑视觉源文件,请勿为这些问题重新调度工作进程。
  • 状态码2 -> 仅当重叠检查门无法运行时(
    overlap.status: "unavailable"
    ——puppeteer-core/Chrome缺失)。环境问题有确定性解决方法:从工作区根目录运行
    node <SKILL_DIR>/scripts/check-overlap.mjs --ensure-deps
    (若提示Chrome缺失则运行
    npx hyperframes doctor
    ),然后重新运行预检查——不得在未测量的情况下继续执行。
  • 状态码1 -> 预检查自身崩溃(调用方式错误/group_spec缺失)→ 修正调用方式。
工作进程重新调度(修复模式)为异常流程,而非预检查分支:仅当最终处理子Agent停止时触发,原因是场景需要重新创作(内容根本错误/真实重布局/动画损坏无法通过少量编辑修复)。此时:重新调度该场景所属的工作进程(组工作进程负责其
group_wN.html
中的所有逻辑场景——一次调度即可传递所有问题),传入完整的
agents/hyperframes-scene.md
+正常调度上下文+
## Repair context
块,包含最终处理子Agent的原始问题、简报中的
npx_prefix
Inspect at: <t1,t2,t3>
(该场景的
midpoint_s
+
brief.scenes[]
中的额外信息)和
Captions: enabled|disabled
;工作进程会就地编辑并自检(范围化的普通
inspect --at
+
check-overlap.mjs --scene
+预留区域检查),符合契约中的修复模式要求。工作进程返回后,重新运行(1)+(2)并重新调度最终处理子Agent。若同一问题在两轮后仍存在,请停止并向用户上报。
即使状态码为0,也请扫描
anomalies[]
(明显的非阻塞警告;目前罕见——读取每个条目的
message
并决定是否调整调度)。
(3) 调度最终处理子Agent(就地修复简报问题→生成一份精简联系表预览→渲染)。提示词 =
agents/hyperframes-finalize.md
的完整内容 +
## Dispatch context
SKILL_DIR: <绝对路径>
PROJECT_DIR: <视频项目根目录>
渲染质量: high     # 可选draft/standard
最终处理简报: <PROJECT_DIR>/finalize_brief.json   # 预检查已写入该文件;Agent会读取一次获取问题+npx_prefix+场景时间线
影片方向: |        # = group_spec.film_direction(简报所依赖的影片级不变量)
  <原样内容>
视觉片段:            # 每个group_spec.visual_clips[]条目一行
  - { id, file, kind, worker_id, scene_ids, start_s, duration_s }
场景:                  # 每个逻辑场景一行,从group_spec.json原样复制
  - { scene_id, start_s, estimatedDuration_s, effects: [...], creative_brief: |
      <该场景的Phase 3文本> }
index.html
已完成组装(注入转场效果,提升视频层级);所有检查门已运行。最终处理流程:首先就地修复所有简报问题(检查门
output_tail
→编辑+仅重新运行该检查门;
overlap.violations[]
→根据给定选择器/矩形编辑+范围化
check-overlap --scene
验证;
caption_keepout.violations[]
→自动应用
edit_old
/
edit_new
),然后在场景中点+组内接缝中点生成一次快照,读取一次联系表(仅检查空白/黑色面板、截断或不可读文本、压缩内部内容、损坏的内部接缝——仅在怀疑时上报单帧问题),然后渲染+验证渲染结果。无需逐帧QA检查。最终处理子Agent不得修改视觉根元素的
data-duration
(=
visual_clips[].duration_s
,由上游固定;修改会导致组装步骤失败——时间仅可通过返回步骤6修复)。
  • 最终处理子Agent上报mp4(验证渲染通过)+检查门状态+已修复问题+精简检查总结+就地修复的文件→流程完成。
  • 最终处理子Agent停止(仅当场景需要完整重新创作时)→返回步骤6,重新调度该工作进程,重新运行(1)+(2)并重新调度最终处理子Agent。这是异常流程,而非默认流程。

Completion report

完成报告

Summarize per phase: input title / topic, preset (auto-picked by scriptwriting from the 5 shipped presets), explainer structure, scene count / total duration, worker grouping, transitions, gate status (lint / validate / inspect strict / overlap), hoisted videos (count + tracks), findings fixed in place, lean pass (tiles scanned, escalations), visual files repaired in place, final mp4 path + bytes + duration.
Offer a live preview — never auto-open one. The deliverable is the mp4 above. A browser preview is optional and must not be started until the user asks for it. Do NOT run
hyperframes preview
/
play
during any earlier phase: a preview opened mid-run shows half-edited compositions and dies when that phase's own snapshot/render server is torn down, which confuses more than it helps. End the report with a single offer line, e.g.:
Optional: I can open a live preview so you can scrub frame-by-frame, change playback speed, or get a shareable link — say the word and I'll start it.
Only after the user asks, start a long-lived dev server (it serves the final on-disk files and stays up until stopped), then report the actual URL with the real port + project name:
bash
(cd "$PROJECT_DIR" && npx hyperframes preview)   # Studio UI, e.g. http://localhost:3002/#project/<project-name>
按阶段总结:输入标题/主题、预设(由脚本编写Agent从5个内置预设中自动选择)、解说结构、场景数量/总时长、工作进程分组、转场效果、检查门状态(lint/validate/inspect严格/重叠)、提升的视频数量(数量+轨道)、就地修复的问题、精简检查(扫描的切片、上报问题)、就地修复的视觉文件、最终mp4路径+大小+时长。
提供在线预览选项——切勿自动打开。交付产物为上述mp4文件。浏览器预览为可选功能,必须在用户请求后才可启动。请勿在任何早期阶段运行
hyperframes preview
/
play
:运行中途打开的预览会显示半编辑的合成内容,且会在该阶段的快照/渲染服务器关闭时崩溃,反而造成混淆。报告结尾添加一行选项,例如:
可选:我可以启动在线预览,你可以逐帧 scrub、调整播放速度或获取可分享链接——如需请告知,我会立即启动。
仅在用户请求后,启动长期运行的开发服务器(会提供最终磁盘文件,直到停止),然后上报实际URL(包含真实端口+项目名称):
bash
(cd "$PROJECT_DIR" && npx hyperframes preview)   # Studio UI,例如http://localhost:3002/#project/<project-name>

or a lightweight shareable player link instead:

或使用轻量级可分享播放器链接:

(cd "$PROJECT_DIR" && npx hyperframes play) # plain http://localhost:<port>

Flags (custom port, external browser) live in the `hyperframes-cli` skill (`references/preview-render.md`).

---
(cd "$PROJECT_DIR" && npx hyperframes play) # 普通链接http://localhost:<port>

标志位(自定义端口、外部浏览器)详见`hyperframes-cli`技能(`references/preview-render.md`)。

---

Resume table

恢复表

Read
$PROJECT_DIR/context.log
and resume from:
StateContinue from
log missing or emptyFull pipeline
capture/extracted/tokens.json
or
visible-text.txt
missing
Step 1 (scaffold)
scaffold done,
narrator_scripts.json
missing
Step 2 (scriptwriting). If the user supplied a final
narrator_scripts.json
, place it in
$PROJECT_DIR/
to skip this state (add a top-level
stylePreset
, or Step 2b defaults to
pin-and-paper
)
narrator_scripts.json
exists,
design-system/chunks/index.json
missing
Step 2b (design-system;
--style
=
narrator_scripts.stylePreset
, default
pin-and-paper
)
narrator_scripts.json
exists,
audio_meta.json
missing
Step 3 (audio)
audio_meta.json
exists,
section_plan.md
missing
Step 4 (visual-design)
section_plan.md
exists,
group_spec.json
missing
Step 5 (prep)
group_spec.json
exists, any
visual_clips[].file
missing or
caption_groups.json
missing
Step 5.5+6 (run
captions.mjs group
->
html
, then dispatch workers for missing clips). Captions-ran criterion =
caption_groups.json
exists (NOT
captions.html
, since a legal skip produces none)
all
visual_clips[].file
exist + captions decided,
renders/video.mp4
missing
Step 7 (rerun assemble + sfx-verify + preflight, overwriting
finalize_brief.json
/
index.html
, then dispatch finalize)
renders/video.mp4
exists
Report completed and stop
读取
$PROJECT_DIR/context.log
并从以下状态恢复:
状态恢复起点
日志缺失或为空完整流水线
capture/extracted/tokens.json
visible-text.txt
缺失
步骤1(脚手架搭建)
脚手架搭建完成,
narrator_scripts.json
缺失
步骤2(脚本编写)。若用户提供了最终的
narrator_scripts.json
,将其放入
$PROJECT_DIR/
即可跳过此状态(需添加顶级
stylePreset
,否则步骤2b默认使用
pin-and-paper
narrator_scripts.json
已存在,
design-system/chunks/index.json
缺失
步骤2b(设计系统;
--style
=
narrator_scripts.stylePreset
,默认
pin-and-paper
narrator_scripts.json
已存在,
audio_meta.json
缺失
步骤3(音频生成)
audio_meta.json
已存在,
section_plan.md
缺失
步骤4(视觉设计)
section_plan.md
已存在,
group_spec.json
缺失
步骤5(预处理)
group_spec.json
已存在,任何
visual_clips[].file
缺失
caption_groups.json
缺失
步骤5.5+6(运行
captions.mjs group
->
html
,然后为缺失的片段调度工作进程)。字幕已生成的判断标准为
caption_groups.json
存在(而非
captions.html
,因为合法跳过不会生成该文件)
所有
visual_clips[].file
已存在 + 字幕已确定,
renders/video.mp4
缺失
步骤7(重新运行组装+音效验证+预检查,覆盖
finalize_brief.json
/
index.html
,然后调度最终处理子Agent)
renders/video.mp4
已存在
上报完成并停止

Directory shape

目录结构

text
./                            # workspace root
├── .claude/skills/
├── node_modules/  package.json
└── videos/<project-name>/    # PROJECT_DIR - HyperFrames project root
    ├── hyperframes.json  context.log
    ├── capture/              # synthetic package (NOT a scrape) — kept for backend layout compatibility
    │   ├── extracted/        # tokens.json (synthetic) + visible-text.txt (the input text)
    │   └── assets/           # empty (faceless)
    ├── design-system/        # build-design outputs: inference.json / design.html / chunks/ / fonts/
    ├── narrator_scripts.json  audio_meta.json  section_plan.md  group_spec.json
    ├── public/  assets/  compositions/  snapshots/
    └── renders/video.mp4
text
./                            # 工作区根目录
├── .claude/skills/
├── node_modules/  package.json
└── videos/<project-name>/    # PROJECT_DIR - HyperFrames项目根目录
    ├── hyperframes.json  context.log
    ├── capture/              # 合成文件包(非抓取内容)——为兼容后端布局保留
    │   ├── extracted/        # tokens.json(合成) + visible-text.txt(输入文本)
    │   └── assets/           # 空目录(无脸解说场景)
    ├── design-system/        # build-design输出:inference.json / design.html / chunks/ / fonts/
    ├── narrator_scripts.json  audio_meta.json  section_plan.md  group_spec.json
    ├── public/  assets/  compositions/  snapshots/
    └── renders/video.mp4
",