explainer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

When to Use

适用场景

  • User wants to create an explainer or tutorial video
  • User asks to "explain" something in video form
  • User wants narrated content with AI-generated visuals
  • User says "explainer video", "解说视频", "tutorial video"
  • 用户希望创建解说视频或教程视频
  • 用户要求以视频形式“解释”某内容
  • 用户需要带有AI生成视觉效果的旁白内容
  • 用户提及“explainer video”、“解说视频”、“tutorial video”

When NOT to Use

不适用场景

  • User wants audio-only content without visuals (use
    /speech
    or
    /podcast
    )
  • User wants a podcast-style discussion (use
    /podcast
    )
  • User wants to generate a standalone image (use
    /image-gen
    )
  • User wants to read text aloud without video (use
    /speech
    )
  • 用户仅需要无视觉内容的音频(请使用
    /speech
    /podcast
  • 用户需要播客风格的讨论内容(请使用
    /podcast
  • 用户希望生成单张图片(请使用
    /image-gen
  • 用户仅需要将文本朗读出来而不需要视频(请使用
    /speech

Purpose

功能用途

Generate explainer videos that combine a single narrator's voiceover with AI-generated visuals. Ideal for product introductions, concept explanations, and tutorials. Supports text-only script generation or full text + video output.
生成结合单旁白配音与AI生成视觉效果的解说视频。非常适合产品介绍、概念讲解和教程内容。支持仅生成文字脚本,或同时生成完整文本+视频输出。

Hard Constraints

硬性约束

  • No shell scripts. Construct curl commands from the API reference files listed in Resources
  • Always read
    shared/authentication.md
    for API key and headers
  • Follow
    shared/common-patterns.md
    for polling, errors, and interaction patterns
  • Always read config following
    shared/config-pattern.md
    before any interaction
  • Never hardcode speaker IDs — always fetch from the speakers API
  • Never save files to
    ~/Downloads/
    — use
    .listenhub/explainer/
    from config
  • Explainer uses exactly 1 speaker
  • Mode must be
    info
    (for Info style) or
    story
    (for Story style) — never
    slides
    (use
    /slides
    skill instead)
<HARD-GATE> Use the AskUserQuestion tool for every multiple-choice step — do NOT print options as plain text. Ask one question at a time. Wait for the user's answer before proceeding to the next step. After all parameters are collected, summarize the choices and ask the user to confirm. Do NOT call any generation API until the user has explicitly confirmed. </HARD-GATE>
  • 禁止使用shell脚本。请根据资源中列出的API参考文档构造curl命令
  • 始终阅读
    shared/authentication.md
    获取API密钥和请求头信息
  • 遵循
    shared/common-patterns.md
    中的轮询、错误处理和交互模式
  • 在进行任何交互前,务必按照
    shared/config-pattern.md
    读取配置
  • 绝不硬编码主播ID — 始终从主播API获取
  • 禁止将文件保存至
    ~/Downloads/
    — 请使用配置中指定的
    .listenhub/explainer/
    目录
  • 解说视频仅支持1名主播
  • 模式必须为
    info
    (信息风格)或
    story
    (故事风格)— 绝不使用
    slides
    (请改用
    /slides
    技能)
<HARD-GATE> 在每个多选步骤中必须使用AskUserQuestion工具 — 不得将选项以纯文本形式打印。一次仅提出一个问题。等待用户回答后再进行下一步。收集完所有参数后,总结所选内容并请用户确认。在用户明确确认前,不得调用任何生成API。 </HARD-GATE>

Step -1: API Key Check

步骤-1:API密钥检查

Follow
shared/config-pattern.md
§ API Key Check. If the key is missing, stop immediately.
遵循
shared/config-pattern.md
中的「API密钥检查」章节。如果密钥缺失,立即停止操作。

Step 0: Config Setup

步骤0:配置设置

Follow
shared/config-pattern.md
Step 0.
If file doesn't exist — ask location, then create immediately:
bash
mkdir -p ".listenhub/explainer"
echo '{"outputDir":".listenhub","outputMode":"inline","language":null,"defaultStyle":null,"defaultSpeakers":{}}' > ".listenhub/explainer/config.json"
CONFIG_PATH=".listenhub/explainer/config.json"
遵循
shared/config-pattern.md
中的步骤0。
如果配置文件不存在 — 询问存储位置,然后立即创建:
bash
mkdir -p ".listenhub/explainer"
echo '{"outputDir":".listenhub","outputMode":"inline","language":null,"defaultStyle":null,"defaultSpeakers":{}}' > ".listenhub/explainer/config.json"
CONFIG_PATH=".listenhub/explainer/config.json"

(or $HOME/.listenhub/explainer/config.json for global)

(全局配置可使用$HOME/.listenhub/explainer/config.json)

Then run **Setup Flow** below.

**If file exists** — read config, display summary, and confirm:
当前配置 (explainer): 输出方式:{inline / download / both} 语言偏好:{zh / en / 未设置} 默认风格:{info / story / 未设置} 默认主播:{speakerName / 未设置}
Ask: "使用已保存的配置?" → **确认,直接继续** / **重新配置**
然后执行下方的「设置流程」。

**如果配置文件已存在** — 读取配置,显示摘要并确认:
当前配置 (explainer): 输出方式:{inline / download / both} 语言偏好:{zh / en / 未设置} 默认风格:{info / story / 未设置} 默认主播:{speakerName / 未设置}
询问:"使用已保存的配置?" → **确认,直接继续** / **重新配置**

Setup Flow (first run or reconfigure)

设置流程(首次运行或重新配置)

Ask these questions in order, then save all answers to config at once:
  1. outputMode: Follow
    shared/output-mode.md
    § Setup Flow Question.
  2. Language (optional): "默认语言?"
    • "中文 (zh)"
    • "English (en)"
    • "每次手动选择" → keep
      null
  3. Style (optional): "默认风格?"
    • "Info — 信息展示型"
    • "Story — 故事叙述型"
    • "每次手动选择" → keep
      null
After collecting answers, save immediately:
bash
undefined
按顺序提出以下问题,然后一次性将所有答案保存至配置:
  1. outputMode:遵循
    shared/output-mode.md
    中的「设置流程问题」。
  2. 语言(可选):"默认语言?"
    • "中文 (zh)"
    • "English (en)"
    • "每次手动选择" → 保留
      null
  3. 风格(可选):"默认风格?"
    • "Info — 信息展示型"
    • "Story — 故事叙述型"
    • "每次手动选择" → 保留
      null
收集完答案后立即保存:
bash
undefined

Follow shared/output-mode.md § Save to Config

遵循shared/output-mode.md中的「保存至配置」章节

NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}') echo "$NEW_CONFIG" > "$CONFIG_PATH" CONFIG=$(cat "$CONFIG_PATH")

Note: `defaultSpeakers` are saved after generation (see After Successful Generation section).
NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}') echo "$NEW_CONFIG" > "$CONFIG_PATH" CONFIG=$(cat "$CONFIG_PATH")
注意:`defaultSpeakers`将在生成完成后保存(请查看「生成成功后」章节)。

Interaction Flow

交互流程

Step 1: Topic / Content

步骤1:主题/内容

Free text input. Ask the user:
What would you like to explain or introduce?
Accept: topic description, text content, or concept to explain.
接受自由文本输入。询问用户:
您想要解释或介绍什么内容?
可接受:主题描述、文本内容或需要讲解的概念。

Step 2: Language

步骤2:语言

If
config.language
is set, pre-fill and show in summary — skip this question. Otherwise ask:
Question: "What language?"
Options:
  - "Chinese (zh)" — Content in Mandarin Chinese
  - "English (en)" — Content in English
如果
config.language
已设置,预填充并显示在摘要中 — 跳过此问题。 否则询问:
问题:"选择语言?"
选项:
  - "Chinese (zh)" — 内容为中文普通话
  - "English (en)" — 内容为英文

Step 3: Style

步骤3:风格

If
config.defaultStyle
is set, pre-fill and show in summary — skip this question. Otherwise ask:
Question: "What style of explainer?"
Options:
  - "Info" — Informational, factual presentation style
  - "Story" — Narrative, storytelling approach
如果
config.defaultStyle
已设置,预填充并显示在摘要中 — 跳过此问题。 否则询问:
问题:"选择解说风格?"
选项:
  - "Info" — 信息型、事实性的展示风格
  - "Story" — 叙事型、讲故事的方式

Step 4: Speaker Selection

步骤4:主播选择

Follow
shared/speaker-selection.md
for the full selection flow, including:
  • Default from
    config.defaultSpeakers.{language}
    (skip step if set)
  • Text table + free-text input
  • Input matching and re-prompt on no match
Only 1 speaker is supported for explainer videos.
遵循
shared/speaker-selection.md
中的完整选择流程,包括:
  • config.defaultSpeakers.{language}
    获取默认值(如果已设置则跳过此步骤)
  • 文本表格+自由文本输入
  • 输入匹配,无匹配结果时重新提示
解说视频仅支持1名主播。

Step 5: Output Type

步骤5:输出类型

Question: "What output do you want?"
Options:
  - "Text script only" — Generate narration script, no video
  - "Text + Video" — Generate full explainer video with AI visuals
问题:"您需要什么输出类型?"
选项:
  - "仅文字脚本" — 仅生成旁白脚本,不生成视频
  - "文本+视频" — 生成带有AI视觉效果的完整解说视频

Step 6: Confirm & Generate

步骤6:确认并生成

Summarize all choices:
Ready to generate explainer:

  Topic: {topic}
  Language: {language}
  Style: {info/story}
  Speaker: {speaker name}
  Output: {text only / text + video}

  Proceed?
Wait for explicit confirmation before calling any API.
总结所有选择:
即将生成解说内容:

  主题:{topic}
  语言:{language}
  风格:{info/story}
  主播:{speaker name}
  输出类型:{仅文字 / 文本+视频}

  是否继续?
等待用户明确确认后,再调用任何API。

Workflow

工作流

  1. Submit (foreground):
    POST /storybook/episodes
    with content, speaker, language, mode → extract
    episodeId
  2. Tell the user the task is submitted
  3. Poll (background): Run the following exact bash command with
    run_in_background: true
    and
    timeout: 600000
    . Do NOT use python3, awk, or any other JSON parser — use
    jq
    as shown:
    bash
    EPISODE_ID="<id-from-step-1>"
    for i in $(seq 1 30); do
      RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
        -H "Authorization: Bearer $LISTENHUB_API_KEY" 2>/dev/null)
      STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.processStatus // "pending"')
      case "$STATUS" in
        success|completed) echo "$RESULT"; exit 0 ;;
        failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
        *) sleep 10 ;;
      esac
    done
    echo "TIMEOUT" >&2; exit 2
  4. When notified, download and present script:
    Read
    OUTPUT_MODE
    from config. Follow
    shared/output-mode.md
    for behavior.
    inline
    or
    both
    : Present the script inline.
    Present:
    解说脚本已生成!
    
    「{title}」
    
    在线查看:https://listenhub.ai/app/explainer/{episodeId}
    download
    or
    both
    : Also save the script file.
    • Create
      .listenhub/explainer/YYYY-MM-DD-{episodeId}/
    • Write
      {episodeId}.md
      from the generated script content
    • Present the download path in addition to the above summary.
  5. If video requested:
    POST /storybook/episodes/{episodeId}/video
    (foreground) → poll again (background) using the exact bash command below with
    run_in_background: true
    and
    timeout: 600000
    . Poll for
    videoStatus
    , not
    processStatus
    :
    bash
    EPISODE_ID="<id-from-step-1>"
    for i in $(seq 1 30); do
      RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
        -H "Authorization: Bearer $LISTENHUB_API_KEY" 2>/dev/null)
      STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.videoStatus // "pending"')
      case "$STATUS" in
        success|completed) echo "$RESULT"; exit 0 ;;
        failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
        *) sleep 10 ;;
      esac
    done
    echo "TIMEOUT" >&2; exit 2
  6. When notified, download and present result:
Present result
Read
OUTPUT_MODE
from config. Follow
shared/output-mode.md
for behavior.
inline
or
both
: Display video URL and audio URL as clickable links.
Present:
解说视频已生成!

视频链接:{videoUrl}
音频链接:{audioUrl}
时长:{duration}s
消耗积分:{credits}
download
or
both
: Also download the audio file.
bash
DATE=$(date +%Y-%m-%d)
JOB_DIR=".listenhub/explainer/${DATE}-{jobId}"
mkdir -p "$JOB_DIR"
curl -sS -o "${JOB_DIR}/{jobId}.mp3" "{audioUrl}"
Present the download path in addition to the above summary.
  1. 提交(前台):调用
    POST /storybook/episodes
    接口,传入内容、主播、语言、模式 → 提取
    episodeId
  2. 告知用户任务已提交
  3. 轮询(后台):运行以下精确的bash命令,设置
    run_in_background: true
    timeout: 600000
    。不得使用python3、awk或其他JSON解析器 — 请使用如下所示的
    jq
    bash
    EPISODE_ID="<id-from-step-1>"
    for i in $(seq 1 30); do
      RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
        -H "Authorization: Bearer $LISTENHUB_API_KEY" 2>/dev/null)
      STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.processStatus // "pending"')
      case "$STATUS" in
        success|completed) echo "$RESULT"; exit 0 ;;
        failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
        *) sleep 10 ;;
      esac
    done
    echo "TIMEOUT" >&2; exit 2
  4. 收到通知后,下载并展示脚本
    从配置中读取
    OUTPUT_MODE
    。遵循
    shared/output-mode.md
    中的行为规则。
    inline
    both
    :直接展示脚本内容。
    展示内容:
    解说脚本已生成!
    
    「{title}」
    
    在线查看:https://listenhub.ai/app/explainer/{episodeId}
    download
    both
    :同时保存脚本文件。
    • 创建目录
      .listenhub/explainer/YYYY-MM-DD-{episodeId}/
    • 将生成的脚本内容写入
      {episodeId}.md
      文件
    • 在上述摘要之外,展示下载路径。
  5. 如果用户选择生成视频:调用
    POST /storybook/episodes/{episodeId}/video
    接口(前台)→ 再次进行后台轮询,使用以下精确的bash命令,设置
    run_in_background: true
    timeout: 600000
    。轮询
    videoStatus
    而非
    processStatus
    bash
    EPISODE_ID="<id-from-step-1>"
    for i in $(seq 1 30); do
      RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/storybook/episodes/$EPISODE_ID" \
        -H "Authorization: Bearer $LISTENHUB_API_KEY" 2>/dev/null)
      STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.videoStatus // "pending"')
      case "$STATUS" in
        success|completed) echo "$RESULT"; exit 0 ;;
        failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
        *) sleep 10 ;;
      esac
    done
    echo "TIMEOUT" >&2; exit 2
  6. 收到通知后,下载并展示结果
展示结果
从配置中读取
OUTPUT_MODE
。遵循
shared/output-mode.md
中的行为规则。
inline
both
:展示可点击的视频链接和音频链接。
展示内容:
解说视频已生成!

视频链接:{videoUrl}
音频链接:{audioUrl}
时长:{duration}s
消耗积分:{credits}
download
both
:同时下载音频文件。
bash
DATE=$(date +%Y-%m-%d)
JOB_DIR=".listenhub/explainer/${DATE}-{jobId}"
mkdir -p "$JOB_DIR"
curl -sS -o "${JOB_DIR}/{jobId}.mp3" "{audioUrl}"
在上述摘要之外,展示下载路径。

After Successful Generation

生成成功后

Update config with the choices made this session:
bash
NEW_CONFIG=$(echo "$CONFIG" | jq \
  --arg lang "{language}" \
  --arg style "{info/story}" \
  --arg speakerId "{speakerId}" \
  '. + {"language": $lang, "defaultStyle": $style, "defaultSpeakers": (.defaultSpeakers + {($lang): [$speakerId]})}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
Estimated times:
  • Text script only: 2-3 minutes
  • Text + Video: 3-5 minutes
更新配置,保存本次会话的选择:
bash
NEW_CONFIG=$(echo "$CONFIG" | jq \
  --arg lang "{language}" \
  --arg style "{info/story}" \
  --arg speakerId "{speakerId}" \
  '. + {"language": $lang, "defaultStyle": $style, "defaultSpeakers": (.defaultSpeakers + {($lang): [$speakerId]})}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
预计耗时
  • 仅文字脚本:2-3分钟
  • 文本+视频:3-5分钟

API Reference

API参考

  • Speaker list:
    shared/api-speakers.md
  • Speaker selection guide:
    shared/speaker-selection.md
  • Episode creation:
    shared/api-storybook.md
  • Polling:
    shared/common-patterns.md
    § Async Polling
  • Config pattern:
    shared/config-pattern.md
  • 主播列表:
    shared/api-speakers.md
  • 主播选择指南:
    shared/speaker-selection.md
  • 剧集创建:
    shared/api-storybook.md
  • 轮询:
    shared/common-patterns.md
    中的「异步轮询」章节
  • 配置模式:
    shared/config-pattern.md

Composability

可组合性

  • Invokes: speakers API (for speaker selection); may invoke
    /speech
    for voiceover
  • Invoked by: content-planner (Phase 3)
  • 调用的服务:主播API(用于选择主播);可能调用
    /speech
    接口生成旁白
  • 被调用方:content-planner(第三阶段)

Example

示例

User: "Create an explainer video introducing Claude Code"
Agent workflow:
  1. Topic: "Claude Code introduction"
  2. Ask language → "English"
  3. Ask style → "Info"
  4. Fetch speakers, user picks "cozy-man-english"
  5. Ask output → "Text + Video"
bash
curl -sS -X POST "https://api.marswave.ai/openapi/v1/storybook/episodes" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sources": [{"type": "text", "content": "Introduce Claude Code: what it is, key features, and how to get started"}],
    "speakers": [{"speakerId": "cozy-man-english"}],
    "language": "en",
    "mode": "info"
  }'
Poll until text is ready, then generate video if requested.
用户:"Create an explainer video introducing Claude Code"
Agent工作流
  1. 主题:"Claude Code introduction"
  2. 询问语言 → "English"
  3. 询问风格 → "Info"
  4. 获取主播列表,用户选择"cozy-man-english"
  5. 询问输出类型 → "Text + Video"
bash
curl -sS -X POST "https://api.marswave.ai/openapi/v1/storybook/episodes" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sources": [{"type": "text", "content": "Introduce Claude Code: what it is, key features, and how to get started"}],
    "speakers": [{"speakerId": "cozy-man-english"}],
    "language": "en",
    "mode": "info"
  }'
轮询直至文本脚本生成完成,然后根据用户请求生成视频。