markdown-video

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Markdown Video Skill

Markdown Video Skill

Convert markdown slides to presentation video with AI-generated visuals and TTS audio narration.
将Markdown幻灯片转换为带有AI生成视觉效果和TTS语音旁白的演示视频。

When to Use This Skill

适用场景

Activate this skill when the user:
  • Asks to create video from markdown slides
  • Requests to convert presentation to MP4 format
  • Wants to generate narrated video from slides
  • Needs automated slide-to-video conversion
当用户有以下需求时激活此Skill:
  • 要求从Markdown幻灯片创建视频
  • 请求将演示文稿转换为MP4格式
  • 希望从幻灯片生成带旁白的视频
  • 需要自动化的幻灯片转视频转换

Key Features

核心功能

  • Gemini AI-generated visuals: High-quality slide images with full emoji and Korean support
  • OpenAI TTS narration: Natural voice from speaker notes
  • Delta updates: Only regenerates changed slides (saves time and API costs)
  • Multiple visual styles: technical-diagram, professional, vibrant-cartoon, watercolor
  • Gemini AI生成视觉效果:支持完整表情符号和韩语的高质量幻灯片图片
  • OpenAI TTS语音旁白:基于演讲者备注生成自然语音
  • 增量更新:仅重新生成已修改的幻灯片(节省时间和API成本)
  • 多种视觉风格:technical-diagram(技术图表)、professional(专业商务)、vibrant-cartoon(活力卡通)、watercolor(水彩风)

Input Requirements

输入要求

  • Markdown file with speaker notes marked with
    ^
    prefix
  • GEMINI_API_KEY environment variable for image generation
  • OPENAI_API_KEY environment variable for TTS audio
  • 带有
    ^
    前缀标记演讲者备注的Markdown文件
  • 用于图片生成的GEMINI_API_KEY环境变量
  • 用于TTS音频生成的OPENAI_API_KEY环境变量

Output Specifications

输出规格

  • MP4 video: 1920x1080 (Full HD)
  • Duration: Each slide displays for duration of its audio narration
  • File naming:
    {input_filename}.mp4

  • MP4视频:1920x1080(全高清)
  • 时长:每张幻灯片的显示时长与其音频旁白的时长一致
  • 文件命名
    {input_filename}.mp4

Workflow

工作流程

Step 1: Generate Audio Files

步骤1:生成音频文件

bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "{slides_filename}" --output-dir "audio"
Delta update: Only regenerates audio for slides with changed speaker notes.
  • Use
    --force
    to regenerate all audio files
Output:
  • audio/slide_0.mp3
    ,
    slide_1.mp3
    , ... (0-indexed)
  • Cache file:
    audio/.audio_cache.json
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "{slides_filename}" --output-dir "audio"
增量更新:仅重新生成演讲者备注已修改的幻灯片的音频。
  • 使用
    --force
    参数重新生成所有音频文件
输出:
  • audio/slide_0.mp3
    ,
    slide_1.mp3
    , ...(从0开始索引)
  • 缓存文件:
    audio/.audio_cache.json

Step 2: Generate Slide Images with Gemini

步骤2:使用Gemini生成幻灯片图片

bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_slides_gemini.py "{slides_filename}" \
  --output-dir "slides-gemini" \
  --style "technical-diagram" \
  --auto-approve
Delta update: Only regenerates images for slides with changed content.
  • Use
    --force
    to regenerate all slide images
Style Options:
StyleDescriptionBest For
technical-diagram
Clean lines, infographic icons, muted blue/grayTechnical, education
professional
Minimalist, geometric shapesCorporate, formal
vibrant-cartoon
Bright gradients, flat designMarketing, startups
watercolor
Soft pastels, organic shapesCreative, personal
Other Parameters:
  • --model
    : Gemini model (default: gemini-3-pro-image-preview)
  • --aspect-ratio
    : 16:9 (default), 1:1, 9:16, 4:3, 3:4
  • --start-from N
    : Resume from slide N
  • --dry-run
    : Preview prompts without generating
Output:
  • slides-gemini/1.jpeg
    ,
    2.jpeg
    , ... (1-indexed)
  • Cache file:
    slides-gemini/.slides_cache.json
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_slides_gemini.py "{slides_filename}" \
  --output-dir "slides-gemini" \
  --style "technical-diagram" \
  --auto-approve
增量更新:仅重新生成内容已修改的幻灯片图片。
  • 使用
    --force
    参数重新生成所有幻灯片图片
风格选项:
风格描述适用场景
technical-diagram
简洁线条、信息图表图标、柔和蓝灰色调技术类、教育类演示
professional
极简风格、几何图形企业、正式场合
vibrant-cartoon
明亮渐变、扁平化设计营销、初创企业
watercolor
柔和粉彩、有机形态创意类、个人演示
其他参数:
  • --model
    : Gemini模型(默认: gemini-3-pro-image-preview)
  • --aspect-ratio
    : 宽高比(默认16:9,可选1:1、9:16、4:3、3:4)
  • --start-from N
    : 从第N张幻灯片开始继续生成
  • --dry-run
    : 预览提示词但不实际生成图片
输出:
  • slides-gemini/1.jpeg
    ,
    2.jpeg
    , ...(从1开始索引)
  • 缓存文件:
    slides-gemini/.slides_cache.json

Step 3: Create Final Video

步骤3:生成最终视频

bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/slides_to_video.py \
  --slides-dir "slides-gemini" \
  --audio-dir "audio" \
  --output "{output_filename}.mp4"

bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/slides_to_video.py \
  --slides-dir "slides-gemini" \
  --audio-dir "audio" \
  --output "{output_filename}.mp4"

Delta Updates

增量更新机制

Both audio and image generation support delta updates - only regenerating what changed.
音频和图片生成均支持增量更新——仅重新生成已修改的内容。

How It Works

工作原理

  1. Content hashing: Each slide's content is hashed (MD5)
  2. Cache storage: Hashes stored in
    .audio_cache.json
    /
    .slides_cache.json
  3. Change detection: On subsequent runs, only changed slides are regenerated
  4. File verification: Also checks if output file exists
  1. 内容哈希: 对每张幻灯片的内容进行MD5哈希计算
  2. 缓存存储: 哈希值存储在
    .audio_cache.json
    /
    .slides_cache.json
  3. 变更检测: 后续运行时,仅重新生成内容已变更的幻灯片
  4. 文件验证: 同时检查输出文件是否存在

Example Output

示例输出

✅ Found 20 slides
   20 slides with speaker notes

✨ Delta update: 17 slides unchanged, 3 to regenerate

🎵 Generating 3 audio files...
Progress |████████████████████████████████████████| 3/3 (100.0%)

✅ Audio generation complete!
   Generated: 3/3 files
   Unchanged: 17 files (skipped)
✅ 发现20张幻灯片
   20张幻灯片带有演讲者备注

✨ 增量更新: 17张幻灯片未变更, 3张需要重新生成

🎵 正在生成3个音频文件...
Progress |████████████████████████████████████████| 3/3 (100.0%)

✅ 音频生成完成!
   已生成: 3/3个文件
   未变更: 17个文件(已跳过)

Force Regeneration

强制重新生成

To ignore cache and regenerate everything:
bash
undefined
要忽略缓存并重新生成所有内容:
bash
undefined

Force regenerate all audio

强制重新生成所有音频

python generate_audio.py "slides.md" --output-dir "audio" --force
python generate_audio.py "slides.md" --output-dir "audio" --force

Force regenerate all images

强制重新生成所有图片

python create_slides_gemini.py "slides.md" --output-dir "slides-gemini" --force

---
python create_slides_gemini.py "slides.md" --output-dir "slides-gemini" --force

---

Quick Reference

快速参考

Full Workflow (First Run)

完整工作流程(首次运行)

bash
cd "{slides_directory}"
bash
cd "{slides_directory}"

Step 1: Generate audio

步骤1: 生成音频

python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "slides.md" --output-dir "audio"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "slides.md" --output-dir "audio"

Step 2: Generate slide images

步骤2: 生成幻灯片图片

python /Users/lifidea/.claude/skills/markdown-video/create_slides_gemini.py "slides.md"
--output-dir "slides-gemini"
--style "technical-diagram"
--auto-approve
python /Users/lifidea/.claude/skills/markdown-video/create_slides_gemini.py "slides.md"
--output-dir "slides-gemini"
--style "technical-diagram"
--auto-approve

Step 3: Create video

步骤3: 生成视频

python /Users/lifidea/.claude/skills/markdown-video/slides_to_video.py
--slides-dir "slides-gemini"
--audio-dir "audio"
--output "presentation.mp4"
undefined
python /Users/lifidea/.claude/skills/markdown-video/slides_to_video.py
--slides-dir "slides-gemini"
--audio-dir "audio"
--output "presentation.mp4"
undefined

Update Workflow (After Changes)

更新工作流程(修改后)

Same commands - delta updates are automatic:
bash
undefined
使用相同命令即可——增量更新会自动执行:
bash
undefined

Only regenerates changed slides

仅重新生成已修改的幻灯片

python generate_audio.py "slides.md" --output-dir "audio" python create_slides_gemini.py "slides.md" --output-dir "slides-gemini" --auto-approve python slides_to_video.py --slides-dir "slides-gemini" --audio-dir "audio" --output "presentation.mp4"

---
python generate_audio.py "slides.md" --output-dir "audio" python create_slides_gemini.py "slides.md" --output-dir "slides-gemini" --auto-approve python slides_to_video.py --slides-dir "slides-gemini" --audio-dir "audio" --output "presentation.mp4"

---

Requirements

环境要求

System Dependencies

系统依赖

  • Python 3.7+
  • ffmpeg:
    brew install ffmpeg
  • Python 3.7+
  • ffmpeg:
    brew install ffmpeg

Python Packages

Python包依赖

bash
pip install Pillow requests google-genai
bash
pip install Pillow requests google-genai

Environment Variables

环境变量配置

bash
export OPENAI_API_KEY="sk-..."
export GEMINI_API_KEY="..."

bash
export OPENAI_API_KEY="sk-..."
export GEMINI_API_KEY="..."

Cost Estimation

成本估算

ComponentCostExample (20 slides)
Gemini images~$0.04/slide~$0.80
OpenAI TTS~$0.015/1K chars~$0.50
Total~$1.30
With delta updates, subsequent runs only cost for changed slides.

组件成本示例(20张幻灯片)
Gemini图片生成约$0.04/张约$0.80
OpenAI TTS约$0.015/每千字符约$0.50
总计约$1.30
使用增量更新后,后续运行仅需为已修改的幻灯片付费。

Error Handling

错误处理

No speaker notes found

未找到演讲者备注

  • Slides need
    ^
    prefixed speaker notes for narration
  • Example:
    ^ This is the speaker note for this slide.
  • 幻灯片需要带有
    ^
    前缀的演讲者备注才能生成旁白
  • 示例:
    ^ 这是本张幻灯片的演讲者备注。

Pronunciation problems

发音问题

  • Replace technical terms with phonetic equivalents in speaker notes
  • Test with
    --dry-run
    first
  • 在演讲者备注中用音标替代专业术语
  • 先使用
    --dry-run
    参数测试

API errors

API错误

  • Check API key environment variables
  • Gemini rate limits: script includes 1-second delay between generations

  • 检查API密钥的环境变量配置
  • Gemini速率限制: 脚本已包含生成请求之间1秒的延迟

Quality Checklist

质量检查清单

Before marking complete:
  • OpenAI and Gemini API keys configured
  • Markdown file has speaker notes with
    ^
    prefix
  • Audio files generated successfully
  • Slide images generated successfully
  • Video plays correctly with synced audio
  • Resolution is 1920x1080

完成前请确认:
  • 已配置OpenAI和Gemini API密钥
  • Markdown文件带有
    ^
    前缀的演讲者备注
  • 音频文件生成成功
  • 幻灯片图片生成成功
  • 视频播放正常且音画同步
  • 分辨率为1920x1080

Image Generation Mode

图片生成模式

Two approaches for generating visuals:
ModeScriptBest For
Slide-by-Slide
create_slides_gemini.py
Standard presentations, precise control
Section-based
generate_section_images.py
Long presentations, infographic style
提供两种视觉效果生成方式:
模式脚本适用场景
逐张幻灯片
create_slides_gemini.py
标准演示文稿、精确控制
按章节生成
generate_section_images.py
长篇演示文稿、信息图表风格

When to Use Section-Based

何时使用按章节生成模式

  • Presentations with 20+ slides
  • Content naturally groups into logical sections
  • Prefer infographic overview per section vs. individual slides
  • Want to reduce API costs (fewer images)

  • 演示文稿包含20张以上幻灯片
  • 内容可自然划分为逻辑章节
  • 偏好每章节的信息图表概览而非单张幻灯片
  • 希望降低API成本(生成更少图片)

Section-Based Workflow (Alternative)

按章节生成工作流程(替代方案)

For presentations with many slides, generate one infographic image per section instead of per slide.
对于包含大量幻灯片的演示文稿,可按章节生成一张信息图表图片,而非为每张幻灯片生成图片。

Comparison

对比

AspectSlide-by-SlideSection-Based
Images1 per slide1 per section
AudioPer slidePer slide → merged by section
ReviewDirect in markdownVideo script document
Best forShort presentationsLong presentations (20+ slides)
维度逐张幻灯片按章节生成
图片数量1张/幻灯片1张/章节
音频处理单张幻灯片对应音频单张幻灯片音频合并为章节音频
审核方式直接在Markdown中查看通过视频脚本文档审核
适用场景短篇演示文稿长篇演示文稿(20张以上)

Step 1: Generate Audio Files

步骤1:生成音频文件

Same as standard workflow:
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "{slides_filename}" --output-dir "audio"
与标准工作流程相同:
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "{slides_filename}" --output-dir "audio"

Step 2: Generate Section Infographic Images

步骤2:生成章节信息图表图片

bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_section_images.py "{slides_filename}" \
  --output-dir "slides-section" \
  --style "infographic"
Style Options:
StyleDescription
infographic
Clean professional with icons (default)
professional
Minimalist corporate design
vibrant
Bright gradients for marketing
technical
Flowcharts and technical diagrams
Other Parameters:
  • --start-from N
    : Resume from section N
  • --force
    : Regenerate all images
  • --dry-run
    : Preview sections without generating
  • --delay N
    : Seconds between API calls (default: 2.0)
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_section_images.py "{slides_filename}" \
  --output-dir "slides-section" \
  --style "infographic"
风格选项:
风格描述
infographic
简洁专业带图标(默认)
professional
极简企业风格
vibrant
明亮渐变适用于营销
technical
流程图和技术图表
其他参数:
  • --start-from N
    : 从第N章节开始继续生成
  • --force
    : 重新生成所有图片
  • --dry-run
    : 预览章节划分但不实际生成
  • --delay N
    : API调用间隔秒数(默认: 2.0)

Step 3: Create Video Script (Optional)

步骤3:生成视频脚本(可选)

Generate a markdown document for reviewing narration:
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_video_script.py "{slides_filename}" \
  --output "video_script.md" \
  --image-dir "slides-section"
The script document shows:
  • Section images embedded
  • Speaker notes for each slide
  • Easy editing format
生成用于审核旁白的Markdown文档:
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_video_script.py "{slides_filename}" \
  --output "video_script.md" \
  --image-dir "slides-section"
脚本文档包含:
  • 嵌入的章节图片
  • 每张幻灯片的演讲者备注
  • 便于编辑的格式

Step 4: Review & Edit Narration

步骤4:审核与编辑旁白

  1. Open
    video_script.md
  2. Review narration in blockquotes
  3. Edit directly in the document
  4. Update original markdown file with changes
  5. Regenerate audio for changed slides:
    bash
    python generate_audio.py "slides.md" --output-dir "audio"
  1. 打开
    video_script.md
  2. 查看引用块中的旁白内容
  3. 直接在文档中编辑
  4. 将修改内容同步到原始Markdown文件
  5. 重新生成已修改幻灯片的音频:
    bash
    python generate_audio.py "slides.md" --output-dir "audio"

Step 5: Create Section-Based Video

步骤5:生成按章节的最终视频

bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_section_video.py \
  --slides "{slides_filename}" \
  --audio-dir "audio" \
  --image-dir "slides-section" \
  --output "presentation.mp4"
With config file (for custom section mappings):
bash
python create_section_video.py \
  --config "sections.json" \
  --audio-dir "audio" \
  --image-dir "slides-section" \
  --output "presentation.mp4"
Config file format:
json
{
  "sections": [
    {"id": 0, "name": "title", "audio_slides": [0]},
    {"id": 1, "name": "introduction", "audio_slides": [1, 2, 3]},
    {"id": 2, "name": "main_content", "audio_slides": [4, 5, 6, 7]}
  ]
}

bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_section_video.py \
  --slides "{slides_filename}" \
  --audio-dir "audio" \
  --image-dir "slides-section" \
  --output "presentation.mp4"
使用配置文件(自定义章节映射):
bash
python create_section_video.py \
  --config "sections.json" \
  --audio-dir "audio" \
  --image-dir "slides-section" \
  --output "presentation.mp4"
配置文件格式:
json
{
  "sections": [
    {"id": 0, "name": "title", "audio_slides": [0]},
    {"id": 1, "name": "introduction", "audio_slides": [1, 2, 3]},
    {"id": 2, "name": "main_content", "audio_slides": [4, 5, 6, 7]}
  ]
}

Section-Based Quick Reference

按章节生成快速参考

bash
cd "{slides_directory}"
bash
cd "{slides_directory}"

Step 1: Generate audio

步骤1: 生成音频

python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "slides.md" --output-dir "audio"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "slides.md" --output-dir "audio"

Step 2: Generate section images

步骤2: 生成章节图片

python /Users/lifidea/.claude/skills/markdown-video/generate_section_images.py "slides.md"
--output-dir "slides-section"
--style "infographic"
python /Users/lifidea/.claude/skills/markdown-video/generate_section_images.py "slides.md"
--output-dir "slides-section"
--style "infographic"

Step 3 (optional): Create review document

步骤3(可选): 生成审核文档

python /Users/lifidea/.claude/skills/markdown-video/create_video_script.py "slides.md"
--output "video_script.md"
--image-dir "slides-section"
python /Users/lifidea/.claude/skills/markdown-video/create_video_script.py "slides.md"
--output "video_script.md"
--image-dir "slides-section"

Step 4: Create final video

步骤4: 生成最终视频

python /Users/lifidea/.claude/skills/markdown-video/create_section_video.py
--slides "slides.md"
--audio-dir "audio"
--image-dir "slides-section"
--output "presentation.mp4"
undefined
python /Users/lifidea/.claude/skills/markdown-video/create_section_video.py
--slides "slides.md"
--audio-dir "audio"
--image-dir "slides-section"
--output "presentation.mp4"
undefined