markdown-video
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMarkdown Video Skill
Markdown Video Skill
Convert markdown slides to presentation video with AI-generated visuals and TTS audio narration.
将Markdown幻灯片转换为带有AI生成视觉效果和TTS语音旁白的演示视频。
When to Use This Skill
适用场景
Activate this skill when the user:
- Asks to create video from markdown slides
- Requests to convert presentation to MP4 format
- Wants to generate narrated video from slides
- Needs automated slide-to-video conversion
当用户有以下需求时激活此Skill:
- 要求从Markdown幻灯片创建视频
- 请求将演示文稿转换为MP4格式
- 希望从幻灯片生成带旁白的视频
- 需要自动化的幻灯片转视频转换
Key Features
核心功能
- Gemini AI-generated visuals: High-quality slide images with full emoji and Korean support
- OpenAI TTS narration: Natural voice from speaker notes
- Delta updates: Only regenerates changed slides (saves time and API costs)
- Multiple visual styles: technical-diagram, professional, vibrant-cartoon, watercolor
- Gemini AI生成视觉效果:支持完整表情符号和韩语的高质量幻灯片图片
- OpenAI TTS语音旁白:基于演讲者备注生成自然语音
- 增量更新:仅重新生成已修改的幻灯片(节省时间和API成本)
- 多种视觉风格:technical-diagram(技术图表)、professional(专业商务)、vibrant-cartoon(活力卡通)、watercolor(水彩风)
Input Requirements
输入要求
- Markdown file with speaker notes marked with prefix
^ - GEMINI_API_KEY environment variable for image generation
- OPENAI_API_KEY environment variable for TTS audio
- 带有前缀标记演讲者备注的Markdown文件
^ - 用于图片生成的GEMINI_API_KEY环境变量
- 用于TTS音频生成的OPENAI_API_KEY环境变量
Output Specifications
输出规格
- MP4 video: 1920x1080 (Full HD)
- Duration: Each slide displays for duration of its audio narration
- File naming:
{input_filename}.mp4
- MP4视频:1920x1080(全高清)
- 时长:每张幻灯片的显示时长与其音频旁白的时长一致
- 文件命名:
{input_filename}.mp4
Workflow
工作流程
Step 1: Generate Audio Files
步骤1:生成音频文件
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "{slides_filename}" --output-dir "audio"Delta update: Only regenerates audio for slides with changed speaker notes.
- Use to regenerate all audio files
--force
Output:
- ,
audio/slide_0.mp3, ... (0-indexed)slide_1.mp3 - Cache file:
audio/.audio_cache.json
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "{slides_filename}" --output-dir "audio"增量更新:仅重新生成演讲者备注已修改的幻灯片的音频。
- 使用参数重新生成所有音频文件
--force
输出:
- ,
audio/slide_0.mp3, ...(从0开始索引)slide_1.mp3 - 缓存文件:
audio/.audio_cache.json
Step 2: Generate Slide Images with Gemini
步骤2:使用Gemini生成幻灯片图片
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_slides_gemini.py "{slides_filename}" \
--output-dir "slides-gemini" \
--style "technical-diagram" \
--auto-approveDelta update: Only regenerates images for slides with changed content.
- Use to regenerate all slide images
--force
Style Options:
| Style | Description | Best For |
|---|---|---|
| Clean lines, infographic icons, muted blue/gray | Technical, education |
| Minimalist, geometric shapes | Corporate, formal |
| Bright gradients, flat design | Marketing, startups |
| Soft pastels, organic shapes | Creative, personal |
Other Parameters:
- : Gemini model (default: gemini-3-pro-image-preview)
--model - : 16:9 (default), 1:1, 9:16, 4:3, 3:4
--aspect-ratio - : Resume from slide N
--start-from N - : Preview prompts without generating
--dry-run
Output:
- ,
slides-gemini/1.jpeg, ... (1-indexed)2.jpeg - Cache file:
slides-gemini/.slides_cache.json
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_slides_gemini.py "{slides_filename}" \
--output-dir "slides-gemini" \
--style "technical-diagram" \
--auto-approve增量更新:仅重新生成内容已修改的幻灯片图片。
- 使用参数重新生成所有幻灯片图片
--force
风格选项:
| 风格 | 描述 | 适用场景 |
|---|---|---|
| 简洁线条、信息图表图标、柔和蓝灰色调 | 技术类、教育类演示 |
| 极简风格、几何图形 | 企业、正式场合 |
| 明亮渐变、扁平化设计 | 营销、初创企业 |
| 柔和粉彩、有机形态 | 创意类、个人演示 |
其他参数:
- : Gemini模型(默认: gemini-3-pro-image-preview)
--model - : 宽高比(默认16:9,可选1:1、9:16、4:3、3:4)
--aspect-ratio - : 从第N张幻灯片开始继续生成
--start-from N - : 预览提示词但不实际生成图片
--dry-run
输出:
- ,
slides-gemini/1.jpeg, ...(从1开始索引)2.jpeg - 缓存文件:
slides-gemini/.slides_cache.json
Step 3: Create Final Video
步骤3:生成最终视频
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/slides_to_video.py \
--slides-dir "slides-gemini" \
--audio-dir "audio" \
--output "{output_filename}.mp4"bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/slides_to_video.py \
--slides-dir "slides-gemini" \
--audio-dir "audio" \
--output "{output_filename}.mp4"Delta Updates
增量更新机制
Both audio and image generation support delta updates - only regenerating what changed.
音频和图片生成均支持增量更新——仅重新生成已修改的内容。
How It Works
工作原理
- Content hashing: Each slide's content is hashed (MD5)
- Cache storage: Hashes stored in /
.audio_cache.json.slides_cache.json - Change detection: On subsequent runs, only changed slides are regenerated
- File verification: Also checks if output file exists
- 内容哈希: 对每张幻灯片的内容进行MD5哈希计算
- 缓存存储: 哈希值存储在/
.audio_cache.json中.slides_cache.json - 变更检测: 后续运行时,仅重新生成内容已变更的幻灯片
- 文件验证: 同时检查输出文件是否存在
Example Output
示例输出
✅ Found 20 slides
20 slides with speaker notes
✨ Delta update: 17 slides unchanged, 3 to regenerate
🎵 Generating 3 audio files...
Progress |████████████████████████████████████████| 3/3 (100.0%)
✅ Audio generation complete!
Generated: 3/3 files
Unchanged: 17 files (skipped)✅ 发现20张幻灯片
20张幻灯片带有演讲者备注
✨ 增量更新: 17张幻灯片未变更, 3张需要重新生成
🎵 正在生成3个音频文件...
Progress |████████████████████████████████████████| 3/3 (100.0%)
✅ 音频生成完成!
已生成: 3/3个文件
未变更: 17个文件(已跳过)Force Regeneration
强制重新生成
To ignore cache and regenerate everything:
bash
undefined要忽略缓存并重新生成所有内容:
bash
undefinedForce regenerate all audio
强制重新生成所有音频
python generate_audio.py "slides.md" --output-dir "audio" --force
python generate_audio.py "slides.md" --output-dir "audio" --force
Force regenerate all images
强制重新生成所有图片
python create_slides_gemini.py "slides.md" --output-dir "slides-gemini" --force
---python create_slides_gemini.py "slides.md" --output-dir "slides-gemini" --force
---Quick Reference
快速参考
Full Workflow (First Run)
完整工作流程(首次运行)
bash
cd "{slides_directory}"bash
cd "{slides_directory}"Step 1: Generate audio
步骤1: 生成音频
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "slides.md" --output-dir "audio"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "slides.md" --output-dir "audio"
Step 2: Generate slide images
步骤2: 生成幻灯片图片
python /Users/lifidea/.claude/skills/markdown-video/create_slides_gemini.py "slides.md"
--output-dir "slides-gemini"
--style "technical-diagram"
--auto-approve
--output-dir "slides-gemini"
--style "technical-diagram"
--auto-approve
python /Users/lifidea/.claude/skills/markdown-video/create_slides_gemini.py "slides.md"
--output-dir "slides-gemini"
--style "technical-diagram"
--auto-approve
--output-dir "slides-gemini"
--style "technical-diagram"
--auto-approve
Step 3: Create video
步骤3: 生成视频
python /Users/lifidea/.claude/skills/markdown-video/slides_to_video.py
--slides-dir "slides-gemini"
--audio-dir "audio"
--output "presentation.mp4"
--slides-dir "slides-gemini"
--audio-dir "audio"
--output "presentation.mp4"
undefinedpython /Users/lifidea/.claude/skills/markdown-video/slides_to_video.py
--slides-dir "slides-gemini"
--audio-dir "audio"
--output "presentation.mp4"
--slides-dir "slides-gemini"
--audio-dir "audio"
--output "presentation.mp4"
undefinedUpdate Workflow (After Changes)
更新工作流程(修改后)
Same commands - delta updates are automatic:
bash
undefined使用相同命令即可——增量更新会自动执行:
bash
undefinedOnly regenerates changed slides
仅重新生成已修改的幻灯片
python generate_audio.py "slides.md" --output-dir "audio"
python create_slides_gemini.py "slides.md" --output-dir "slides-gemini" --auto-approve
python slides_to_video.py --slides-dir "slides-gemini" --audio-dir "audio" --output "presentation.mp4"
---python generate_audio.py "slides.md" --output-dir "audio"
python create_slides_gemini.py "slides.md" --output-dir "slides-gemini" --auto-approve
python slides_to_video.py --slides-dir "slides-gemini" --audio-dir "audio" --output "presentation.mp4"
---Requirements
环境要求
System Dependencies
系统依赖
- Python 3.7+
- ffmpeg:
brew install ffmpeg
- Python 3.7+
- ffmpeg:
brew install ffmpeg
Python Packages
Python包依赖
bash
pip install Pillow requests google-genaibash
pip install Pillow requests google-genaiEnvironment Variables
环境变量配置
bash
export OPENAI_API_KEY="sk-..."
export GEMINI_API_KEY="..."bash
export OPENAI_API_KEY="sk-..."
export GEMINI_API_KEY="..."Cost Estimation
成本估算
| Component | Cost | Example (20 slides) |
|---|---|---|
| Gemini images | ~$0.04/slide | ~$0.80 |
| OpenAI TTS | ~$0.015/1K chars | ~$0.50 |
| Total | ~$1.30 |
With delta updates, subsequent runs only cost for changed slides.
| 组件 | 成本 | 示例(20张幻灯片) |
|---|---|---|
| Gemini图片生成 | 约$0.04/张 | 约$0.80 |
| OpenAI TTS | 约$0.015/每千字符 | 约$0.50 |
| 总计 | 约$1.30 |
使用增量更新后,后续运行仅需为已修改的幻灯片付费。
Error Handling
错误处理
No speaker notes found
未找到演讲者备注
- Slides need prefixed speaker notes for narration
^ - Example:
^ This is the speaker note for this slide.
- 幻灯片需要带有前缀的演讲者备注才能生成旁白
^ - 示例:
^ 这是本张幻灯片的演讲者备注。
Pronunciation problems
发音问题
- Replace technical terms with phonetic equivalents in speaker notes
- Test with first
--dry-run
- 在演讲者备注中用音标替代专业术语
- 先使用参数测试
--dry-run
API errors
API错误
- Check API key environment variables
- Gemini rate limits: script includes 1-second delay between generations
- 检查API密钥的环境变量配置
- Gemini速率限制: 脚本已包含生成请求之间1秒的延迟
Quality Checklist
质量检查清单
Before marking complete:
- OpenAI and Gemini API keys configured
- Markdown file has speaker notes with prefix
^ - Audio files generated successfully
- Slide images generated successfully
- Video plays correctly with synced audio
- Resolution is 1920x1080
完成前请确认:
- 已配置OpenAI和Gemini API密钥
- Markdown文件带有前缀的演讲者备注
^ - 音频文件生成成功
- 幻灯片图片生成成功
- 视频播放正常且音画同步
- 分辨率为1920x1080
Image Generation Mode
图片生成模式
Two approaches for generating visuals:
| Mode | Script | Best For |
|---|---|---|
| Slide-by-Slide | | Standard presentations, precise control |
| Section-based | | Long presentations, infographic style |
提供两种视觉效果生成方式:
| 模式 | 脚本 | 适用场景 |
|---|---|---|
| 逐张幻灯片 | | 标准演示文稿、精确控制 |
| 按章节生成 | | 长篇演示文稿、信息图表风格 |
When to Use Section-Based
何时使用按章节生成模式
- Presentations with 20+ slides
- Content naturally groups into logical sections
- Prefer infographic overview per section vs. individual slides
- Want to reduce API costs (fewer images)
- 演示文稿包含20张以上幻灯片
- 内容可自然划分为逻辑章节
- 偏好每章节的信息图表概览而非单张幻灯片
- 希望降低API成本(生成更少图片)
Section-Based Workflow (Alternative)
按章节生成工作流程(替代方案)
For presentations with many slides, generate one infographic image per section instead of per slide.
对于包含大量幻灯片的演示文稿,可按章节生成一张信息图表图片,而非为每张幻灯片生成图片。
Comparison
对比
| Aspect | Slide-by-Slide | Section-Based |
|---|---|---|
| Images | 1 per slide | 1 per section |
| Audio | Per slide | Per slide → merged by section |
| Review | Direct in markdown | Video script document |
| Best for | Short presentations | Long presentations (20+ slides) |
| 维度 | 逐张幻灯片 | 按章节生成 |
|---|---|---|
| 图片数量 | 1张/幻灯片 | 1张/章节 |
| 音频处理 | 单张幻灯片对应音频 | 单张幻灯片音频合并为章节音频 |
| 审核方式 | 直接在Markdown中查看 | 通过视频脚本文档审核 |
| 适用场景 | 短篇演示文稿 | 长篇演示文稿(20张以上) |
Step 1: Generate Audio Files
步骤1:生成音频文件
Same as standard workflow:
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "{slides_filename}" --output-dir "audio"与标准工作流程相同:
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "{slides_filename}" --output-dir "audio"Step 2: Generate Section Infographic Images
步骤2:生成章节信息图表图片
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_section_images.py "{slides_filename}" \
--output-dir "slides-section" \
--style "infographic"Style Options:
| Style | Description |
|---|---|
| Clean professional with icons (default) |
| Minimalist corporate design |
| Bright gradients for marketing |
| Flowcharts and technical diagrams |
Other Parameters:
- : Resume from section N
--start-from N - : Regenerate all images
--force - : Preview sections without generating
--dry-run - : Seconds between API calls (default: 2.0)
--delay N
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/generate_section_images.py "{slides_filename}" \
--output-dir "slides-section" \
--style "infographic"风格选项:
| 风格 | 描述 |
|---|---|
| 简洁专业带图标(默认) |
| 极简企业风格 |
| 明亮渐变适用于营销 |
| 流程图和技术图表 |
其他参数:
- : 从第N章节开始继续生成
--start-from N - : 重新生成所有图片
--force - : 预览章节划分但不实际生成
--dry-run - : API调用间隔秒数(默认: 2.0)
--delay N
Step 3: Create Video Script (Optional)
步骤3:生成视频脚本(可选)
Generate a markdown document for reviewing narration:
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_video_script.py "{slides_filename}" \
--output "video_script.md" \
--image-dir "slides-section"The script document shows:
- Section images embedded
- Speaker notes for each slide
- Easy editing format
生成用于审核旁白的Markdown文档:
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_video_script.py "{slides_filename}" \
--output "video_script.md" \
--image-dir "slides-section"脚本文档包含:
- 嵌入的章节图片
- 每张幻灯片的演讲者备注
- 便于编辑的格式
Step 4: Review & Edit Narration
步骤4:审核与编辑旁白
- Open
video_script.md - Review narration in blockquotes
- Edit directly in the document
- Update original markdown file with changes
- Regenerate audio for changed slides:
bash
python generate_audio.py "slides.md" --output-dir "audio"
- 打开
video_script.md - 查看引用块中的旁白内容
- 直接在文档中编辑
- 将修改内容同步到原始Markdown文件
- 重新生成已修改幻灯片的音频:
bash
python generate_audio.py "slides.md" --output-dir "audio"
Step 5: Create Section-Based Video
步骤5:生成按章节的最终视频
bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_section_video.py \
--slides "{slides_filename}" \
--audio-dir "audio" \
--image-dir "slides-section" \
--output "presentation.mp4"With config file (for custom section mappings):
bash
python create_section_video.py \
--config "sections.json" \
--audio-dir "audio" \
--image-dir "slides-section" \
--output "presentation.mp4"Config file format:
json
{
"sections": [
{"id": 0, "name": "title", "audio_slides": [0]},
{"id": 1, "name": "introduction", "audio_slides": [1, 2, 3]},
{"id": 2, "name": "main_content", "audio_slides": [4, 5, 6, 7]}
]
}bash
cd "{slides_directory}"
python /Users/lifidea/.claude/skills/markdown-video/create_section_video.py \
--slides "{slides_filename}" \
--audio-dir "audio" \
--image-dir "slides-section" \
--output "presentation.mp4"使用配置文件(自定义章节映射):
bash
python create_section_video.py \
--config "sections.json" \
--audio-dir "audio" \
--image-dir "slides-section" \
--output "presentation.mp4"配置文件格式:
json
{
"sections": [
{"id": 0, "name": "title", "audio_slides": [0]},
{"id": 1, "name": "introduction", "audio_slides": [1, 2, 3]},
{"id": 2, "name": "main_content", "audio_slides": [4, 5, 6, 7]}
]
}Section-Based Quick Reference
按章节生成快速参考
bash
cd "{slides_directory}"bash
cd "{slides_directory}"Step 1: Generate audio
步骤1: 生成音频
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "slides.md" --output-dir "audio"
python /Users/lifidea/.claude/skills/markdown-video/generate_audio.py "slides.md" --output-dir "audio"
Step 2: Generate section images
步骤2: 生成章节图片
python /Users/lifidea/.claude/skills/markdown-video/generate_section_images.py "slides.md"
--output-dir "slides-section"
--style "infographic"
--output-dir "slides-section"
--style "infographic"
python /Users/lifidea/.claude/skills/markdown-video/generate_section_images.py "slides.md"
--output-dir "slides-section"
--style "infographic"
--output-dir "slides-section"
--style "infographic"
Step 3 (optional): Create review document
步骤3(可选): 生成审核文档
python /Users/lifidea/.claude/skills/markdown-video/create_video_script.py "slides.md"
--output "video_script.md"
--image-dir "slides-section"
--output "video_script.md"
--image-dir "slides-section"
python /Users/lifidea/.claude/skills/markdown-video/create_video_script.py "slides.md"
--output "video_script.md"
--image-dir "slides-section"
--output "video_script.md"
--image-dir "slides-section"
Step 4: Create final video
步骤4: 生成最终视频
python /Users/lifidea/.claude/skills/markdown-video/create_section_video.py
--slides "slides.md"
--audio-dir "audio"
--image-dir "slides-section"
--output "presentation.mp4"
--slides "slides.md"
--audio-dir "audio"
--image-dir "slides-section"
--output "presentation.mp4"
undefinedpython /Users/lifidea/.claude/skills/markdown-video/create_section_video.py
--slides "slides.md"
--audio-dir "audio"
--image-dir "slides-section"
--output "presentation.mp4"
--slides "slides.md"
--audio-dir "audio"
--image-dir "slides-section"
--output "presentation.mp4"
undefined