present
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePresent — Narrated Interactive Presentations
Present — 带旁白的交互式演示文稿
Generate a self-contained HTML presentation with dual article/slides mode, ElevenLabs narration, optional GPT Image 2 illustrations, and scroll-reveal animations.
生成包含文章/幻灯片双模式、ElevenLabs旁白、可选GPT Image 2插图及滚动揭示动画的独立HTML演示文稿。
What This Skill Produces
此技能生成的内容
A single file (plus audio and optional image assets) that can be:
index.html- Opened locally in a browser
- Deployed to Vercel, Netlify, or any static host
- Shared as a folder
The output has two modes the viewer can toggle between:
- Article mode — long-form scrollable report with Tufte-inspired typography
- Slides mode — navigable presentation with keyboard/click navigation and narrated audio playback
单个文件(外加音频和可选图像资源),可:
index.html- 在本地浏览器中打开
- 部署至Vercel、Netlify或任何静态托管平台
- 以文件夹形式共享
输出内容支持查看者在两种模式间切换:
- 文章模式 — 采用Tufte风格排版的长篇滚动式报告
- 幻灯片模式 — 支持键盘/点击导航及旁白音频播放的可导航演示文稿
Quick Start
快速开始
/present "AI adoption research for Arseny" --slides 12 --voice daniel --images risographOr with a file:
/present path/to/research.md --detail detailed --voice alice/present "AI adoption research for Arseny" --slides 12 --voice daniel --images risograph或使用文件:
/present path/to/research.md --detail detailed --voice aliceParameters
参数
| Parameter | Values | Default | Description |
|---|---|---|---|
| 5-20 | 12 | Number of slides |
| | | Content depth |
| ElevenLabs voice name | | Narrator voice |
| style name or | | Image generation style |
| custom string | auto | Override image prompt prefix |
| path | | Output directory |
| vercel project or | | Auto-deploy target |
| string | auto | Presentation title |
| flag | false | Skip audio generation |
| 参数 | 取值 | 默认值 | 说明 |
|---|---|---|---|
| 5-20 | 12 | 幻灯片数量 |
| | | 内容深度 |
| ElevenLabs语音名称 | | 旁白语音 |
| 风格名称或 | | 图像生成风格 |
| 自定义字符串 | 自动 | 覆盖图像提示前缀 |
| 路径 | | 输出目录 |
| Vercel项目或 | | 自动部署目标 |
| 字符串 | 自动 | 演示文稿标题 |
| 标志 | false | 跳过音频生成 |
Detail Levels
细节级别
- (5-7 slides): Key findings only. One stat slide, one recommendation slide, sources. Best for busy stakeholders who need the bottom line.
executive - (10-14 slides): Full narrative arc. Problem, evidence, analysis, recommendations, sources. The default for most presentations.
standard - (15-20 slides): Deep dive. Includes methodology, multiple evidence sections, case studies, detailed recommendations with implementation steps.
detailed
- (5-7张幻灯片):仅包含关键结论。一张统计幻灯片、一张建议幻灯片及来源。适合需要快速了解核心内容的忙碌利益相关者。
executive - (10-14张幻灯片):完整叙事结构。包含问题、证据、分析、建议及来源。是大多数演示文稿的默认选项。
standard - (15-20张幻灯片):深度剖析。包含方法论、多个证据章节、案例研究、带实施步骤的详细建议。
detailed
Voice Options
语音选项
Uses ElevenLabs API. The key must be available in as .
~/claude-skills/elevenlabs-tts/.envELEVENLABS_API_KEYRecommended voices for presentations:
- daniel — Steady Broadcaster, British, formal (default)
- alice — Clear Educator, British, professional
- matilda — Knowledgeable, American, upbeat
- brian — Deep Resonant, American, comforting
- george — Warm Storyteller, British, mature
使用ElevenLabs API。需在文件中配置密钥。
~/claude-skills/elevenlabs-tts/.envELEVENLABS_API_KEY推荐用于演示文稿的语音:
- daniel — 沉稳的广播腔,英式英语,正式(默认)
- alice — 清晰的教学腔,英式英语,专业
- matilda — 知识渊博,美式英语,活泼
- brian — 低沉有共鸣,美式英语,亲切
- george — 温暖的叙事腔,英式英语,成熟
Image Styles
图像风格
When is set, the skill generates illustrations for key slides using GPT Image 2 (). Available styles:
--images~/.claude/skills/gpt-image-2/scripts/gpt_image_2.py- — Gerd Arntz isotype style, muted colors, sand texture
risograph - — Magazine photography style, dramatic lighting
editorial - — Technical drawing aesthetic, white on blue
blueprint - — Black ink illustration, hand-drawn feel
ink - — Data visualization aesthetic, dots and lines
constellation - Custom: pass to override
--image-prompt "your style description"
Images are generated in mode first (~$0.006/image). The skill decides which slides benefit from illustration (typically 3-5 out of 12).
--draft当设置时,技能会使用GPT Image 2()为关键幻灯片生成插图。可用风格:
--images~/.claude/skills/gpt-image-2/scripts/gpt_image_2.py- — Gerd Arntz符号风格,柔和色彩,沙质纹理
risograph - — 杂志摄影风格,戏剧性光影
editorial - — 技术图纸美学,白底蓝字
blueprint - — 黑色墨水插画,手绘质感
ink - — 数据可视化美学,点线结构
constellation - 自定义:传入来覆盖默认风格
--image-prompt "你的风格描述"
图像首先以模式生成(约0.006美元/张)。技能会判断哪些幻灯片适合添加插图(通常12张幻灯片中选3-5张)。
--draftWorkflow
工作流程
Step 1: Content Analysis
步骤1:内容分析
Read the input content (a topic description, a markdown file, vault notes, meeting transcript, or research). Identify:
- The core argument or narrative
- Key data points and statistics
- Natural section breaks
- Quotable findings with sources
读取输入内容(主题描述、Markdown文件、知识库笔记、会议记录或研究资料)。识别:
- 核心论点或叙事主线
- 关键数据点和统计信息
- 自然的章节划分
- 带来源的可引用结论
Step 2: Slide Planning
步骤2:幻灯片规划
Based on and , create a slide plan. Each slide needs:
--detail--slidesSlide N: [Type] — [Title]
Content: [what appears on screen]
Narration: [what the voice says — always more than what's on screen]
Read time: [seconds for an average reader to absorb the visual content]
Image: [yes/no, with prompt if yes]Slide types: , , , , , , , , ,
titlesummarystatevidencecomparisonquoteframeworkrecommendationcase-studysourcesThe narration script should be conversational and add context beyond what's displayed. It should NOT just read the slide text aloud — it should explain, connect, and elaborate. Target 15-30 seconds of narration per slide.
根据和参数创建幻灯片规划。每张幻灯片需包含:
--detail--slides幻灯片N:[类型] — [标题]
内容:[屏幕显示内容]
旁白:[语音内容 — 需比屏幕内容更丰富]
阅读时间:[普通读者消化视觉内容所需的秒数]
图像:[是/否,若为是则附提示词]幻灯片类型:(标题页)、(摘要页)、(统计页)、(证据页)、(对比页)、(引用页)、(框架页)、(建议页)、(案例研究页)、(来源页)
titlesummarystatevidencecomparisonquoteframeworkrecommendationcase-studysources旁白脚本应口语化,并提供超出屏幕显示内容的上下文。不应只是朗读幻灯片文本——而应进行解释、关联和拓展。目标为每张幻灯片15-30秒的旁白内容。
Step 3: Generate Audio
步骤3:生成音频
For each slide, generate narration using ElevenLabs:
bash
python3 ~/.claude/skills/elevenlabs-tts/scripts/elevenlabs_tts.py \
--voice <voice_name> \
--text "<narration>" \
--output <output_dir>/audio/slide-<N>.mp3Or use the direct API via the script at in this skill.
scripts/generate_audio.pyAlso generate a transition sound (Rhodes chord) for slide-to-slide transitions.
After generation, get durations with ffprobe to calculate slide timing.
为每张幻灯片使用ElevenLabs生成旁白:
bash
python3 ~/.claude/skills/elevenlabs-tts/scripts/elevenlabs_tts.py \\
--voice <voice_name> \\
--text "<narration>" \\
--output <output_dir>/audio/slide-<N>.mp3或使用本技能中脚本直接调用API。
scripts/generate_audio.py同时生成幻灯片切换音效(Rhodes和弦)。
生成后,使用ffprobe获取音频时长以计算幻灯片计时。
Step 4: Generate Images (if enabled)
步骤4:生成图像(若启用)
For slides that benefit from illustration, generate images using GPT Image 2:
bash
python3 ~/.claude/skills/gpt-image-2/scripts/gpt_image_2.py --draft --size 1536x1024 \
"<style prefix> <slide-specific prompt>" \
<output_dir>/images/<name>.pngTypically generate 3-5 images for a 12-slide deck. Choose slides where a visual metaphor strengthens the point — stat slides, concept slides, and the title slide are good candidates. Don't illustrate every slide.
为适合添加插图的幻灯片使用GPT Image 2生成图像:
bash
python3 ~/.claude/skills/gpt-image-2/scripts/gpt_image_2.py --draft --size 1536x1024 \\
"<风格前缀> <幻灯片专属提示词>" \\
<output_dir>/images/<name>.png通常为12张幻灯片的演示文稿生成3-5张图像。选择视觉隐喻能强化观点的幻灯片——统计页、概念页和标题页是不错的选择。无需为每张幻灯片都添加插图。
Step 5: Build HTML
步骤5:构建HTML
Use the template at as the base. The template includes:
assets/template.html- Typography: EB Garamond (body) + DM Sans (labels/numbers)
- Color palette: Configurable via CSS variables in
:root - Article mode: Tufte-inspired layout with executive summary box, stat cards, two-column sections, data tables
- Slides mode: Full-viewport slides with fade transitions, keyboard navigation (arrows, space), dot indicators
- Audio engine: Single reusable element, slide-synced playback with progress bar, transition sounds between slides
<audio> - Auto-hide controls: Top bar (mode switcher + audio) appears when cursor enters top 20% of viewport. Bottom nav appears in bottom 20%. Shift+. toggles always-show/always-hide/zone mode.
- Scroll-reveal animations: Intersection Observer-based fade-up for sections, staggered stat cards, animated counters, h2 rule-draw effect
- : All animations disabled when user prefers reduced motion
prefers-reduced-motion
Populate the template by replacing placeholder sections with the actual slide and article content.
以为模板。模板包含:
assets/template.html- 排版:EB Garamond(正文)+ DM Sans(标签/数字)
- 调色板:可通过中的CSS变量配置
:root - 文章模式:Tufte风格布局,包含执行摘要框、统计卡片、双栏章节、数据表
- 幻灯片模式:全屏幻灯片,带淡入淡出过渡效果,支持键盘导航(箭头键、空格键)、圆点指示器
- 音频引擎:单个可复用的元素,与幻灯片同步播放的进度条,幻灯片间的切换音效
<audio> - 自动隐藏控件:当光标进入视口顶部20%区域时,顶部栏(模式切换器+音频控件)显示;底部导航在视口底部20%区域显示。Shift+.可切换始终显示/始终隐藏/区域触发模式。
- 滚动揭示动画:基于Intersection Observer的段落淡入效果、交错显示的统计卡片、动画计数器、h2标题规则绘制效果
- :当用户偏好减少动画时,禁用所有动画
prefers-reduced-motion
通过替换模板中的占位符部分,填入实际的幻灯片和文章内容。
Step 6: Test
步骤6:测试
Open in browser using or . Verify:
/real-browseropen <path>- Article mode renders correctly, images load
- Slides mode: all slides navigable, text fits within viewport
- Audio plays when play button is clicked
- Audio syncs to slide advancement (each slide waits for narration + read time)
- Transition sounds play between slides
- Auto-hide works for top and bottom bars
- Keyboard navigation (arrows, space) works in slide mode
使用或在浏览器中打开。验证:
/real-browseropen <路径>- 文章模式渲染正确,图像加载正常
- 幻灯片模式:所有幻灯片可导航,文本适配视口
- 点击播放按钮后音频正常播放
- 音频与幻灯片推进同步(每张幻灯片等待旁白播放+阅读时间)
- 幻灯片间切换音效正常播放
- 顶部和底部栏的自动隐藏功能正常
- 幻灯片模式下键盘导航(箭头键、空格键)正常工作
Step 7: Deploy (if requested)
步骤7:部署(若请求)
If is set, copy output to the target project's folder and deploy:
--deploypublic/bash
cp -r <output_dir>/* <project_path>/public/<slug>/
cd <project_path> && vercel deploy --prod --yes若设置了,将输出内容复制到目标项目的文件夹并部署:
--deploypublic/bash
cp -r <output_dir>/* <project_path>/public/<slug>/
cd <project_path> && vercel deploy --prod --yesHTML Architecture
HTML架构
Audio Sync Model
音频同步模型
Each slide has three timing properties:
- — maps to audio file
data-audio="slide-name" - — seconds for reading the visual content
data-read-time="N"
The audio engine calculates: . After narration ends, it waits for any remaining read time plus a 2-second buffer, plays a transition sound (1.8s), then advances to the next slide.
slide_duration = max(audio_duration, read_time) + 2s每张幻灯片有三个计时属性:
- — 映射到音频文件
data-audio="slide-name" - — 阅读视觉内容所需的秒数
data-read-time="N"
音频引擎计算:。旁白结束后,等待剩余的阅读时间加上2秒缓冲,播放切换音效(1.8秒),然后切换到下一张幻灯片。
slide_duration = max(audio_duration, read_time) + 2sAvoiding AI-Looking Formatting
避免AI生成式格式
The following patterns read as AI-generated and should be avoided:
- Colored left-bar + bold heading + description blocks (finding cards)
- Large italic pull quotes with colored left border
- Uniform card grids with icon + heading + description
- Gradient text on metrics
Instead use:
- Natural prose paragraphs with inline emphasis
- Definition lists () for structured points
<dl> - Tables for comparisons
- Direct statements woven into flowing text
以下模式会被识别为AI生成内容,应避免:
- 彩色左侧栏+粗体标题+描述块(发现卡片)
- 带彩色左侧边框的大斜体引用
- 统一的卡片网格(图标+标题+描述)
- 指标文本渐变效果
应使用:
- 带有内联强调的自然段落
- 用于结构化要点的定义列表()
<dl> - 用于对比的表格
- 融入流畅文本的直接陈述
Image Paths
图像路径
Use absolute paths from the deployment root: , not relative paths. Relative paths break when URLs load without trailing slashes.
/slug/images/name.png使用部署根目录的绝对路径:,而非相对路径。当URL不带尾部斜杠时,相对路径会失效。
/slug/images/name.pngFiles
文件
- — This file
SKILL.md - — ElevenLabs TTS batch generator
scripts/generate_audio.py - — Base HTML template with all CSS/JS
assets/template.html - — Detailed slide type specifications and examples
references/slide-types.md
- — 本文档
SKILL.md - — ElevenLabs TTS批量生成器
scripts/generate_audio.py - — 包含所有CSS/JS的基础HTML模板
assets/template.html - — 详细的幻灯片类型规范及示例",
references/slide-types.md