podcast-producer-agent
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePodcast Producer
播客制作工具
Create complete podcast episodes, interviews, and conversation-style audio content.
This is an orchestrator skill that combines:
- Script/dialogue generation (Claude)
- Multi-speaker voice synthesis (Gemini TTS)
- Intro/outro music (Lyria)
- Audio assembly (FFmpeg via media-utils)
创建完整的播客剧集、访谈以及对话式音频内容。
这是一个统筹型技能,整合了以下功能:
- 脚本/对话生成(Claude)
- 多说话人语音合成(Gemini TTS)
- 片头/片尾音乐制作(Lyria)
- 音频组装(通过media-utils调用FFmpeg)
What You Can Create
可创建的内容类型
| Type | Example |
|---|---|
| Podcast episode | Two hosts discussing a topic |
| Interview | Q&A format with host and guest |
| Dialogue | Scripted conversation between characters |
| Audio drama | Story with multiple characters |
| Radio show | Formatted audio program with segments |
| 类型 | 示例 |
|---|---|
| 播客剧集 | 两位主持人讨论某个话题 |
| 访谈 | 主持人与嘉宾的问答形式 |
| 对话 | 角色之间的脚本化对话 |
| 有声剧 | 包含多个角色的故事内容 |
| 广播节目 | 带有板块的格式化音频节目 |
Prerequisites
前置条件
- - For Gemini TTS (voices) and Lyria (music)
GOOGLE_API_KEY - FFmpeg installed: (macOS) or
brew install ffmpeg(Linux)apt install ffmpeg
- - 用于Gemini TTS(语音合成)和Lyria(音乐生成)
GOOGLE_API_KEY - 已安装FFmpeg:macOS用户执行,Linux用户执行
brew install ffmpegapt install ffmpeg
Workflow
工作流程
Step 1: Gather Requirements (REQUIRED)
步骤1:收集需求(必填)
⚠️ DO NOT skip this step. Use interactive questioning — ask ONE question at a time.
⚠️ 请勿跳过此步骤。通过交互式提问收集信息——每次只提一个问题。
Question Flow
提问流程
⚠️ Use the tool for each question below. Do not just print questions in your response — use the tool to create interactive prompts with the options shown.
AskUserQuestionQ1: Topic
"I'll create that podcast episode! First — what's the topic?(What should the hosts discuss?)"
Wait for response.
Q2: Hosts
"Who are the hosts/speakers?(Names and brief personality — e.g., 'Sarah, enthusiastic tech expert' — max 2 for TTS)"
Wait for response.
Q3: Duration
"How long should the episode be?
- 5 minutes
- 10 minutes
- 15 minutes
- Or specify your own"
Wait for response.
Q4: Tone
"What tone?
- Professional
- Casual/conversational
- Funny/entertaining
- Serious/educational
- Or describe your own"
Wait for response.
Q5: Music
"What music style for intro/outro?
- Upbeat pop
- Chill lo-fi
- Corporate/professional
- Electronic
- Or describe your own"
Wait for response.
⚠️ 请使用工具提出以下每个问题,不要直接在回复中打印问题——使用工具创建带有选项的交互式提示。
AskUserQuestion问题1:主题
"我将为您创建播客剧集!首先——主题是什么?(主持人要讨论什么内容?)"
等待用户回复。
问题2:主持人
"主持人/说话人是谁?(请提供姓名和简短性格描述,例如:'Sarah,热情的技术专家'——TTS最多支持2位说话人)"
等待用户回复。
问题3:时长
"剧集时长要多久?
- 5分钟
- 10分钟
- 15分钟
- 或自定义时长"
等待用户回复。
问题4:语气
"要什么语气?
- 专业正式
- 轻松随意/对话式
- 有趣搞笑/娱乐向
- 严肃认真/教育向
- 或自定义描述"
等待用户回复。
问题5:音乐
"片头/片尾的音乐风格是什么?
- 欢快流行
- 舒缓lo-fi
- 商务/专业
- 电子风格
- 或自定义描述"
等待用户回复。
Quick Reference
快速参考
| Question | Determines |
|---|---|
| Topic | Script content and discussion points |
| Hosts | Voice selection and script style |
| Duration | Script length |
| Tone | Writing style and energy |
| Music | Lyria prompt for intro/outro |
| 问题 | 决定内容 |
|---|---|
| 主题 | 脚本内容和讨论要点 |
| 主持人 | 语音选择和脚本风格 |
| 时长 | 脚本长度 |
| 语气 | 写作风格和内容活力 |
| 音乐 | Lyria生成片头/片尾音乐的提示词 |
Step 2: Generate the Script
步骤2:生成脚本
Use Claude to create the dialogue script with speaker labels:
[INTRO MUSIC: 10 seconds, upbeat tech podcast vibe]
Sarah: Welcome back to Tech Talk! I'm Sarah, and today we have something exciting.
Mike: Hey everyone! I'm Mike, and yes - we're diving into AI in healthcare.
Sarah: So Mike, what's the biggest change you've seen this year?
Mike: Great question! The use of AI for diagnostic imaging has exploded...
[Continue dialogue...]
[OUTRO MUSIC: Fade under last lines, 5 seconds after]
Sarah: Thanks for listening! Follow us for more episodes.
Mike: See you next time!Script Guidelines:
- Use speaker names exactly as they'll be configured in TTS
- Include music cues in brackets:
[INTRO MUSIC: description] - Natural conversation flow with back-and-forth
- Aim for ~150 words per minute of audio
使用Claude生成带有说话人标签的对话脚本:
[片头音乐:10秒,欢快的科技播客风格]
Sarah:欢迎回到《科技访谈》!我是Sarah,今天有个激动人心的话题要聊。
Mike:大家好!我是Mike,没错——我们今天要深入探讨AI在医疗领域的应用。
Sarah:Mike,你今年看到的最大变化是什么?
Mike:问得好!AI在诊断成像中的应用呈爆发式增长……
[继续对话……]
[片尾音乐:在最后几句台词下渐弱,结束后持续5秒]
Sarah:感谢收听!欢迎关注我们获取更多剧集。
Mike:下次再见!脚本编写指南:
- 说话人姓名需与TTS配置中的完全一致
- 用方括号标注音乐提示:
[片头音乐:描述] - 对话流程自然,有来有回
- 目标语速约为每分钟150词
Step 3: Plan Asset Generation
步骤3:规划资源生成
Create a manifest of what needs to be generated:
json
{
"project": "tech_talk_ai_healthcare",
"duration_target": "5 minutes",
"speakers": [
{"name": "Sarah", "voice": "Kore", "style": "Enthusiastic, upbeat"},
{"name": "Mike", "voice": "Puck", "style": "Friendly, knowledgeable"}
],
"assets": [
{
"type": "music",
"name": "intro_music",
"prompt": "upbeat tech podcast, electronic, modern",
"duration": 15,
"script": "lyria"
},
{
"type": "dialogue",
"name": "main_content",
"speakers": ["Sarah", "Mike"],
"text": "[the dialogue script]",
"script": "gemini_tts"
},
{
"type": "music",
"name": "outro_music",
"prompt": "same as intro, fade out",
"duration": 10,
"script": "lyria"
}
],
"assembly": [
{"action": "mix", "voice": "intro_with_music", "music": "intro_music", "music_volume": 0.8},
{"action": "concat", "files": ["intro_with_music", "main_content"]},
{"action": "mix", "voice": "main_content_end", "music": "outro_music", "fade_out": 5}
]
}创建需要生成的资源清单:
json
{
"project": "tech_talk_ai_healthcare",
"duration_target": "5 minutes",
"speakers": [
{"name": "Sarah", "voice": "Kore", "style": "Enthusiastic, upbeat"},
{"name": "Mike", "voice": "Puck", "style": "Friendly, knowledgeable"}
],
"assets": [
{
"type": "music",
"name": "intro_music",
"prompt": "upbeat tech podcast, electronic, modern",
"duration": 15,
"script": "lyria"
},
{
"type": "dialogue",
"name": "main_content",
"speakers": ["Sarah", "Mike"],
"text": "[the dialogue script]",
"script": "gemini_tts"
},
{
"type": "music",
"name": "outro_music",
"prompt": "same as intro, fade out",
"duration": 10,
"script": "lyria"
}
],
"assembly": [
{"action": "mix", "voice": "intro_with_music", "music": "intro_music", "music_volume": 0.8},
{"action": "concat", "files": ["intro_with_music", "main_content"]},
{"action": "mix", "voice": "main_content_end", "music": "outro_music", "fade_out": 5}
]
}Step 4: Generate Assets
步骤4:生成资源
Execute each generation step:
Generate intro music (Lyria):
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \
--prompt "upbeat tech podcast, electronic, modern, positive energy" \
--duration 15 \
--bpm 120Generate dialogue (Gemini TTS multi-speaker):
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/voice-generation/scripts/gemini_tts.py \
--multi \
--speaker "Sarah:Kore" \
--speaker "Mike:Puck" \
--text "[dialogue script from Step 2]" \
--style "Make Sarah sound enthusiastic and upbeat. Mike sounds friendly and knowledgeable."Generate outro music (same as intro or variation):
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \
--prompt "upbeat tech podcast, electronic, fade out feel" \
--duration 10 \
--bpm 120执行每个生成步骤:
生成片头音乐(Lyria):
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \
--prompt "upbeat tech podcast, electronic, modern, positive energy" \
--duration 15 \
--bpm 120生成对话(Gemini TTS多说话人):
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/voice-generation/scripts/gemini_tts.py \
--multi \
--speaker "Sarah:Kore" \
--speaker "Mike:Puck" \
--text "[dialogue script from Step 2]" \
--style "Make Sarah sound enthusiastic and upbeat. Mike sounds friendly and knowledgeable."生成片尾音乐(与片头相同或变体):
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \
--prompt "upbeat tech podcast, electronic, fade out feel" \
--duration 10 \
--bpm 120Step 5: Assemble the Podcast
步骤5:组装播客
Use media-utils to stitch everything together:
Mix intro music with beginning of dialogue:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_mix.py \
--voice dialogue.wav \
--music intro_music.wav \
--music-volume 0.4 \
--fade-out 3 \
-o intro_mixed.wavConcatenate all segments:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_concat.py \
-i intro_mixed.wav main_dialogue.wav outro_mixed.wav \
--crossfade 1.0 \
-o final_podcast.mp3使用media-utils将所有内容拼接在一起:
将片头音乐与对话开头混合:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_mix.py \
--voice dialogue.wav \
--music intro_music.wav \
--music-volume 0.4 \
--fade-out 3 \
-o intro_mixed.wav拼接所有片段:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_concat.py \
-i intro_mixed.wav main_dialogue.wav outro_mixed.wav \
--crossfade 1.0 \
-o final_podcast.mp3Step 6: Deliver the Result
步骤6:交付结果
Provide:
- The final audio file
- Summary of what was created
- Offer adjustments
Example delivery:
"✅ Your podcast episode is ready!
File: (5:23)
tech_talk_ai_healthcare.mp3What I created:
- Dialogue between Sarah (Kore voice) and Mike (Puck voice)
- 15s upbeat electronic intro music
- 10s outro music fade
Want me to:
- Adjust the voices or tone?
- Change the music style?
- Extend or shorten any section?
- Add more topics to the discussion?"
提供以下内容:
- 最终音频文件
- 内容创建摘要
- 提供调整选项
交付示例:
"✅ 您的播客剧集已制作完成!
文件: (时长5分23秒)
tech_talk_ai_healthcare.mp3制作内容说明:
- Sarah(Kore语音)与Mike(Puck语音)之间的对话
- 15秒欢快电子风格片头音乐
- 10秒渐弱片尾音乐
是否需要调整:
- 更换语音或语气?
- 更改音乐风格?
- 延长或缩短某部分内容?
- 在讨论中添加更多话题?"
Voice Pairing Suggestions
语音搭配建议
| Host Type | Suggested Voice | Description |
|---|---|---|
| Main host (energetic) | Kore, Puck, Laomedeia | Firm, upbeat |
| Co-host (calm) | Charon, Algieba | Informative, smooth |
| Guest (expert) | Rasalgethi, Gacrux | Knowledgeable, mature |
| Interviewer | Achird | Friendly |
| Narrator | Charon, Orus | Informative, clear |
| 主持人类型 | 推荐语音 | 描述 |
|---|---|---|
| 主主持人(活力型) | Kore、Puck、Laomedeia | 坚定、欢快 |
| 副主持人(沉稳型) | Charon、Algieba | 信息量足、音色流畅 |
| 嘉宾(专家型) | Rasalgethi、Gacrux | 知识渊博、成熟稳重 |
| 访谈主持人 | Achird | 友好亲切 |
| 旁白 | Charon、Orus | 信息清晰、表达明确 |
Music Style Suggestions
音乐风格建议
| Podcast Type | Music Prompt |
|---|---|
| Tech/Business | "upbeat electronic, modern, corporate, positive" |
| Casual/Comedy | "fun, playful, acoustic guitar, lighthearted" |
| News/Serious | "subtle, professional, ambient, understated" |
| Storytelling | "cinematic, emotional, orchestral, atmospheric" |
| Health/Wellness | "calm, peaceful, ambient, gentle piano" |
| True Crime | "dark, suspenseful, minimal, tension" |
| 播客类型 | 音乐提示词 |
|---|---|
| 科技/商务 | "欢快电子、现代、商务风、积极向上" |
| 轻松/喜剧 | "有趣、活泼、原声吉他、轻松愉快" |
| 新闻/严肃 | "低调、专业、氛围音、不突兀" |
| 故事讲述 | "电影感、富有情感、管弦乐、氛围感" |
| 健康/养生 | "舒缓、平和、氛围音、轻柔钢琴" |
| 真实犯罪 | "暗黑、悬疑、极简、紧张感" |
Limitations
局限性
- Max 2 speakers per Gemini TTS call (for more, generate separate files and concatenate)
- Lyria is instrumental only - no vocals in music
- Duration estimates may vary - dialogue length depends on speaking pace
- Music loops if shorter than dialogue
- 每次Gemini TTS调用最多支持2位说话人(如需更多,可生成单独文件后拼接)
- Lyria仅支持纯音乐生成——音乐中无 vocals
- 时长估算可能有偏差——对话长度取决于语速
- 如果音乐短于对话,会自动循环播放
Error Handling
错误处理
| Error | Solution |
|---|---|
| "GOOGLE_API_KEY not set" | Set up API key per README |
| "FFmpeg not found" | Install: |
| "google-genai not installed" | Run: |
| TTS too long | Split script into segments, generate separately, concat |
| 错误 | 解决方案 |
|---|---|
| "GOOGLE_API_KEY not set" | 按照README设置API密钥 |
| "FFmpeg not found" | 安装FFmpeg: |
| "google-genai not installed" | 执行: |
| TTS内容过长 | 将脚本拆分为多个片段,分别生成后再拼接 |
Example Prompts
示例提示词
Simple:
"Create a 3-minute podcast episode about remote work tips with two hosts"
Detailed:
"Create a 5-minute tech podcast episode. Hosts: Alex (enthusiastic, tech-savvy) and Jordan (skeptical, asks good questions). Topic: The future of AI assistants. Include upbeat electronic intro/outro music. Casual but informative tone."
With provided script:
"Turn this dialogue into a podcast episode with intro/outro music: Alex: Hey everyone, welcome back! Jordan: Today we're talking about..."
简单版:
"创建一个3分钟的关于远程办公技巧的播客剧集,包含两位主持人"
详细版:
"创建一个5分钟的科技播客剧集。主持人:Alex(热情、精通技术)和Jordan(持怀疑态度、擅长提问)。主题:AI助手的未来。搭配欢快电子风格的片头/片尾音乐。语气轻松但信息量充足。"
提供脚本版:
"将这段对话转换成带有片头/片尾音乐的播客剧集: Alex:大家好,欢迎回来! Jordan:今天我们要聊的是……"