audio-cog

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Audio Cog - AI Audio Generation Powered by CellCog

Audio Cog - 由CellCog驱动的AI音频生成

Create professional audio with AI - from voiceovers and narration to background music and sound design.

借助AI创作专业音频——从旁白配音、内容旁白到背景音乐、音效设计,一应俱全。

Prerequisites

前置条件

This skill requires the CellCog mothership skill for SDK setup and API calls.
bash
clawhub install cellcog
Read the cellcog skill first for SDK setup. This skill shows you what's possible.
Quick pattern (v1.0+):
python
undefined
本技能依赖CellCog核心技能来完成SDK配置与API调用。
bash
clawhub install cellcog
请先阅读cellcog技能文档完成SDK配置。本技能将为你展示其可用功能。
快速使用范式(v1.0+):
python
undefined

Fire-and-forget - returns immediately

Fire-and-forget - returns immediately

result = client.create_chat( prompt="[your audio request]", notify_session_key="agent:main:main", task_label="audio-task", chat_mode="agent" # Agent mode is optimal for all audio tasks )
result = client.create_chat( prompt="[your audio request]", notify_session_key="agent:main:main", task_label="audio-task", chat_mode="agent" # Agent mode is optimal for all audio tasks )

Daemon notifies you when complete - do NOT poll

Daemon notifies you when complete - do NOT poll


---

---

What Audio You Can Create

可创作的音频类型

Text-to-Speech / Voiceover

文本转语音 / 旁白配音

Convert text to natural-sounding speech:
  • Narration: "Generate a professional male voiceover for this product video script"
  • Audiobook Style: "Create an engaging narration of this short story with emotional delivery"
  • Podcast Intros: "Generate a warm, friendly podcast intro: 'Welcome to The Daily Tech...'"
  • E-Learning: "Create clear, instructional voiceover for this training module"
  • IVR/Phone Systems: "Generate professional phone menu prompts"

将文本转换为自然流畅的语音:
  • 内容旁白:"为该产品视频脚本生成专业男性旁白配音"
  • 有声书风格:"为这篇短篇故事创作富有感染力的旁白,注重情感表达"
  • 播客开场:"生成温暖友好的播客开场:'欢迎来到每日科技...'"
  • 在线教育:"为该培训模块创作清晰易懂的教学旁白"
  • IVR/电话系统:"生成专业的电话菜单提示音"

Available Voices

可用语音类型

CellCog provides 8 high-quality voices with distinct characteristics:
VoiceGenderBest ForCharacteristics
cedarMaleProduct videos, announcementsWarm, resonant, authoritative, trustworthy
marinFemaleProfessional content, tutorialsBright, articulate, emotionally agile
balladMaleStorytelling, flowing narrativesSmooth, melodic, musical quality
coralFemaleEnergetic content, adsVibrant, lively, dynamic, spirited
echoMaleThoughtful content, documentariesCalm, measured, deliberate
sageFemaleEducational, knowledge contentWise, contemplative, reflective
shimmerFemaleGentle content, wellnessSoft, gentle, soothing, approachable
verseMaleCreative, artistic contentPoetic, rhythmic, expressive
CellCog提供8种高品质语音,各有独特特点:
语音名称性别适用场景特点
cedar产品视频、公告内容温暖、洪亮、权威、值得信赖
marin专业内容、教程明快、清晰、情感表现力强
ballad故事讲述、流畅叙事柔和、悦耳、富有音乐感
coral活力内容、广告活泼、生动、充满动感
echo深度内容、纪录片沉稳、从容、语速适中
sage教育内容、知识类内容睿智、深思、富有内涵
shimmer轻柔内容、健康养生类柔和、舒缓、平易近人
verse创意内容、艺术类内容诗意、富有韵律、表现力强

Voice Recommendations by Use Case

语音场景推荐

For product videos and announcements:
Use cedar (male) or marin (female) - both project confidence and professionalism.
For storytelling and audiobooks:
Use ballad (male) or sage (female) - designed for engaging, flowing narratives.
For high-energy content:
Use coral (female) - vibrant and dynamic, perfect for ads and exciting announcements.
For calm, educational content:
Use echo (male) or shimmer (female) - measured pacing ideal for learning.
产品视频与公告内容:
推荐使用cedar(男)或marin(女)——二者均能传递自信与专业感。
故事讲述与有声书:
推荐使用ballad(男)或sage(女)——专为引人入胜的流畅叙事设计。
高活力内容:
推荐使用coral(女)——活泼动感,非常适合广告与激动人心的公告。
沉稳教育类内容:
推荐使用echo(男)或shimmer(女)——适中的语速非常适合学习场景。

Voice Style Customization

语音风格自定义

Beyond selecting a voice, you can fine-tune delivery with style instructions:
  • Accent & dialect: American, British, Australian, Indian, etc.
  • Emotional range: Excited, serious, warm, mysterious, dramatic
  • Pacing: Slow and deliberate, conversational, fast and energetic
  • Special effects: Whispering, character impressions
Example with style instructions:
"Generate voiceover using cedar voice with a warm, conversational tone. Speak at medium pace with slight enthusiasm when mentioning features. American accent."

除了选择语音,你还可以通过风格指令微调语音表现:
  • 口音与方言:美式、英式、澳式、印度式等
  • 情感范围:兴奋、严肃、温暖、神秘、富有戏剧性
  • 语速:缓慢从容、日常对话、快速活力
  • 特殊效果:低语、角色模仿
带风格指令的示例:
"使用cedar语音生成旁白,采用温暖、日常对话的语气。提及功能时略带热情,语速适中。美式口音。"

Music Generation

音乐生成

Create original background music and soundtracks:
  • Background Music: "Create calm lo-fi background music for a study video, 2 minutes"
  • Podcast Music: "Generate an upbeat intro jingle for a tech podcast, 15 seconds"
  • Video Soundtracks: "Create cinematic orchestral music for a product launch video"
  • Ambient/Mood: "Generate peaceful ambient sounds for a meditation app"
  • Genre-Specific: "Create energetic electronic music for a fitness video"
创作原创背景音乐与配乐:
  • 背景音乐:"为学习视频创作舒缓的lo-fi背景音乐,时长2分钟"
  • 播客音乐:"为科技播客生成欢快的开场主题曲,时长15秒"
  • 视频配乐:"为产品发布视频创作电影级管弦乐配乐"
  • 氛围音效:"为冥想应用生成宁静的环境音效"
  • 特定流派:"为健身视频创作充满活力的电子音乐"

Music Specifications

音乐参数设置

ParameterOptions
Duration15 seconds to 5+ minutes
GenreElectronic, rock, classical, jazz, ambient, lo-fi, cinematic, pop, hip-hop
Tempo60 BPM (slow) to 180+ BPM (fast)
MoodUpbeat, calm, dramatic, mysterious, inspiring, melancholic
InstrumentsPiano, guitar, synth, strings, drums, brass, etc.
参数可选值
时长15秒至5分钟以上
流派电子、摇滚、古典、爵士、氛围、lo-fi、电影原声、流行、嘻哈
Tempo(节拍)60 BPM(缓慢)至180+ BPM(快速)
情绪欢快、舒缓、富有戏剧性、神秘、鼓舞人心、忧郁
乐器钢琴、吉他、合成器、弦乐、鼓、铜管乐器等

Music Licensing

音乐授权

All AI-generated music from CellCog is royalty-free and fully yours to use commercially.
You have complete rights to use the generated music for:
  • YouTube videos (including monetized content)
  • Commercial projects and advertisements
  • Podcasts and streaming
  • Apps and games
  • Any other commercial or personal use
No attribution required. No licensing fees. The music is generated uniquely for you.

所有由CellCog生成的AI音乐均为免版税,你可完全用于商业用途。
你拥有以下完全使用权:
  • YouTube视频(包括变现内容)
  • 商业项目与广告
  • 播客与流媒体内容
  • 应用与游戏
  • 其他任何商业或个人用途
无需署名,无需授权费用。生成的音乐为你专属定制。

Audio Output Formats

音频输出格式

FormatBest For
MP3Standard audio delivery, voiceovers, music
Combined with videoBackground music for video-cog outputs

格式适用场景
MP3标准音频交付、旁白配音、音乐
与视频结合为video-cog输出内容搭配背景音乐

Chat Mode for Audio

音频聊天模式

Use
chat_mode="agent"
for all audio generation tasks.
Audio generation—whether voiceovers, music, or sound design—executes efficiently in agent mode. CellCog's audio capabilities don't require multi-angle deliberation; they require precise execution, which agent mode excels at.
There's no scenario where agent team mode provides meaningfully better audio output. Save agent team for research and complex creative work that benefits from multiple reasoning passes.

所有音频生成任务请使用
chat_mode="agent"
无论是旁白配音、音乐还是音效设计,音频生成在agent模式下执行效率最高。CellCog的音频功能无需多轮推理,只需精准执行,而agent模式在这方面表现出色。
没有任何场景下,agent团队模式能带来显著更优的音频输出。请将agent团队模式用于需要多轮推理的研究与复杂创意工作。

Example Audio Prompts

音频提示词示例

Professional voiceover with specific voice:
"Generate a professional voiceover using the marin voice for this script:
'Introducing TaskFlow - the project management tool that actually works. With intelligent automation, seamless collaboration, and powerful analytics, TaskFlow helps teams do their best work.'
Style: Confident and friendly, medium pace. Suitable for a product launch video."
Podcast intro with voice selection:
"Create a podcast intro voiceover using cedar voice:
'Welcome to Future Forward, the podcast where we explore the technologies shaping tomorrow. I'm your host, and today we're diving into...'
Style: Warm and engaging, conversational tone. Also generate a 10-second upbeat intro music bed to go underneath."
Background music:
"Generate 2 minutes of calm, lo-fi hip-hop style background music. Should be chill and unobtrusive, good for studying or working. Include soft piano, mellow beats, and gentle vinyl crackle. 75 BPM."
Audiobook narration:
"Create an audiobook-style narration using ballad voice for this passage:
[passage text]
Style: Warm storytelling quality, measured pace with appropriate pauses for drama."
Cinematic music:
"Generate 90 seconds of cinematic orchestral music for a tech company's 'About Us' video. Start soft and inspiring, build to a confident crescendo, then resolve to a hopeful ending."

指定语音的专业旁白:
"使用marin语音为以下脚本生成专业旁白:
'隆重推出TaskFlow——真正好用的项目管理工具。凭借智能自动化、无缝协作与强大分析功能,TaskFlow助力团队发挥最佳效能。'
风格:自信友好,语速适中。适合产品发布视频。"
指定语音的播客开场:
"使用cedar语音创作播客开场旁白:
'欢迎来到《未来前沿》播客,在这里我们探索塑造未来的科技。我是主持人,今天我们将深入探讨...'
风格:温暖亲切,日常对话语气。同时生成一段10秒的欢快开场背景音乐作为铺垫。"
背景音乐:
"生成2分钟舒缓的lo-fi嘻哈风格背景音乐。需轻松柔和、不干扰注意力,适合学习或工作场景。包含轻柔钢琴、舒缓节拍与轻微黑胶唱片杂音。节拍75 BPM。"
有声书旁白:
"使用ballad语音为以下段落创作有声书风格旁白:
[段落文本]
风格:温暖的故事讲述感,语速适中,配合适当停顿增强戏剧效果。"
电影级音乐:
"为科技公司的'关于我们'视频生成90秒的电影级管弦乐配乐。开头轻柔励志,逐步推向自信的高潮,最后以充满希望的结尾收束。"

Multi-Language Support

多语言支持

CellCog can generate speech in 50+ languages:
  • English (multiple accents)
  • Spanish, French, German, Italian, Portuguese
  • Chinese (Mandarin, Cantonese)
  • Japanese, Korean
  • Hindi, Arabic
  • Russian, Polish, Dutch
  • And many more
Specify the language in your prompt:
"Generate this text in Japanese with a native female speaker using shimmer voice: 'いらっしゃいませ...'"

CellCog可生成50余种语言的语音:
  • 英语(多种口音)
  • 西班牙语、法语、德语、意大利语、葡萄牙语
  • 中文(普通话、粤语)
  • 日语、韩语
  • 印地语、阿拉伯语
  • 俄语、波兰语、荷兰语
  • 以及更多其他语言
请在提示词中指定语言:
"使用shimmer语音,以日语母语女性发音生成以下文本:'いらっしゃいませ...'"

Tips for Better Audio

优化音频质量的技巧

  1. Choose the right voice: Match the voice to your content type. Cedar/marin for professional, ballad/sage for storytelling, coral for energy.
  2. Provide the complete script: Don't say "something about our product" - write out exactly what should be said.
  3. Include style instructions: "Confident but warm", "slow and deliberate", "with slight excitement" helps shape delivery.
  4. For music: Specify duration, tempo (BPM if you know it), mood, and genre.
  5. Pronunciation guidance: For names or technical terms, add hints: "CellCog (pronounced SELL-kog)"
  6. Emotional beats: For longer voiceovers, indicate tone shifts: "[excited] And now for the big reveal... [serious] But there's a catch."
  1. 选择合适的语音:根据内容类型匹配语音。专业场景选cedar/marin,故事讲述选ballad/sage,活力内容选coral。
  2. 提供完整脚本:不要只说“关于我们产品的内容”,请写出具体的台词文本。
  3. 添加风格指令:比如“自信但温暖”、“缓慢从容”、“略带兴奋”等,有助于塑造语音表现。
  4. 音乐生成提示:指定时长、节拍(已知BPM的话)、情绪与流派。
  5. 发音指导:对于名称或专业术语,添加发音提示:"CellCog(发音为SELL-kog)"
  6. 情感节奏:对于较长的旁白,标注语气变化:"[兴奋] 现在揭晓重磅消息... [严肃] 但还有一个注意事项。"