Generate audio using
with
. Supports speech (TTS), music, and sound effects. ElevenLabs is preferred when available, with OpenAI as fallback.
python
# Text-to-speech (auto-selects ElevenLabs if key available)
generate_media(prompt="Hello, welcome to our presentation!", mode="audio")
# With specific voice
generate_media(prompt="Hello!", mode="audio", voice="Rachel")
# Music generation (ElevenLabs only)
generate_media(prompt="Upbeat jazz piano with soft drums", mode="audio",
audio_type="music", duration=30)
# Sound effects (ElevenLabs only)
generate_media(prompt="Thunder rolling across a mountain valley", mode="audio",
audio_type="sound_effect", duration=5)
If ElevenLabs TTS fails, the system automatically falls back to OpenAI TTS.
python
# CORRECT: prompt = text to speak, instructions = how to speak it
generate_media(
prompt="Welcome to the annual report presentation.",
mode="audio",
voice="alloy",
instructions="warm, reflective tone with measured pacing",
backend_type="openai"
)
# WRONG: Don't put style instructions in prompt
generate_media(prompt="Say this warmly: Welcome...", mode="audio") # Bad!
only works with OpenAI
. ElevenLabs uses voice selection for tone.
Use
(not
) to analyze existing audio: