Loading...
Loading...
Generate speech from text using Google Gemini TTS models via scripts/. Use for text-to-speech, audio generation, voice synthesis, multi-speaker conversations, and creating audio content. Supports multiple voices and streaming. Triggers on "text to speech", "TTS", "generate audio", "voice synthesis", "speak this text".
npx skill4agent add akrindev/google-studio-skills gemini-tts| Parameter | Description | Example |
|---|---|---|
| Text to convert (required) | |
| Voice name | |
| Base name for output file | |
| Output directory for audio | |
| Disable auto timestamp | Flag |
| TTS model | |
| Enable streaming | Flag |
| Multi-speaker mapping | |
python scripts/tts.py "Hello, world! Have a wonderful day."Koreaudio/tts_output_YYYYMMDD_HHMMSS.wavpython scripts/tts.py "Welcome to our podcast about technology trends" --voice Puck --output welcomeaudio/welcome_YYYYMMDD_HHMMSS.wavpython scripts/tts.py "TTS the following conversation:
Joe: How's it going today?
Jane: Not too bad, how about you?
Joe: I'm working on a new project.
Jane: Sounds exciting, tell me more!" --speakers "Joe:Kore,Jane:Puck" --output conversationaudio/conversation_YYYYMMDD_HHMMSS.wavpython scripts/tts.py "This is a very long text that would benefit from streaming..." --stream --output long-formaudio/long-form_YYYYMMDD_HHMMSS.wavpython scripts/tts.py "Welcome to our quarterly earnings presentation. Today we'll discuss our growth metrics and future plans." --voice Charon --output voiceoverCharonpython scripts/tts.py "Save to specific folder." --output-dir ./my-projects/podcasts/ --output episode1./my-projects/podcasts/episode1_YYYYMMDD_HHMMSS.wav# 1. Generate script (gemini-text skill)
python skills/gemini-text/scripts/generate.py "Write a 2-minute podcast intro about sustainable energy"
# 2. Generate audio (this skill)
python scripts/tts.py "[Paste generated script]" --voice Fenrir --output podcast-intro
# 3. Use in video or podcastpython scripts/tts.py "Welcome to our accessible website. This audio describes our main navigation options." --voice Aoede --output accessibilityAoedepython scripts/tts.py "Chapter 1: Introduction to Quantum Computing. Let's explore the fundamental principles..." --voice Zephyr --output chapter1Zephyrpython scripts/tts.py "Fixed filename." --output my-audio --no-timestampaudio/my-audio.wav| Model | Quality | Speed | Best For |
|---|---|---|---|
| Good | Fast | General use, high volume |
| Higher | Slower | Premium content, voiceovers |
| Voice | Characteristics | Best For |
|---|---|---|
| Kore | Clear, professional | Announcements, general purpose (default) |
| Puck | Friendly, conversational | Casual content, interviews |
| Charon | Deep, authoritative | Corporate, serious content |
| Fenrir | Warm, expressive | Storytelling, narratives |
| Aoede | Melodic, pleasant | Educational, accessibility |
| Zephyr | Light, airy | Gentle content, tutorials |
| Sulafat | Neutral, balanced | Documentaries, factual content |
| Specification | Value |
|---|---|
| Format | WAV (PCM) |
| Sample rate | 24000 Hz |
| Channels | 1 (mono) |
| Bit depth | 16-bit |
| Limit | Type | Description |
|---|---|---|
| 8,192 | Input | Maximum input text tokens |
| 16,384 | Output | Maximum output audio tokens |
--speakerspip install google-genaiSpeakerName:VoiceName,Speaker2:Voice2"Joe:Kore,Jane:Puck,Host:Charon"--output| Voice | Ideal Use Cases |
|---|---|
| Kore | Announcements, navigation, general info |
| Puck | Podcasts, interviews, casual content |
| Charon | Corporate, news, formal presentations |
| Fenrir | Audiobooks, stories, emotional content |
| Aoede | Accessibility, educational, gentle content |
| Zephyr | Tutorials, explanations, guides |
| Sulafat | Documentaries, factual presentations |
# Basic
python scripts/tts.py "Your text here"
# Custom voice
python scripts/tts.py "Your text" --voice Puck --output audio.wav
# Multi-speaker
python scripts/tts.py "Joe: Hi. Jane: Hello!" --speakers "Joe:Kore,Jane:Puck"
# Streaming
python scripts/tts.py "Long text..." --stream --output long.wav
# Professional
python scripts/tts.py "Corporate announcement" --voice Charonreferences/voices.md