Loading...
Loading...
Text-to-speech and speech-to-text using fal.ai audio models. Use when the user requests "Convert text to speech", "Transcribe audio", "Generate voice", "Speech to text", "TTS", "STT", or similar audio tasks.
npx skill4agent add fal-ai-community/skills fal-audio| Model | Notes |
|---|---|
| Best quality |
| Fast, good quality |
| Natural voices |
| Multi-language, fast |
| For video sync |
| Model | Notes |
|---|---|
| Best quality |
| Fast |
| Google's model |
| Song generation |
| Instrumental |
| Short clips |
| Background music |
| Model | Features | Speed |
|---|---|---|
| Multi-language, timestamps | Fast |
| Speaker diarization | Medium |
bash /mnt/skills/user/fal-audio/scripts/text-to-speech.sh [options]--text--modelfal-ai/minimax/speech-2.6-turbo--voice# Basic TTS (fast, good quality)
bash /mnt/skills/user/fal-audio/scripts/text-to-speech.sh \
--text "Hello, welcome to the future of AI."
# High quality with MiniMax HD
bash /mnt/skills/user/fal-audio/scripts/text-to-speech.sh \
--text "This is premium quality speech." \
--model "fal-ai/minimax/speech-2.6-hd"
# Natural voices with ElevenLabs
bash /mnt/skills/user/fal-audio/scripts/text-to-speech.sh \
--text "Natural sounding voice generation" \
--model "fal-ai/elevenlabs/eleven-v3"
# Multi-language TTS
bash /mnt/skills/user/fal-audio/scripts/text-to-speech.sh \
--text "Bonjour, bienvenue dans le futur." \
--model "fal-ai/chatterbox/multilingual"bash /mnt/skills/user/fal-audio/scripts/speech-to-text.sh [options]--audio-url--modelfal-ai/whisper--language# Transcribe with Whisper
bash /mnt/skills/user/fal-audio/scripts/speech-to-text.sh \
--audio-url "https://example.com/audio.mp3"
# Transcribe with speaker diarization
bash /mnt/skills/user/fal-audio/scripts/speech-to-text.sh \
--audio-url "https://example.com/meeting.mp3" \
--model "fal-ai/elevenlabs/scribe"
# Transcribe specific language
bash /mnt/skills/user/fal-audio/scripts/speech-to-text.sh \
--audio-url "https://example.com/spanish.mp3" \
--language "es"mcp__fal-ai__generate({
modelId: "fal-ai/minimax/speech-2.6-turbo",
input: {
text: "Hello, welcome to the future of AI."
}
})mcp__fal-ai__generate({
modelId: "fal-ai/whisper",
input: {
audio_url: "https://example.com/audio.mp3"
}
})Generating speech...
Model: fal-ai/minimax/speech-2.6-turbo
Speech generated!
Audio URL: https://v3.fal.media/files/abc123/speech.mp3
Duration: 5.2sTranscribing audio...
Model: fal-ai/whisper
Transcription complete!
Text: "Hello, this is the transcribed text from the audio file."
Duration: 12.5s
Language: enHere's the generated speech:
[Download audio](https://v3.fal.media/files/.../speech.mp3)
• Duration: 5.2s | Model: Maya TTSHere's the transcription:
"Hello, this is the transcribed text from the audio file."
• Duration: 12.5s | Language: Englishfal-ai/minimax/speech-2.6-hdfal-ai/minimax/speech-2.6-turbofal-ai/elevenlabs/eleven-v3fal-ai/chatterbox/multilingualfal-ai/minimax-music/v2fal-ai/lyria2fal-ai/whisperfal-ai/elevenlabs/scribeError: Generated audio is empty
Check that your text is not empty and contains valid content.Error: Audio format not supported
Supported formats: MP3, WAV, M4A, FLAC, OGG
Convert your audio to a supported format.Warning: Could not detect language, defaulting to English
Specify the language explicitly with --language option.