Search Results: stt

Found 18 Skills

AI & Machine Learningcodestackr/livekit-skills

agents-ts

Build LiveKit Agent backends in TypeScript or JavaScript. Use this skill when creating voice AI agents, voice assistants, or any realtime AI application using LiveKit's Node.js Agents SDK (@livekit/agents-js). Covers AgentSession, Agent class, function tools with zod, STT/LLM/TTS models, turn detection, and realtime models.

🇺🇸|EnglishTranslated

AI & Machine Learningcodestackr/livekit-skills

agents-py

Build LiveKit Agent backends in Python. Use this skill when creating voice AI agents, voice assistants, or any realtime AI application using LiveKit's Python Agents SDK (livekit-agents). Covers AgentSession, Agent class, function tools, STT/LLM/TTS models, turn detection, and multi-agent workflows.

🇺🇸|EnglishTranslated

AI & Machine Learningmbailey/voicemode

voicemode-connect

Remote voice via VoiceMode Connect. Use when users want to add voice to Claude Code using their phone or web app, without local STT/TTS setup.

🇺🇸|EnglishTranslated

AI & Machine Learningscientiacapital/skills

voice-ai

Production voice AI agents with sub-500ms latency. Groq LLM, Deepgram STT, Cartesia TTS, Twilio integration. No OpenAI. Use when: voice agent, phone bot, STT, TTS, Deepgram, Cartesia, Twilio, voice AI, speech to text, IVR, call center, voice latency.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningvanman2024/ai-dev-marketp...

stt-integration

ElevenLabs Speech-to-Text transcription workflows with Scribe v1 supporting 99 languages, speaker diarization, and Vercel AI SDK integration. Use when implementing audio transcription, building STT features, integrating speech-to-text, setting up Vercel AI SDK with ElevenLabs, or when user mentions transcription, STT, Scribe v1, audio-to-text, speaker diarization, or multi-language transcription.

🇺🇸|EnglishTranslated

5 scripts/Attention

Tools & Utilitiessales-skills/sales

sales-char

Char (formerly Hyprnote) platform help — open-source, bot-free, local-first AI meeting notepad with system audio capture, markdown output, plugin SDK, and optional cloud STT/LLM (GPL-3.0). Use when setting up Char on macOS for the first time, speaker identification not working in group meetings, configuring local-only transcription with Cactus or Ollama for full offline use, choosing between Char's cloud STT providers (Deepgram, AssemblyAI, Soniox, OpenAI, etc.), app not launching or bouncing on dock without opening, telemetry concerns with PostHog or Sentry in a local-first app, building a Char plugin or using the automation hooks system, comparing Char to Granola or Meetily or Fathom for privacy, or configuring the CLI for template management. Do NOT use for picking between note-takers generally (use /sales-note-taker) or reviewing a single call for coaching (use /sales-call-review).

🇺🇸|EnglishTranslated

AI & Machine Learningzainhas/togetherai-skills

together-audio

Text-to-speech (TTS) and speech-to-text (STT) via Together AI. TTS models include Orpheus, Kokoro, Cartesia Sonic, Rime, MiniMax with REST, streaming, and WebSocket support. STT models include Whisper and Voxtral. Use when users need voice synthesis, audio generation, speech recognition, transcription, TTS, STT, or real-time voice applications.

🇺🇸|EnglishTranslated

2 scripts/Checked

AI & Machine Learninginference-sh/skills

speech-to-text

Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation, multi-language, timestamps. Use for: meeting transcription, subtitles, podcast transcripts, voice notes. Triggers: speech to text, transcription, whisper, audio to text, transcribe audio, voice to text, stt, automatic transcription, subtitles generation, transcribe meeting, audio transcription, whisper ai

🇺🇸|EnglishTranslated

856

AI & Machine Learningassemblyai/assemblyai-ski...

assemblyai

Use when implementing speech-to-text, audio transcription, real-time streaming STT, audio intelligence features, or voice AI using AssemblyAI APIs or SDKs. Use when user mentions AssemblyAI, voice agents, transcription, speaker diarization, PII redaction of audio, LLM Gateway for audio understanding, or applying LLMs to transcripts. Also use when building voice agents with LiveKit or Pipecat that need speech-to-text, or when the user is working with any audio/video processing pipeline that could benefit from transcription, even if they don't mention AssemblyAI by name.

🇺🇸|EnglishTranslated

AI & Machine Learningscientiacapital/skills

groq-inference

Fast LLM inference with Groq API - chat, vision, audio STT/TTS, tool use. Use when: groq, fast inference, low latency, whisper, PlayAI TTS, Llama, vision API, tool calling, voice agents, real-time AI.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningtool-belt/skills

elevenlabs-stt

ElevenLabs speech-to-text with Scribe models and forced alignment via inference.sh CLI. Models: Scribe v1/v2 (98%+ accuracy, 90+ languages). Capabilities: transcription, speaker diarization, audio event tagging, word-level timestamps, forced alignment, subtitle generation. Use for: meeting transcription, subtitles, podcast transcripts, lip-sync timing, karaoke. Triggers: elevenlabs stt, elevenlabs transcription, scribe, elevenlabs speech to text, forced alignment, word alignment, subtitle timing, diarization, speaker identification, audio event detection, eleven labs transcribe

🇺🇸|EnglishTranslated

Automationnodnarbnitram/claude-code...

ha-voice

Configure Home Assistant Assist voice control with pipelines, intents, wake words, and speech processing. Use when setting up voice control, creating custom intents, configuring TTS/STT, or building voice satellites. Activates on keywords: Assist, voice control, wake word, intent, sentence, TTS, STT, Piper, Whisper, Wyoming.

🇺🇸|EnglishTranslated