Search Results: speech-to-text

Found 87 Skills

open-gstack-browser

Launch GStack Browser — AI-controlled Chromium with the sidebar extension baked in. Opens a visible browser window where you can watch every action in real time. The sidebar shows a live activity feed and chat. Anti-bot stealth built in. Use when asked to "open gstack browser", "launch browser", "connect chrome", "open chrome", "real browser", "launch chrome", "side panel", or "control my browser". Voice triggers (speech-to-text aliases): "show me the browser".

🇺🇸|EnglishTranslated

Tools & Utilitiesgarrytan/gstack

make-pdf

Turn any markdown file into a publication-quality PDF. Proper 1in margins, intelligent page breaks, page numbers, cover pages, running headers, curly quotes and em dashes, clickable TOC, diagonal DRAFT watermark. Not a draft artifact — a finished artifact. Use when asked to "make a PDF", "export to PDF", "turn this markdown into a PDF", or "generate a document". (gstack) Voice triggers (speech-to-text aliases): "make this a pdf", "make it a pdf", "export to pdf", "turn this into a pdf", "turn this markdown into a pdf", "generate a pdf", "make a pdf from", "pdf this markdown".

🇺🇸|EnglishTranslated

14 scripts/Attention

AI & Machine Learningframersai/agentos-skills

vosk

Fully offline speech-to-text via the Vosk library — streaming recognition, 16 kHz PCM, no network required after model download.

🇺🇸|EnglishTranslated

Backend Developmentteam-telnyx/skills

telnyx-voice-streaming-javascript

Stream call audio in real-time, fork media to external destinations, and transcribe speech live. Use for real-time analytics and AI integrations. This skill provides JavaScript SDK examples.

🇺🇸|EnglishTranslated

Tools & Utilitiesyzfly/douyin-mcp-server

douyin-video

Watermark-free Douyin video download and transcript extraction tool. Retrieve watermark-free video download links from Douyin share links, download videos, extract voice transcripts from videos and automatically save them to files. Applicable scenarios include obtaining Douyin video information, downloading watermark-free videos, and batch extracting video transcripts. Triggered when users need to process Douyin video links or extract video content.

🇨🇳|ChineseTranslated

1 scripts/Attention

AI & Machine Learningagentiveau/myagentive

deepgram-transcription

Transcribe audio and video files using the Deepgram API. This skill should be used when the user requests transcription of audio files (mp3, wav, m4a, aac) or video files (mp4, mov, avi, etc.). Handles large video files by extracting audio first to reduce upload size and processing time.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningcnemri/google-genai-skill...

speech-use

Generate (TTS), Transcribe (STT), and Clone voices using Google's GenAI and Cloud Speech SDKs. Supports Gemini-TTS, Chirp 3, and Instant Custom Voice.

🇺🇸|EnglishTranslated

3 scripts/Checked

AI & Machine Learningteam-telnyx/skills

telnyx-ai-inference-python

Access Telnyx LLM inference APIs, embeddings, and AI analytics for call insights and summaries. This skill provides Python SDK examples.

🇺🇸|EnglishTranslated

AI & Machine Learningdeepgram/skills

examples

Find working Deepgram integration examples with third-party platforms and frameworks. Use whenever someone wants to integrate Deepgram with Twilio, LiveKit, LangChain, Vercel AI SDK, Discord, Vonage, Pipecat, Expo, FastAPI, Cloudflare Workers, Slack, Telegram, LlamaIndex, Zoom, Next.js, Nuxt, Django, SvelteKit, NestJS, Spring Boot, CrewAI, Riverside, SignalWire, and more. Examples are full runnable integration demos, not minimal feature snippets.

🇺🇸|EnglishTranslated

AI & Machine Learningbytedance/agentkit-sample...

byted-voice-to-text

Automatic Speech Recognition (ASR). Uses Volcano Engine BigModel ASR for speech recognition, with two available modes: Express Edition (≤2h/100MB, synchronous fast response) and Standard Edition (≤5h, asynchronous recognition). It supports Feishu voice messages, local audio files and audio URLs. Use this skill when you receive voice messages or audio attachments (.ogg/.mp3/.wav).

🇨🇳|ChineseTranslated

5 scripts/Attention

AI & Machine Learningmembranedev/application-s...

assemblyai

AssemblyAI integration. Manage Transcripts, Speakers, Jobs. Use when the user wants to interact with AssemblyAI data.

🇺🇸|EnglishTranslated

AI & Machine Learningteam-telnyx/skills

telnyx-ai-inference-curl

Access Telnyx LLM inference APIs, embeddings, and AI analytics for call insights and summaries. This skill provides REST API (curl) examples.

🇺🇸|EnglishTranslated