Search Results: audio-transcription

Found 34 Skills

AI & Machine Learningagntswrm/agent-media

audio-transcribe

Transcribes audio to text with timestamps and optional speaker identification. Use when you need to convert speech to text, create subtitles, transcribe meetings, or process voice recordings.

🇺🇸|EnglishTranslated

AI & Machine Learningtrpc-group/trpc-agent-go

whisper

Transcribe audio files to text using OpenAI Whisper

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiessickn33/antigravity-aweso...

audio-transcriber

Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration

🇺🇸|EnglishTranslated

3 scripts/Attention

AI & Machine Learningcinience/alicloud-skills

aliyun-qwen-asr

Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learning958877748/skills

groq-stt

Transcribe audio files using Groq API (Whisper models). Use when user needs to transcribe audio to text.

🇺🇸|EnglishTranslated

1 scripts/Attention

AI & Machine Learningmarswaveai/skills

asr

Transcribe audio files to text using local speech recognition. Triggers on: "转录", "transcribe", "语音转文字", "ASR", "识别音频", "把这段音频转成文字".

🇺🇸|EnglishTranslated

Frontend Developmentsyncfusion/blazor-ui-comp...

syncfusion-blazor-speech-to-text

Implement speech-to-text voice input in Blazor applications using Syncfusion SpeechToText component. ALWAYS use this when users need voice input, speech recognition, audio transcription, or implementing the SpeechToText component in Blazor. Trigger for Syncfusion.Blazor.Inputs, microphone input, voice-to-text conversion, language support, transcript binding, listening states, error handling, browser speech API, or any speech recognition requirements.

🇺🇸|EnglishTranslated

AI & Machine Learningmaxgent-ai/maxgent-plugin

audio-transcribe

Speech-to-text transcription using Whisper with word-level timestamps. Use when users ask to transcribe audio or video to text, generate subtitles, or recognize speech.

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiesmalue-ai/dazee-small

screenpipe

AI screen memory — search everything you've seen or heard on your computer. Integrates with Screenpipe's local MCP server for OCR text, audio transcripts, and app usage history.

🇨🇳|ChineseTranslated

AI & Machine Learningcinience/alicloud-skills

alicloud-ai-audio-asr

Transcribe non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningsamhvw8/dot-claude

ai-multimodal

Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection, segmentation, visual Q&A), video (scene detection, 6hr max, YouTube URLs, temporal analysis), documents (PDF extraction, tables, forms, charts), image generation (text-to-image, editing). Actions: transcribe, analyze, extract, caption, detect, segment, generate from media. Keywords: Gemini API, audio transcription, image captioning, OCR, object detection, video analysis, PDF extraction, text-to-image, multimodal, speech recognition, visual Q&A, scene detection, YouTube transcription, table extraction, form processing, image generation, Imagen. Use when: transcribing audio/video, analyzing images/screenshots, extracting data from PDFs, processing YouTube videos, generating images from text, implementing multimodal AI features.

🇺🇸|EnglishTranslated

6 scripts/Attention

AI & Machine Learningaahl/skills

qwen-asr

Transcribe audio files using Qwen ASR. Use when the user sends voice messages and wants them converted to text.

🇺🇸|EnglishTranslated

1 scripts/Checked