Search Results: audio-transcription

Found 42 Skills

chough

Fast ASR CLI tool for transcribing audio/video files. Use when user wants to transcribe audio/video, generate subtitles (VTT), convert speech to text with timestamps (JSON), or optimize transcription for low memory.

🇺🇸|EnglishTranslated

AI & Machine Learningtrpc-group/trpc-agent-go

whisper

Transcribe audio files to text using OpenAI Whisper

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningnoizai/skills

speech-to-text

Use this skill whenever the user wants to transcribe audio to text, convert speech to text, or get a transcript from an audio or video file. Triggers include: any mention of 'transcribe', 'transcription', 'speech to text', 'STT', 'convert audio to text', 'what does this audio say', 'get transcript', 'subtitle generation', or requests to extract spoken words from a file. Also use when the user wants speaker identification from audio, timestamps for captions, or multilingual transcription.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningaia-11-hn-mib/mib-mockint...

gemini-video-understanding

Analyze videos using Google's Gemini API - describe content, answer questions, transcribe audio with visual descriptions, reference timestamps, clip videos, and process YouTube URLs. Supports 9 video formats, multiple models (Gemini 2.5/2.0), and context windows up to 2M tokens (6 hours of video).

🇺🇸|EnglishTranslated

AI & Machine Learningalphaonedev/openclaw-grap...

openai-whisper-api

OpenAI Whisper API: audio transcription, translation, structured output, large file handling

🇺🇸|EnglishTranslated

AI & Machine Learninganthropics/knowledge-work...

scribe

Reference skill for Zoom AI Services Scribe. Use after routing to a transcription workflow when handling uploaded or stored media, Build-platform JWT auth, fast mode transcription, batch jobs, or transcript pipeline design.

🇺🇸|EnglishTranslated

AI & Machine Learningjianshuo/claude-skills

wjs-transcribing-audio

Use when the user has audio or video and wants a timestamped transcript (SRT) in the source language. Routes by source language — Chinese defaults to Volcano (豆包) ASR; other languages (Spanish, English, Portuguese, French, Italian, Japanese, Korean, etc.) use OpenAI Whisper API with word-level timestamps and self-assembled cues. Outputs SRT with punctuation-bounded cues capped for on-screen reading. Triggers — "转写", "转成字幕", "做 SRT", "transcribe", "make subtitles", "speech to text", "出字幕".

🇺🇸|EnglishTranslated

AI & Machine Learningbadlogic/pi-skills

transcribe

Speech-to-text transcription using Groq Whisper API. Supports m4a, mp3, wav, ogg, flac, webm.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learning958877748/skills

groq-stt

Transcribe audio files using Groq API (Whisper models). Use when user needs to transcribe audio to text.

🇺🇸|EnglishTranslated

1 scripts/Attention

AI & Machine Learningmarswaveai/skills

asr

Transcribe audio files to text using local speech recognition. Triggers on: "转录", "transcribe", "语音转文字", "ASR", "识别音频", "把这段音频转成文字".

🇺🇸|EnglishTranslated

AI & Machine Learningmaxgent-ai/maxgent-plugin

audio-transcribe

Speech-to-text transcription using Whisper with word-level timestamps. Use when users ask to transcribe audio or video to text, generate subtitles, or recognize speech.

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiesnateherkai/hyperframes-st...

hyperframes-cli

HyperFrames CLI tool — hyperframes init, lint, preview, render, transcribe, tts, doctor, browser, info, upgrade, compositions, docs, benchmark. Use when scaffolding a project, linting or validating compositions, previewing in the studio, rendering to video, transcribing audio, generating TTS, or troubleshooting the HyperFrames environment.

🇺🇸|EnglishTranslated