Loading...
Loading...
Found 19 Skills
Transcribe video files directly into timed transcripts and subtitle-ready artifacts using hosted Whisper video-to-text. Use this when the input is a video and the goal is speech extraction, caption generation, or edit-prep timing.
Package an existing talking-head / interview / podcast video by layering timed, designed GRAPHIC OVERLAY cards onto the playing video — titles, lower-thirds, data callouts, quotes, side panels, picture-in-picture — synced to the transcript. The source video plays in full; the agent designs and writes each card's HTML in conversation, then renders to MP4 via hyperframes. Use when the user asks for graphic overlays, on-screen graphics / lower-thirds / data callouts / kinetic titles on a video, "package / dress up my video", "add overlay cards / graphic cards", or AI-composed graphic packaging of an existing video. NOT for plain subtitles (→ embedded-captions) or building a video from scratch (→ the creation workflows); when unsure overlays-vs-captions, see /hyperframes-read-first.
Transcription of oral broadcast videos and recognition of speech errors. Generate review drafts and deletion task lists. Trigger words: edit oral broadcast, process video, recognize speech errors
Extract, transcribe, and translate YouTube video transcripts using the YouTubeTranscript.dev V2 API. Supports captions, ASR audio transcription, batch processing (up to 100 videos), translation to 100+ languages, and multiple output formats. Use when working with YouTube videos, subtitles, captions, or video-to-text conversion.
Transcribe audio and video files using the Deepgram API. This skill should be used when the user requests transcription of audio files (mp3, wav, m4a, aac) or video files (mp4, mov, avi, etc.). Handles large video files by extracting audio first to reduce upload size and processing time.
Video processing toolkit. Use when user wants to: - Download videos from YouTube or other sites - Remove silence from videos - Trim, cut, or extract segments from videos - Extract audio from video files - Enhance or denoise audio - Replace audio track in a video - Change video playback speed - Concatenate multiple videos - Generate transcripts/captions (VTT) - Generate video descriptions, timestamps, or context cards - Upload videos to YouTube or Bunny.net CDN - Post social updates to X (Twitter) or LinkedIn - Get video metadata (duration, resolution, codec)
Generate structured video transcripts from local files or video URLs using Gemini Files API. Use when a GitHub or Linear tracker item, comment, or attachment includes a screen recording, .mov, .mp4, or tracker-hosted video and you need a <video-transcripts> block instead of hand-written notes.
Adds visual descriptions to transcripts by extracting and analyzing video frames with ffmpeg. Creates visual transcript with periodic visual descriptions of the video clip. Use when all files have audio transcripts present (transcript) but don't yet have visual transcripts created (visual_transcript).
Spoken video transcription and slip-of-the-tongue recognition. Generate review drafts and deletion task checklists. Trigger phrases: edit spoken video, process video, recognize slip-of-the-tongue
Extract full transcripts from video content for analysis, summarization, note-taking, or research. Use when the user wants a written version of video content, asks to "transcribe this", "get the text from this video", "convert video to text", or shares a video URL for content extraction.
Video generation and transcription workflows via the Venice.ai API.
Use this skill when the user says phrases like "get transcript", "transcribe video", "extract script", "help me extract it", "what does this video say", "what did this blogger say", or directly provides a video link requesting content extraction. Even if the user only sends a video link without stating their request, proactively trigger this skill if the context involves benchmark analysis or content extraction. Call video2text.py to obtain the raw transcript, use AI to correct common speech recognition errors, identify the author, and archive it to the benchmark blogger directory. Do NOT trigger for: analyzing viral content patterns (use li-analyzer), recording own topic ideas (use li-recorder), writing own scripts (use li-writer). Use when the user wants to "get transcript", "transcribe video", "extract script", or gives a video link for content extraction. Runs speech-to-text, AI proofreads, and archives to benchmark blogger directory.