Loading...
Loading...
Found 1,612 Skills
Reference — AVFoundation audio APIs, AVAudioSession categories/modes, AVAudioEngine pipelines, bit-perfect DAC output, iOS 26+ spatial audio capture, ASAF/APAC, Audio Mix with Cinematic framework
This skill should be used when users need to download audio or music from online platforms like YouTube, SoundCloud, Spotify, or other streaming services. It provides yt-dlp and spotdl command templates for high-quality audio extraction, playlist downloads, metadata embedding, and multi-platform support.
Best practices for building integrations with NetBox REST and GraphQL APIs. Use when building NetBox API integrations, reviewing integration code, troubleshooting NetBox performance issues, planning automation architecture, writing scripts that interact with NetBox, using pynetbox, configuring Diode for data ingestion, or implementing NetBox webhooks.
FFmpeg CLI reference for video and audio processing, format conversion, filtering, and media automation. Use when converting video formats, resizing or cropping video, trimming by time, replacing or extracting audio, mixing audio tracks, overlaying text or images, burning subtitles, creating GIFs, generating thumbnails, building slideshows, changing playback speed, encoding with H264/H265/VP9, setting CRF/bitrate, using GPU acceleration, creating storyboards, or running ffprobe. Covers filter_complex, stream selectors, -map, -c copy, seeking, scale, pad, crop, concat, drawtext, zoompan, xfade.
Analyze videos using Google's Gemini API - describe content, answer questions, transcribe audio with visual descriptions, reference timestamps, clip videos, and process YouTube URLs. Supports 9 video formats, multiple models (Gemini 2.5/2.0), and context windows up to 2M tokens (6 hours of video).
Generate high-quality voiceover audio with ElevenLabs. Includes word-level timestamps for video sync. Use when: creating demo narration, video voiceover, podcast intros, or any TTS need. Keywords: voiceover, TTS, text to speech, ElevenLabs, narration, audio, timestamps.
Use this skill when the user requests to generate, create, or produce podcasts from text content. Converts written content into a two-host conversational podcast audio format with natural dialogue.
Enter the Visual Studio Developer environment in the current PowerShell session via the VsDevShell module (MSBuild/CL toolchain env vars).
Edit images with Alibaba Cloud Model Studio Qwen Image Edit Max (qwen-image-edit-max). Use when modifying existing images (inpaint, replace, style transfer, local edits), preserving subject consistency, or documenting image edit request/response mappings.
Minimal image editing smoke test for Model Studio Qwen Image Edit.
Minimal video generation smoke test for Model Studio Wan Video.
Read, watch, and listen to video/audio files. Use Gemini for native video understanding, or extract key frames + Whisper transcription as fallback. Use when a user sends a video/audio and asks about its content, what's in it, what someone said, etc.