Search Results: media-processing

Found 25 Skills

Product & Designmrgoonie/claudekit-skills

aesthetic

Create aesthetically beautiful interfaces following proven design principles. Use when building UI/UX, analyzing designs from inspiration sites, generating design images with ai-multimodal, implementing visual hierarchy and color theory, adding micro-interactions, or creating design documentation. Includes workflows for capturing and analyzing inspiration screenshots with chrome-devtools and ai-multimodal, iterative design image generation until aesthetic standards are met, and comprehensive design system guidance covering BEAUTIFUL (aesthetic principles), RIGHT (functionality/accessibility), SATISFYING (micro-interactions), and PEAK (storytelling) stages. Integrates with chrome-devtools, ai-multimodal, media-processing, ui-styling, and web-frameworks skills.

🇺🇸|EnglishTranslated

Frontend Developmentsamhvw8/dot-claude

aesthetic

Visual design intelligence and UI aesthetics. Integrates: chrome-devtools, ai-multimodal, media-processing. Capabilities: design analysis, visual hierarchy, color theory, typography, micro-interactions, animation, design systems, accessibility. Actions: analyze, design, create, capture, evaluate, implement UI aesthetics. Keywords: Dribbble, Behance, Mobbin, design inspiration, visual hierarchy, color palette, typography, spacing, animation, micro-interaction, design system, style guide, accessibility, WCAG, contrast ratio, golden ratio, whitespace, visual rhythm. Use when: building beautiful UIs, analyzing design inspiration, implementing visual hierarchy, adding animations/micro-interactions, creating design systems, evaluating aesthetic quality, capturing design screenshots.

🇺🇸|EnglishTranslated

Tools & Utilitiesbobmatnyc/claude-mpm-skil...

media-transcoding

FFmpeg-based media transcoding workflows with preset-driven conversions, batch processing, and safe backups for web/mobile/archive outputs.

🇺🇸|EnglishTranslated

Tools & Utilitiesagntswrm/agent-media

image-extend

Extends an image canvas by adding padding on all sides with a solid background color. Use when you need to add borders, margins, or expand the canvas area around an image.

🇺🇸|EnglishTranslated

Data Processingbarefootford/buttercut

analyze-video

Adds visual descriptions to transcripts by extracting and analyzing video frames with ffmpeg. Creates visual transcript with periodic visual descriptions of the video clip. Use when all files have audio transcripts present (transcript) but don't yet have visual transcripts created (visual_transcript).

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiesdkyazzentwatwa/chatgpt-sk...

video-to-gif

Convert video clips to optimized GIFs with speed control, cropping, text overlays, and file size optimization. Create perfect GIFs for social media, documentation, and presentations.

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiesfactory-ai/factory-plugin...

compose

Background knowledge for droid-control workflows -- not invoked directly. Video assembly via Remotion — title cards, layout, transitions, effects, and showcase polish.

🇺🇸|EnglishTranslated

AI & Machine Learningmrgoonie/claudekit-skills

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

🇺🇸|EnglishTranslated

6 scripts/Attention

Tools & Utilitiessundial-org/awesome-openc...

ffmpeg-video-editor

Generate FFmpeg commands from natural language video editing requests - cut, trim, convert, compress, change aspect ratio, extract audio, and more.

🇺🇸|EnglishTranslated

Tools & Utilitiesdkyazzentwatwa/chatgpt-sk...

video-thumbnail-extractor

Extract frames from videos at specific timestamps or intervals, find best frames, and generate thumbnail grids for previews.

🇺🇸|EnglishTranslated

1 scripts/Checked

Frontend Developmentbuainoai/remotion-skills

remotion-best-practices

Remotion Best Practices - Create Videos with React

🇨🇳|ChineseTranslated

AI & Machine Learningbinhmuc/autobot-review

ai-multimodal

Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (better image analysis than Claude models, captioning, reasoning, object detection, design extraction, OCR, visual Q&A, segmentation, handle multiple images), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots (instead of default vision capabilities of Claude, only fallback to Claude's vision capabilities if needed), processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.

🇺🇸|EnglishTranslated

7 scripts/Attention