Loading...
Loading...
Found 118 Skills
Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.
PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation.
dontbesilent Execution Diagnosis. Diagnose the real reason behind your 'know what to do but fail to act' using the Adlerian psychology framework. Triggers: /dbs-unblock, /self-check, 'I know what to do but can't do it', 'why do I always procrastinate' Execution block diagnosis using Adlerian psychology framework. Trigger: /dbs-unblock, "I know what to do but can't do it", "why do I procrastinate"
Azure AI Vision Image Analysis SDK for captions, tags, objects, OCR, people detection, and smart cropping. Use for computer vision and image understanding tasks. Triggers: "image analysis", "computer vision", "OCR", "object detection", "ImageAnalysisClient", "image caption".
Socratic questioning protocol + user communication. MANDATORY for complex requests, new features, or unclear requirements. Includes progress reporting and error handling.
Process, convert, OCR, extract, redact, sign, and fill documents using the Nutrient DWS API. Works with PDFs, DOCX, XLSX, PPTX, HTML, and images.
MCP server with 39 tools for Word, Excel, PowerPoint, PDF, OCR operations
Guides structured ideation through Socratic questioning to explore problems, opportunities, and solutions. Use when brainstorming features, exploring use cases, or thinking through new ideas.
Extract tables from PDFs and images to CSV or Excel. Support for scanned documents with OCR, multi-page PDFs, and complex table structures.
Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting text from PDFs/Office files, transcribing audio, performing OCR on images, extracting YouTube transcripts, or processing batches of files. Supports 20+ formats including DOCX, XLSX, PPTX, PDF, HTML, EPUB, CSV, JSON, images with OCR, and audio with transcription.
Socratic mentoring for junior developers and AI newcomers. Guides through questions, never answers. Triggers: "help me understand", "explain this code", "I'm stuck", "Im stuck", "I'm confused", "Im confused", "I don't understand", "I dont understand", "can you teach me", "teach me", "mentor me", "guide me", "what does this error mean", "why doesn't this work", "why does not this work", "I'm a beginner", "Im a beginner", "I'm learning", "Im learning", "I'm new to this", "Im new to this", "walk me through", "how does this work", "what's wrong with my code", "what's wrong", "can you break this down", "ELI5", "step by step", "where do I start", "what am I missing", "newbie here", "junior dev", "first time using", "how do I", "what is", "is this right", "not sure", "need help", "struggling", "show me", "help me debug", "best practice", "too complex", "overwhelmed", "lost", "debug this", "/socratic", "/hint", "/concept", "/pseudocode". Progressive clue systems, teaching techniques, and success metrics.
Implement computer vision features including text recognition (OCR), face detection, barcode scanning, image segmentation, object tracking, and document scanning in iOS apps. Covers both the modern Swift-native Vision API (iOS 16+) and legacy VNRequest patterns, VisionKit DataScannerViewController for live camera scanning, and VNCoreMLRequest for custom model inference. Use when adding OCR, barcode scanning, face detection, or custom Core ML model inference with Vision.