Loading...
Loading...
Found 167 Skills
Extract text from PDFs for LLM consumption. Use when processing PDFs for RAG, document analysis, or text extraction. Supports API services (Mistral OCR) and local tools (PyMuPDF, pdfplumber). Handles text-based PDFs, tables, and scanned documents with OCR.
Process documents with the Nutrient DWS API. Use this skill when the user wants to convert documents (PDF, DOCX, XLSX, PPTX, HTML, images), extract text or tables from PDFs, OCR scanned documents, redact sensitive information (PII, SSN, emails, credit cards), add watermarks, digitally sign PDFs, fill PDF forms, or check API credit usage. Activates on keywords: PDF, document, convert, extract, OCR, redact, watermark, sign, merge, compress, form fill, document processing.
Comprehensive PDF processing API for conversion, merge, split, compress, OCR, and more
Interactive Swift/iOS tutor using Socratic method. Use when studying Swift concepts, learning iOS development, or wanting guided explanations with questions.
Guidance for processing financial documents (invoices, receipts, statements) with OCR and text extraction. This skill should be used when tasks involve extracting data from financial PDFs or images, generating summaries (CSV/JSON), or moving/organizing processed documents. Emphasizes data safety practices to prevent catastrophic data loss.
Socratic coach for breaking down problems to fundamental truths. Use when users want to think through a problem deeply, challenge assumptions, or find innovative solutions. Triggers on requests like "help me think through this", "let's break this down", "what are my blind spots", "I'm stuck on a problem", "challenge my assumptions", or explicit requests for first-principles thinking.
Extract vendor, date, items, amounts, and total from receipt images using OCR and pattern matching with structured JSON output.
Parse PDF into Markdown/JSON/DOCX using MinerU API. Extract text, tables, formulas with OCR support. Use when converting PDF documents, extracting content from scanned papers, or batch processing PDF files.
Implement computer vision features including text recognition (OCR), face detection, barcode scanning, image segmentation, object tracking, and document scanning in iOS apps. Covers both the modern Swift-native Vision API (iOS 16+) and legacy VNRequest patterns, VisionKit DataScannerViewController for live camera scanning, and VNCoreMLRequest for custom model inference. Use when adding OCR, barcode scanning, face detection, or custom Core ML model inference with Vision.
Extract text from images using OCR. Use when the user shares a screenshot and you need to read the text content, copy UI labels, or extract copy from a design mockup.
Analyze, describe, and extract information from images using the MiniMax vision MCP tool. Use when: user shares an image file path or URL (any message containing .jpg, .jpeg, .png, .gif, .webp, .bmp, or .svg file extension) or uses any of these words/phrases near an image: "analyze", "analyse", "describe", "explain", "understand", "look at", "review", "extract text", "OCR", "what is in", "what's in", "read this image", "see this image", "tell me about", "explain this", "interpret this", in connection with an image, screenshot, diagram, chart, mockup, wireframe, or photo. Also triggers for: UI mockup review, wireframe analysis, design critique, data extraction from charts, object detection, person/animal/activity identification. Triggers: any message with an image file extension (jpg, jpeg, png, gif, webp, bmp, svg), or any request to analyze/describ/understand/review/extract text from an image, screenshot, diagram, chart, photo, mockup, or wireframe.
Extract contact information from business card images using OCR - name, company, email, phone, address.