Search Results: ocr

Found 167 Skills

extracting-pdf-text

Extract text from PDFs for LLM consumption. Use when processing PDFs for RAG, document analysis, or text extraction. Supports API services (Mistral OCR) and local tools (PyMuPDF, pdfplumber). Handles text-based PDFs, tables, and scanned documents with OCR.

🇺🇸|EnglishTranslated

4 scripts/Checked

Document Processingpspdfkit-labs/nutrient-ag...

nutrient-document-processing

Process documents with the Nutrient DWS API. Use this skill when the user wants to convert documents (PDF, DOCX, XLSX, PPTX, HTML, images), extract text or tables from PDFs, OCR scanned documents, redact sensitive information (PII, SSN, emails, credit cards), add watermarks, digitally sign PDFs, fill PDF forms, or check API credit usage. Activates on keywords: PDF, document, convert, extract, OCR, redact, watermark, sign, merge, compress, form fill, document processing.

🇺🇸|EnglishTranslated

Document Processingvm0-ai/vm0-skills

pdf4me

Comprehensive PDF processing API for conversion, merge, split, compress, OCR, and more

🇺🇸|EnglishTranslated

Mobile Developmentitlearning/study-ios

swift-study

Interactive Swift/iOS tutor using Socratic method. Use when studying Swift concepts, learning iOS development, or wanting guided explanations with questions.

🇺🇸|EnglishTranslated

Document Processingletta-ai/skills

financial-document-processor

Guidance for processing financial documents (invoices, receipts, statements) with OCR and text extraction. This skill should be used when tasks involve extracting data from financial PDFs or images, generating summaries (CSV/JSON), or moving/organizing processed documents. Emphasizes data safety practices to prevent catastrophic data loss.

🇺🇸|EnglishTranslated

Tools & Utilitiesakshat10/skills

first-principles-thinking

Socratic coach for breaking down problems to fundamental truths. Use when users want to think through a problem deeply, challenge assumptions, or find innovative solutions. Triggers on requests like "help me think through this", "let's break this down", "what are my blind spots", "I'm stuck on a problem", "challenge my assumptions", or explicit requests for first-principles thinking.

🇺🇸|EnglishTranslated

Data Processingdkyazzentwatwa/chatgpt-sk...

receipt-scanner

Extract vendor, date, items, amounts, and total from receipt images using OCR and pattern matching with structured JSON output.

🇺🇸|EnglishTranslated

1 scripts/Checked

Document Processingnebutra/mineru-skill

mineru

Parse PDF into Markdown/JSON/DOCX using MinerU API. Extract text, tables, formulas with OCR support. Use when converting PDF documents, extracting content from scanned papers, or batch processing PDF files.

🇺🇸|EnglishTranslated

Mobile Developmentdpearson2699/swift-ios-sk...

vision-framework

Implement computer vision features including text recognition (OCR), face detection, barcode scanning, image segmentation, object tracking, and document scanning in iOS apps. Covers both the modern Swift-native Vision API (iOS 16+) and legacy VNRequest patterns, VisionKit DataScannerViewController for live camera scanning, and VNCoreMLRequest for custom model inference. Use when adding OCR, barcode scanning, face detection, or custom Core ML model inference with Vision.

🇺🇸|EnglishTranslated

Tools & Utilitiespascalorg/skills

image-to-text

Extract text from images using OCR. Use when the user shares a screenshot and you need to read the text content, copy UI labels, or extract copy from a design mockup.

🇺🇸|EnglishTranslated

2 scripts/Checked

AI & Machine Learningminimax-ai/skills

vision-analysis

Analyze, describe, and extract information from images using the MiniMax vision MCP tool. Use when: user shares an image file path or URL (any message containing .jpg, .jpeg, .png, .gif, .webp, .bmp, or .svg file extension) or uses any of these words/phrases near an image: "analyze", "analyse", "describe", "explain", "understand", "look at", "review", "extract text", "OCR", "what is in", "what's in", "read this image", "see this image", "tell me about", "explain this", "interpret this", in connection with an image, screenshot, diagram, chart, mockup, wireframe, or photo. Also triggers for: UI mockup review, wireframe analysis, design critique, data extraction from charts, object detection, person/animal/activity identification. Triggers: any message with an image file extension (jpg, jpeg, png, gif, webp, bmp, svg), or any request to analyze/describ/understand/review/extract text from an image, screenshot, diagram, chart, photo, mockup, or wireframe.

🇺🇸|EnglishTranslated

AI & Machine Learningdkyazzentwatwa/chatgpt-sk...

business-card-scanner

Extract contact information from business card images using OCR - name, company, email, phone, address.

🇺🇸|EnglishTranslated

1 scripts/Checked