Search Results: ocr

Found 112 Skills

pdf

Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.

🇺🇸|EnglishTranslated

141.0k

8 scripts/Checked

AI & Machine Learningmicrosoft/agent-skills

azure-ai-vision-imageanalysis-py

Azure AI Vision Image Analysis SDK for captions, tags, objects, OCR, people detection, and smart cropping. Use for computer vision and image understanding tasks. Triggers: "image analysis", "computer vision", "OCR", "object detection", "ImageAnalysisClient", "image caption".

🇺🇸|EnglishTranslated

Data Processingdkyazzentwatwa/chatgpt-sk...

ocr-document-processor

Extract text from images and scanned PDFs using OCR. Supports 100+ languages, table detection, structured output (markdown/JSON), and batch processing.

🇺🇸|EnglishTranslated

1 scripts/Checked

Document Processingclaude-office-skills/skil...

office-mcp

MCP server with 39 tools for Word, Excel, PowerPoint, PDF, OCR operations

🇺🇸|EnglishTranslated

Document Processingaffaan-m/everything-claud...

nutrient-document-processing

Process, convert, OCR, extract, redact, sign, and fill documents using the Nutrient DWS API. Works with PDFs, DOCX, XLSX, PPTX, HTML, and images.

🇺🇸|EnglishTranslated

AI & Machine Learningaktsmm/agent-skills

ocr-super-surya

GPU-optimized OCR using Surya. Use when: (1) Extracting text from images/screenshots, (2) Processing PDFs with embedded images, (3) Multi-language document OCR, (4) Layout analysis and table detection. Supports 90+ languages with 2x accuracy over Tesseract.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingdkyazzentwatwa/chatgpt-sk...

table-extractor

Extract tables from PDFs and images to CSV or Excel. Support for scanned documents with OCR, multi-page PDFs, and complex table structures.

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiesxyuanbuilds/my_skills

ocr

Extract text from images using OCR. Use when the user needs to read text from screenshots, photos, or image files.

🇺🇸|EnglishTranslated

5 scripts/Attention

Document Processingpspdfkit-labs/nutrient-ag...

nutrient-document-processing

Process documents with the Nutrient DWS API. Use this skill when the user wants to convert documents (PDF, DOCX, XLSX, PPTX, HTML, images), extract text or tables from PDFs, OCR scanned documents, redact sensitive information (PII, SSN, emails, credit cards), add watermarks, digitally sign PDFs, fill PDF forms, or check API credit usage. Activates on keywords: PDF, document, convert, extract, OCR, redact, watermark, sign, merge, compress, form fill, document processing.

🇺🇸|EnglishTranslated

Document Processingletta-ai/skills

financial-document-processor

Guidance for processing financial documents (invoices, receipts, statements) with OCR and text extraction. This skill should be used when tasks involve extracting data from financial PDFs or images, generating summaries (CSV/JSON), or moving/organizing processed documents. Emphasizes data safety practices to prevent catastrophic data loss.

🇺🇸|EnglishTranslated

Mobile Developmentdpearson2699/swift-ios-sk...

vision-framework

Implement computer vision features including text recognition (OCR), face detection, barcode scanning, image segmentation, object tracking, and document scanning in iOS apps. Covers both the modern Swift-native Vision API (iOS 16+) and legacy VNRequest patterns, VisionKit DataScannerViewController for live camera scanning, and VNCoreMLRequest for custom model inference. Use when adding OCR, barcode scanning, face detection, or custom Core ML model inference with Vision.

🇺🇸|EnglishTranslated

AI & Machine Learningvamseeachanta/workspace-h...

document-rag-pipeline

Build complete document knowledge bases with PDF text extraction, OCR for scanned documents, vector embeddings, and semantic search. Use this for creating searchable document libraries from folders of PDFs, technical standards, or any document collection.

🇺🇸|EnglishTranslated