Search Results: ocr

Found 112 Skills

Document Processingexistential-birds/beagle

docling

Docling document parser for PDF, DOCX, PPTX, HTML, images, and 15+ formats. Use when parsing documents, extracting text, converting to Markdown/HTML/JSON, chunking for RAG pipelines, or batch processing files. Triggers on DocumentConverter, convert, convert_all, export_to_markdown, HierarchicalChunker, HybridChunker, ConversionResult.

🇺🇸|EnglishTranslated

Document Processingaffaan-m/everything-claud...

visa-doc-translate

Translate visa application documents (images) to English and create a bilingual PDF with original and translation

🇺🇸|EnglishTranslated

AI & Machine Learningcharleswiltgen/axiom

axiom-vision-ref

Vision framework API, VNDetectHumanHandPoseRequest, VNDetectHumanBodyPoseRequest, person segmentation, face detection, VNImageRequestHandler, recognized points, joint landmarks, VNRecognizeTextRequest, VNDetectBarcodesRequest, DataScannerViewController, VNDocumentCameraViewController, RecognizeDocumentsRequest

🇺🇸|EnglishTranslated

Document Processinglawvable/awesome-legal-sk...

pdf-processing-anthropic

Toolkit for comprehensive PDF manipulation, including extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. Use to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.

🇺🇸|EnglishTranslated

8 scripts/Checked

Document Processingclaude-office-skills/skil...

layout-analyzer

🇺🇸|EnglishTranslated

Document Processingkrishagel/geoffrey

pdf-to-markdown

Convert PDF to clean Markdown with image content described as text. Use when user wants to convert a PDF to markdown, extract content from PDF, or prepare PDF content for AI tools.

🇺🇸|EnglishTranslated

1 scripts/Checked

Code Qualityalibaba/open-code-review

open-code-review

Performs AI-powered code review on Git changes using the `ocr` CLI from alibaba/open-code-review. Use when the user asks to review code, review a pull request, review staged/unstaged changes, review a commit, or compare branches for code quality issues. Produces line-level review comments and can automatically apply fixes when requested. With appropriate review rules, can detect various types of issues including bugs, security vulnerabilities, performance problems, and code quality concerns.

🇺🇸|EnglishTranslated

AI & Machine Learningimsus/pi-extension-minima...

minimax-image-understanding

Analyze images using AI with the understand_image tool

🇺🇸|EnglishTranslated

Tools & Utilitiesblessonism/openclaw-searc...

mineru-extract

Use the official MinerU (mineru.net) parsing API to convert a URL (HTML pages like WeChat articles, or direct PDF/Office/image links) into clean Markdown + structured outputs. Use when web_fetch/browser can’t access or extracts messy content, and you want higher-fidelity parsing (layout/table/formula/OCR).

🇺🇸|EnglishTranslated

2 scripts/Checked

Document Processingrightnow-ai/openfang

pdf-reader

PDF content extraction and analysis specialist

🇺🇸|EnglishTranslated

Document Processingrun-llama/llamaparse-agen...

liteparse

Use this skill when the user asks to parse, perform multi-format document conversion or spatially extract text from an unstructured file (PDF, DOCX, PPTX, XLSX, images, etc.) locally without cloud dependencies.

🇺🇸|EnglishTranslated

AI & Machine Learningminimax-ai/skills

vision-analysis

Analyze, describe, and extract information from images using the MiniMax vision MCP tool. Use when: user shares an image file path or URL (any message containing .jpg, .jpeg, .png, .gif, .webp, .bmp, or .svg file extension) or uses any of these words/phrases near an image: "analyze", "analyse", "describe", "explain", "understand", "look at", "review", "extract text", "OCR", "what is in", "what's in", "read this image", "see this image", "tell me about", "explain this", "interpret this", in connection with an image, screenshot, diagram, chart, mockup, wireframe, or photo. Also triggers for: UI mockup review, wireframe analysis, design critique, data extraction from charts, object detection, person/animal/activity identification. Triggers: any message with an image file extension (jpg, jpeg, png, gif, webp, bmp, svg), or any request to analyze/describ/understand/review/extract text from an image, screenshot, diagram, chart, photo, mockup, or wireframe.

🇺🇸|EnglishTranslated