Search Results: ocr

Found 203 Skills

image-to-text

Extract text from images using OCR. Use when the user shares a screenshot and you need to read the text content, copy UI labels, or extract copy from a design mockup.

🇺🇸|EnglishTranslated

2 scripts/Checked

AI & Machine Learningminimax-ai/skills

vision-analysis

Analyze, describe, and extract information from images using the MiniMax vision MCP tool. Use when: user shares an image file path or URL (any message containing .jpg, .jpeg, .png, .gif, .webp, .bmp, or .svg file extension) or uses any of these words/phrases near an image: "analyze", "analyse", "describe", "explain", "understand", "look at", "review", "extract text", "OCR", "what is in", "what's in", "read this image", "see this image", "tell me about", "explain this", "interpret this", in connection with an image, screenshot, diagram, chart, mockup, wireframe, or photo. Also triggers for: UI mockup review, wireframe analysis, design critique, data extraction from charts, object detection, person/animal/activity identification. Triggers: any message with an image file extension (jpg, jpeg, png, gif, webp, bmp, svg), or any request to analyze/describ/understand/review/extract text from an image, screenshot, diagram, chart, photo, mockup, or wireframe.

🇺🇸|EnglishTranslated

AI & Machine Learningletta-ai/skills

extracting-pdf-text

Extract text from PDFs for LLM consumption. Use when processing PDFs for RAG, document analysis, or text extraction. Supports API services (Mistral OCR) and local tools (PyMuPDF, pdfplumber). Handles text-based PDFs, tables, and scanned documents with OCR.

🇺🇸|EnglishTranslated

4 scripts/Checked

AI & Machine Learningsanyuan0704/code-review-e...

sigma

Personalized 1-on-1 AI tutor using Bloom's 2-Sigma mastery learning. Guides users through any topic with Socratic questioning, adaptive pacing, and rich visual output (HTML dashboards, Excalidraw concept maps, generated images). Use when user wants to learn something, study a topic, understand a concept, requests tutoring, says 'teach me', 'I want to learn', 'explain X to me step by step', 'help me understand', or invokes /sigma. Triggers on: learn, study, teach, tutor, understand, master, explain step by step.

🇺🇸|EnglishTranslated

AI & Machine Learningyeachan-heo/oh-my-claudec...

deep-interview

Socratic deep interview with mathematical ambiguity gating before autonomous execution

🇺🇸|EnglishTranslated

Document Processingcoroboros/agent-skills

markitdown

Convert any document to Markdown with Microsoft's `markitdown` CLI — PDF, Word, Excel, PowerPoint, HTML, CSV, JSON, XML, ZIP, EPub, images (OCR/EXIF), audio (transcription), and YouTube URLs. Use whenever the user wants to extract text from a binary document, transcribe audio, OCR an image, scrape a YouTube transcript, or pre-process a file for an LLM context window — even when they just say "convert this pdf", "what's in this docx", "transcribe this mp3", or "get the text out of this".

🇺🇸|EnglishTranslated

1 scripts/Attention

Tools & Utilitiesletta-ai/skills

code-from-image

Guide for extracting code or pseudocode from images using OCR and implementing it correctly. This skill should be used when tasks involve reading code, pseudocode, or algorithms from images (PNG, JPG, screenshots) and executing or implementing the extracted logic.

🇺🇸|EnglishTranslated

Document Processingmicrock/ordinary-claude-s...

markitdown

Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting text from PDFs/Office files, transcribing audio, performing OCR on images, extracting YouTube transcripts, or processing batches of files. Supports 20+ formats including DOCX, XLSX, PPTX, PDF, HTML, EPUB, CSV, JSON, images with OCR, and audio with transcription.

🇺🇸|EnglishTranslated

AI & Machine Learningdkyazzentwatwa/chatgpt-sk...

business-card-scanner

Extract contact information from business card images using OCR - name, company, email, phone, address.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningqwencloud/qwencloud-ai

qwencloud-vision

[QwenCloud] Understand images and videos with Qwen vision models. TRIGGER when: user wants to analyze, describe, or extract information from images or videos, OCR text extraction, chart/table reading, visual reasoning, multi-image comparison, screenshot understanding, video comprehension, or explicitly invokes this skill by name (e.g. use qwencloud-vision). DO NOT TRIGGER when: user wants to generate/create images (use qwencloud-image-generation), generate videos (use qwencloud-video-generation), text-only tasks without visual input, or non-Qwen vision tasks.

🇺🇸|EnglishTranslated

6 scripts/Checked

Mobile Developmentitlearning/study-ios

swift-study

Interactive Swift/iOS tutor using Socratic method. Use when studying Swift concepts, learning iOS development, or wanting guided explanations with questions.

🇺🇸|EnglishTranslated

Data Processingdkyazzentwatwa/chatgpt-sk...

receipt-scanner

Extract vendor, date, items, amounts, and total from receipt images using OCR and pattern matching with structured JSON output.

🇺🇸|EnglishTranslated

1 scripts/Checked