Loading...
Loading...
Found 112 Skills
Use when the user needs PDF generation, manipulation, form filling, table extraction, OCR, merging, splitting, watermarking, or metadata handling. Trigger conditions: generate PDF reports, extract text or tables from PDFs, fill PDF forms programmatically, merge or split PDF files, add watermarks, OCR scanned documents, read or write PDF metadata, convert HTML to PDF.
Extract text and metadata from PDF files using pdf-parse. Use when: user uploads a PDF or asks to read/analyze PDF content. NOT for: creating PDFs, editing PDFs, or OCR on scanned documents.
Score and compare images using vision LLMs as judges. YAML-defined criteria presets for 11 use cases (text-to-image, photorealism, document OCR, charts, UI, portrait, product, scientific, invoice, alt-text, artistic style). Supports OpenAI, Anthropic, Gemini, Mistral, and OpenRouter as judge providers. Keys auto-decrypted via SOPS + age.
Return public original model architecture diagrams for user-specified LLM, VLM, MoE, diffusion, OCR, and SGLang/sgl-cookbook model families. Use when the user asks for a model structure chart, architecture diagram, or rendered image link for a specific model such as DeepSeek, GLM, Qwen, Kimi, MiniMax, Step, Hunyuan, or Qwen3-VL.
Straightforward text extraction from document files (text-based PDF only for now, no OCR or docx). Use when you just need to read/extract text from binary documents.
AI screen memory — search everything you've seen or heard on your computer. Integrates with Screenpipe's local MCP server for OCR text, audio transcripts, and app usage history.
MCP server for computer use & browser automation - screenshot, OCR, click, type, find_text, Chrome/Electron CDP, template matching on macOS, Windows & Android
Generate professional Hebrew documents including PDF, DOCX, and PPTX with full RTL support and proper Hebrew typography. Use when user asks to create Hebrew PDF, generate Israeli business documents, "lehafik heshbonit", "litstor hozeh", build Hebrew Word document, create Hebrew PowerPoint, or produce Israeli templates such as Heshbonit Mas (tax invoice), Hozeh (contract), Hatza'at Mechir (proposal), or Protokol (meeting minutes). Covers reportlab, WeasyPrint, python-docx, and pptxgenjs with bidi paragraph support. Do NOT use for OCR or reading existing documents (use hebrew-ocr-forms instead).
Advanced document parsing with PaddleOCR. Returns complete document structure including text, tables, formulas, charts, and layout information. Claude extracts relevant content based on user needs.
[Document Processing] Convert PDF files to Markdown with support for native text PDFs and scanned documents (OCR). Cross-platform.
Use this skill when users need to extract text from images, PDFs, or documents. Supports URLs and local files. Returns structured JSON containing recognized text.
>