Loading...
Loading...
Found 112 Skills
Handles fetching, reading, and summarizing official Indian corporate filings (BSE/NSE). Specialized in OCR for scanned PDFs.
OCR Web Service integration. Manage Documents. Use when the user wants to interact with OCR Web Service data.
Extract frames from video files using ffmpeg for AI/LLM analysis. Use when (1) the user asks to analyze, describe, or summarize a video file, (2) the user wants to extract frames or screenshots from a video, (3) the user provides a video file (.mp4, .mov, .avi, .mkv, .webm, etc.) and asks questions about its visual content, (4) the user wants to identify scenes, objects, or events in a video, (5) the user wants timestamps overlaid on extracted frames for temporal reference. Converts video into JPEG frames that can be attached to LLM prompts as images. Requires ffmpeg on PATH. Supports scene-change detection, model-aware optimization (Claude/OpenAI/Gemini), quality presets (efficient/balanced/detailed/ocr), grayscale and high-contrast OCR mode, and automatic FPS calculation via --max-frames.
Extract text from images using OCR and vision AI with the performOCR() high-level API or the full VisionPipeline.
Process scanned documents and images containing Chuukese text using OCR with specialized post-processing for accent characters and traditional formatting. Use when working with scanned books, documents, or images that contain Chuukese text that needs to be digitized.
图片分析与识别,可分析本地图片、网络图片、视频、文件。适用于 OCR、物体识别、场景理解等。当用户发送图片或要求分析图片时必须使用此技能。
Use when OCR-specialized extraction is needed with Alibaba Cloud Model Studio Qwen OCR models (`qwen-vl-ocr`, `qwen-vl-ocr-latest`, and snapshots), including document parsing, table parsing, multilingual OCR, formula recognition, and key information extraction.
Extracts text, tables, and images from PDFs (including scanned PDFs) using the Mistral OCR API. Use when user asks to OCR a PDF/image, extract text from a PDF, parse a scanned document, convert a PDF to Markdown, or extract structured fields from a document.
Analyze images — segment objects, detect, run OCR, describe, and answer visual questions via fal.ai vision models.
Extract text from images using Tesseract OCR
This skill should be used when the user needs to convert documents between formats (Office to PDF, PDF to images, image to PDF), perform PDF operations (merge, split, rotate, encrypt, decrypt), or run OCR on scanned documents. Uses local free tools — LibreOffice, ghostscript, pdftk, tesseract, and imagemagick — with no API key required. Trigger when the user says "convert this document", "export to PDF", "merge PDFs", "split PDF", "rotate PDF", "OCR this scan", "convert PPTX to PDF", "convert DOCX to PDF", or any document format conversion request.
PaddleOCR text recognition skill using PP-OCRv5 lightweight model. Supports natural scene and complex document text detection and recognition. Use when user needs OCR text extraction from images.