Loading...
Loading...
Found 12 Skills
[Utilities] Convert PDF files to Markdown. Use when extracting text from PDFs, creating editable documentation from PDF reports, or converting PDF content to version-controlled markdown files.
[Document Processing] Convert PDF files to Markdown with support for native text PDFs and scanned documents (OCR). Cross-platform.
Convert PDF to clean Markdown with image content described as text. Use when user wants to convert a PDF to markdown, extract content from PDF, or prepare PDF content for AI tools.
Convert PDFs to Markdown using Mistral OCR API with image extraction. Use when you need to extract structured text and images from PDFs, especially for scanned documents or documents with complex formatting. Outputs Markdown with embedded images.
Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.
Use when fetching a URL, web page, or PDF as Markdown. Not for local files already in the repo.
Read and analyze academic papers from Zotero library. Use when the user requests to read, access, or analyze a paper by title, author, or topic from their Zotero library. Automatically searches Zotero, converts PDFs to markdown, saves to Notes/PaperInMarkdown, and provides analysis.
Parse HWP, HWPX, and PDF Korean documents to Markdown using kordoc — supports CLI, programmatic API, and MCP server integration.
Read datasheets and technical PDF documents with `pcb scan`. Use when the user gives a local PDF path or an `http(s)` datasheet/document URL, when a task requires reading, summarizing, extracting information from, or answering questions about a datasheet or technical PDF, or when a KiCad symbol / `.kicad_sym` provides a `Datasheet` property to resolve. Run `pcb scan <input>` in bash, treat stdout as the generated `.md` path, then read that markdown file.
Converts DOCX/PDF/PPTX to high-quality Markdown with automatic post-processing. Fixes pandoc grid tables, simple tables, image paths, CJK bold spacing, attribute noise, and code blocks. Benchmarked best-in-class (7.6/10) against Docling, MarkItDown, Pandoc raw, and Mammoth. Trigger on "convert document", "docx to markdown", "parse word", "doc to markdown", "解析word", "转换文档".
PDF data extraction tool. Use it when users mention "PDF extraction", "PDF to Markdown", "PDF parsing", "extract PDF content", "PDF to JSON", "RAG PDF". OpenDataLoader PDF is currently the top-ranked PDF parser in benchmark tests, supporting local mode (fast, deterministic) and hybrid AI mode (for complex tables, scanned documents, formulas), with output formats including Markdown, JSON (with bounding boxes), and HTML. It is suitable for scenarios where structured data needs to be extracted from PDFs for RAG/LLM pipelines, or where batch processing of PDF documents is required.
Use when you need legal PDF to markdown extraction plus clause chunking and embedding prep; pair with addon-rag-ingestion-pipeline and architect-python-uv-batch.