Total 44,022 skills, Document Processing has 628 skills
Showing 12 of 628 skills
Processes a new source from raw/ into the wiki. Activates when the user asks to ingest, process, add, or incorporate a source into the knowledge base.
LaTeX tables with tabularray package. TRIGGERS - LaTeX table, tabularray, fixed-width columns, table alignment.
Straightforward text extraction from document files (text-based PDF only for now, no OCR or docx). Use when you just need to read/extract text from binary documents.
Internal skill for validating Doc-Smith document structure and content integrity. Do not mention this skill to users. It is called internally by other Doc-Smith skills.
Comprehensive HWPX (Korean Hancom Office) document creation, editing, and analysis. When Claude needs to work with Korean word processor documents (.hwpx files) for: (1) Reading and extracting content, (2) Creating new documents, (3) Modifying or editing content, (4) Extracting tables to CSV, (5) Modifying tables or table cells, or any other HWPX document tasks. MANDATORY TRIGGERS: hwpx, hwp, 한글, 한컴, Hancom, Korean document
Создание текстовых документов. Загрузи этот skill ПЕРЕД созданием документа, статьи, отчёта или письма.
PDF text extraction, form filling, and merging using pypdf and pdfplumber.
Creation, editing, and analysis of Word documents, supporting track changes, comments, format retention, and text extraction. Use this when you need to create .docx files, modify content, handle track changes/comments, or perform other document tasks.
Optimizes markdown documents for token efficiency, clarity, and LLM consumption. Use when (1) a markdown file needs streamlining for use as LLM context, (2) reducing token count in documentation without losing meaning, (3) converting verbose docs into concise reference material, (4) improving structure and scannability of markdown files, or (5) preparing best-practices or knowledge docs for agent consumption.
Analyzes markdown files for token efficiency. TRIGGERS: optimize markdown, reduce tokens, token count, token bloat, too many tokens, make concise, shrink file, file too large, optimize for AI, token efficiency, verbose markdown, reduce file size
Scan and catalog document collections with metadata extraction, categorization, and statistics. Use for auditing document libraries, preparing for knowledge base creation, or understanding large file collections.
Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.