Total 50,472 skills, Document Processing has 738 skills
Showing 12 of 738 skills
Use this skill when the user asks to parse, perform multi-format document conversion or spatially extract text from an unstructured file (PDF, DOCX, PPTX, XLSX, images, etc.) locally without cloud dependencies.
Crawl entire websites using Cloudflare Browser Rendering /crawl API. Initiates async crawl jobs, polls for completion, and saves results as markdown files. Useful for ingesting documentation sites, knowledge bases, or any web content into your project context. Requires CLOUDFLARE_ACCOUNT_ID and CLOUDFLARE_API_TOKEN environment variables.
Refine speech transcription texts (interviews, speeches, podcasts, meetings) into more readable article paragraphs. Trigger this skill when users mention terms like "subtitle refinement", "transcript polish", "subtitle polishing", "organize video subtitles into articles", "interview text organization", processing interview records, transcription text optimization, speech-to-text organization, or when they need to organize long dialogue/speech texts into readable articles. It is suitable for organizing transcription texts of solo speeches or multi-person conversations, requiring the retention of original sentences and words, and rejecting high-level generalization. This skill should also be triggered even if users only say "help me organize this text" and attach obviously colloquial text.
Create, edit, and convert Word documents (.docx) using Syncfusion DocIO. Supports to generate java code for the user's project. Use when the user mentions docx, Word processing, document generation, Syncfusion DocIO, or syncfusion java word.
Provides comprehensive memory file management capabilities including auditing, quality assessment, and targeted improvements for files such as CLAUDE.md. Use when user asks to check, audit, update, improve, fix, maintain, or validate project memory files. Also triggers for "project memory optimization", "CLAUDE.md quality check", "documentation review", or when a project memory file needs to be created from scratch. This skill scans memory files, evaluates quality against standardized criteria, outputs detailed quality reports with scores and recommendations, then makes targeted updates with user approval.
Convert legal texts (legal provisions or legal cases) into standardized Markdown format and remove promotional redundant information. This skill shall be used when users need to process legal provisions (such as the Civil Code, Criminal Law, etc.), organize legal cases (such as typical cases of the Supreme People's Court, judgment documents, etc.), or format legal documents from pasted text. Note: This skill is only responsible for formatting and content cleaning, and does not have content crawling capability. Content acquisition shall be completed by other skills (such as wechat-article-fetch), and AI will automatically determine the skill collaboration sequence.
PDF data extraction tool. Use it when users mention "PDF extraction", "PDF to Markdown", "PDF parsing", "extract PDF content", "PDF to JSON", "RAG PDF". OpenDataLoader PDF is currently the top-ranked PDF parser in benchmark tests, supporting local mode (fast, deterministic) and hybrid AI mode (for complex tables, scanned documents, formulas), with output formats including Markdown, JSON (with bounding boxes), and HTML. It is suitable for scenarios where structured data needs to be extracted from PDFs for RAG/LLM pipelines, or where batch processing of PDF documents is required.
Verify documentation coverage and generate missing docs interactively
Extract text/tables from PDFs, create formatted PDFs, merge/split/rotate, handle forms and metadata. Supports pdf-lib/pdfkit (Node.js) and pypdf/pdfplumber/ReportLab (Python).
Comprehensive PDF processing and manipulation. Creates, extracts, merges, splits, and transforms PDF documents with full format support.
Capture and organize knowledge in workspace platforms. Structures information into wikis, databases, and connected knowledge graphs.
Detect and flag AI-generated content markers in documentation and prose. Use when reviewing documentation for AI markers, cleaning up LLM-generated content, or auditing prose quality. Do not use when generating new content (use doc-generator) or learning writing styles (use style-learner).