Loading...
Loading...
Convert local documents to Markdown using Microsoft's markitdown CLI. Best for: PDF, Word, Excel, PowerPoint, images (OCR), audio. Can fetch URLs but Jina is faster for web. Triggers on: convert to markdown, read PDF, parse document, extract text from, docx, xlsx, pptx, OCR image, local file.
npx skill4agent add 0xdarkmatter/claude-mods markitdown| Use Case | Recommendation |
|---|---|
| Local files (PDF, Word, Excel) | ✅ Use markitdown - unique capability |
| Web pages | ❌ Use Jina ( |
| Blocked/anti-bot sites | ❌ Use Firecrawl |
| OCR on images | ✅ Use markitdown |
| Audio transcription | ✅ Use markitdown |
# Local files (primary use case)
markitdown document.pdf
markitdown report.docx
markitdown data.xlsx
markitdown slides.pptx
markitdown screenshot.png # OCR
# URLs (works, but Jina is faster)
markitdown https://example.com
# Save output
markitdown document.pdf > document.md| Format | Extensions | Notes |
|---|---|---|
| Text extraction, tables | |
| Word | | Formatting preserved |
| Excel | | Tables to markdown |
| PowerPoint | | Slides as sections |
| Images | | OCR text extraction |
| HTML | | Clean conversion |
| Audio | | Speech-to-text |
| Text | | Pass-through/structure |
| URLs | | Works but slower than Jina |
| Tool | Avg Speed | Success Rate |
|---|---|---|
| Jina | 0.5s | 10/10 |
| markitdown | 2.5s | 9/10 |
| Firecrawl | 4.5s | 10/10 |
# PDF to markdown (primary use case)
markitdown report.pdf > report.md
# Excel spreadsheet
markitdown financials.xlsx
# Image with text (OCR)
markitdown screenshot.png
# PowerPoint deck
markitdown presentation.pptx > slides.md
# Audio transcription
markitdown meeting.mp3 > transcript.md| Task | markitdown | Alternative |
|---|---|---|
| PDF text | | PyMuPDF, pdfplumber |
| Word docs | | python-docx |
| Excel | | pandas, openpyxl |
| OCR | | Tesseract |
| Web pages | Use Jina instead | |