doc-mindmap
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDoc Mindmap - 文档智能整理助手 📚🧠
Doc Mindmap - Document Intelligent Organizer 📚🧠
将散落的办公文档(PDF、PPT、Word、Excel 等)批量转换为 Markdown,通过本地 Ollama 模型生成摘要和三维度分类,用软链接同时呈现多种分类方案,零额外磁盘占用。
Batch convert scattered office documents (PDF, PPT, Word, Excel, etc.) to Markdown, generate summaries and three-dimensional classifications using local Ollama models, and present multiple classification schemes simultaneously with soft links, with zero additional disk usage.
When to Use
When to Use
Use this skill when users:
- 想整理大量文档、分类归档
- 需要给一批文档生成摘要
- 想生成文档的思维导图 / mindmap
- 想把 PDF、PPT、Word 转成 Markdown
- 需要文档分类建议或目录结构方案
- 想快速了解一个文件夹里都有什么文档
- 需要检测重复文件
触发关键词: 文档整理, 文档分类, 思维导图, mindmap, 文档摘要, PDF 转 Markdown, 批量转换, 文档归档
Use this skill when users:
- Want to organize and archive a large number of documents
- Need to generate summaries for a batch of documents
- Want to generate mindmaps for documents
- Want to convert PDF, PPT, Word to Markdown
- Need document classification suggestions or directory structure solutions
- Want to quickly understand what documents are in a folder
- Need to detect duplicate files
Trigger Keywords: document organization, document classification, mindmap, document summary, PDF to Markdown, batch conversion, document archiving
Features
Features
- 🔄 批量转换 - PDF、PPT、Word、Excel 等一键转 Markdown
- 📋 CSV 索引 - 自动生成文档索引,含 MD5 和重复检测
- 🔍 重复检测 - MD5 比对发现重复文件,建议删除释放空间
- 📝 本地摘要 - Ollama 本地模型生成摘要,不消耗 Claude 上下文
- 🗂️ 三维度分类 - 按主题/用途/客户三种方案同时分类
- 🔗 软链接目录 - symlink 实现多分类共存,零额外磁盘占用
- ✏️ 智能重命名 - AI 根据内容建议更清晰的文件名,软链接可选用优化名称
- 🛡️ 安全机制 - 只读转换,不修改原始文件
- 🔄 Batch Conversion - One-click conversion of PDF, PPT, Word, Excel, etc. to Markdown
- 📋 CSV Index - Automatically generate document index with MD5 and duplicate detection
- 🔍 Duplicate Detection - Identify duplicate files via MD5 comparison and suggest deletion to free up space
- 📝 Local Summary - Generate summaries using Ollama local models, no Claude context consumption
- 🗂️ Three-Dimensional Classification - Classify simultaneously by three schemes: topic/usage/client
- 🔗 Symlink Directory - Realize coexistence of multiple classifications via symlink with zero additional disk usage
- ✏️ Intelligent Rename - AI suggests clearer filenames, optimized names can be used for soft links
- 🛡️ Security Mechanism - Read-only conversion, no modification to original files
Supported Formats
Supported Formats
| 格式 | 扩展名 | 说明 |
|---|---|---|
| PDF 文档 | ||
| 📊 PPT | .pptx | PowerPoint 演示文稿 |
| 📝 Word | .docx | Word 文档 |
| 📈 Excel | .xlsx, .xls | 电子表格 |
| 📈 CSV | .csv | 逗号分隔值 |
| 🌐 HTML | .html, .htm | 网页 |
| 📚 EPUB | .epub | 电子书 |
| 📋 JSON | .json | JSON 数据 |
| 📋 XML | .xml | XML 数据 |
| Format | Extension | Description |
|---|---|---|
| PDF document | ||
| 📊 PPT | .pptx | PowerPoint presentation |
| 📝 Word | .docx | Word document |
| 📈 Excel | .xlsx, .xls | Spreadsheet |
| 📈 CSV | .csv | Comma-separated values |
| 🌐 HTML | .html, .htm | Web page |
| 📚 EPUB | .epub | E-book |
| 📋 JSON | .json | JSON data |
| 📋 XML | .xml | XML data |
Usage
Usage
预览文档列表(含重复检测)
Preview Document List (Including Duplicate Detection)
bash
python scripts/doc_converter.py ~/Documents/reports --previewbash
python scripts/doc_converter.py ~/Documents/reports --preview执行转换
Execute Conversion
bash
python scripts/doc_converter.py ~/Documents/reports --convert --confirmbash
python scripts/doc_converter.py ~/Documents/reports --convert --confirm转换指定文件
Convert Specific Files
bash
python scripts/doc_converter.py file1.pdf file2.pptx --convert --confirmbash
python scripts/doc_converter.py file1.pdf file2.pptx --convert --confirm生成摘要(需先转换)
Generate Summary (Requires Conversion First)
bash
python scripts/doc_converter.py ~/Documents/reports --summarize
python scripts/doc_converter.py ~/Documents/reports --summarize --model qwen3:8bbash
python scripts/doc_converter.py ~/Documents/reports --summarize
python scripts/doc_converter.py ~/Documents/reports --summarize --model qwen3:8b三维度分类 + 软链接(需先摘要)
Three-Dimensional Classification + Symlink (Requires Summary First)
bash
python scripts/doc_converter.py ~/Documents/reports --organizebash
python scripts/doc_converter.py ~/Documents/reports --organize分类 + 使用 AI 建议的文件名
Classification + Use AI-Suggested Filenames
bash
python scripts/doc_converter.py ~/Documents/reports --organize --renamebash
python scripts/doc_converter.py ~/Documents/reports --organize --rename全流程一步完成
Complete Full Process in One Step
bash
python scripts/doc_converter.py ~/Documents/reports --convert --confirm --summarize --organizebash
python scripts/doc_converter.py ~/Documents/reports --convert --confirm --summarize --organize含优化文件名
With optimized filenames
python scripts/doc_converter.py ~/Documents/reports --convert --confirm --summarize --organize --rename
undefinedpython scripts/doc_converter.py ~/Documents/reports --convert --confirm --summarize --organize --rename
undefinedJSON 格式输出
JSON Format Output
bash
python scripts/doc_converter.py ~/Documents --preview --jsonbash
python scripts/doc_converter.py ~/Documents --preview --jsonArguments
Arguments
| 参数 | 说明 |
|---|---|
| 文件或目录路径(支持多个) |
| 预览模式,列出文档 + 重复检测 |
| 执行批量转换(自动跳过重复文件) |
| 使用 Ollama 本地模型生成摘要(需先 convert) |
| 三维度分类并生成软链接目录(需先 summarize) |
| 软链接使用 AI 建议的优化文件名(配合 --organize) |
| Ollama 模型名称(默认: qwen2.5:3b) |
| 确认执行(安全机制) |
| JSON 格式输出 |
| Argument | Description |
|---|---|
| File or directory path (supports multiple) |
| Preview mode, list documents + duplicate detection |
| Execute batch conversion (automatically skip duplicate files) |
| Generate summaries using Ollama local models (requires --convert first) |
| Three-dimensional classification and generate symlink directories (requires --summarize first) |
| Soft links use AI-suggested optimized filenames (works with --organize) |
| Ollama model name (default: qwen2.5:3b) |
| Confirm execution (security mechanism) |
| Output in JSON format |
Output Structure
Output Structure
转换输出在源文件夹的 隐藏目录下:
.summaries/{source}/
└── .summaries/
├── converted/ # markitdown 转换的 .md 文件
│ ├── report.pdf.md
│ ├── slides.pptx.md
│ └── data.xlsx.md
├── briefs/ # Ollama 生成的摘要
│ ├── report.pdf.brief.md
│ ├── slides.pptx.brief.md
│ └── data.xlsx.brief.md
├── schemes/ # 软链接分类目录
│ ├── by-topic/ # 按主题分类
│ │ ├── AI技术/
│ │ │ └── AI驱动产品管理指南.pptx -> ../../../../slides.pptx # --rename
│ │ └── 数据治理/
│ │ └── C端数据治理规划.pdf -> ../../../../report.pdf # --rename
│ ├── by-usage/ # 按用途分类
│ │ ├── 培训材料/
│ │ └── 客户交付方案/
│ └── by-client/ # 按客户分类
│ ├── 沃尔沃/
│ └── 通用方案/
├── mindmap.md # Claude 生成的思维导图分类
└── index.csv # 转换索引(含 MD5、重复标记)Conversion output is stored in the hidden directory under the source folder:
.summaries/{source}/
└── .summaries/
├── converted/ # .md files converted by markitdown
│ ├── report.pdf.md
│ ├── slides.pptx.md
│ └── data.xlsx.md
├── briefs/ # Summaries generated by Ollama
│ ├── report.pdf.brief.md
│ ├── slides.pptx.brief.md
│ └── data.xlsx.brief.md
├── schemes/ # Symlink classification directories
│ ├── by-topic/ # Classified by topic
│ │ ├── AI Technology/
│ │ │ └── AI-Driven Product Management Guide.pptx -> ../../../../slides.pptx # --rename
│ │ └── Data Governance/
│ │ └── C-End Data Governance Plan.pdf -> ../../../../report.pdf # --rename
│ ├── by-usage/ # Classified by usage
│ │ ├── Training Materials/
│ │ └── Client Delivery Solutions/
│ └── by-client/ # Classified by client
│ ├── Volvo/
│ └── General Solutions/
├── mindmap.md # Mindmap classification generated by Claude
└── index.csv # Conversion index (includes MD5, duplicate markers)Dependencies
Dependencies
- Python 3.10+
- markitdown:
pip install 'markitdown[all]' - Ollama(摘要 + 分类): +
brew install ollamaollama pull qwen2.5:3b - requests:
pip install requests
- Python 3.10+
- markitdown:
pip install 'markitdown[all]' - Ollama (for summary + classification): +
brew install ollamaollama pull qwen2.5:3b - requests:
pip install requests
Claude Workflow
Claude Workflow
Claude 使用此技能时,按以下步骤执行:
When Claude uses this skill, follow these steps:
第 1 步:预览文档
Step 1: Preview Documents
运行预览命令,向用户展示文档列表和重复检测结果:
bash
python doc-mindmap/scripts/doc_converter.py <路径> --preview告知用户找到的文档数量、类型分布、总大小和重复文件情况,等待确认。
Run the preview command to show the document list and duplicate detection results to the user:
bash
python doc-mindmap/scripts/doc_converter.py <path> --previewInform the user of the number of documents found, format distribution, total size, and duplicate file status, then wait for confirmation.
第 2 步:执行转换
Step 2: Execute Conversion
用户确认后执行转换(重复文件自动跳过):
bash
python doc-mindmap/scripts/doc_converter.py <路径> --convert --confirmAfter user confirmation, execute conversion (duplicate files are automatically skipped):
bash
python doc-mindmap/scripts/doc_converter.py <path> --convert --confirm第 3 步:生成摘要(Ollama 本地模型)
Step 3: Generate Summary (Ollama Local Model)
使用 Ollama 本地模型为每个文档生成摘要,不消耗 Claude 上下文窗口:
bash
python doc-mindmap/scripts/doc_converter.py <路径> --summarize也可以和 convert 一起执行:
bash
python doc-mindmap/scripts/doc_converter.py <路径> --convert --confirm --summarizeGenerate summaries for each document using the Ollama local model, no Claude context window consumption:
bash
python doc-mindmap/scripts/doc_converter.py <path> --summarizeIt can also be executed together with convert:
bash
python doc-mindmap/scripts/doc_converter.py <path> --convert --confirm --summarize第 4 步:三维度分类 + 软链接
Step 4: Three-Dimensional Classification + Symlink
使用 Ollama 对每个文档进行三维度分类(主题/用途/客户),同时为每个文档建议更清晰的文件名:
bash
undefinedUse Ollama to classify each document in three dimensions (topic/usage/client), and suggest clearer filenames for each document:
bash
undefined先不带 --rename 运行,展示分类结果和建议文件名
First run without --rename to show classification results and suggested filenames
python doc-mindmap/scripts/doc_converter.py <路径> --organize
向用户展示分类结果和 AI 建议的文件名,询问是否使用优化文件名。如果用户同意:
```bash
python doc-mindmap/scripts/doc_converter.py <路径> --organize --rename三套分类方案通过软链接同时存在于 下,零额外磁盘占用。 仅影响软链接名称,不修改原始文件。
.summaries/schemes/--renamepython doc-mindmap/scripts/doc_converter.py <path> --organize
Show the classification results and AI-suggested filenames to the user, and ask if they want to use the optimized filenames. If the user agrees:
```bash
python doc-mindmap/scripts/doc_converter.py <path> --organize --renameThree classification schemes coexist in via symlinks with zero additional disk usage. only affects soft link names, not the original files.
.summaries/schemes/--rename第 5 步:预览分类结果
Step 5: Preview Classification Results
询问用户是否要在 Finder 中预览分类目录。如果用户同意:
- 将 schemes 目录复制到桌面(保留软链接):
bash
cp -a <.summaries/schemes> ~/Desktop/文档分类-$(date +%Y%m%d)- 用 Finder 打开:
bash
open ~/Desktop/文档分类-$(date +%Y%m%d)用户可以在 Finder 中直观浏览三种分类方案,双击软链接即可打开原始文件。
Ask the user if they want to preview the classification directory in Finder. If the user agrees:
- Copy the schemes directory to the desktop (retain symlinks):
bash
cp -a <.summaries/schemes> ~/Desktop/Document-Classification-$(date +%Y%m%d)- Open it with Finder:
bash
open ~/Desktop/Document-Classification-$(date +%Y%m%d)Users can intuitively browse the three classification schemes in Finder, and double-click the soft link to open the original file.
第 6 步:生成思维导图
Step 6: Generate Mindmap
读取 下的摘要文件,生成 思维导图分类文件。
.summaries/briefs/.summaries/mindmap.mdRead the summary files in and generate the mindmap classification file.
.summaries/briefs/.summaries/mindmap.md第 7 步:展示结果
Step 7: Present Results
向用户展示:
- 转换统计(成功/失败/跳过重复)
- 重复文件列表及删除建议
- 三维度分类概览
- 思维导图分类预览
Show the user:
- Conversion statistics (success/failure/skipped duplicates)
- List of duplicate files and deletion suggestions
- Overview of three-dimensional classification
- Preview of mindmap classification
Credits
Credits
- markitdown - Microsoft 出品的文档转 Markdown 工具
- Ollama - 本地 LLM 运行框架
- 与 file-organizer 配合使用,先分类再整理
- markitdown - Document to Markdown tool developed by Microsoft
- Ollama - Local LLM runtime framework
- Use with file-organizer, classify first then organize