pdf

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

pdf

来源: ComposioHQ/awesome-claude-skills 适配: LiYe OS 三层架构

综合 PDF 操作工具，支持提取文本/表格/元数据，合并/拆分文档，添加注释，处理表单。

Source: ComposioHQ/awesome-claude-skills Adapted for: LiYe OS Three-tier Architecture

Comprehensive PDF operation tool that supports extracting text/tables/metadata, merging/splitting documents, adding annotations, and handling forms.

When to Use This Skill

当 Claude 需要处理 PDF 文件时：

从 PDF 中提取文本、表格或元数据
创建新的 PDF 文档
合并或拆分 PDF 文件
添加注释或批注
处理 PDF 表单

When Claude needs to process PDF files:

Extract text, tables, or metadata from PDFs
Create new PDF documents
Merge or split PDF files
Add annotations or comments
Handle PDF forms

Core Capabilities

1. 文本提取

1. Text Extraction

提取全文内容
保留文档结构
识别标题和段落层级
OCR 支持（扫描件）

Extract full-text content
Preserve document structure
Identify headings and paragraph hierarchies
OCR support (for scanned documents)

2. 表格提取

2. Table Extraction

识别表格结构
导出为结构化数据
支持复杂嵌套表格

Recognize table structures
Export as structured data
Support complex nested tables

3. 元数据处理

3. Metadata Processing

读取文档属性
提取作者、创建日期等信息
修改文档元数据

Read document properties
Extract information such as author, creation date, etc.
Modify document metadata

4. 文档操作

4. Document Operations

合并多个 PDF
拆分为多个文件
页面重排序
提取特定页面

Merge multiple PDFs
Split into multiple files
Reorder pages
Extract specific pages

5. 注释与表单

5. Annotations & Forms

添加高亮和批注
填写表单字段
创建可填写表单

Add highlights and comments
Fill form fields
Create fillable forms

Usage Examples

示例 1: 提取研究论文内容

Example 1: Extract Research Paper Content

用户: 提取这篇医学论文的摘要和结论
Claude: [使用 pdf 技能提取指定章节内容]

User: Extract the abstract and conclusion of this medical paper
Claude: [Use the pdf skill to extract content from the specified sections]

示例 2: 合并多个报告

Example 2: Merge Multiple Reports

用户: 把这些季度报告合并成一个年度汇总
Claude: [使用 pdf 技能合并文件，添加目录页]

User: Merge these quarterly reports into an annual summary
Claude: [Use the pdf skill to merge files and add a table of contents page]

示例 3: 提取表格数据

Example 3: Extract Table Data

用户: 从这个 PDF 报告中提取数据表格
Claude: [使用 pdf 技能识别并提取表格，转为结构化格式]

User: Extract the data table from this PDF report
Claude: [Use the pdf skill to identify and extract the table, and convert it to a structured format]

Dependencies

Python: PyPDF2, pdfplumber, reportlab, PyMuPDF
或 Node.js: pdf-lib, pdf-parse

Python: PyPDF2, pdfplumber, reportlab, PyMuPDF
Or Node.js: pdf-lib, pdf-parse

LiYe OS Integration

业务域引用

Business Domain References

此技能被以下业务域引用：

05_Medical_Intelligence: 医学文献、临床指南解析
01_Research_Intelligence: 学术论文分析
04_Business_Operations: 商务文档处理

This skill is referenced by the following business domains:

05_Medical_Intelligence: Medical literature, clinical guideline parsing
01_Research_Intelligence: Academic paper analysis
04_Business_Operations: Business document processing

三层架构位置

Three-tier Architecture Positioning

物理层 (本文件): Skills/00_Core_Utilities/document-processing/pdf/
逻辑层索引: Skills/{domain}/index.yaml
L3 指令层: .claude/skills/{domain}/pdf/

Created: 2025-12-28 | Adapted for LiYe OS

Physical Layer (This file): Skills/00_Core_Utilities/document-processing/pdf/
Logical Layer Index: Skills/{domain}/index.yaml
L3 Instruction Layer: .claude/skills/{domain}/pdf/

Created: 2025-12-28 | Adapted for LiYe OS