nutrient-document-processing
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseNutrient Document Processing
Nutrient 文档处理
Note: This skill integrates with the Nutrient commercial API. Review their terms before use.
Process documents with the Nutrient DWS Processor API. Convert formats, extract text and tables, OCR scanned documents, redact PII, add watermarks, digitally sign, and fill PDF forms.
注意: 此技能集成了Nutrient商业API。使用前请查看其条款。
通过Nutrient DWS Processor API处理文档。支持格式转换、提取文本和表格、对扫描文档进行OCR识别、脱敏个人身份信息(PII)、添加水印、数字签名以及填充PDF表单。
Setup
设置
Get a free API key at nutrient.io
bash
export NUTRIENT_API_KEY="pdf_live_..."All requests go to as multipart POST with an JSON field.
https://api.nutrient.io/buildinstructions前往**nutrient.io**获取免费API密钥
bash
export NUTRIENT_API_KEY="pdf_live_..."所有请求均以multipart POST方式发送至,并包含 JSON字段。
https://api.nutrient.io/buildinstructionsOperations
操作
Convert Documents
文档转换
bash
undefinedbash
undefinedDOCX to PDF
DOCX转PDF
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.docx=@document.docx"
-F 'instructions={"parts":[{"file":"document.docx"}]}'
-o output.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.docx=@document.docx"
-F 'instructions={"parts":[{"file":"document.docx"}]}'
-o output.pdf
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.docx=@document.docx"
-F 'instructions={"parts":[{"file":"document.docx"}]}'
-o output.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.docx=@document.docx"
-F 'instructions={"parts":[{"file":"document.docx"}]}'
-o output.pdf
PDF to DOCX
PDF转DOCX
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"docx"}}'
-o output.docx
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"docx"}}'
-o output.docx
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"docx"}}'
-o output.docx
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"docx"}}'
-o output.docx
HTML to PDF
HTML转PDF
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "index.html=@index.html"
-F 'instructions={"parts":[{"html":"index.html"}]}'
-o output.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "index.html=@index.html"
-F 'instructions={"parts":[{"html":"index.html"}]}'
-o output.pdf
Supported inputs: PDF, DOCX, XLSX, PPTX, DOC, XLS, PPT, PPS, PPSX, ODT, RTF, HTML, JPG, PNG, TIFF, HEIC, GIF, WebP, SVG, TGA, EPS.curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "index.html=@index.html"
-F 'instructions={"parts":[{"html":"index.html"}]}'
-o output.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "index.html=@index.html"
-F 'instructions={"parts":[{"html":"index.html"}]}'
-o output.pdf
支持的输入格式:PDF、DOCX、XLSX、PPTX、DOC、XLS、PPT、PPS、PPSX、ODT、RTF、HTML、JPG、PNG、TIFF、HEIC、GIF、WebP、SVG、TGA、EPS。Extract Text and Data
提取文本与数据
bash
undefinedbash
undefinedExtract plain text
提取纯文本
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"text"}}'
-o output.txt
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"text"}}'
-o output.txt
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"text"}}'
-o output.txt
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"text"}}'
-o output.txt
Extract tables as Excel
提取表格为Excel格式
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"xlsx"}}'
-o tables.xlsx
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"xlsx"}}'
-o tables.xlsx
undefinedcurl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"xlsx"}}'
-o tables.xlsx
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"xlsx"}}'
-o tables.xlsx
undefinedOCR Scanned Documents
扫描文档OCR识别
bash
undefinedbash
undefinedOCR to searchable PDF (supports 100+ languages)
OCR转换为可搜索PDF(支持100+种语言)
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "scanned.pdf=@scanned.pdf"
-F 'instructions={"parts":[{"file":"scanned.pdf"}],"actions":[{"type":"ocr","language":"english"}]}'
-o searchable.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "scanned.pdf=@scanned.pdf"
-F 'instructions={"parts":[{"file":"scanned.pdf"}],"actions":[{"type":"ocr","language":"english"}]}'
-o searchable.pdf
Languages: Supports 100+ languages via ISO 639-2 codes (e.g., `eng`, `deu`, `fra`, `spa`, `jpn`, `kor`, `chi_sim`, `chi_tra`, `ara`, `hin`, `rus`). Full language names like `english` or `german` also work. See the [complete OCR language table](https://www.nutrient.io/guides/document-engine/ocr/language-support/) for all supported codes.curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "scanned.pdf=@scanned.pdf"
-F 'instructions={"parts":[{"file":"scanned.pdf"}],"actions":[{"type":"ocr","language":"english"}]}'
-o searchable.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "scanned.pdf=@scanned.pdf"
-F 'instructions={"parts":[{"file":"scanned.pdf"}],"actions":[{"type":"ocr","language":"english"}]}'
-o searchable.pdf
语言支持:通过ISO 639-2代码支持100+种语言(例如`eng`、`deu`、`fra`、`spa`、`jpn`、`kor`、`chi_sim`、`chi_tra`、`ara`、`hin`、`rus`)。也支持完整语言名称,如`english`或`german`。查看[完整OCR语言列表](https://www.nutrient.io/guides/document-engine/ocr/language-support/)获取所有支持的代码。Redact Sensitive Information
敏感信息脱敏
bash
undefinedbash
undefinedPattern-based (SSN, email)
基于预设规则(社保号、邮箱)
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"social-security-number"}},{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"email-address"}}]}'
-o redacted.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"social-security-number"}},{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"email-address"}}]}'
-o redacted.pdf
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"social-security-number"}},{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"email-address"}}]}'
-o redacted.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"social-security-number"}},{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"email-address"}}]}'
-o redacted.pdf
Regex-based
基于正则表达式
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"regex","strategyOptions":{"regex":"\b[A-Z]{2}\d{6}\b"}}]}'
-o redacted.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"regex","strategyOptions":{"regex":"\b[A-Z]{2}\d{6}\b"}}]}'
-o redacted.pdf
Presets: `social-security-number`, `email-address`, `credit-card-number`, `international-phone-number`, `north-american-phone-number`, `date`, `time`, `url`, `ipv4`, `ipv6`, `mac-address`, `us-zip-code`, `vin`.curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"regex","strategyOptions":{"regex":"\b[A-Z]{2}\d{6}\b"}}]}'
-o redacted.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"regex","strategyOptions":{"regex":"\b[A-Z]{2}\d{6}\b"}}]}'
-o redacted.pdf
预设规则包括:`social-security-number`、`email-address`、`credit-card-number`、`international-phone-number`、`north-american-phone-number`、`date`、`time`、`url`、`ipv4`、`ipv6`、`mac-address`、`us-zip-code`、`vin`。Add Watermarks
添加水印
bash
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"watermark","text":"CONFIDENTIAL","fontSize":72,"opacity":0.3,"rotation":-45}]}' \
-o watermarked.pdfbash
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"watermark","text":"CONFIDENTIAL","fontSize":72,"opacity":0.3,"rotation":-45}]}' \
-o watermarked.pdfDigital Signatures
数字签名
bash
undefinedbash
undefinedSelf-signed CMS signature
自签名CMS签名
curl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cms"}]}'
-o signed.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cms"}]}'
-o signed.pdf
undefinedcurl -X POST https://api.nutrient.io/build
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cms"}]}'
-o signed.pdf
-H "Authorization: Bearer $NUTRIENT_API_KEY"
-F "document.pdf=@document.pdf"
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cms"}]}'
-o signed.pdf
undefinedFill PDF Forms
填充PDF表单
bash
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "form.pdf=@form.pdf" \
-F 'instructions={"parts":[{"file":"form.pdf"}],"actions":[{"type":"fillForm","formFields":{"name":"Jane Smith","email":"jane@example.com","date":"2026-02-06"}}]}' \
-o filled.pdfbash
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "form.pdf=@form.pdf" \
-F 'instructions={"parts":[{"file":"form.pdf"}],"actions":[{"type":"fillForm","formFields":{"name":"Jane Smith","email":"jane@example.com","date":"2026-02-06"}}]}' \
-o filled.pdfMCP Server (Alternative)
MCP服务器(替代方案)
For native tool integration, use the MCP server instead of curl:
json
{
"mcpServers": {
"nutrient-dws": {
"command": "npx",
"args": ["-y", "@nutrient-sdk/dws-mcp-server"],
"env": {
"NUTRIENT_DWS_API_KEY": "YOUR_API_KEY",
"SANDBOX_PATH": "/path/to/working/directory"
}
}
}
}如需原生工具集成,可使用MCP服务器替代curl:
json
{
"mcpServers": {
"nutrient-dws": {
"command": "npx",
"args": ["-y", "@nutrient-sdk/dws-mcp-server"],
"env": {
"NUTRIENT_DWS_API_KEY": "YOUR_API_KEY",
"SANDBOX_PATH": "/path/to/working/directory"
}
}
}
}When to Use
使用场景
- Converting documents between formats (PDF, DOCX, XLSX, PPTX, HTML, images)
- Extracting text, tables, or key-value pairs from PDFs
- OCR on scanned documents or images
- Redacting PII before sharing documents
- Adding watermarks to drafts or confidential documents
- Digitally signing contracts or agreements
- Filling PDF forms programmatically
- 文档格式转换(PDF、DOCX、XLSX、PPTX、HTML、图片)
- 从PDF中提取文本、表格或键值对
- 扫描文档或图片的OCR识别
- 共享文档前脱敏个人身份信息(PII)
- 为草稿或机密文档添加水印
- 为合同或协议添加数字签名
- 程序化填充PDF表单