pdfco

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

PDF.co

PDF.co

All-in-one PDF processing API. Convert, extract, merge, split, compress PDFs and more. Supports OCR for scanned documents.
Official docs: https://docs.pdf.co/

一站式PDF处理API,支持PDF转换、提取、合并、拆分、压缩等功能,可对扫描文档提供OCR支持。
官方文档:https://docs.pdf.co/

When to Use

适用场景

Use this skill when you need to:
  • Extract text from PDF files (with OCR support)
  • Convert PDF to CSV, JSON, or other formats
  • Merge multiple PDFs into one
  • Split PDF into multiple files
  • Compress PDF to reduce file size
  • Convert HTML/URL to PDF
  • Parse invoices and documents with AI

当你需要完成以下操作时可以使用这个工具:
  • 从PDF文件提取文本(支持OCR)
  • 将PDF转换为CSV、JSON或其他格式
  • 合并多个PDF为一个文件
  • 将PDF拆分为多个文件
  • 压缩PDF减小文件体积
  • 将HTML/URL转换为PDF
  • 用AI解析发票和文档

Prerequisites

前提条件

  1. Create an account at https://pdf.co/
  2. Get your API key from https://app.pdf.co/
Set environment variable:
bash
export PDFCO_API_KEY="your-email@example.com_your-api-key"

Important: When using
$VAR
in a command that pipes to another command, wrap the command containing
$VAR
in
bash -c '...'
. Due to a Claude Code bug, environment variables are silently cleared when pipes are used directly.
bash
bash -c 'curl -s "https://api.example.com" -H "Authorization: Bearer $API_KEY"'
  1. https://pdf.co/ 创建账号
  2. https://app.pdf.co/ 获取你的API密钥
设置环境变量:
bash
export PDFCO_API_KEY="your-email@example.com_your-api-key"

重要提示: 在需要管道传输到其他命令的命令中使用
$VAR
时,请将包含
$VAR
的命令包裹在
bash -c '...'
中。由于Claude Code的bug,直接使用管道时环境变量会被静默清空。
bash
bash -c 'curl -s "https://api.example.com" -H "Authorization: Bearer $API_KEY"'

How to Use

使用指南

1. PDF to Text

1. PDF转文本

Extract text from PDF with OCR support:
Write to
/tmp/request.json
:
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
  "inline": true
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/to/text" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
With specific pages (1-indexed):
Write to
/tmp/request.json
:
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
  "pages": "1-3",
  "inline": true
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/to/text" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
支持OCR从PDF中提取文本:
写入到
/tmp/request.json
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
  "inline": true
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/to/text" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
指定页码(从1开始计数):
写入到
/tmp/request.json
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
  "pages": "1-3",
  "inline": true
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/to/text" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'

2. PDF to CSV

2. PDF转CSV

Convert PDF tables to CSV:
Write to
/tmp/request.json
:
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-csv/sample.pdf",
  "inline": true
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/to/csv" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
将PDF中的表格转换为CSV:
写入到
/tmp/request.json
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-csv/sample.pdf",
  "inline": true
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/to/csv" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'

3. Merge PDFs

3. 合并PDF

Combine multiple PDFs into one:
Write to
/tmp/request.json
:
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-merge/sample1.pdf,https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-merge/sample2.pdf",
  "name": "merged.pdf"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/merge" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
将多个PDF合并为一个文件:
写入到
/tmp/request.json
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-merge/sample1.pdf,https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-merge/sample2.pdf",
  "name": "merged.pdf"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/merge" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'

4. Split PDF

4. 拆分PDF

Split PDF by page ranges:
Write to
/tmp/request.json
:
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-split/sample.pdf",
  "pages": "1-2,3-"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/split" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
按页码范围拆分PDF:
写入到
/tmp/request.json
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-split/sample.pdf",
  "pages": "1-2,3-"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/split" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'

5. Compress PDF

5. 压缩PDF

Reduce PDF file size:
Write to
/tmp/request.json
:
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-optimize/sample.pdf",
  "name": "compressed.pdf"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/optimize" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
减小PDF文件体积:
写入到
/tmp/request.json
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-optimize/sample.pdf",
  "name": "compressed.pdf"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/optimize" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'

6. HTML to PDF

6. HTML转PDF

Convert HTML or URL to PDF:
Write to
/tmp/request.json
:
json
{
  "html": "<h1>Hello World</h1><p>This is a test.</p>",
  "name": "output.pdf"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/from/html" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
From URL:
Write to
/tmp/request.json
:
json
{
  "url": "https://example.com",
  "name": "webpage.pdf"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/from/url" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
将HTML或URL转换为PDF:
写入到
/tmp/request.json
json
{
  "html": "<h1>Hello World</h1><p>This is a test.</p>",
  "name": "output.pdf"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/from/html" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
从URL转换:
写入到
/tmp/request.json
json
{
  "url": "https://example.com",
  "name": "webpage.pdf"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/from/url" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'

7. AI Invoice Parser

7. AI发票解析

Extract structured data from invoices:
Write to
/tmp/request.json
:
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/ai-invoice-parser/sample-invoice.pdf",
  "inline": true
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/ai-invoice-parser" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
从发票中提取结构化数据:
写入到
/tmp/request.json
json
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/ai-invoice-parser/sample-invoice.pdf",
  "inline": true
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/ai-invoice-parser" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'

8. Upload Local File

8. 上传本地文件

Upload a local file first, then use the returned URL:
Step 1: Get presigned upload URL
bash
bash -c 'curl -s "https://api.pdf.co/v1/file/upload/get-presigned-url?name=myfile.pdf&contenttype=application/pdf" --header "x-api-key: ${PDFCO_API_KEY}"' | jq -r '.presignedUrl, .url'
Copy the presigned URL and file URL from the response.
Step 2: Upload file
Replace
<presigned-url>
with the URL from Step 1:
bash
curl -X PUT "<presigned-url>" --header "Content-Type: application/pdf" --data-binary @/path/to/your/file.pdf
Step 3: Use file URL in subsequent API calls
Replace
<file-url>
with the file URL from Step 1:
Write to
/tmp/request.json
:
json
{
  "url": "<file-url>",
  "inline": true
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/to/text" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'
先上传本地文件,再使用返回的URL进行处理:
步骤1:获取预签名上传URL
bash
bash -c 'curl -s "https://api.pdf.co/v1/file/upload/get-presigned-url?name=myfile.pdf&contenttype=application/pdf" --header "x-api-key: ${PDFCO_API_KEY}"' | jq -r '.presignedUrl, .url'
从响应中复制预签名URL和文件URL。
步骤2:上传文件
<presigned-url>
替换为步骤1获取的URL:
bash
curl -X PUT "<presigned-url>" --header "Content-Type: application/pdf" --data-binary @/path/to/your/file.pdf
步骤3:在后续API调用中使用文件URL
<file-url>
替换为步骤1获取的文件URL:
写入到
/tmp/request.json
json
{
  "url": "<file-url>",
  "inline": true
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/pdf/convert/to/text" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'

9. Async Mode (Large Files)

9. 异步模式(适用于大文件)

For large files, use async mode to avoid timeouts:
Step 1: Start async job
Write to
/tmp/request.json
:
json
{
  "url": "https://example.com/large-file.pdf",
  "async": true
}
bash
bash -c 'curl -s --location --request POST "https://api.pdf.co/v1/pdf/convert/to/text" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json' | jq -r '.jobId'
Copy the job ID from the response.
Step 2: Check job status
Replace
<job-id>
with the job ID from Step 1:
Write to
/tmp/request.json
:
json
{
  "jobid": "<job-id>"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/job/check" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'

针对大文件使用异步模式避免超时:
步骤1:启动异步任务
写入到
/tmp/request.json
json
{
  "url": "https://example.com/large-file.pdf",
  "async": true
}
bash
bash -c 'curl -s --location --request POST "https://api.pdf.co/v1/pdf/convert/to/text" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json' | jq -r '.jobId'
从响应中复制任务ID。
步骤2:检查任务状态
<job-id>
替换为步骤1获取的任务ID:
写入到
/tmp/request.json
json
{
  "jobid": "<job-id>"
}
bash
bash -c 'curl --location --request POST "https://api.pdf.co/v1/job/check" --header "x-api-key: ${PDFCO_API_KEY}" --header "Content-Type: application/json" -d @/tmp/request.json'

Common Parameters

通用参数

ParameterTypeDescription
url
stringURL to source file (required)
inline
booleanReturn result in response body
async
booleanRun as background job
pages
stringPage range, 1-indexed (e.g., "1-3", "1,3,5", "2-")
name
stringOutput filename
password
stringPDF password if protected
expiration
integerOutput link expiration in minutes (default: 60)

参数类型说明
url
string源文件的URL(必填)
inline
boolean是否在响应体中直接返回结果
async
boolean是否作为后台任务运行
pages
string页码范围,从1开始计数(例如:"1-3", "1,3,5", "2-")
name
string输出文件名
password
stringPDF的密码(如果文件受保护)
expiration
integer输出链接的有效时长,单位为分钟(默认:60)

Response Format

响应格式

json
{
  "url": "https://pdf-temp-files.s3.amazonaws.com/.../result.pdf",
  "pageCount": 5,
  "error": false,
  "status": 200,
  "name": "result.pdf",
  "credits": 10,
  "remainingCredits": 9990
}
With
inline: true
, the response includes
body
field with extracted content.

json
{
  "url": "https://pdf-temp-files.s3.amazonaws.com/.../result.pdf",
  "pageCount": 5,
  "error": false,
  "status": 200,
  "name": "result.pdf",
  "credits": 10,
  "remainingCredits": 9990
}
当设置
inline: true
时,响应中会包含
body
字段,存放提取到的内容。

API Endpoints

API端点

EndpointDescription
/pdf/convert/to/text
PDF to text (OCR supported)
/pdf/convert/to/csv
PDF to CSV
/pdf/convert/to/json
PDF to JSON
/pdf/merge
Merge multiple PDFs
/pdf/split
Split PDF by pages
/pdf/optimize
Compress PDF
/pdf/convert/from/html
HTML to PDF
/pdf/convert/from/url
URL to PDF
/ai-invoice-parser
AI-powered invoice parsing
/document-parser
Template-based document parsing
/file/upload/get-presigned-url
Get upload URL
/job/check
Check async job status

端点说明
/pdf/convert/to/text
PDF转文本(支持OCR)
/pdf/convert/to/csv
PDF转CSV
/pdf/convert/to/json
PDF转JSON
/pdf/merge
合并多个PDF
/pdf/split
按页码拆分PDF
/pdf/optimize
压缩PDF
/pdf/convert/from/html
HTML转PDF
/pdf/convert/from/url
URL转PDF
/ai-invoice-parser
AI驱动的发票解析
/document-parser
基于模板的文档解析
/file/upload/get-presigned-url
获取上传URL
/job/check
检查异步任务状态

Guidelines

使用说明

  1. File Sources: Use direct URLs or upload files first via presigned URL
  2. Large Files: Use
    async: true
    for files over 40 pages or 10MB
  3. OCR: Automatically enabled for scanned PDFs (set
    lang
    for non-English)
  4. Rate Limits: Check your plan at https://pdf.co/pricing
  5. Output Expiration: Download results within expiration time (default 60 min)
  6. Credits: Each operation costs credits; check
    remainingCredits
    in response
  1. 文件来源:使用直接URL,或者先通过预签名URL上传文件
  2. 大文件处理:针对超过40页或10MB的文件,请设置
    async: true
  3. OCR:扫描版PDF会自动启用OCR(非英文内容请设置
    lang
    参数)
  4. 速率限制:可在https://pdf.co/pricing 查看你的套餐限制
  5. 输出有效期:请在有效期内下载结果(默认60分钟)
  6. 积分消耗:每次操作都会消耗积分,可在响应中查看
    remainingCredits
    剩余积分