api

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

DeepRead API Reference

DeepRead API参考文档

You are helping a developer integrate DeepRead into their application. You know the full API and can write working integration code in any language.

Base URL:

https://api.deepread.tech

Auth:

X-API-Key

header with key from

https://www.deepread.tech/dashboard

or via the device authorization flow (see Agent Authentication below)

你正在帮助开发者将DeepRead集成到他们的应用中。你熟悉完整的API，可以用任意语言编写可运行的集成代码。

基础URL：

https://api.deepread.tech

认证方式： 使用

X-API-Key

请求头，密钥可从

https://www.deepread.tech/dashboard

获取，或通过设备授权流程获取（见下方Agent认证章节）

Agent Authentication (Device Authorization Flow)

Agent认证（设备授权流程）

These endpoints let an AI agent obtain an API key without the user ever copy/pasting secrets. Based on OAuth 2.0 Device Authorization Grant (RFC 8628).

这些接口允许AI Agent无需用户手动复制粘贴密钥即可获取API密钥。基于OAuth 2.0设备授权许可（RFC 8628）。

POST /v1/agent/device/code — Request a Device Code

POST /v1/agent/device/code — 请求设备码

Auth: None (public endpoint) Content-Type:

application/json

json

{"agent_name": "my-agent"}

Parameter	Type	Required	Description
`agent_name`	string	No	Display name shown to the user during approval (e.g. "Claude Code", "My CI Bot"). Optional but strongly recommended — without it, the user sees "Unknown Agent".

Response (200 OK):

json

{
  "device_code": "a7f3c9d2e1b8...",
  "user_code": "HXKP-3MNV",
  "verification_uri": "https://www.deepread.tech/activate",
  "verification_uri_complete": "https://www.deepread.tech/activate?code=HXKP-3MNV",
  "expires_in": 900,
  "interval": 5
}

Field	Description
`device_code`	Secret code for polling — never show this to the user
`user_code`	Short code the user enters in their browser (format: `XXXX-XXXX` )
`verification_uri`	Base URL for manual code entry
`verification_uri_complete`	URL with code pre-filled — open this to skip manual entry (preferred)
`expires_in`	Seconds until the code expires (default: 900 = 15 minutes)
`interval`	Minimum seconds between poll requests

认证： 无需认证（公开接口） 内容类型：

application/json

json

{"agent_name": "my-agent"}

参数	类型	是否必填	描述
`agent_name`	string	否	审批时展示给用户的Agent名称（例如"Claude Code"、"My CI Bot"）。可选但强烈推荐——如果不填，用户会看到"Unknown Agent"。

响应（200 OK）：

json

{
  "device_code": "a7f3c9d2e1b8...",
  "user_code": "HXKP-3MNV",
  "verification_uri": "https://www.deepread.tech/activate",
  "verification_uri_complete": "https://www.deepread.tech/activate?code=HXKP-3MNV",
  "expires_in": 900,
  "interval": 5
}

字段	描述
`device_code`	用于轮询的保密代码——绝不能展示给用户
`user_code`	用户在浏览器中输入的短代码（格式： `XXXX-XXXX` ）
`verification_uri`	手动输入代码的基础URL
`verification_uri_complete`	预填代码的URL——打开此链接可跳过手动输入（推荐使用）
`expires_in`	代码过期前的秒数（默认：900 = 15分钟）
`interval`	轮询请求之间的最小间隔秒数

POST /v1/agent/device/token — Poll for API Key

POST /v1/agent/device/token — 轮询获取API密钥

Auth: None (public endpoint) Content-Type:

application/json

json

{"device_code": "a7f3c9d2e1b8..."}

Poll this endpoint every

interval

seconds after the user has been shown the code.

Responses:

Scenario	`error` field	`api_key` field	Action
User hasn't acted yet	`"authorization_pending"`	`null`	Wait `interval` seconds, poll again
User approved	`null`	`"sk_live_..."`	Save the key, stop polling
User denied	`"access_denied"`	`null`	Stop polling, inform user
Code expired	`"expired_token"`	`null`	Start over with a new device code

The response always includes all three fields (

error

api_key

key_prefix

). Check

api_key != null

to detect success — don't rely on key presence alone.

Important:

The
```
api_key
```
is returned exactly once. After you retrieve it, the server clears it. Store it immediately.
The
```
key_prefix
```
is a non-secret identifier for the key (useful for display/logging).
Never show
```
device_code
```
or
```
api_key
```
to the user.

What happens on the user's side (you don't need to call these):

User opens
```
verification_uri_complete
```
— the code is pre-filled, no typing needed
User logs in (or signs up + confirms email for new users)
User sees your agent name and clicks Approve → redirected to dashboard
Once approved, the next poll to
```
/v1/agent/device/token
```
returns the
```
api_key
```

认证： 无需认证（公开接口） 内容类型：

application/json

json

{"device_code": "a7f3c9d2e1b8..."}

在向用户展示代码后，每隔

interval

秒轮询此接口。

响应情况：

场景	`error` 字段	`api_key` 字段	操作
用户尚未操作	`"authorization_pending"`	`null`	等待 `interval` 秒后再次轮询
用户已批准	`null`	`"sk_live_..."`	保存密钥，停止轮询
用户已拒绝	`"access_denied"`	`null`	停止轮询，通知用户
代码已过期	`"expired_token"`	`null`	重新获取新的设备码

响应始终包含三个字段（

error

、

api_key

、

key_prefix

）。通过检查

api_key != null

来判断是否成功——不要仅依赖密钥是否存在。

重要提示：

```
api_key
```
只会返回一次。获取后，服务器会清除该密钥，请立即存储。
```
key_prefix
```
是密钥的非保密标识符（用于展示/日志记录）。
绝不能向用户展示
```
device_code
```
或
```
api_key
```
。

用户端操作流程（无需调用以下接口）：

用户打开
```
verification_uri_complete
```
——代码已预填，无需手动输入
用户登录（新用户需注册并确认邮箱）
用户看到你的Agent名称并点击批准→跳转到控制台
批准后，下一次轮询
```
/v1/agent/device/token
```
会返回
```
api_key
```

Processing

文档处理

POST /v1/process — Submit a Document

POST /v1/process — 提交文档

Uploads a document for async processing. Returns immediately with a job ID.

Auth:

X-API-Key: YOUR_KEY

Content-Type:

multipart/form-data

Parameter	Type	Required	Default	Description
`file`	File	Yes	—	PDF, PNG, JPG, or JPEG
`pipeline`	string	No	`"standard"`	`"standard"` or `"searchable"`
`schema`	string	No	—	JSON Schema for structured extraction
`blueprint_id`	string	No	—	Blueprint UUID (mutually exclusive with schema)
`include_images`	string	No	`"true"`	Generate preview images and page data
`include_pages`	string	No	`"false"`	Per-page breakdown (auto-enabled when include_images=true)
`webhook_url`	string	No	—	HTTPS URL to notify on completion
`version`	string	No	—	Pipeline version for reproducibility

Note: Provide

schema

blueprint_id

, not both. Without either, only OCR text is returned.

Response (200 OK):

json

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued"
}

Errors:

Status	Meaning
400	Invalid schema, unsupported file type, both schema and blueprint_id provided
401	Invalid or missing API key
413	File exceeds plan limit (15MB free, 50MB paid)
429	Monthly page quota exceeded or rate limit hit

上传文档进行异步处理，立即返回任务ID。

认证：

X-API-Key: YOUR_KEY

内容类型：

multipart/form-data

参数	类型	是否必填	默认值	描述
`file`	File	是	—	PDF、PNG、JPG或JPEG格式
`pipeline`	string	否	`"standard"`	可选值为 `"standard"` 或 `"searchable"`
`schema`	string	否	—	用于结构化提取的JSON Schema
`blueprint_id`	string	否	—	蓝图UUID（与schema互斥，只能填其一）
`include_images`	string	否	`"true"`	生成预览图片和页面数据
`include_pages`	string	否	`"false"`	按页拆分数据（当include_images=true时自动启用）
`webhook_url`	string	否	—	处理完成后通知的HTTPS URL
`version`	string	否	—	用于可复现性的流水线版本

注意： 请提供

schema

或

blueprint_id

，不要同时提供。如果都不提供，仅返回OCR识别的文本。

响应（200 OK）：

json

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued"
}

错误：

状态码	含义
400	无效的schema、不支持的文件类型、同时提供了schema和blueprint_id
401	API密钥无效或缺失
413	文件大小超出套餐限制（免费版15MB，付费版50MB）
429	超出每月页数配额或速率限制

GET /v1/jobs/{job_id} — Get Results

GET /v1/jobs/{job_id} — 获取处理结果

Poll until

status

completed

failed

. Recommended: wait 5s, then poll every 5-10s with exponential backoff, max 5 minutes.

Auth:

X-API-Key: YOUR_KEY

Response (completed):

json

{
  "id": "550e8400-...",
  "status": "completed",
  "created_at": "2025-01-18T10:30:00Z",
  "completed_at": "2025-01-18T10:32:15Z",
  "result": {
    "text": "Full extracted text in markdown",
    "text_preview": "First 500 characters...",
    "text_url": "https://...",
    "data": {
      "vendor": {"value": "Acme Inc", "hil_flag": false, "found_on_page": 1},
      "total": {"value": 1250.00, "hil_flag": true, "reason": "Outside typical range", "found_on_page": 1}
    },
    "pages": [
      {
        "page_number": 1,
        "text": "Page 1 text...",
        "hil_flag": false,
        "review_reason": null,
        "data": {}
      }
    ]
  },
  "metadata": {
    "page_count": 3,
    "pipeline": "standard",
    "review_percentage": 5.0,
    "fields_requiring_review": 1,
    "total_fields": 20,
    "step_timings": {}
  },
  "preview_url": "https://preview.deepread.tech/token123...",
  "webhook_url": "https://yourapp.com/webhook",
  "webhook_delivered": true
}

Notes:

```
text_url
```
is provided when full text exceeds 1MB — fetch from this URL instead
```
text_preview
```
is always the first 500 characters
```
data
```
is only present if
```
schema
```
or
```
blueprint_id
```
was provided

pages

is present when

include_pages=true

include_images=true

```
preview_url
```
is a shareable link (no auth needed) to the HIL review interface

Response (failed):

json

{
  "id": "550e8400-...",
  "status": "failed",
  "error": "PDF parsing failed: file may be corrupted"
}

Statuses:

queued

→

processing

→

completed

failed

轮询直到

status

变为

completed

或

failed

。推荐：先等待5秒，之后每隔5-10秒轮询一次，采用指数退避策略，最长等待5分钟。

认证：

X-API-Key: YOUR_KEY

处理完成响应：

json

{
  "id": "550e8400-...",
  "status": "completed",
  "created_at": "2025-01-18T10:30:00Z",
  "completed_at": "2025-01-18T10:32:15Z",
  "result": {
    "text": "完整的提取文本（Markdown格式）",
    "text_preview": "前500个字符...",
    "text_url": "https://...",
    "data": {
      "vendor": {"value": "Acme Inc", "hil_flag": false, "found_on_page": 1},
      "total": {"value": 1250.00, "hil_flag": true, "reason": "超出典型范围", "found_on_page": 1}
    },
    "pages": [
      {
        "page_number": 1,
        "text": "第1页文本...",
        "hil_flag": false,
        "review_reason": null,
        "data": {}
      }
    ]
  },
  "metadata": {
    "page_count": 3,
    "pipeline": "standard",
    "review_percentage": 5.0,
    "fields_requiring_review": 1,
    "total_fields": 20,
    "step_timings": {}
  },
  "preview_url": "https://preview.deepread.tech/token123...",
  "webhook_url": "https://yourapp.com/webhook",
  "webhook_delivered": true
}

说明：

当完整文本超过1MB时，会返回
```
text_url
```
——请从此URL获取文本
```
text_preview
```
始终是文本的前500个字符
只有提供了
```
schema
```
或
```
blueprint_id
```
时，才会返回
```
data
```
字段

当

include_pages=true

或

include_images=true

时，会返回

pages

字段

```
preview_url
```
是可分享的链接（无需认证），用于HIL审核界面

处理失败响应：

json

{
  "id": "550e8400-...",
  "status": "failed",
  "error": "PDF解析失败：文件可能已损坏"
}

状态流转：

queued

→

processing

→

completed

或

failed

GET /v1/preview/{token} — Public Preview (No Auth)

GET /v1/preview/{token} — 公开预览（无需认证）

Returns document preview data. Anyone with the token can view — no API key needed. Use for sharing results with stakeholders.

json

{
  "file_name": "invoice.pdf",
  "status": "completed",
  "created_at": "2025-01-18T10:30:00Z",
  "pages": [
    {
      "page_number": 1,
      "image_url": "https://...",
      "text": "Page text...",
      "hil_flag": false,
      "data": {}
    }
  ],
  "data": {},
  "metadata": {"page_count": 1, "pipeline": "standard", "review_percentage": 0}
}

返回文档预览数据。任何拥有token的用户都可以查看——无需API密钥。用于与利益相关者分享结果。

json

{
  "file_name": "invoice.pdf",
  "status": "completed",
  "created_at": "2025-01-18T10:30:00Z",
  "pages": [
    {
      "page_number": 1,
      "image_url": "https://...",
      "text": "页面文本...",
      "hil_flag": false,
      "data": {}
    }
  ],
  "data": {},
  "metadata": {"page_count": 1, "pipeline": "standard", "review_percentage": 0}
}

GET /v1/pipelines — List Pipelines (No Auth)

GET /v1/pipelines — 列出流水线（无需认证）

standard — Multi-model consensus (GPT + Gemini), dual OCR with LLM judge, ~2-3 minutes
searchable — Creates searchable PDF with embedded OCR text layer, ~3-4 minutes

standard — 多模型共识（GPT + Gemini），双重OCR搭配LLM判断，耗时约2-3分钟
searchable — 创建带嵌入式OCR文本层的可搜索PDF，耗时约3-4分钟

Blueprints & Optimizer

蓝图与优化器

Blueprints are optimized, versioned schemas. The optimizer takes your sample documents + expected values and enhances field descriptions for 20-30% accuracy improvement.

蓝图是经过优化的版本化Schema。优化器接收你的样本文档和预期值，增强字段描述，可将准确率提升20-30%。

GET /v1/blueprints/ — List Blueprints

GET /v1/blueprints/ — 列出蓝图

Auth:

X-API-Key: YOUR_KEY

Returns all blueprints with active version and accuracy metrics.

认证：

X-API-Key: YOUR_KEY

返回所有蓝图，包含活跃版本和准确率指标。

GET /v1/blueprints/{blueprint_id} — Get Blueprint Details

GET /v1/blueprints/{blueprint_id} — 获取蓝图详情

Auth:

X-API-Key: YOUR_KEY

Returns blueprint with all versions, active version schema, and accuracy metrics.

认证：

X-API-Key: YOUR_KEY

返回蓝图的所有版本、活跃版本的Schema以及准确率指标。

POST /v1/optimize — Start Optimization

POST /v1/optimize — 启动优化

Auth:

X-API-Key: YOUR_KEY

json

{
  "name": "utility_invoice",
  "description": "Utility bill extraction",
  "document_type": "invoice",
  "initial_schema": {"type": "object", "properties": {...}},
  "training_documents": ["path1.pdf", "path2.pdf"],
  "ground_truth_data": [{"vendor": "Electric Co", "total": 150.00}, ...],
  "target_accuracy": 95.0,
  "max_iterations": 5,
  "max_cost_usd": 10.0
}

```
initial_schema
```
is optional — auto-generated from ground truth if omitted
Minimum 2 training documents
```
validation_split
```
(default 0.3) — fraction held out for validation

Response:

json

{
  "job_id": "...",
  "blueprint_id": "...",
  "status": "pending"
}

认证：

X-API-Key: YOUR_KEY

json

{
  "name": "utility_invoice",
  "description": "水电费账单提取",
  "document_type": "invoice",
  "initial_schema": {"type": "object", "properties": {...}},
  "training_documents": ["path1.pdf", "path2.pdf"],
  "ground_truth_data": [{"vendor": "Electric Co", "total": 150.00}, ...],
  "target_accuracy": 95.0,
  "max_iterations": 5,
  "max_cost_usd": 10.0
}

```
initial_schema
```
是可选参数——如果省略，会从真值数据自动生成
至少需要2份训练文档
```
validation_split
```
（默认0.3）——用于验证的数据集比例

响应：

json

{
  "job_id": "...",
  "blueprint_id": "...",
  "status": "pending"
}

POST /v1/optimize/resume — Resume Optimization

POST /v1/optimize/resume — 恢复优化

Resume a failed job or start a new optimization run for an existing blueprint.

恢复失败的任务，或为现有蓝图启动新的优化运行。

GET /v1/blueprints/jobs/{job_id} — Optimization Job Status

GET /v1/blueprints/jobs/{job_id} — 优化任务状态

Auth:

X-API-Key: YOUR_KEY

json

{
  "status": "running",
  "iteration": 2,
  "baseline_accuracy": 68.0,
  "current_accuracy": 88.0,
  "target_accuracy": 95.0,
  "total_cost": 1.82,
  "max_cost_usd": 10.0
}

Statuses:

pending

→

initializing

→

running

→

completed

failed

, or

cancelled

认证：

X-API-Key: YOUR_KEY

json

{
  "status": "running",
  "iteration": 2,
  "baseline_accuracy": 68.0,
  "current_accuracy": 88.0,
  "target_accuracy": 95.0,
  "total_cost": 1.82,
  "max_cost_usd": 10.0
}

状态流转：

pending

→

initializing

→

running

→

completed

、

failed

或

cancelled

GET /v1/blueprints/jobs/{job_id}/schema — Get Optimized Schema

GET /v1/blueprints/jobs/{job_id}/schema — 获取优化后的Schema

Returns the optimized JSON schema after optimization completes.

优化完成后返回优化后的JSON Schema。

Using a Blueprint

使用蓝图

bash

curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: YOUR_KEY" \
  -F "file=@invoice.pdf" \
  -F "blueprint_id=660e8400-..."

bash

curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: YOUR_KEY" \
  -F "file=@invoice.pdf" \
  -F "blueprint_id=660e8400-..."

Webhooks

Webhook

Pass

webhook_url

when submitting a document to get notified on completion.

Payload sent to your URL:

json

{
  "event": "job.completed",
  "job_id": "550e8400-...",
  "status": "completed",
  "result": {"text": "...", "data": {}},
  "metadata": {},
  "preview_url": "https://preview.deepread.tech/..."
}

Important:

Webhooks are NOT authenticated — always fetch the canonical result via
```
GET /v1/jobs/{job_id}
```
with your API key
Must be HTTPS
Return 2xx to confirm delivery
Delivery is best-effort — use polling as fallback if webhook not received
Make your endpoint idempotent (may receive duplicates)

提交文档时传入

webhook_url

，处理完成后会收到通知。

发送到你URL的负载：

json

{
  "event": "job.completed",
  "job_id": "550e8400-...",
  "status": "completed",
  "result": {"text": "...", "data": {}},
  "metadata": {},
  "preview_url": "https://preview.deepread.tech/..."
}

重要提示：

Webhook不提供认证——请始终使用你的API密钥通过
```
GET /v1/jobs/{job_id}
```
获取权威结果
必须使用HTTPS
返回2xx状态码确认接收
采用尽力交付机制——如果未收到Webhook，请使用轮询作为备选方案
确保你的端点是幂等的（可能会收到重复通知）

Rate Limits

速率限制

Every response includes these headers:

Header	Description
`X-RateLimit-Limit`	Monthly pages in your plan
`X-RateLimit-Remaining`	Pages remaining this cycle
`X-RateLimit-Used`	Pages used this cycle
`X-RateLimit-Reset`	Unix timestamp when quota resets

Plans:

Plan	Pages/month	Max file	Per-doc limit	Rate limit
Free	2,000	15 MB	50 pages	10 req/min
Pro ($99/mo)	50,000	50 MB	Unlimited	100 req/min
Scale	1,000,000	50 MB	Unlimited	500 req/min

每个响应都包含以下头部：

头部	描述
`X-RateLimit-Limit`	你的套餐每月可处理的页数
`X-RateLimit-Remaining`	当前周期剩余的可处理页数
`X-RateLimit-Used`	当前周期已使用的页数
`X-RateLimit-Reset`	配额重置的Unix时间戳

套餐详情：

套餐	每月页数	最大文件大小	单文档页数限制	速率限制
免费版	2,000	15 MB	50页	10次请求/分钟
专业版（$99/月）	50,000	50 MB	无限制	100次请求/分钟
企业版	1,000,000	50 MB	无限制	500次请求/分钟

Error Handling

错误处理

All errors return:

json

{"detail": "Human-readable error message"}

Status	Meaning
400	Bad request — invalid schema, unsupported file, both schema + blueprint_id
401	Invalid or missing API key
404	Job not found
413	File too large for your plan
429	Rate limit or monthly quota exceeded
500	Server error

Quota exceeded (429):

json

{
  "detail": {
    "error": "page_count_exceeded",
    "message": "Document has 100 pages, exceeds 50-page limit for FREE plan. Upgrade to PRO.",
    "page_count": 100,
    "max_pages": 50,
    "plan": "free"
  }
}

Common failure reasons in jobs:

Document issues: corrupted, unreadable, poor scan quality, processing timeout
Schema issues: invalid JSON Schema, required fields not found
Plan limits: file too large, too many pages, quota exceeded

所有错误都会返回：

json

{"detail": "易读的错误消息"}

状态码	含义
400	请求无效——无效的schema、不支持的文件类型、同时提供了schema和blueprint_id
401	API密钥无效或缺失
404	任务不存在
413	文件大小超出套餐限制
429	超出速率限制或每月配额
500	服务器错误

配额超出（429）：

json

{
  "detail": {
    "error": "page_count_exceeded",
    "message": "文档有100页，超出免费版50页的限制。请升级到专业版。",
    "page_count": 100,
    "max_pages": 50,
    "plan": "free"
  }
}

任务常见失败原因：

文档问题：损坏、无法读取、扫描质量差、处理超时
Schema问题：无效的JSON Schema、未找到必填字段
套餐限制：文件过大、页数过多、配额超出

Code Examples

代码示例

Python

python

import requests
import time
import json

API_KEY = "sk_live_YOUR_KEY"
BASE = "https://api.deepread.tech"

python

import requests
import time
import json

API_KEY = "sk_live_YOUR_KEY"
BASE = "https://api.deepread.tech"

Submit document with structured extraction

提交文档进行结构化提取

schema = { "type": "object", "properties": { "vendor": {"type": "string", "description": "Vendor or company name"}, "total": {"type": "number", "description": "Total amount due"}, "due_date": {"type": "string", "description": "Payment due date"} } }

with open("invoice.pdf", "rb") as f: resp = requests.post( f"{BASE}/v1/process", headers={"X-API-Key": API_KEY}, files={"file": f}, data={"schema": json.dumps(schema)} ) job_id = resp.json()["id"]

schema = { "type": "object", "properties": { "vendor": {"type": "string", "description": "供应商或公司名称"}, "total": {"type": "number", "description": "应付总金额"}, "due_date": {"type": "string", "description": "付款截止日期"} } }

with open("invoice.pdf", "rb") as f: resp = requests.post( f"{BASE}/v1/process", headers={"X-API-Key": API_KEY}, files={"file": f}, data={"schema": json.dumps(schema)} ) job_id = resp.json()["id"]

Poll with exponential backoff

指数退避轮询

delay = 5 while True: time.sleep(delay) result = requests.get( f"{BASE}/v1/jobs/{job_id}", headers={"X-API-Key": API_KEY} ).json()

if result["status"] in ("completed", "failed"):
    break
delay = min(delay * 1.5, 30)  # cap at 30s

delay = 5 while True: time.sleep(delay) result = requests.get( f"{BASE}/v1/jobs/{job_id}", headers={"X-API-Key": API_KEY} ).json()

if result["status"] in ("completed", "failed"):
    break
delay = min(delay * 1.5, 30)  # 最大延迟30秒

Use results

处理结果

if result["status"] == "completed": text = result["result"]["text"] data = result["result"].get("data", {}) for field, info in data.items(): if info["hil_flag"]: print(f"REVIEW: {field} = {info['value']} ({info.get('reason')})") else: print(f"OK: {field} = {info['value']}")

undefined

if result["status"] == "completed": text = result["result"]["text"] data = result["result"].get("data", {}) for field, info in data.items(): if info["hil_flag"]: print(f"需审核：{field} = {info['value']}（原因：{info.get('reason')}）") else: print(f"正常：{field} = {info['value']}")

undefined

JavaScript / Node.js

javascript

import fs from "fs";

const API_KEY = "sk_live_YOUR_KEY";
const BASE = "https://api.deepread.tech";

// Submit document
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
form.append("schema", JSON.stringify({
  type: "object",
  properties: {
    vendor: { type: "string", description: "Vendor or company name" },
    total: { type: "number", description: "Total amount due" }
  }
}));

const { id: jobId } = await fetch(`${BASE}/v1/process`, {
  method: "POST",
  headers: { "X-API-Key": API_KEY },
  body: form
}).then(r => r.json());

// Poll with backoff
let delay = 5000;
let result;
do {
  await new Promise(r => setTimeout(r, delay));
  result = await fetch(`${BASE}/v1/jobs/${jobId}`, {
    headers: { "X-API-Key": API_KEY }
  }).then(r => r.json());
  delay = Math.min(delay * 1.5, 30000);
} while (!["completed", "failed"].includes(result.status));

console.log(result);

javascript

import fs from "fs";

const API_KEY = "sk_live_YOUR_KEY";
const BASE = "https://api.deepread.tech";

// 提交文档
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
form.append("schema", JSON.stringify({
  type: "object",
  properties: {
    vendor: { type: "string", description: "供应商或公司名称" },
    total: { type: "number", description: "应付总金额" }
  }
}));

const { id: jobId } = await fetch(`${BASE}/v1/process`, {
  method: "POST",
  headers: { "X-API-Key": API_KEY },
  body: form
}).then(r => r.json());

// 指数退避轮询
let delay = 5000;
let result;
do {
  await new Promise(r => setTimeout(r, delay));
  result = await fetch(`${BASE}/v1/jobs/${jobId}`, {
    headers: { "X-API-Key": API_KEY }
  }).then(r => r.json());
  delay = Math.min(delay * 1.5, 30000);
} while (!["completed", "failed"].includes(result.status));

console.log(result);

cURL

bash

undefined

bash

undefined

Submit with schema

使用schema提交文档

curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F 'schema={"type":"object","properties":{"vendor":{"type":"string","description":"Vendor name"},"total":{"type":"number","description":"Total amount"}}}'

curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F 'schema={"type":"object","properties":{"vendor":{"type":"string","description":"供应商名称"},"total":{"type":"number","description":"应付总金额"}}}'

Submit with blueprint

使用蓝图提交文档

curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F "blueprint_id=660e8400-..."

Get results

获取结果

curl https://api.deepread.tech/v1/jobs/JOB_ID
-H "X-API-Key: YOUR_KEY"

List blueprints

列出蓝图

curl https://api.deepread.tech/v1/blueprints/
-H "X-API-Key: YOUR_KEY"

undefined

curl https://api.deepread.tech/v1/blueprints/
-H "X-API-Key: YOUR_KEY"

undefined

Agent Device Flow (Python)

Agent设备授权流程（Python）

python

import requests
import time
import webbrowser

BASE = "https://api.deepread.tech"

python

import requests
import time
import webbrowser

BASE = "https://api.deepread.tech"

Step 1: Request a device code

步骤1：请求设备码

resp = requests.post(f"{BASE}/v1/agent/device/code", json={"agent_name": "my-agent"}) data = resp.json() device_code = data["device_code"] uri_complete = data["verification_uri_complete"] interval = data["interval"]

Step 2: Open browser with code pre-filled

步骤2：打开预填代码的浏览器链接

success = webbrowser.open(uri_complete) if success: print(f"Opened browser: {uri_complete}") else: print(f"Unable to open browser programmatically; please open this URL manually: {uri_complete}") print("Log in and click Approve. I'll wait here.")

success = webbrowser.open(uri_complete) if success: print(f"已打开浏览器：{uri_complete}") else: print(f"无法自动打开浏览器，请手动打开此链接：{uri_complete}") print("请登录并点击批准，我会在此等待。")

Step 3: Poll until approved

步骤3：轮询直到获取API密钥

api_key = None while True: time.sleep(interval) resp = requests.post(f"{BASE}/v1/agent/device/token", json={"device_code": device_code}) result = resp.json()

if result.get("api_key"):
    api_key = result["api_key"]
    print(f"Got API key: {result['key_prefix']}...")
    break
elif result.get("error") == "authorization_pending":
    continue
elif result.get("error") == "access_denied":
    print("User denied the request.")
    break
elif result.get("error") == "expired_token":
    print("Code expired. Please start over.")
    break

if api_key is None: raise SystemExit("Device flow did not complete successfully — no API key obtained.")

api_key = None while True: time.sleep(interval) resp = requests.post(f"{BASE}/v1/agent/device/token", json={"device_code": device_code}) result = resp.json()

if result.get("api_key"):
    api_key = result["api_key"]
    print(f"已获取API密钥：{result['key_prefix']}...")
    break
elif result.get("error") == "authorization_pending":
    continue
elif result.get("error") == "access_denied":
    print("用户已拒绝请求。")
    break
elif result.get("error") == "expired_token":
    print("代码已过期，请重新开始。")
    break

if api_key is None: raise SystemExit("设备授权流程未成功完成——未获取到API密钥。")

Step 4: Use the key to process documents

步骤4：使用API密钥处理文档

with open("invoice.pdf", "rb") as f: resp = requests.post( f"{BASE}/v1/process", headers={"X-API-Key": api_key}, files={"file": f}, ) print(resp.json()) # {"id": "...", "status": "queued"}

undefined

with open("invoice.pdf", "rb") as f: resp = requests.post( f"{BASE}/v1/process", headers={"X-API-Key": api_key}, files={"file": f}, ) print(resp.json()) # {"id": "...", "status": "queued"}

undefined

Agent Device Flow (JavaScript)

Agent设备授权流程（JavaScript）

javascript

const fs = require("fs");
const BASE = "https://api.deepread.tech";

// Step 1: Request a device code
const { device_code, verification_uri_complete, interval } = await fetch(
  `${BASE}/v1/agent/device/code`,
  { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ agent_name: "my-agent" }) }
).then(r => r.json());

// Step 2: Open browser with code pre-filled
console.log(`Please open: ${verification_uri_complete}`);
console.log("Log in and click Approve. I'll wait here.");

// Step 3: Poll until approved
let apiKey;
while (true) {
  await new Promise(r => setTimeout(r, interval * 1000));
  const result = await fetch(`${BASE}/v1/agent/device/token`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ device_code }),
  }).then(r => r.json());

  if (result.api_key) {
    apiKey = result.api_key;
    console.log(`Got API key: ${result.key_prefix}...`);
    break;
  } else if (result.error === "authorization_pending") {
    continue;
  } else {
    console.log(`Flow ended: ${result.error}`);
    break;
  }
}

if (!apiKey) {
  throw new Error("Device flow did not complete successfully — no API key obtained.");
}

// Step 4: Use the key
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
const job = await fetch(`${BASE}/v1/process`, {
  method: "POST",
  headers: { "X-API-Key": apiKey },
  body: form,
}).then(r => r.json());
console.log(job); // {id: "...", status: "queued"}

javascript

const fs = require("fs");
const BASE = "https://api.deepread.tech";

// 步骤1：请求设备码
const { device_code, verification_uri_complete, interval } = await fetch(
  `${BASE}/v1/agent/device/code`,
  { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ agent_name: "my-agent" }) }
).then(r => r.json());

// 步骤2：提示用户打开链接
console.log(`请打开：${verification_uri_complete}`);
console.log("请登录并点击批准，我会在此等待。");

// 步骤3：轮询直到获取API密钥
let apiKey;
while (true) {
  await new Promise(r => setTimeout(r, interval * 1000));
  const result = await fetch(`${BASE}/v1/agent/device/token`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ device_code }),
  }).then(r => r.json());

  if (result.api_key) {
    apiKey = result.api_key;
    console.log(`已获取API密钥：${result.key_prefix}...`);
    break;
  } else if (result.error === "authorization_pending") {
    continue;
  } else {
    console.log(`流程结束：${result.error}`);
    break;
  }
}

if (!apiKey) {
  throw new Error("设备授权流程未成功完成——未获取到API密钥。");
}

// 步骤4：使用API密钥
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
const job = await fetch(`${BASE}/v1/process`, {
  method: "POST",
  headers: { "X-API-Key": apiKey },
  body: form,
}).then(r => r.json());
console.log(job); // {id: "...", status: "queued"}

Agent Device Flow (cURL)

Agent设备授权流程（cURL）

bash

undefined

bash

undefined

Step 1: Request a device code — save the full response

步骤1：请求设备码——保存完整响应

response=$(curl -s -X POST https://api.deepread.tech/v1/agent/device/code
-H "Content-Type: application/json"
-d '{"agent_name": "my-agent"}') device_code=$(echo "$response" | jq -r '.device_code') verification_uri_complete=$(echo "$response" | jq -r '.verification_uri_complete') interval=$(echo "$response" | jq -r '.interval')

Step 2: Open the browser (use the saved URL — code is pre-filled, user clicks Approve)

步骤2：打开浏览器（macOS用open，Linux用xdg-open）

open "$verification_uri_complete" # macOS / xdg-open on Linux

open "$verification_uri_complete"

Step 3: Poll for the key (repeat every $interval seconds until api_key is returned)

步骤3：轮询获取API密钥（每隔$interval秒重复执行）

curl -s -X POST https://api.deepread.tech/v1/agent/device/token
-H "Content-Type: application/json"
-d "{"device_code": "$device_code"}"

→ {"error": "authorization_pending"} (keep polling)

→ {"error": "authorization_pending"} （继续轮询）

→ {"api_key": "sk_live_...", "key_prefix": "sk_live_abc..."} (done!)

→ {"api_key": "sk_live_...", "key_prefix": "sk_live_abc..."} （完成！）

Step 4: Use the key

步骤4：使用API密钥

curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: sk_live_..."
-F "file=@invoice.pdf"

undefined

curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: sk_live_..."
-F "file=@invoice.pdf"

undefined

Webhook Receiver (Python / Flask)

Webhook接收器（Python / Flask）

python

from flask import Flask, request
import requests

app = Flask(__name__)
API_KEY = "sk_live_YOUR_KEY"

@app.route("/webhook", methods=["POST"])
def handle_webhook():
    payload = request.json
    job_id = payload["job_id"]

    # IMPORTANT: Always fetch canonical result from API (webhooks are not authenticated)
    result = requests.get(
        f"https://api.deepread.tech/v1/jobs/{job_id}",
        headers={"X-API-Key": API_KEY}
    ).json()

    # Process result...
    return "", 200  # Return 2xx to confirm delivery

python

from flask import Flask, request
import requests

app = Flask(__name__)
API_KEY = "sk_live_YOUR_KEY"

@app.route("/webhook", methods=["POST"])
def handle_webhook():
    payload = request.json
    job_id = payload["job_id"]

    # 重要提示：始终从API获取权威结果（Webhook不提供认证）
    result = requests.get(
        f"https://api.deepread.tech/v1/jobs/{job_id}",
        headers={"X-API-Key": API_KEY}
    ).json()

    # 处理结果...
    return "", 200  # 返回2xx状态码确认接收

Help the Developer

开发者帮助指南

No API key yet → use the device authorization flow (Agent Authentication section) — no copy/paste needed
Send a document → POST /v1/process, show code in their language
Structured data → help write a JSON Schema with descriptive field descriptions
Better accuracy → explain blueprints, help set up optimizer
Real-time updates → set up webhook_url, build receiver endpoint
Hitting errors → check API key, plan limits, file format, schema validity
Share results → use preview_url from response (no auth needed)
Large documents → use text_url instead of text field for docs > 1MB
Review workflow → filter fields by hil_flag, route flagged ones to human review

还没有API密钥 → 使用设备授权流程（Agent认证章节）——无需手动复制粘贴
提交文档 → 调用POST /v1/process，提供对应语言的代码示例
结构化数据提取 → 帮助编写带有详细字段描述的JSON Schema
提升准确率 → 解释蓝图功能，帮助设置优化器
实时更新 → 配置webhook_url，构建接收器端点
遇到错误 → 检查API密钥、套餐限制、文件格式、Schema有效性
分享结果 → 使用响应中的preview_url（无需认证）
大文档处理 → 对于超过1MB的文档，使用text_url字段获取文本
审核工作流 → 根据hil_flag筛选字段，将标记的字段路由到人工审核