api
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDeepRead API Reference
DeepRead API参考文档
You are helping a developer integrate DeepRead into their application. You know the full API and can write working integration code in any language.
Base URL:
Auth: header with key from or via the device authorization flow (see Agent Authentication below)
https://api.deepread.techX-API-Keyhttps://www.deepread.tech/dashboard你正在帮助开发者将DeepRead集成到他们的应用中。你熟悉完整的API,可以用任意语言编写可运行的集成代码。
基础URL:
认证方式: 使用请求头,密钥可从获取,或通过设备授权流程获取(见下方Agent认证章节)
https://api.deepread.techX-API-Keyhttps://www.deepread.tech/dashboardAgent Authentication (Device Authorization Flow)
Agent认证(设备授权流程)
These endpoints let an AI agent obtain an API key without the user ever copy/pasting secrets. Based on OAuth 2.0 Device Authorization Grant (RFC 8628).
这些接口允许AI Agent无需用户手动复制粘贴密钥即可获取API密钥。基于OAuth 2.0设备授权许可(RFC 8628)。
POST /v1/agent/device/code — Request a Device Code
POST /v1/agent/device/code — 请求设备码
Auth: None (public endpoint)
Content-Type:
application/jsonjson
{"agent_name": "my-agent"}| Parameter | Type | Required | Description |
|---|---|---|---|
| string | No | Display name shown to the user during approval (e.g. "Claude Code", "My CI Bot"). Optional but strongly recommended — without it, the user sees "Unknown Agent". |
Response (200 OK):
json
{
"device_code": "a7f3c9d2e1b8...",
"user_code": "HXKP-3MNV",
"verification_uri": "https://www.deepread.tech/activate",
"verification_uri_complete": "https://www.deepread.tech/activate?code=HXKP-3MNV",
"expires_in": 900,
"interval": 5
}| Field | Description |
|---|---|
| Secret code for polling — never show this to the user |
| Short code the user enters in their browser (format: |
| Base URL for manual code entry |
| URL with code pre-filled — open this to skip manual entry (preferred) |
| Seconds until the code expires (default: 900 = 15 minutes) |
| Minimum seconds between poll requests |
认证: 无需认证(公开接口)
内容类型:
application/jsonjson
{"agent_name": "my-agent"}| 参数 | 类型 | 是否必填 | 描述 |
|---|---|---|---|
| string | 否 | 审批时展示给用户的Agent名称(例如"Claude Code"、"My CI Bot")。可选但强烈推荐——如果不填,用户会看到"Unknown Agent"。 |
响应(200 OK):
json
{
"device_code": "a7f3c9d2e1b8...",
"user_code": "HXKP-3MNV",
"verification_uri": "https://www.deepread.tech/activate",
"verification_uri_complete": "https://www.deepread.tech/activate?code=HXKP-3MNV",
"expires_in": 900,
"interval": 5
}| 字段 | 描述 |
|---|---|
| 用于轮询的保密代码——绝不能展示给用户 |
| 用户在浏览器中输入的短代码(格式: |
| 手动输入代码的基础URL |
| 预填代码的URL——打开此链接可跳过手动输入(推荐使用) |
| 代码过期前的秒数(默认:900 = 15分钟) |
| 轮询请求之间的最小间隔秒数 |
POST /v1/agent/device/token — Poll for API Key
POST /v1/agent/device/token — 轮询获取API密钥
Auth: None (public endpoint)
Content-Type:
application/jsonjson
{"device_code": "a7f3c9d2e1b8..."}Poll this endpoint every seconds after the user has been shown the code.
intervalResponses:
| Scenario | | | Action |
|---|---|---|---|
| User hasn't acted yet | | | Wait |
| User approved | | | Save the key, stop polling |
| User denied | | | Stop polling, inform user |
| Code expired | | | Start over with a new device code |
The response always includes all three fields (, , ). Check to detect success — don't rely on key presence alone.
errorapi_keykey_prefixapi_key != nullImportant:
- The is returned exactly once. After you retrieve it, the server clears it. Store it immediately.
api_key - The is a non-secret identifier for the key (useful for display/logging).
key_prefix - Never show or
device_codeto the user.api_key
What happens on the user's side (you don't need to call these):
- User opens — the code is pre-filled, no typing needed
verification_uri_complete - User logs in (or signs up + confirms email for new users)
- User sees your agent name and clicks Approve → redirected to dashboard
- Once approved, the next poll to returns the
/v1/agent/device/tokenapi_key
认证: 无需认证(公开接口)
内容类型:
application/jsonjson
{"device_code": "a7f3c9d2e1b8..."}在向用户展示代码后,每隔秒轮询此接口。
interval响应情况:
| 场景 | | | 操作 |
|---|---|---|---|
| 用户尚未操作 | | | 等待 |
| 用户已批准 | | | 保存密钥,停止轮询 |
| 用户已拒绝 | | | 停止轮询,通知用户 |
| 代码已过期 | | | 重新获取新的设备码 |
响应始终包含三个字段(、、)。通过检查来判断是否成功——不要仅依赖密钥是否存在。
errorapi_keykey_prefixapi_key != null重要提示:
- 只会返回一次。获取后,服务器会清除该密钥,请立即存储。
api_key - 是密钥的非保密标识符(用于展示/日志记录)。
key_prefix - 绝不能向用户展示或
device_code。api_key
用户端操作流程(无需调用以下接口):
- 用户打开——代码已预填,无需手动输入
verification_uri_complete - 用户登录(新用户需注册并确认邮箱)
- 用户看到你的Agent名称并点击批准→跳转到控制台
- 批准后,下一次轮询会返回
/v1/agent/device/tokenapi_key
Processing
文档处理
POST /v1/process — Submit a Document
POST /v1/process — 提交文档
Uploads a document for async processing. Returns immediately with a job ID.
Auth:
Content-Type:
X-API-Key: YOUR_KEYmultipart/form-data| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| File | Yes | — | PDF, PNG, JPG, or JPEG |
| string | No | | |
| string | No | — | JSON Schema for structured extraction |
| string | No | — | Blueprint UUID (mutually exclusive with schema) |
| string | No | | Generate preview images and page data |
| string | No | | Per-page breakdown (auto-enabled when include_images=true) |
| string | No | — | HTTPS URL to notify on completion |
| string | No | — | Pipeline version for reproducibility |
Note: Provide OR , not both. Without either, only OCR text is returned.
schemablueprint_idResponse (200 OK):
json
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "queued"
}Errors:
| Status | Meaning |
|---|---|
| 400 | Invalid schema, unsupported file type, both schema and blueprint_id provided |
| 401 | Invalid or missing API key |
| 413 | File exceeds plan limit (15MB free, 50MB paid) |
| 429 | Monthly page quota exceeded or rate limit hit |
上传文档进行异步处理,立即返回任务ID。
认证:
内容类型:
X-API-Key: YOUR_KEYmultipart/form-data| 参数 | 类型 | 是否必填 | 默认值 | 描述 |
|---|---|---|---|---|
| File | 是 | — | PDF、PNG、JPG或JPEG格式 |
| string | 否 | | 可选值为 |
| string | 否 | — | 用于结构化提取的JSON Schema |
| string | 否 | — | 蓝图UUID(与schema互斥,只能填其一) |
| string | 否 | | 生成预览图片和页面数据 |
| string | 否 | | 按页拆分数据(当include_images=true时自动启用) |
| string | 否 | — | 处理完成后通知的HTTPS URL |
| string | 否 | — | 用于可复现性的流水线版本 |
注意: 请提供或,不要同时提供。如果都不提供,仅返回OCR识别的文本。
schemablueprint_id响应(200 OK):
json
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "queued"
}错误:
| 状态码 | 含义 |
|---|---|
| 400 | 无效的schema、不支持的文件类型、同时提供了schema和blueprint_id |
| 401 | API密钥无效或缺失 |
| 413 | 文件大小超出套餐限制(免费版15MB,付费版50MB) |
| 429 | 超出每月页数配额或速率限制 |
GET /v1/jobs/{job_id} — Get Results
GET /v1/jobs/{job_id} — 获取处理结果
Poll until is or . Recommended: wait 5s, then poll every 5-10s with exponential backoff, max 5 minutes.
statuscompletedfailedAuth:
X-API-Key: YOUR_KEYResponse (completed):
json
{
"id": "550e8400-...",
"status": "completed",
"created_at": "2025-01-18T10:30:00Z",
"completed_at": "2025-01-18T10:32:15Z",
"result": {
"text": "Full extracted text in markdown",
"text_preview": "First 500 characters...",
"text_url": "https://...",
"data": {
"vendor": {"value": "Acme Inc", "hil_flag": false, "found_on_page": 1},
"total": {"value": 1250.00, "hil_flag": true, "reason": "Outside typical range", "found_on_page": 1}
},
"pages": [
{
"page_number": 1,
"text": "Page 1 text...",
"hil_flag": false,
"review_reason": null,
"data": {}
}
]
},
"metadata": {
"page_count": 3,
"pipeline": "standard",
"review_percentage": 5.0,
"fields_requiring_review": 1,
"total_fields": 20,
"step_timings": {}
},
"preview_url": "https://preview.deepread.tech/token123...",
"webhook_url": "https://yourapp.com/webhook",
"webhook_delivered": true
}Notes:
- is provided when full text exceeds 1MB — fetch from this URL instead
text_url - is always the first 500 characters
text_preview - is only present if
dataorschemawas providedblueprint_id - is present when
pagesorinclude_pages=trueinclude_images=true - is a shareable link (no auth needed) to the HIL review interface
preview_url
Response (failed):
json
{
"id": "550e8400-...",
"status": "failed",
"error": "PDF parsing failed: file may be corrupted"
}Statuses: → → or
queuedprocessingcompletedfailed轮询直到变为或。推荐:先等待5秒,之后每隔5-10秒轮询一次,采用指数退避策略,最长等待5分钟。
statuscompletedfailed认证:
X-API-Key: YOUR_KEY处理完成响应:
json
{
"id": "550e8400-...",
"status": "completed",
"created_at": "2025-01-18T10:30:00Z",
"completed_at": "2025-01-18T10:32:15Z",
"result": {
"text": "完整的提取文本(Markdown格式)",
"text_preview": "前500个字符...",
"text_url": "https://...",
"data": {
"vendor": {"value": "Acme Inc", "hil_flag": false, "found_on_page": 1},
"total": {"value": 1250.00, "hil_flag": true, "reason": "超出典型范围", "found_on_page": 1}
},
"pages": [
{
"page_number": 1,
"text": "第1页文本...",
"hil_flag": false,
"review_reason": null,
"data": {}
}
]
},
"metadata": {
"page_count": 3,
"pipeline": "standard",
"review_percentage": 5.0,
"fields_requiring_review": 1,
"total_fields": 20,
"step_timings": {}
},
"preview_url": "https://preview.deepread.tech/token123...",
"webhook_url": "https://yourapp.com/webhook",
"webhook_delivered": true
}说明:
- 当完整文本超过1MB时,会返回——请从此URL获取文本
text_url - 始终是文本的前500个字符
text_preview - 只有提供了或
schema时,才会返回blueprint_id字段data - 当或
include_pages=true时,会返回include_images=true字段pages - 是可分享的链接(无需认证),用于HIL审核界面
preview_url
处理失败响应:
json
{
"id": "550e8400-...",
"status": "failed",
"error": "PDF解析失败:文件可能已损坏"
}状态流转: → → 或
queuedprocessingcompletedfailedGET /v1/preview/{token} — Public Preview (No Auth)
GET /v1/preview/{token} — 公开预览(无需认证)
Returns document preview data. Anyone with the token can view — no API key needed. Use for sharing results with stakeholders.
json
{
"file_name": "invoice.pdf",
"status": "completed",
"created_at": "2025-01-18T10:30:00Z",
"pages": [
{
"page_number": 1,
"image_url": "https://...",
"text": "Page text...",
"hil_flag": false,
"data": {}
}
],
"data": {},
"metadata": {"page_count": 1, "pipeline": "standard", "review_percentage": 0}
}返回文档预览数据。任何拥有token的用户都可以查看——无需API密钥。用于与利益相关者分享结果。
json
{
"file_name": "invoice.pdf",
"status": "completed",
"created_at": "2025-01-18T10:30:00Z",
"pages": [
{
"page_number": 1,
"image_url": "https://...",
"text": "页面文本...",
"hil_flag": false,
"data": {}
}
],
"data": {},
"metadata": {"page_count": 1, "pipeline": "standard", "review_percentage": 0}
}GET /v1/pipelines — List Pipelines (No Auth)
GET /v1/pipelines — 列出流水线(无需认证)
- standard — Multi-model consensus (GPT + Gemini), dual OCR with LLM judge, ~2-3 minutes
- searchable — Creates searchable PDF with embedded OCR text layer, ~3-4 minutes
- standard — 多模型共识(GPT + Gemini),双重OCR搭配LLM判断,耗时约2-3分钟
- searchable — 创建带嵌入式OCR文本层的可搜索PDF,耗时约3-4分钟
Blueprints & Optimizer
蓝图与优化器
Blueprints are optimized, versioned schemas. The optimizer takes your sample documents + expected values and enhances field descriptions for 20-30% accuracy improvement.
蓝图是经过优化的版本化Schema。优化器接收你的样本文档和预期值,增强字段描述,可将准确率提升20-30%。
GET /v1/blueprints/ — List Blueprints
GET /v1/blueprints/ — 列出蓝图
Auth:
X-API-Key: YOUR_KEYReturns all blueprints with active version and accuracy metrics.
认证:
X-API-Key: YOUR_KEY返回所有蓝图,包含活跃版本和准确率指标。
GET /v1/blueprints/{blueprint_id} — Get Blueprint Details
GET /v1/blueprints/{blueprint_id} — 获取蓝图详情
Auth:
X-API-Key: YOUR_KEYReturns blueprint with all versions, active version schema, and accuracy metrics.
认证:
X-API-Key: YOUR_KEY返回蓝图的所有版本、活跃版本的Schema以及准确率指标。
POST /v1/optimize — Start Optimization
POST /v1/optimize — 启动优化
Auth:
X-API-Key: YOUR_KEYjson
{
"name": "utility_invoice",
"description": "Utility bill extraction",
"document_type": "invoice",
"initial_schema": {"type": "object", "properties": {...}},
"training_documents": ["path1.pdf", "path2.pdf"],
"ground_truth_data": [{"vendor": "Electric Co", "total": 150.00}, ...],
"target_accuracy": 95.0,
"max_iterations": 5,
"max_cost_usd": 10.0
}- is optional — auto-generated from ground truth if omitted
initial_schema - Minimum 2 training documents
- (default 0.3) — fraction held out for validation
validation_split
Response:
json
{
"job_id": "...",
"blueprint_id": "...",
"status": "pending"
}认证:
X-API-Key: YOUR_KEYjson
{
"name": "utility_invoice",
"description": "水电费账单提取",
"document_type": "invoice",
"initial_schema": {"type": "object", "properties": {...}},
"training_documents": ["path1.pdf", "path2.pdf"],
"ground_truth_data": [{"vendor": "Electric Co", "total": 150.00}, ...],
"target_accuracy": 95.0,
"max_iterations": 5,
"max_cost_usd": 10.0
}- 是可选参数——如果省略,会从真值数据自动生成
initial_schema - 至少需要2份训练文档
- (默认0.3)——用于验证的数据集比例
validation_split
响应:
json
{
"job_id": "...",
"blueprint_id": "...",
"status": "pending"
}POST /v1/optimize/resume — Resume Optimization
POST /v1/optimize/resume — 恢复优化
Resume a failed job or start a new optimization run for an existing blueprint.
恢复失败的任务,或为现有蓝图启动新的优化运行。
GET /v1/blueprints/jobs/{job_id} — Optimization Job Status
GET /v1/blueprints/jobs/{job_id} — 优化任务状态
Auth:
X-API-Key: YOUR_KEYjson
{
"status": "running",
"iteration": 2,
"baseline_accuracy": 68.0,
"current_accuracy": 88.0,
"target_accuracy": 95.0,
"total_cost": 1.82,
"max_cost_usd": 10.0
}Statuses: → → → , , or
pendinginitializingrunningcompletedfailedcancelled认证:
X-API-Key: YOUR_KEYjson
{
"status": "running",
"iteration": 2,
"baseline_accuracy": 68.0,
"current_accuracy": 88.0,
"target_accuracy": 95.0,
"total_cost": 1.82,
"max_cost_usd": 10.0
}状态流转: → → → 、或
pendinginitializingrunningcompletedfailedcancelledGET /v1/blueprints/jobs/{job_id}/schema — Get Optimized Schema
GET /v1/blueprints/jobs/{job_id}/schema — 获取优化后的Schema
Returns the optimized JSON schema after optimization completes.
优化完成后返回优化后的JSON Schema。
Using a Blueprint
使用蓝图
bash
curl -X POST https://api.deepread.tech/v1/process \
-H "X-API-Key: YOUR_KEY" \
-F "file=@invoice.pdf" \
-F "blueprint_id=660e8400-..."bash
curl -X POST https://api.deepread.tech/v1/process \
-H "X-API-Key: YOUR_KEY" \
-F "file=@invoice.pdf" \
-F "blueprint_id=660e8400-..."Webhooks
Webhook
Pass when submitting a document to get notified on completion.
webhook_urlPayload sent to your URL:
json
{
"event": "job.completed",
"job_id": "550e8400-...",
"status": "completed",
"result": {"text": "...", "data": {}},
"metadata": {},
"preview_url": "https://preview.deepread.tech/..."
}Important:
- Webhooks are NOT authenticated — always fetch the canonical result via with your API key
GET /v1/jobs/{job_id} - Must be HTTPS
- Return 2xx to confirm delivery
- Delivery is best-effort — use polling as fallback if webhook not received
- Make your endpoint idempotent (may receive duplicates)
提交文档时传入,处理完成后会收到通知。
webhook_url发送到你URL的负载:
json
{
"event": "job.completed",
"job_id": "550e8400-...",
"status": "completed",
"result": {"text": "...", "data": {}},
"metadata": {},
"preview_url": "https://preview.deepread.tech/..."
}重要提示:
- Webhook不提供认证——请始终使用你的API密钥通过获取权威结果
GET /v1/jobs/{job_id} - 必须使用HTTPS
- 返回2xx状态码确认接收
- 采用尽力交付机制——如果未收到Webhook,请使用轮询作为备选方案
- 确保你的端点是幂等的(可能会收到重复通知)
Rate Limits
速率限制
Every response includes these headers:
| Header | Description |
|---|---|
| Monthly pages in your plan |
| Pages remaining this cycle |
| Pages used this cycle |
| Unix timestamp when quota resets |
Plans:
| Plan | Pages/month | Max file | Per-doc limit | Rate limit |
|---|---|---|---|---|
| Free | 2,000 | 15 MB | 50 pages | 10 req/min |
| Pro ($99/mo) | 50,000 | 50 MB | Unlimited | 100 req/min |
| Scale | 1,000,000 | 50 MB | Unlimited | 500 req/min |
每个响应都包含以下头部:
| 头部 | 描述 |
|---|---|
| 你的套餐每月可处理的页数 |
| 当前周期剩余的可处理页数 |
| 当前周期已使用的页数 |
| 配额重置的Unix时间戳 |
套餐详情:
| 套餐 | 每月页数 | 最大文件大小 | 单文档页数限制 | 速率限制 |
|---|---|---|---|---|
| 免费版 | 2,000 | 15 MB | 50页 | 10次请求/分钟 |
| 专业版($99/月) | 50,000 | 50 MB | 无限制 | 100次请求/分钟 |
| 企业版 | 1,000,000 | 50 MB | 无限制 | 500次请求/分钟 |
Error Handling
错误处理
All errors return:
json
{"detail": "Human-readable error message"}| Status | Meaning |
|---|---|
| 400 | Bad request — invalid schema, unsupported file, both schema + blueprint_id |
| 401 | Invalid or missing API key |
| 404 | Job not found |
| 413 | File too large for your plan |
| 429 | Rate limit or monthly quota exceeded |
| 500 | Server error |
Quota exceeded (429):
json
{
"detail": {
"error": "page_count_exceeded",
"message": "Document has 100 pages, exceeds 50-page limit for FREE plan. Upgrade to PRO.",
"page_count": 100,
"max_pages": 50,
"plan": "free"
}
}Common failure reasons in jobs:
- Document issues: corrupted, unreadable, poor scan quality, processing timeout
- Schema issues: invalid JSON Schema, required fields not found
- Plan limits: file too large, too many pages, quota exceeded
所有错误都会返回:
json
{"detail": "易读的错误消息"}| 状态码 | 含义 |
|---|---|
| 400 | 请求无效——无效的schema、不支持的文件类型、同时提供了schema和blueprint_id |
| 401 | API密钥无效或缺失 |
| 404 | 任务不存在 |
| 413 | 文件大小超出套餐限制 |
| 429 | 超出速率限制或每月配额 |
| 500 | 服务器错误 |
配额超出(429):
json
{
"detail": {
"error": "page_count_exceeded",
"message": "文档有100页,超出免费版50页的限制。请升级到专业版。",
"page_count": 100,
"max_pages": 50,
"plan": "free"
}
}任务常见失败原因:
- 文档问题:损坏、无法读取、扫描质量差、处理超时
- Schema问题:无效的JSON Schema、未找到必填字段
- 套餐限制:文件过大、页数过多、配额超出
Code Examples
代码示例
Python
Python
python
import requests
import time
import json
API_KEY = "sk_live_YOUR_KEY"
BASE = "https://api.deepread.tech"python
import requests
import time
import json
API_KEY = "sk_live_YOUR_KEY"
BASE = "https://api.deepread.tech"Submit document with structured extraction
提交文档进行结构化提取
schema = {
"type": "object",
"properties": {
"vendor": {"type": "string", "description": "Vendor or company name"},
"total": {"type": "number", "description": "Total amount due"},
"due_date": {"type": "string", "description": "Payment due date"}
}
}
with open("invoice.pdf", "rb") as f:
resp = requests.post(
f"{BASE}/v1/process",
headers={"X-API-Key": API_KEY},
files={"file": f},
data={"schema": json.dumps(schema)}
)
job_id = resp.json()["id"]
schema = {
"type": "object",
"properties": {
"vendor": {"type": "string", "description": "供应商或公司名称"},
"total": {"type": "number", "description": "应付总金额"},
"due_date": {"type": "string", "description": "付款截止日期"}
}
}
with open("invoice.pdf", "rb") as f:
resp = requests.post(
f"{BASE}/v1/process",
headers={"X-API-Key": API_KEY},
files={"file": f},
data={"schema": json.dumps(schema)}
)
job_id = resp.json()["id"]
Poll with exponential backoff
指数退避轮询
delay = 5
while True:
time.sleep(delay)
result = requests.get(
f"{BASE}/v1/jobs/{job_id}",
headers={"X-API-Key": API_KEY}
).json()
if result["status"] in ("completed", "failed"):
break
delay = min(delay * 1.5, 30) # cap at 30sdelay = 5
while True:
time.sleep(delay)
result = requests.get(
f"{BASE}/v1/jobs/{job_id}",
headers={"X-API-Key": API_KEY}
).json()
if result["status"] in ("completed", "failed"):
break
delay = min(delay * 1.5, 30) # 最大延迟30秒Use results
处理结果
if result["status"] == "completed":
text = result["result"]["text"]
data = result["result"].get("data", {})
for field, info in data.items():
if info["hil_flag"]:
print(f"REVIEW: {field} = {info['value']} ({info.get('reason')})")
else:
print(f"OK: {field} = {info['value']}")
undefinedif result["status"] == "completed":
text = result["result"]["text"]
data = result["result"].get("data", {})
for field, info in data.items():
if info["hil_flag"]:
print(f"需审核:{field} = {info['value']}(原因:{info.get('reason')})")
else:
print(f"正常:{field} = {info['value']}")
undefinedJavaScript / Node.js
JavaScript / Node.js
javascript
import fs from "fs";
const API_KEY = "sk_live_YOUR_KEY";
const BASE = "https://api.deepread.tech";
// Submit document
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
form.append("schema", JSON.stringify({
type: "object",
properties: {
vendor: { type: "string", description: "Vendor or company name" },
total: { type: "number", description: "Total amount due" }
}
}));
const { id: jobId } = await fetch(`${BASE}/v1/process`, {
method: "POST",
headers: { "X-API-Key": API_KEY },
body: form
}).then(r => r.json());
// Poll with backoff
let delay = 5000;
let result;
do {
await new Promise(r => setTimeout(r, delay));
result = await fetch(`${BASE}/v1/jobs/${jobId}`, {
headers: { "X-API-Key": API_KEY }
}).then(r => r.json());
delay = Math.min(delay * 1.5, 30000);
} while (!["completed", "failed"].includes(result.status));
console.log(result);javascript
import fs from "fs";
const API_KEY = "sk_live_YOUR_KEY";
const BASE = "https://api.deepread.tech";
// 提交文档
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
form.append("schema", JSON.stringify({
type: "object",
properties: {
vendor: { type: "string", description: "供应商或公司名称" },
total: { type: "number", description: "应付总金额" }
}
}));
const { id: jobId } = await fetch(`${BASE}/v1/process`, {
method: "POST",
headers: { "X-API-Key": API_KEY },
body: form
}).then(r => r.json());
// 指数退避轮询
let delay = 5000;
let result;
do {
await new Promise(r => setTimeout(r, delay));
result = await fetch(`${BASE}/v1/jobs/${jobId}`, {
headers: { "X-API-Key": API_KEY }
}).then(r => r.json());
delay = Math.min(delay * 1.5, 30000);
} while (!["completed", "failed"].includes(result.status));
console.log(result);cURL
cURL
bash
undefinedbash
undefinedSubmit with schema
使用schema提交文档
curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F 'schema={"type":"object","properties":{"vendor":{"type":"string","description":"Vendor name"},"total":{"type":"number","description":"Total amount"}}}'
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F 'schema={"type":"object","properties":{"vendor":{"type":"string","description":"Vendor name"},"total":{"type":"number","description":"Total amount"}}}'
curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F 'schema={"type":"object","properties":{"vendor":{"type":"string","description":"供应商名称"},"total":{"type":"number","description":"应付总金额"}}}'
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F 'schema={"type":"object","properties":{"vendor":{"type":"string","description":"供应商名称"},"total":{"type":"number","description":"应付总金额"}}}'
Submit with blueprint
使用蓝图提交文档
curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F "blueprint_id=660e8400-..."
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F "blueprint_id=660e8400-..."
curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F "blueprint_id=660e8400-..."
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F "blueprint_id=660e8400-..."
Get results
获取结果
curl https://api.deepread.tech/v1/jobs/JOB_ID
-H "X-API-Key: YOUR_KEY"
-H "X-API-Key: YOUR_KEY"
curl https://api.deepread.tech/v1/jobs/JOB_ID
-H "X-API-Key: YOUR_KEY"
-H "X-API-Key: YOUR_KEY"
List blueprints
列出蓝图
curl https://api.deepread.tech/v1/blueprints/
-H "X-API-Key: YOUR_KEY"
-H "X-API-Key: YOUR_KEY"
undefinedcurl https://api.deepread.tech/v1/blueprints/
-H "X-API-Key: YOUR_KEY"
-H "X-API-Key: YOUR_KEY"
undefinedAgent Device Flow (Python)
Agent设备授权流程(Python)
python
import requests
import time
import webbrowser
BASE = "https://api.deepread.tech"python
import requests
import time
import webbrowser
BASE = "https://api.deepread.tech"Step 1: Request a device code
步骤1:请求设备码
resp = requests.post(f"{BASE}/v1/agent/device/code", json={"agent_name": "my-agent"})
data = resp.json()
device_code = data["device_code"]
uri_complete = data["verification_uri_complete"]
interval = data["interval"]
resp = requests.post(f"{BASE}/v1/agent/device/code", json={"agent_name": "my-agent"})
data = resp.json()
device_code = data["device_code"]
uri_complete = data["verification_uri_complete"]
interval = data["interval"]
Step 2: Open browser with code pre-filled
步骤2:打开预填代码的浏览器链接
success = webbrowser.open(uri_complete)
if success:
print(f"Opened browser: {uri_complete}")
else:
print(f"Unable to open browser programmatically; please open this URL manually: {uri_complete}")
print("Log in and click Approve. I'll wait here.")
success = webbrowser.open(uri_complete)
if success:
print(f"已打开浏览器:{uri_complete}")
else:
print(f"无法自动打开浏览器,请手动打开此链接:{uri_complete}")
print("请登录并点击批准,我会在此等待。")
Step 3: Poll until approved
步骤3:轮询直到获取API密钥
api_key = None
while True:
time.sleep(interval)
resp = requests.post(f"{BASE}/v1/agent/device/token", json={"device_code": device_code})
result = resp.json()
if result.get("api_key"):
api_key = result["api_key"]
print(f"Got API key: {result['key_prefix']}...")
break
elif result.get("error") == "authorization_pending":
continue
elif result.get("error") == "access_denied":
print("User denied the request.")
break
elif result.get("error") == "expired_token":
print("Code expired. Please start over.")
breakif api_key is None:
raise SystemExit("Device flow did not complete successfully — no API key obtained.")
api_key = None
while True:
time.sleep(interval)
resp = requests.post(f"{BASE}/v1/agent/device/token", json={"device_code": device_code})
result = resp.json()
if result.get("api_key"):
api_key = result["api_key"]
print(f"已获取API密钥:{result['key_prefix']}...")
break
elif result.get("error") == "authorization_pending":
continue
elif result.get("error") == "access_denied":
print("用户已拒绝请求。")
break
elif result.get("error") == "expired_token":
print("代码已过期,请重新开始。")
breakif api_key is None:
raise SystemExit("设备授权流程未成功完成——未获取到API密钥。")
Step 4: Use the key to process documents
步骤4:使用API密钥处理文档
with open("invoice.pdf", "rb") as f:
resp = requests.post(
f"{BASE}/v1/process",
headers={"X-API-Key": api_key},
files={"file": f},
)
print(resp.json()) # {"id": "...", "status": "queued"}
undefinedwith open("invoice.pdf", "rb") as f:
resp = requests.post(
f"{BASE}/v1/process",
headers={"X-API-Key": api_key},
files={"file": f},
)
print(resp.json()) # {"id": "...", "status": "queued"}
undefinedAgent Device Flow (JavaScript)
Agent设备授权流程(JavaScript)
javascript
const fs = require("fs");
const BASE = "https://api.deepread.tech";
// Step 1: Request a device code
const { device_code, verification_uri_complete, interval } = await fetch(
`${BASE}/v1/agent/device/code`,
{ method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ agent_name: "my-agent" }) }
).then(r => r.json());
// Step 2: Open browser with code pre-filled
console.log(`Please open: ${verification_uri_complete}`);
console.log("Log in and click Approve. I'll wait here.");
// Step 3: Poll until approved
let apiKey;
while (true) {
await new Promise(r => setTimeout(r, interval * 1000));
const result = await fetch(`${BASE}/v1/agent/device/token`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ device_code }),
}).then(r => r.json());
if (result.api_key) {
apiKey = result.api_key;
console.log(`Got API key: ${result.key_prefix}...`);
break;
} else if (result.error === "authorization_pending") {
continue;
} else {
console.log(`Flow ended: ${result.error}`);
break;
}
}
if (!apiKey) {
throw new Error("Device flow did not complete successfully — no API key obtained.");
}
// Step 4: Use the key
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
const job = await fetch(`${BASE}/v1/process`, {
method: "POST",
headers: { "X-API-Key": apiKey },
body: form,
}).then(r => r.json());
console.log(job); // {id: "...", status: "queued"}javascript
const fs = require("fs");
const BASE = "https://api.deepread.tech";
// 步骤1:请求设备码
const { device_code, verification_uri_complete, interval } = await fetch(
`${BASE}/v1/agent/device/code`,
{ method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ agent_name: "my-agent" }) }
).then(r => r.json());
// 步骤2:提示用户打开链接
console.log(`请打开:${verification_uri_complete}`);
console.log("请登录并点击批准,我会在此等待。");
// 步骤3:轮询直到获取API密钥
let apiKey;
while (true) {
await new Promise(r => setTimeout(r, interval * 1000));
const result = await fetch(`${BASE}/v1/agent/device/token`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ device_code }),
}).then(r => r.json());
if (result.api_key) {
apiKey = result.api_key;
console.log(`已获取API密钥:${result.key_prefix}...`);
break;
} else if (result.error === "authorization_pending") {
continue;
} else {
console.log(`流程结束:${result.error}`);
break;
}
}
if (!apiKey) {
throw new Error("设备授权流程未成功完成——未获取到API密钥。");
}
// 步骤4:使用API密钥
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
const job = await fetch(`${BASE}/v1/process`, {
method: "POST",
headers: { "X-API-Key": apiKey },
body: form,
}).then(r => r.json());
console.log(job); // {id: "...", status: "queued"}Agent Device Flow (cURL)
Agent设备授权流程(cURL)
bash
undefinedbash
undefinedStep 1: Request a device code — save the full response
步骤1:请求设备码——保存完整响应
response=$(curl -s -X POST https://api.deepread.tech/v1/agent/device/code
-H "Content-Type: application/json"
-d '{"agent_name": "my-agent"}') device_code=$(echo "$response" | jq -r '.device_code') verification_uri_complete=$(echo "$response" | jq -r '.verification_uri_complete') interval=$(echo "$response" | jq -r '.interval')
-H "Content-Type: application/json"
-d '{"agent_name": "my-agent"}') device_code=$(echo "$response" | jq -r '.device_code') verification_uri_complete=$(echo "$response" | jq -r '.verification_uri_complete') interval=$(echo "$response" | jq -r '.interval')
response=$(curl -s -X POST https://api.deepread.tech/v1/agent/device/code
-H "Content-Type: application/json"
-d '{"agent_name": "my-agent"}') device_code=$(echo "$response" | jq -r '.device_code') verification_uri_complete=$(echo "$response" | jq -r '.verification_uri_complete') interval=$(echo "$response" | jq -r '.interval')
-H "Content-Type: application/json"
-d '{"agent_name": "my-agent"}') device_code=$(echo "$response" | jq -r '.device_code') verification_uri_complete=$(echo "$response" | jq -r '.verification_uri_complete') interval=$(echo "$response" | jq -r '.interval')
Step 2: Open the browser (use the saved URL — code is pre-filled, user clicks Approve)
步骤2:打开浏览器(macOS用open,Linux用xdg-open)
open "$verification_uri_complete" # macOS / xdg-open on Linux
open "$verification_uri_complete"
Step 3: Poll for the key (repeat every $interval seconds until api_key is returned)
步骤3:轮询获取API密钥(每隔$interval秒重复执行)
curl -s -X POST https://api.deepread.tech/v1/agent/device/token
-H "Content-Type: application/json"
-d "{"device_code": "$device_code"}"
-H "Content-Type: application/json"
-d "{"device_code": "$device_code"}"
curl -s -X POST https://api.deepread.tech/v1/agent/device/token
-H "Content-Type: application/json"
-d "{"device_code": "$device_code"}"
-H "Content-Type: application/json"
-d "{"device_code": "$device_code"}"
→ {"error": "authorization_pending"} (keep polling)
→ {"error": "authorization_pending"} (继续轮询)
→ {"api_key": "sk_live_...", "key_prefix": "sk_live_abc..."} (done!)
→ {"api_key": "sk_live_...", "key_prefix": "sk_live_abc..."} (完成!)
Step 4: Use the key
步骤4:使用API密钥
curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: sk_live_..."
-F "file=@invoice.pdf"
-H "X-API-Key: sk_live_..."
-F "file=@invoice.pdf"
undefinedcurl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: sk_live_..."
-F "file=@invoice.pdf"
-H "X-API-Key: sk_live_..."
-F "file=@invoice.pdf"
undefinedWebhook Receiver (Python / Flask)
Webhook接收器(Python / Flask)
python
from flask import Flask, request
import requests
app = Flask(__name__)
API_KEY = "sk_live_YOUR_KEY"
@app.route("/webhook", methods=["POST"])
def handle_webhook():
payload = request.json
job_id = payload["job_id"]
# IMPORTANT: Always fetch canonical result from API (webhooks are not authenticated)
result = requests.get(
f"https://api.deepread.tech/v1/jobs/{job_id}",
headers={"X-API-Key": API_KEY}
).json()
# Process result...
return "", 200 # Return 2xx to confirm deliverypython
from flask import Flask, request
import requests
app = Flask(__name__)
API_KEY = "sk_live_YOUR_KEY"
@app.route("/webhook", methods=["POST"])
def handle_webhook():
payload = request.json
job_id = payload["job_id"]
# 重要提示:始终从API获取权威结果(Webhook不提供认证)
result = requests.get(
f"https://api.deepread.tech/v1/jobs/{job_id}",
headers={"X-API-Key": API_KEY}
).json()
# 处理结果...
return "", 200 # 返回2xx状态码确认接收Help the Developer
开发者帮助指南
- No API key yet → use the device authorization flow (Agent Authentication section) — no copy/paste needed
- Send a document → POST /v1/process, show code in their language
- Structured data → help write a JSON Schema with descriptive field descriptions
- Better accuracy → explain blueprints, help set up optimizer
- Real-time updates → set up webhook_url, build receiver endpoint
- Hitting errors → check API key, plan limits, file format, schema validity
- Share results → use preview_url from response (no auth needed)
- Large documents → use text_url instead of text field for docs > 1MB
- Review workflow → filter fields by hil_flag, route flagged ones to human review
- 还没有API密钥 → 使用设备授权流程(Agent认证章节)——无需手动复制粘贴
- 提交文档 → 调用POST /v1/process,提供对应语言的代码示例
- 结构化数据提取 → 帮助编写带有详细字段描述的JSON Schema
- 提升准确率 → 解释蓝图功能,帮助设置优化器
- 实时更新 → 配置webhook_url,构建接收器端点
- 遇到错误 → 检查API密钥、套餐限制、文件格式、Schema有效性
- 分享结果 → 使用响应中的preview_url(无需认证)
- 大文档处理 → 对于超过1MB的文档,使用text_url字段获取文本
- 审核工作流 → 根据hil_flag筛选字段,将标记的字段路由到人工审核