api

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

DeepRead API Reference

DeepRead API参考文档

You are helping a developer integrate DeepRead into their application. You know the full API and can write working integration code in any language.
Base URL:
https://api.deepread.tech
Auth:
X-API-Key
header with key from
https://www.deepread.tech/dashboard
or via the device authorization flow (see Agent Authentication below)

你正在帮助开发者将DeepRead集成到他们的应用中。你熟悉完整的API,可以用任意语言编写可运行的集成代码。
基础URL:
https://api.deepread.tech
认证方式: 使用
X-API-Key
请求头,密钥可从
https://www.deepread.tech/dashboard
获取,或通过设备授权流程获取(见下方Agent认证章节)

Agent Authentication (Device Authorization Flow)

Agent认证(设备授权流程)

These endpoints let an AI agent obtain an API key without the user ever copy/pasting secrets. Based on OAuth 2.0 Device Authorization Grant (RFC 8628).
这些接口允许AI Agent无需用户手动复制粘贴密钥即可获取API密钥。基于OAuth 2.0设备授权许可(RFC 8628)。

POST /v1/agent/device/code — Request a Device Code

POST /v1/agent/device/code — 请求设备码

Auth: None (public endpoint) Content-Type:
application/json
json
{"agent_name": "my-agent"}
ParameterTypeRequiredDescription
agent_name
stringNoDisplay name shown to the user during approval (e.g. "Claude Code", "My CI Bot"). Optional but strongly recommended — without it, the user sees "Unknown Agent".
Response (200 OK):
json
{
  "device_code": "a7f3c9d2e1b8...",
  "user_code": "HXKP-3MNV",
  "verification_uri": "https://www.deepread.tech/activate",
  "verification_uri_complete": "https://www.deepread.tech/activate?code=HXKP-3MNV",
  "expires_in": 900,
  "interval": 5
}
FieldDescription
device_code
Secret code for polling — never show this to the user
user_code
Short code the user enters in their browser (format:
XXXX-XXXX
)
verification_uri
Base URL for manual code entry
verification_uri_complete
URL with code pre-filled — open this to skip manual entry (preferred)
expires_in
Seconds until the code expires (default: 900 = 15 minutes)
interval
Minimum seconds between poll requests

认证: 无需认证(公开接口) 内容类型:
application/json
json
{"agent_name": "my-agent"}
参数类型是否必填描述
agent_name
string审批时展示给用户的Agent名称(例如"Claude Code"、"My CI Bot")。可选但强烈推荐——如果不填,用户会看到"Unknown Agent"。
响应(200 OK):
json
{
  "device_code": "a7f3c9d2e1b8...",
  "user_code": "HXKP-3MNV",
  "verification_uri": "https://www.deepread.tech/activate",
  "verification_uri_complete": "https://www.deepread.tech/activate?code=HXKP-3MNV",
  "expires_in": 900,
  "interval": 5
}
字段描述
device_code
用于轮询的保密代码——绝不能展示给用户
user_code
用户在浏览器中输入的短代码(格式:
XXXX-XXXX
verification_uri
手动输入代码的基础URL
verification_uri_complete
预填代码的URL——打开此链接可跳过手动输入(推荐使用)
expires_in
代码过期前的秒数(默认:900 = 15分钟)
interval
轮询请求之间的最小间隔秒数

POST /v1/agent/device/token — Poll for API Key

POST /v1/agent/device/token — 轮询获取API密钥

Auth: None (public endpoint) Content-Type:
application/json
json
{"device_code": "a7f3c9d2e1b8..."}
Poll this endpoint every
interval
seconds after the user has been shown the code.
Responses:
Scenario
error
field
api_key
field
Action
User hasn't acted yet
"authorization_pending"
null
Wait
interval
seconds, poll again
User approved
null
"sk_live_..."
Save the key, stop polling
User denied
"access_denied"
null
Stop polling, inform user
Code expired
"expired_token"
null
Start over with a new device code
The response always includes all three fields (
error
,
api_key
,
key_prefix
). Check
api_key != null
to detect success — don't rely on key presence alone.
Important:
  • The
    api_key
    is returned exactly once. After you retrieve it, the server clears it. Store it immediately.
  • The
    key_prefix
    is a non-secret identifier for the key (useful for display/logging).
  • Never show
    device_code
    or
    api_key
    to the user.

What happens on the user's side (you don't need to call these):
  • User opens
    verification_uri_complete
    — the code is pre-filled, no typing needed
  • User logs in (or signs up + confirms email for new users)
  • User sees your agent name and clicks Approve → redirected to dashboard
  • Once approved, the next poll to
    /v1/agent/device/token
    returns the
    api_key

认证: 无需认证(公开接口) 内容类型:
application/json
json
{"device_code": "a7f3c9d2e1b8..."}
在向用户展示代码后,每隔
interval
秒轮询此接口。
响应情况:
场景
error
字段
api_key
字段
操作
用户尚未操作
"authorization_pending"
null
等待
interval
秒后再次轮询
用户已批准
null
"sk_live_..."
保存密钥,停止轮询
用户已拒绝
"access_denied"
null
停止轮询,通知用户
代码已过期
"expired_token"
null
重新获取新的设备码
响应始终包含三个字段(
error
api_key
key_prefix
)。通过检查
api_key != null
来判断是否成功——不要仅依赖密钥是否存在。
重要提示:
  • api_key
    只会返回一次。获取后,服务器会清除该密钥,请立即存储。
  • key_prefix
    是密钥的非保密标识符(用于展示/日志记录)。
  • 绝不能向用户展示
    device_code
    api_key

用户端操作流程(无需调用以下接口):
  • 用户打开
    verification_uri_complete
    ——代码已预填,无需手动输入
  • 用户登录(新用户需注册并确认邮箱)
  • 用户看到你的Agent名称并点击批准→跳转到控制台
  • 批准后,下一次轮询
    /v1/agent/device/token
    会返回
    api_key

Processing

文档处理

POST /v1/process — Submit a Document

POST /v1/process — 提交文档

Uploads a document for async processing. Returns immediately with a job ID.
Auth:
X-API-Key: YOUR_KEY
Content-Type:
multipart/form-data
ParameterTypeRequiredDefaultDescription
file
FileYesPDF, PNG, JPG, or JPEG
pipeline
stringNo
"standard"
"standard"
or
"searchable"
schema
stringNoJSON Schema for structured extraction
blueprint_id
stringNoBlueprint UUID (mutually exclusive with schema)
include_images
stringNo
"true"
Generate preview images and page data
include_pages
stringNo
"false"
Per-page breakdown (auto-enabled when include_images=true)
webhook_url
stringNoHTTPS URL to notify on completion
version
stringNoPipeline version for reproducibility
Note: Provide
schema
OR
blueprint_id
, not both. Without either, only OCR text is returned.
Response (200 OK):
json
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued"
}
Errors:
StatusMeaning
400Invalid schema, unsupported file type, both schema and blueprint_id provided
401Invalid or missing API key
413File exceeds plan limit (15MB free, 50MB paid)
429Monthly page quota exceeded or rate limit hit

上传文档进行异步处理,立即返回任务ID。
认证:
X-API-Key: YOUR_KEY
内容类型:
multipart/form-data
参数类型是否必填默认值描述
file
FilePDF、PNG、JPG或JPEG格式
pipeline
string
"standard"
可选值为
"standard"
"searchable"
schema
string用于结构化提取的JSON Schema
blueprint_id
string蓝图UUID(与schema互斥,只能填其一)
include_images
string
"true"
生成预览图片和页面数据
include_pages
string
"false"
按页拆分数据(当include_images=true时自动启用)
webhook_url
string处理完成后通知的HTTPS URL
version
string用于可复现性的流水线版本
注意: 请提供
schema
blueprint_id
,不要同时提供。如果都不提供,仅返回OCR识别的文本。
响应(200 OK):
json
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued"
}
错误:
状态码含义
400无效的schema、不支持的文件类型、同时提供了schema和blueprint_id
401API密钥无效或缺失
413文件大小超出套餐限制(免费版15MB,付费版50MB)
429超出每月页数配额或速率限制

GET /v1/jobs/{job_id} — Get Results

GET /v1/jobs/{job_id} — 获取处理结果

Poll until
status
is
completed
or
failed
. Recommended: wait 5s, then poll every 5-10s with exponential backoff, max 5 minutes.
Auth:
X-API-Key: YOUR_KEY
Response (completed):
json
{
  "id": "550e8400-...",
  "status": "completed",
  "created_at": "2025-01-18T10:30:00Z",
  "completed_at": "2025-01-18T10:32:15Z",
  "result": {
    "text": "Full extracted text in markdown",
    "text_preview": "First 500 characters...",
    "text_url": "https://...",
    "data": {
      "vendor": {"value": "Acme Inc", "hil_flag": false, "found_on_page": 1},
      "total": {"value": 1250.00, "hil_flag": true, "reason": "Outside typical range", "found_on_page": 1}
    },
    "pages": [
      {
        "page_number": 1,
        "text": "Page 1 text...",
        "hil_flag": false,
        "review_reason": null,
        "data": {}
      }
    ]
  },
  "metadata": {
    "page_count": 3,
    "pipeline": "standard",
    "review_percentage": 5.0,
    "fields_requiring_review": 1,
    "total_fields": 20,
    "step_timings": {}
  },
  "preview_url": "https://preview.deepread.tech/token123...",
  "webhook_url": "https://yourapp.com/webhook",
  "webhook_delivered": true
}
Notes:
  • text_url
    is provided when full text exceeds 1MB — fetch from this URL instead
  • text_preview
    is always the first 500 characters
  • data
    is only present if
    schema
    or
    blueprint_id
    was provided
  • pages
    is present when
    include_pages=true
    or
    include_images=true
  • preview_url
    is a shareable link (no auth needed) to the HIL review interface
Response (failed):
json
{
  "id": "550e8400-...",
  "status": "failed",
  "error": "PDF parsing failed: file may be corrupted"
}
Statuses:
queued
processing
completed
or
failed

轮询直到
status
变为
completed
failed
。推荐:先等待5秒,之后每隔5-10秒轮询一次,采用指数退避策略,最长等待5分钟。
认证:
X-API-Key: YOUR_KEY
处理完成响应:
json
{
  "id": "550e8400-...",
  "status": "completed",
  "created_at": "2025-01-18T10:30:00Z",
  "completed_at": "2025-01-18T10:32:15Z",
  "result": {
    "text": "完整的提取文本(Markdown格式)",
    "text_preview": "前500个字符...",
    "text_url": "https://...",
    "data": {
      "vendor": {"value": "Acme Inc", "hil_flag": false, "found_on_page": 1},
      "total": {"value": 1250.00, "hil_flag": true, "reason": "超出典型范围", "found_on_page": 1}
    },
    "pages": [
      {
        "page_number": 1,
        "text": "第1页文本...",
        "hil_flag": false,
        "review_reason": null,
        "data": {}
      }
    ]
  },
  "metadata": {
    "page_count": 3,
    "pipeline": "standard",
    "review_percentage": 5.0,
    "fields_requiring_review": 1,
    "total_fields": 20,
    "step_timings": {}
  },
  "preview_url": "https://preview.deepread.tech/token123...",
  "webhook_url": "https://yourapp.com/webhook",
  "webhook_delivered": true
}
说明:
  • 当完整文本超过1MB时,会返回
    text_url
    ——请从此URL获取文本
  • text_preview
    始终是文本的前500个字符
  • 只有提供了
    schema
    blueprint_id
    时,才会返回
    data
    字段
  • include_pages=true
    include_images=true
    时,会返回
    pages
    字段
  • preview_url
    是可分享的链接(无需认证),用于HIL审核界面
处理失败响应:
json
{
  "id": "550e8400-...",
  "status": "failed",
  "error": "PDF解析失败:文件可能已损坏"
}
状态流转:
queued
processing
completed
failed

GET /v1/preview/{token} — Public Preview (No Auth)

GET /v1/preview/{token} — 公开预览(无需认证)

Returns document preview data. Anyone with the token can view — no API key needed. Use for sharing results with stakeholders.
json
{
  "file_name": "invoice.pdf",
  "status": "completed",
  "created_at": "2025-01-18T10:30:00Z",
  "pages": [
    {
      "page_number": 1,
      "image_url": "https://...",
      "text": "Page text...",
      "hil_flag": false,
      "data": {}
    }
  ],
  "data": {},
  "metadata": {"page_count": 1, "pipeline": "standard", "review_percentage": 0}
}

返回文档预览数据。任何拥有token的用户都可以查看——无需API密钥。用于与利益相关者分享结果。
json
{
  "file_name": "invoice.pdf",
  "status": "completed",
  "created_at": "2025-01-18T10:30:00Z",
  "pages": [
    {
      "page_number": 1,
      "image_url": "https://...",
      "text": "页面文本...",
      "hil_flag": false,
      "data": {}
    }
  ],
  "data": {},
  "metadata": {"page_count": 1, "pipeline": "standard", "review_percentage": 0}
}

GET /v1/pipelines — List Pipelines (No Auth)

GET /v1/pipelines — 列出流水线(无需认证)

  • standard — Multi-model consensus (GPT + Gemini), dual OCR with LLM judge, ~2-3 minutes
  • searchable — Creates searchable PDF with embedded OCR text layer, ~3-4 minutes

  • standard — 多模型共识(GPT + Gemini),双重OCR搭配LLM判断,耗时约2-3分钟
  • searchable — 创建带嵌入式OCR文本层的可搜索PDF,耗时约3-4分钟

Blueprints & Optimizer

蓝图与优化器

Blueprints are optimized, versioned schemas. The optimizer takes your sample documents + expected values and enhances field descriptions for 20-30% accuracy improvement.
蓝图是经过优化的版本化Schema。优化器接收你的样本文档和预期值,增强字段描述,可将准确率提升20-30%。

GET /v1/blueprints/ — List Blueprints

GET /v1/blueprints/ — 列出蓝图

Auth:
X-API-Key: YOUR_KEY
Returns all blueprints with active version and accuracy metrics.
认证:
X-API-Key: YOUR_KEY
返回所有蓝图,包含活跃版本和准确率指标。

GET /v1/blueprints/{blueprint_id} — Get Blueprint Details

GET /v1/blueprints/{blueprint_id} — 获取蓝图详情

Auth:
X-API-Key: YOUR_KEY
Returns blueprint with all versions, active version schema, and accuracy metrics.
认证:
X-API-Key: YOUR_KEY
返回蓝图的所有版本、活跃版本的Schema以及准确率指标。

POST /v1/optimize — Start Optimization

POST /v1/optimize — 启动优化

Auth:
X-API-Key: YOUR_KEY
json
{
  "name": "utility_invoice",
  "description": "Utility bill extraction",
  "document_type": "invoice",
  "initial_schema": {"type": "object", "properties": {...}},
  "training_documents": ["path1.pdf", "path2.pdf"],
  "ground_truth_data": [{"vendor": "Electric Co", "total": 150.00}, ...],
  "target_accuracy": 95.0,
  "max_iterations": 5,
  "max_cost_usd": 10.0
}
  • initial_schema
    is optional — auto-generated from ground truth if omitted
  • Minimum 2 training documents
  • validation_split
    (default 0.3) — fraction held out for validation
Response:
json
{
  "job_id": "...",
  "blueprint_id": "...",
  "status": "pending"
}
认证:
X-API-Key: YOUR_KEY
json
{
  "name": "utility_invoice",
  "description": "水电费账单提取",
  "document_type": "invoice",
  "initial_schema": {"type": "object", "properties": {...}},
  "training_documents": ["path1.pdf", "path2.pdf"],
  "ground_truth_data": [{"vendor": "Electric Co", "total": 150.00}, ...],
  "target_accuracy": 95.0,
  "max_iterations": 5,
  "max_cost_usd": 10.0
}
  • initial_schema
    是可选参数——如果省略,会从真值数据自动生成
  • 至少需要2份训练文档
  • validation_split
    (默认0.3)——用于验证的数据集比例
响应:
json
{
  "job_id": "...",
  "blueprint_id": "...",
  "status": "pending"
}

POST /v1/optimize/resume — Resume Optimization

POST /v1/optimize/resume — 恢复优化

Resume a failed job or start a new optimization run for an existing blueprint.
恢复失败的任务,或为现有蓝图启动新的优化运行。

GET /v1/blueprints/jobs/{job_id} — Optimization Job Status

GET /v1/blueprints/jobs/{job_id} — 优化任务状态

Auth:
X-API-Key: YOUR_KEY
json
{
  "status": "running",
  "iteration": 2,
  "baseline_accuracy": 68.0,
  "current_accuracy": 88.0,
  "target_accuracy": 95.0,
  "total_cost": 1.82,
  "max_cost_usd": 10.0
}
Statuses:
pending
initializing
running
completed
,
failed
, or
cancelled
认证:
X-API-Key: YOUR_KEY
json
{
  "status": "running",
  "iteration": 2,
  "baseline_accuracy": 68.0,
  "current_accuracy": 88.0,
  "target_accuracy": 95.0,
  "total_cost": 1.82,
  "max_cost_usd": 10.0
}
状态流转:
pending
initializing
running
completed
failed
cancelled

GET /v1/blueprints/jobs/{job_id}/schema — Get Optimized Schema

GET /v1/blueprints/jobs/{job_id}/schema — 获取优化后的Schema

Returns the optimized JSON schema after optimization completes.
优化完成后返回优化后的JSON Schema。

Using a Blueprint

使用蓝图

bash
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: YOUR_KEY" \
  -F "file=@invoice.pdf" \
  -F "blueprint_id=660e8400-..."

bash
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: YOUR_KEY" \
  -F "file=@invoice.pdf" \
  -F "blueprint_id=660e8400-..."

Webhooks

Webhook

Pass
webhook_url
when submitting a document to get notified on completion.
Payload sent to your URL:
json
{
  "event": "job.completed",
  "job_id": "550e8400-...",
  "status": "completed",
  "result": {"text": "...", "data": {}},
  "metadata": {},
  "preview_url": "https://preview.deepread.tech/..."
}
Important:
  • Webhooks are NOT authenticated — always fetch the canonical result via
    GET /v1/jobs/{job_id}
    with your API key
  • Must be HTTPS
  • Return 2xx to confirm delivery
  • Delivery is best-effort — use polling as fallback if webhook not received
  • Make your endpoint idempotent (may receive duplicates)

提交文档时传入
webhook_url
,处理完成后会收到通知。
发送到你URL的负载:
json
{
  "event": "job.completed",
  "job_id": "550e8400-...",
  "status": "completed",
  "result": {"text": "...", "data": {}},
  "metadata": {},
  "preview_url": "https://preview.deepread.tech/..."
}
重要提示:
  • Webhook不提供认证——请始终使用你的API密钥通过
    GET /v1/jobs/{job_id}
    获取权威结果
  • 必须使用HTTPS
  • 返回2xx状态码确认接收
  • 采用尽力交付机制——如果未收到Webhook,请使用轮询作为备选方案
  • 确保你的端点是幂等的(可能会收到重复通知)

Rate Limits

速率限制

Every response includes these headers:
HeaderDescription
X-RateLimit-Limit
Monthly pages in your plan
X-RateLimit-Remaining
Pages remaining this cycle
X-RateLimit-Used
Pages used this cycle
X-RateLimit-Reset
Unix timestamp when quota resets
Plans:
PlanPages/monthMax filePer-doc limitRate limit
Free2,00015 MB50 pages10 req/min
Pro ($99/mo)50,00050 MBUnlimited100 req/min
Scale1,000,00050 MBUnlimited500 req/min

每个响应都包含以下头部:
头部描述
X-RateLimit-Limit
你的套餐每月可处理的页数
X-RateLimit-Remaining
当前周期剩余的可处理页数
X-RateLimit-Used
当前周期已使用的页数
X-RateLimit-Reset
配额重置的Unix时间戳
套餐详情:
套餐每月页数最大文件大小单文档页数限制速率限制
免费版2,00015 MB50页10次请求/分钟
专业版($99/月)50,00050 MB无限制100次请求/分钟
企业版1,000,00050 MB无限制500次请求/分钟

Error Handling

错误处理

All errors return:
json
{"detail": "Human-readable error message"}
StatusMeaning
400Bad request — invalid schema, unsupported file, both schema + blueprint_id
401Invalid or missing API key
404Job not found
413File too large for your plan
429Rate limit or monthly quota exceeded
500Server error
Quota exceeded (429):
json
{
  "detail": {
    "error": "page_count_exceeded",
    "message": "Document has 100 pages, exceeds 50-page limit for FREE plan. Upgrade to PRO.",
    "page_count": 100,
    "max_pages": 50,
    "plan": "free"
  }
}
Common failure reasons in jobs:
  • Document issues: corrupted, unreadable, poor scan quality, processing timeout
  • Schema issues: invalid JSON Schema, required fields not found
  • Plan limits: file too large, too many pages, quota exceeded

所有错误都会返回:
json
{"detail": "易读的错误消息"}
状态码含义
400请求无效——无效的schema、不支持的文件类型、同时提供了schema和blueprint_id
401API密钥无效或缺失
404任务不存在
413文件大小超出套餐限制
429超出速率限制或每月配额
500服务器错误
配额超出(429):
json
{
  "detail": {
    "error": "page_count_exceeded",
    "message": "文档有100页,超出免费版50页的限制。请升级到专业版。",
    "page_count": 100,
    "max_pages": 50,
    "plan": "free"
  }
}
任务常见失败原因:
  • 文档问题:损坏、无法读取、扫描质量差、处理超时
  • Schema问题:无效的JSON Schema、未找到必填字段
  • 套餐限制:文件过大、页数过多、配额超出

Code Examples

代码示例

Python

Python

python
import requests
import time
import json

API_KEY = "sk_live_YOUR_KEY"
BASE = "https://api.deepread.tech"
python
import requests
import time
import json

API_KEY = "sk_live_YOUR_KEY"
BASE = "https://api.deepread.tech"

Submit document with structured extraction

提交文档进行结构化提取

schema = { "type": "object", "properties": { "vendor": {"type": "string", "description": "Vendor or company name"}, "total": {"type": "number", "description": "Total amount due"}, "due_date": {"type": "string", "description": "Payment due date"} } }
with open("invoice.pdf", "rb") as f: resp = requests.post( f"{BASE}/v1/process", headers={"X-API-Key": API_KEY}, files={"file": f}, data={"schema": json.dumps(schema)} ) job_id = resp.json()["id"]
schema = { "type": "object", "properties": { "vendor": {"type": "string", "description": "供应商或公司名称"}, "total": {"type": "number", "description": "应付总金额"}, "due_date": {"type": "string", "description": "付款截止日期"} } }
with open("invoice.pdf", "rb") as f: resp = requests.post( f"{BASE}/v1/process", headers={"X-API-Key": API_KEY}, files={"file": f}, data={"schema": json.dumps(schema)} ) job_id = resp.json()["id"]

Poll with exponential backoff

指数退避轮询

delay = 5 while True: time.sleep(delay) result = requests.get( f"{BASE}/v1/jobs/{job_id}", headers={"X-API-Key": API_KEY} ).json()
if result["status"] in ("completed", "failed"):
    break
delay = min(delay * 1.5, 30)  # cap at 30s
delay = 5 while True: time.sleep(delay) result = requests.get( f"{BASE}/v1/jobs/{job_id}", headers={"X-API-Key": API_KEY} ).json()
if result["status"] in ("completed", "failed"):
    break
delay = min(delay * 1.5, 30)  # 最大延迟30秒

Use results

处理结果

if result["status"] == "completed": text = result["result"]["text"] data = result["result"].get("data", {}) for field, info in data.items(): if info["hil_flag"]: print(f"REVIEW: {field} = {info['value']} ({info.get('reason')})") else: print(f"OK: {field} = {info['value']}")
undefined
if result["status"] == "completed": text = result["result"]["text"] data = result["result"].get("data", {}) for field, info in data.items(): if info["hil_flag"]: print(f"需审核:{field} = {info['value']}(原因:{info.get('reason')})") else: print(f"正常:{field} = {info['value']}")
undefined

JavaScript / Node.js

JavaScript / Node.js

javascript
import fs from "fs";

const API_KEY = "sk_live_YOUR_KEY";
const BASE = "https://api.deepread.tech";

// Submit document
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
form.append("schema", JSON.stringify({
  type: "object",
  properties: {
    vendor: { type: "string", description: "Vendor or company name" },
    total: { type: "number", description: "Total amount due" }
  }
}));

const { id: jobId } = await fetch(`${BASE}/v1/process`, {
  method: "POST",
  headers: { "X-API-Key": API_KEY },
  body: form
}).then(r => r.json());

// Poll with backoff
let delay = 5000;
let result;
do {
  await new Promise(r => setTimeout(r, delay));
  result = await fetch(`${BASE}/v1/jobs/${jobId}`, {
    headers: { "X-API-Key": API_KEY }
  }).then(r => r.json());
  delay = Math.min(delay * 1.5, 30000);
} while (!["completed", "failed"].includes(result.status));

console.log(result);
javascript
import fs from "fs";

const API_KEY = "sk_live_YOUR_KEY";
const BASE = "https://api.deepread.tech";

// 提交文档
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
form.append("schema", JSON.stringify({
  type: "object",
  properties: {
    vendor: { type: "string", description: "供应商或公司名称" },
    total: { type: "number", description: "应付总金额" }
  }
}));

const { id: jobId } = await fetch(`${BASE}/v1/process`, {
  method: "POST",
  headers: { "X-API-Key": API_KEY },
  body: form
}).then(r => r.json());

// 指数退避轮询
let delay = 5000;
let result;
do {
  await new Promise(r => setTimeout(r, delay));
  result = await fetch(`${BASE}/v1/jobs/${jobId}`, {
    headers: { "X-API-Key": API_KEY }
  }).then(r => r.json());
  delay = Math.min(delay * 1.5, 30000);
} while (!["completed", "failed"].includes(result.status));

console.log(result);

cURL

cURL

bash
undefined
bash
undefined

Submit with schema

使用schema提交文档

curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F 'schema={"type":"object","properties":{"vendor":{"type":"string","description":"Vendor name"},"total":{"type":"number","description":"Total amount"}}}'
curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F 'schema={"type":"object","properties":{"vendor":{"type":"string","description":"供应商名称"},"total":{"type":"number","description":"应付总金额"}}}'

Submit with blueprint

使用蓝图提交文档

curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F "blueprint_id=660e8400-..."
curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: YOUR_KEY"
-F "file=@invoice.pdf"
-F "blueprint_id=660e8400-..."

Get results

获取结果

curl https://api.deepread.tech/v1/jobs/JOB_ID
-H "X-API-Key: YOUR_KEY"
curl https://api.deepread.tech/v1/jobs/JOB_ID
-H "X-API-Key: YOUR_KEY"

List blueprints

列出蓝图

curl https://api.deepread.tech/v1/blueprints/
-H "X-API-Key: YOUR_KEY"
undefined
curl https://api.deepread.tech/v1/blueprints/
-H "X-API-Key: YOUR_KEY"
undefined

Agent Device Flow (Python)

Agent设备授权流程(Python)

python
import requests
import time
import webbrowser

BASE = "https://api.deepread.tech"
python
import requests
import time
import webbrowser

BASE = "https://api.deepread.tech"

Step 1: Request a device code

步骤1:请求设备码

resp = requests.post(f"{BASE}/v1/agent/device/code", json={"agent_name": "my-agent"}) data = resp.json() device_code = data["device_code"] uri_complete = data["verification_uri_complete"] interval = data["interval"]
resp = requests.post(f"{BASE}/v1/agent/device/code", json={"agent_name": "my-agent"}) data = resp.json() device_code = data["device_code"] uri_complete = data["verification_uri_complete"] interval = data["interval"]

Step 2: Open browser with code pre-filled

步骤2:打开预填代码的浏览器链接

success = webbrowser.open(uri_complete) if success: print(f"Opened browser: {uri_complete}") else: print(f"Unable to open browser programmatically; please open this URL manually: {uri_complete}") print("Log in and click Approve. I'll wait here.")
success = webbrowser.open(uri_complete) if success: print(f"已打开浏览器:{uri_complete}") else: print(f"无法自动打开浏览器,请手动打开此链接:{uri_complete}") print("请登录并点击批准,我会在此等待。")

Step 3: Poll until approved

步骤3:轮询直到获取API密钥

api_key = None while True: time.sleep(interval) resp = requests.post(f"{BASE}/v1/agent/device/token", json={"device_code": device_code}) result = resp.json()
if result.get("api_key"):
    api_key = result["api_key"]
    print(f"Got API key: {result['key_prefix']}...")
    break
elif result.get("error") == "authorization_pending":
    continue
elif result.get("error") == "access_denied":
    print("User denied the request.")
    break
elif result.get("error") == "expired_token":
    print("Code expired. Please start over.")
    break
if api_key is None: raise SystemExit("Device flow did not complete successfully — no API key obtained.")
api_key = None while True: time.sleep(interval) resp = requests.post(f"{BASE}/v1/agent/device/token", json={"device_code": device_code}) result = resp.json()
if result.get("api_key"):
    api_key = result["api_key"]
    print(f"已获取API密钥:{result['key_prefix']}...")
    break
elif result.get("error") == "authorization_pending":
    continue
elif result.get("error") == "access_denied":
    print("用户已拒绝请求。")
    break
elif result.get("error") == "expired_token":
    print("代码已过期,请重新开始。")
    break
if api_key is None: raise SystemExit("设备授权流程未成功完成——未获取到API密钥。")

Step 4: Use the key to process documents

步骤4:使用API密钥处理文档

with open("invoice.pdf", "rb") as f: resp = requests.post( f"{BASE}/v1/process", headers={"X-API-Key": api_key}, files={"file": f}, ) print(resp.json()) # {"id": "...", "status": "queued"}
undefined
with open("invoice.pdf", "rb") as f: resp = requests.post( f"{BASE}/v1/process", headers={"X-API-Key": api_key}, files={"file": f}, ) print(resp.json()) # {"id": "...", "status": "queued"}
undefined

Agent Device Flow (JavaScript)

Agent设备授权流程(JavaScript)

javascript
const fs = require("fs");
const BASE = "https://api.deepread.tech";

// Step 1: Request a device code
const { device_code, verification_uri_complete, interval } = await fetch(
  `${BASE}/v1/agent/device/code`,
  { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ agent_name: "my-agent" }) }
).then(r => r.json());

// Step 2: Open browser with code pre-filled
console.log(`Please open: ${verification_uri_complete}`);
console.log("Log in and click Approve. I'll wait here.");

// Step 3: Poll until approved
let apiKey;
while (true) {
  await new Promise(r => setTimeout(r, interval * 1000));
  const result = await fetch(`${BASE}/v1/agent/device/token`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ device_code }),
  }).then(r => r.json());

  if (result.api_key) {
    apiKey = result.api_key;
    console.log(`Got API key: ${result.key_prefix}...`);
    break;
  } else if (result.error === "authorization_pending") {
    continue;
  } else {
    console.log(`Flow ended: ${result.error}`);
    break;
  }
}

if (!apiKey) {
  throw new Error("Device flow did not complete successfully — no API key obtained.");
}

// Step 4: Use the key
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
const job = await fetch(`${BASE}/v1/process`, {
  method: "POST",
  headers: { "X-API-Key": apiKey },
  body: form,
}).then(r => r.json());
console.log(job); // {id: "...", status: "queued"}
javascript
const fs = require("fs");
const BASE = "https://api.deepread.tech";

// 步骤1:请求设备码
const { device_code, verification_uri_complete, interval } = await fetch(
  `${BASE}/v1/agent/device/code`,
  { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ agent_name: "my-agent" }) }
).then(r => r.json());

// 步骤2:提示用户打开链接
console.log(`请打开:${verification_uri_complete}`);
console.log("请登录并点击批准,我会在此等待。");

// 步骤3:轮询直到获取API密钥
let apiKey;
while (true) {
  await new Promise(r => setTimeout(r, interval * 1000));
  const result = await fetch(`${BASE}/v1/agent/device/token`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ device_code }),
  }).then(r => r.json());

  if (result.api_key) {
    apiKey = result.api_key;
    console.log(`已获取API密钥:${result.key_prefix}...`);
    break;
  } else if (result.error === "authorization_pending") {
    continue;
  } else {
    console.log(`流程结束:${result.error}`);
    break;
  }
}

if (!apiKey) {
  throw new Error("设备授权流程未成功完成——未获取到API密钥。");
}

// 步骤4:使用API密钥
const form = new FormData();
form.append("file", fs.createReadStream("invoice.pdf"));
const job = await fetch(`${BASE}/v1/process`, {
  method: "POST",
  headers: { "X-API-Key": apiKey },
  body: form,
}).then(r => r.json());
console.log(job); // {id: "...", status: "queued"}

Agent Device Flow (cURL)

Agent设备授权流程(cURL)

bash
undefined
bash
undefined

Step 1: Request a device code — save the full response

步骤1:请求设备码——保存完整响应

response=$(curl -s -X POST https://api.deepread.tech/v1/agent/device/code
-H "Content-Type: application/json"
-d '{"agent_name": "my-agent"}') device_code=$(echo "$response" | jq -r '.device_code') verification_uri_complete=$(echo "$response" | jq -r '.verification_uri_complete') interval=$(echo "$response" | jq -r '.interval')
response=$(curl -s -X POST https://api.deepread.tech/v1/agent/device/code
-H "Content-Type: application/json"
-d '{"agent_name": "my-agent"}') device_code=$(echo "$response" | jq -r '.device_code') verification_uri_complete=$(echo "$response" | jq -r '.verification_uri_complete') interval=$(echo "$response" | jq -r '.interval')

Step 2: Open the browser (use the saved URL — code is pre-filled, user clicks Approve)

步骤2:打开浏览器(macOS用open,Linux用xdg-open)

open "$verification_uri_complete" # macOS / xdg-open on Linux
open "$verification_uri_complete"

Step 3: Poll for the key (repeat every $interval seconds until api_key is returned)

步骤3:轮询获取API密钥(每隔$interval秒重复执行)

curl -s -X POST https://api.deepread.tech/v1/agent/device/token
-H "Content-Type: application/json"
-d "{"device_code": "$device_code"}"
curl -s -X POST https://api.deepread.tech/v1/agent/device/token
-H "Content-Type: application/json"
-d "{"device_code": "$device_code"}"

→ {"error": "authorization_pending"} (keep polling)

→ {"error": "authorization_pending"} (继续轮询)

→ {"api_key": "sk_live_...", "key_prefix": "sk_live_abc..."} (done!)

→ {"api_key": "sk_live_...", "key_prefix": "sk_live_abc..."} (完成!)

Step 4: Use the key

步骤4:使用API密钥

curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: sk_live_..."
-F "file=@invoice.pdf"
undefined
curl -X POST https://api.deepread.tech/v1/process
-H "X-API-Key: sk_live_..."
-F "file=@invoice.pdf"
undefined

Webhook Receiver (Python / Flask)

Webhook接收器(Python / Flask)

python
from flask import Flask, request
import requests

app = Flask(__name__)
API_KEY = "sk_live_YOUR_KEY"

@app.route("/webhook", methods=["POST"])
def handle_webhook():
    payload = request.json
    job_id = payload["job_id"]

    # IMPORTANT: Always fetch canonical result from API (webhooks are not authenticated)
    result = requests.get(
        f"https://api.deepread.tech/v1/jobs/{job_id}",
        headers={"X-API-Key": API_KEY}
    ).json()

    # Process result...
    return "", 200  # Return 2xx to confirm delivery

python
from flask import Flask, request
import requests

app = Flask(__name__)
API_KEY = "sk_live_YOUR_KEY"

@app.route("/webhook", methods=["POST"])
def handle_webhook():
    payload = request.json
    job_id = payload["job_id"]

    # 重要提示:始终从API获取权威结果(Webhook不提供认证)
    result = requests.get(
        f"https://api.deepread.tech/v1/jobs/{job_id}",
        headers={"X-API-Key": API_KEY}
    ).json()

    # 处理结果...
    return "", 200  # 返回2xx状态码确认接收

Help the Developer

开发者帮助指南

  • No API key yet → use the device authorization flow (Agent Authentication section) — no copy/paste needed
  • Send a document → POST /v1/process, show code in their language
  • Structured data → help write a JSON Schema with descriptive field descriptions
  • Better accuracy → explain blueprints, help set up optimizer
  • Real-time updates → set up webhook_url, build receiver endpoint
  • Hitting errors → check API key, plan limits, file format, schema validity
  • Share results → use preview_url from response (no auth needed)
  • Large documents → use text_url instead of text field for docs > 1MB
  • Review workflow → filter fields by hil_flag, route flagged ones to human review
  • 还没有API密钥 → 使用设备授权流程(Agent认证章节)——无需手动复制粘贴
  • 提交文档 → 调用POST /v1/process,提供对应语言的代码示例
  • 结构化数据提取 → 帮助编写带有详细字段描述的JSON Schema
  • 提升准确率 → 解释蓝图功能,帮助设置优化器
  • 实时更新 → 配置webhook_url,构建接收器端点
  • 遇到错误 → 检查API密钥、套餐限制、文件格式、Schema有效性
  • 分享结果 → 使用响应中的preview_url(无需认证)
  • 大文档处理 → 对于超过1MB的文档,使用text_url字段获取文本
  • 审核工作流 → 根据hil_flag筛选字段,将标记的字段路由到人工审核