mistral-ocr

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Mistral OCR

Mistral OCR

Extract text from images and PDFs using Mistral's dedicated OCR API. No external dependencies required.
使用Mistral专属OCR API从图片和PDF中提取文本,无需外部依赖。

Requirements

前提条件

This skill requires a Mistral API key. If you don't have one, follow the guide in reference/getting-started.md.
该Skill需要Mistral API密钥。如果您还没有,请遵循reference/getting-started.md中的指南获取。

API Key

API密钥

The user must provide their Mistral API key. Ask for it if not available.
Option 1 (Recommended for AI agents): User provides key directly in message:
"Use this Mistral key: aBc123XyZ..."
"Convert this PDF to markdown, my API key is aBc123XyZ..."
Option 2: Environment variable
$MISTRAL_API_KEY
Option 3: Claude Code settings (
~/.claude/settings.json
)
If no key is available, guide the user to get one at console.mistral.ai.

用户必须提供自己的Mistral API密钥。如果未获取到,请向用户索要。
选项1(推荐AI Agent使用):用户在消息中直接提供密钥:
"Use this Mistral key: aBc123XyZ..."
"Convert this PDF to markdown, my API key is aBc123XyZ..."
选项2:环境变量
$MISTRAL_API_KEY
选项3:Claude Code设置(
~/.claude/settings.json
如果没有可用密钥,引导用户前往console.mistral.ai获取。

API Endpoint

API端点

Use the dedicated OCR endpoint for all document processing:
POST https://api.mistral.ai/v1/ocr
Model:
mistral-ocr-latest

使用专属OCR端点处理所有文档:
POST https://api.mistral.ai/v1/ocr
模型:
mistral-ocr-latest

Features

功能特性

1. PDF → Markdown (Direct, no conversion needed!)

1. PDF → Markdown(直接转换,无需额外步骤!)

bash
curl -s "https://api.mistral.ai/v1/ocr" \
  -H "Authorization: Bearer $MISTRAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral-ocr-latest",
    "document": {
      "type": "document_url",
      "document_url": "https://example.com/document.pdf"
    }
  }'
bash
curl -s "https://api.mistral.ai/v1/ocr" \
  -H "Authorization: Bearer $MISTRAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral-ocr-latest",
    "document": {
      "type": "document_url",
      "document_url": "https://example.com/document.pdf"
    }
  }'

2. Image → Text

2. 图片 → 文本

Works with JPG, PNG, WEBP, GIF:
bash
curl -s "https://api.mistral.ai/v1/ocr" \
  -H "Authorization: Bearer $MISTRAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral-ocr-latest",
    "document": {
      "type": "image_url",
      "image_url": "https://example.com/image.jpg"
    }
  }'
支持JPG、PNG、WEBP、GIF格式:
bash
curl -s "https://api.mistral.ai/v1/ocr" \
  -H "Authorization: Bearer $MISTRAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral-ocr-latest",
    "document": {
      "type": "image_url",
      "image_url": "https://example.com/image.jpg"
    }
  }'

3. Local Files (Base64 Data URL)

3. 本地文件(Base64数据URL)

For local PDFs or images, encode as base64 and use a data URL.
ALWAYS use curl (works on all platforms including Windows via Git Bash):
bash
undefined
对于本地PDF或图片,将其编码为base64格式并使用数据URL。
请始终使用curl(适用于所有平台,包括Windows的Git Bash):
bash
undefined

For local PDF

处理本地PDF

BASE64=$(base64 -w0 document.pdf) curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "document_url", "document_url": "data:application/pdf;base64,'"$BASE64"'" } }'
BASE64=$(base64 -w0 document.pdf) curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "document_url", "document_url": "data:application/pdf;base64,'"$BASE64"'" } }'

For local images (PNG, JPG, etc.)

处理本地图片(PNG、JPG等)

BASE64=$(base64 -w0 image.png) curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "image_url", "image_url": "data:image/png;base64,'"$BASE64"'" } }'

**MIME types:**
- PDF: `data:application/pdf;base64,...`
- PNG: `data:image/png;base64,...`
- JPG: `data:image/jpeg;base64,...`
- WEBP: `data:image/webp;base64,...`
BASE64=$(base64 -w0 image.png) curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "image_url", "image_url": "data:image/png;base64,'"$BASE64"'" } }'

**MIME类型:**
- PDF: `data:application/pdf;base64,...`
- PNG: `data:image/png;base64,...`
- JPG: `data:image/jpeg;base64,...`
- WEBP: `data:image/webp;base64,...`

4. Structured JSON Output

4. 结构化JSON输出

For invoices, forms, tables - ask for JSON in a follow-up or use Document AI annotations.

对于发票、表单、表格,可以在后续请求中要求返回JSON格式,或使用Document AI注释功能。

Response Format

响应格式

The API returns markdown directly:
json
{
  "pages": [
    {
      "index": 0,
      "markdown": "# Document Title\n\nExtracted content here...",
      "images": [],
      "tables": [],
      "dimensions": {"dpi": 200, "height": 842, "width": 595}
    }
  ],
  "model": "mistral-ocr-latest",
  "usage_info": {"pages_processed": 1, "doc_size_bytes": 12345}
}

API直接返回Markdown格式内容:
json
{
  "pages": [
    {
      "index": 0,
      "markdown": "# Document Title\n\nExtracted content here...",
      "images": [],
      "tables": [],
      "dimensions": {"dpi": 200, "height": 842, "width": 595}
    }
  ],
  "model": "mistral-ocr-latest",
  "usage_info": {"pages_processed": 1, "doc_size_bytes": 12345}
}

Workflow

工作流程

User requests OCR from image or PDF

用户请求对图片或PDF进行OCR识别

  1. Get API key - Ask user if not in environment
  2. Determine input type (URL or local file)
  3. For local files, ALWAYS use temp file approach (avoids "Argument list too long" error):
bash
undefined
  1. 获取API密钥 - 如果环境变量中没有,向用户索要
  2. 确定输入类型(URL或本地文件)
  3. 对于本地文件,始终使用临时文件方法(避免"参数列表过长"错误):
bash
undefined

Cross-platform temp directory

跨平台临时目录

TMPDIR="${TMPDIR:-${TEMP:-/tmp}}"
TMPDIR="${TMPDIR:-${TEMP:-/tmp}}"

Step 1: Encode file to base64

步骤1:将文件编码为base64格式

base64 -w0 "document.pdf" > "$TMPDIR/b64.txt"
base64 -w0 "document.pdf" > "$TMPDIR/b64.txt"

Step 2: Create JSON request file

步骤2:创建JSON请求文件

echo '{"model":"mistral-ocr-latest","document":{"type":"document_url","document_url":"data:application/pdf;base64,'$(cat "$TMPDIR/b64.txt")'"}}' > "$TMPDIR/request.json"
echo '{"model":"mistral-ocr-latest","document":{"type":"document_url","document_url":"data:application/pdf;base64,'$(cat "$TMPDIR/b64.txt")'"}}' > "$TMPDIR/request.json"

Step 3: Call API with -d @file (use actual key, not variable)

步骤3:使用-d @file调用API(请使用实际密钥,而非变量)

curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer YOUR_API_KEY_HERE"
-H "Content-Type: application/json"
-d @"$TMPDIR/request.json" > "$TMPDIR/response.json"
curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer YOUR_API_KEY_HERE"
-H "Content-Type: application/json"
-d @"$TMPDIR/request.json" > "$TMPDIR/response.json"

Step 4: Extract markdown with node (NOT jq - not available on all systems)

步骤4:使用node提取Markdown(请勿使用jq - 并非所有系统都支持)

node -e "const fs=require('fs'); const r=JSON.parse(fs.readFileSync('$TMPDIR/response.json')); console.log(r.pages.map(p=>p.markdown).join('\n\n---\n\n'))"

4. **Save to .md file** using Write tool
5. Confirm file location to user
node -e "const fs=require('fs'); const r=JSON.parse(fs.readFileSync('$TMPDIR/response.json')); console.log(r.pages.map(p=>p.markdown).join('\n\n---\n\n'))"

4. **使用Write工具保存为.md文件**
5. 向用户确认文件保存位置

IMPORTANT: Cross-Platform Compatibility

重要提示:跨平台兼容性

  • ALWAYS use curl (works on Windows via Git Bash)
  • ALWAYS use
    -d @file
    for request body (handles large files)
  • NEVER use jq - use node instead to parse JSON
  • Use
    ${TMPDIR:-${TEMP:-/tmp}}
    for temp files (works on all systems)
  • Copy response.json to user directory before parsing with node on Windows

  • 始终使用curl(适用于Windows的Git Bash)
  • **始终使用
    -d @file
    **传递请求体(处理大文件)
  • 请勿使用jq - 改用node解析JSON
  • **使用
    ${TMPDIR:-${TEMP:-/tmp}}
    **存储临时文件(适用于所有系统)
  • 在Windows上,解析前将response.json复制到用户目录

Usage Examples

使用示例

When the user says:
User RequestAction
"Convert this PDF to markdown"OCR the PDF, save as .md file
"Extract text from this image"OCR the image, return text
"Give me a .md of this document"OCR and save as .md file
"What does this PDF say?"OCR and summarize content
"OCR this receipt"Extract text, optionally structure as JSON

当用户提出以下请求时:
用户请求操作
"把这个PDF转换成Markdown"对PDF进行OCR识别,保存为.md文件
"提取这张图片里的文本"对图片进行OCR识别,返回文本内容
"给我这个文档的.md版本"进行OCR识别并保存为.md文件
"这个PDF里写了什么?"进行OCR识别并总结内容
"识别这张收据的内容"提取文本,可选转换为结构化JSON格式

Error Handling

错误处理

ErrorCauseSolution
401 UnauthorizedInvalid API keyVerify key, guide to getting-started.md
400 Bad RequestInvalid documentCheck format and URL accessibility
3310 File fetch errorURL not accessibleUse base64 for local files
Rate limitToo many requestsWait and retry

错误原因解决方案
401 UnauthorizedAPI密钥无效验证密钥,引导用户查看getting-started.md
400 Bad Request文档无效检查格式和URL可访问性
3310 File fetch errorURL无法访问对本地文件使用base64编码
速率限制请求过于频繁等待后重试

Supported Formats

支持的格式

FormatSupport
PDF✅ Direct (no conversion)
PNG✅ Direct
JPG/JPEG✅ Direct
WEBP✅ Direct
GIF✅ Direct
No external dependencies required! Unlike other OCR solutions, Mistral OCR handles PDFs directly without needing pdftoppm, ImageMagick, or any other tools.

格式支持情况
PDF✅ 直接支持(无需转换)
PNG✅ 直接支持
JPG/JPEG✅ 直接支持
WEBP✅ 直接支持
GIF✅ 直接支持
无需外部依赖! 与其他OCR解决方案不同,Mistral OCR可直接处理PDF,无需pdftoppm、ImageMagick或任何其他工具。

Pricing

定价

As of 2025, Mistral OCR pricing:
  • $2 per 1,000 pages
  • 50% discount with Batch API
Check current rates at mistral.ai/pricing

截至2025年,Mistral OCR定价:
  • 每1000页2美元
  • 使用批量API可享受50%折扣
请查看mistral.ai/pricing获取最新价格。

References

参考资料

  • Getting Started - How to get your API key
  • PDF to Markdown - PDF conversion examples
  • Output Formats - JSON, Markdown, plain text
  • Step-by-Step Guide - Complete tutorial with examples

Skill by Parlamento AI
  • 快速入门 - 如何获取API密钥
  • PDF转Markdown - PDF转换示例
  • 输出格式 - JSON、Markdown、纯文本
  • 分步指南 - 完整示例教程

该Skill由Parlamento AI开发