receipt-scanner

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Receipt Scanner

收据扫描器

Extract structured data from receipt images using OCR.
使用OCR从收据图片中提取结构化数据。

Features

功能特性

  • OCR Processing: Extract text from receipt images
  • Data Extraction: Vendor, date, items, amounts, total, tax
  • Pattern Matching: Smart regex patterns for receipts
  • Multi-Format Support: JPG, PNG, PDF receipts
  • JSON/CSV Export: Structured data output
  • Batch Processing: Process multiple receipts
  • OCR处理:从收据图片中提取文本
  • 数据提取:商家信息、日期、商品、金额、总计、税费
  • 模式匹配:针对收据的智能正则表达式模式
  • 多格式支持:支持JPG、PNG、PDF格式的收据
  • JSON/CSV导出:输出结构化数据
  • 批量处理:处理多张收据

CLI Usage

命令行界面(CLI)使用方法

bash
python receipt_scanner.py --input receipt.jpg --output data.json
python receipt_scanner.py --batch receipts/ --output receipts.csv
bash
python receipt_scanner.py --input receipt.jpg --output data.json
python receipt_scanner.py --batch receipts/ --output receipts.csv

Dependencies

依赖项

  • pytesseract>=0.3.10
  • pillow>=10.0.0
  • opencv-python>=4.8.0
  • pandas>=2.0.0
  • pytesseract>=0.3.10
  • pillow>=10.0.0
  • opencv-python>=4.8.0
  • pandas>=2.0.0