invoice-organizer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseInvoice Organizer
发票整理工具
Bulk-categorize a CSV of invoices or receipts, detect duplicates, and produce a tax-ready monthly summary.
批量分类发票或收据的CSV文件,检测重复项,并生成可直接用于税务申报的月度汇总。
Table of Contents
目录
Keywords
关键词
invoice, invoices, receipt, receipts, expense, expenses, bookkeeping, accounting, tax, tax prep, categorization, vendor, reimbursement, monthly summary
invoice, invoices, receipt, receipts, expense, expenses, bookkeeping, accounting, tax, tax prep, categorization, vendor, reimbursement, monthly summary
Quick Start
快速开始
Categorize 200 Receipts in 1 Minute
1分钟完成200张收据分类
- Export receipts from your bank or expense tool as a CSV with columns:
date,vendor,description,amount,currency - Run:
bash
python scripts/invoice_categorizer.py receipts.csv - Review the categorized output and override anything wrong via the rules file
- Export the monthly summary for handoff to your accountant
- 从银行或费用管理工具导出收据为CSV文件,需包含列:
date,vendor,description,amount,currency - 运行以下命令:
bash
python scripts/invoice_categorizer.py receipts.csv - 查看分类后的输出结果,通过规则文件修正分类错误的条目
- 导出月度汇总文件提交给会计师
Core Workflows
核心工作流
Workflow 1: Monthly Bookkeeping
工作流1:月度簿记
Goal: Convert a month of unstructured receipts into a categorized, tax-ready summary in under 10 minutes.
Steps:
- Export receipts as CSV from your bank, card, or expense tool
- Run:
python scripts/invoice_categorizer.py receipts.csv - Review the uncategorized bucket — these need rules added or manual override
- Add rules to for any recurring vendors
assets/category_rules.json - Re-run; uncategorized count should drop each month as the rules file grows
- Drop the monthly summary into
assets/monthly_summary_template.md
Expected Output: Categorized expense list + monthly totals by category + duplicate-suspect list.
Time Estimate: 10 minutes/month after initial rules are seeded.
目标: 在10分钟内将一个月的非结构化收据转换为分类完成、可用于税务申报的汇总文件。
步骤:
- 从银行、信用卡或费用管理工具导出收据为CSV文件
- 运行命令:
python scripts/invoice_categorizer.py receipts.csv - 查看未分类条目——这些需要添加规则或手动修正
- 在中为经常性合作供应商添加规则
assets/category_rules.json - 重新运行脚本;随着规则文件的完善,每月未分类条目的数量会逐渐减少
- 将月度汇总文件导入模板
assets/monthly_summary_template.md
预期输出: 分类后的费用列表 + 按类别统计的月度总计 + 疑似重复项列表。
时间预估: 初始规则设置完成后,每月耗时约10分钟。
Workflow 2: Duplicate Detection
工作流2:重复项检测
Goal: Catch double-entered receipts before they reach the books.
Steps:
- Run:
python scripts/invoice_categorizer.py receipts.csv --json - Inspect the list
duplicates_suspected - Confirm whether each is a true duplicate (same charge entered twice) or a coincidence (same amount on different days at different vendors)
- Remove confirmed duplicates from the source CSV; re-run
Expected Output: Cleaned CSV with no duplicate rows.
Time Estimate: 2-3 minutes per month.
目标: 在入账前发现重复录入的收据。
步骤:
- 运行命令:
python scripts/invoice_categorizer.py receipts.csv --json - 查看列表
duplicates_suspected - 确认每个条目是真重复(同一笔费用重复录入)还是巧合(不同日期、不同供应商但金额相同)
- 从源CSV文件中移除确认的重复项,重新运行脚本
预期输出: 清理完成后无重复行的CSV文件。
时间预估: 每月耗时2-3分钟。
Workflow 3: Vendor Spend Review
工作流3:供应商支出审查
Goal: Find spend creep — vendors whose monthly total grew significantly without you noticing.
Steps:
- Run categorizer for the last 3-6 months separately
- Compare per-vendor totals month-over-month
- Flag any vendor where total grew > 25% with no obvious business reason
- Either renegotiate, switch, or accept; revisit quarterly
Expected Output: Vendor-spend trend list with flagged growth.
Time Estimate: 15 minutes per quarter.
目标: 发现支出异常增长——即供应商月度总支出大幅增长但未被察觉的情况。
步骤:
- 分别运行分类工具处理过去3-6个月的收据
- 对比各供应商的月度支出总额
- 标记无合理业务原因但支出增长超过25%的供应商
- 选择重新协商合作、更换供应商或接受现状,每季度复查一次
预期输出: 包含异常增长标记的供应商支出趋势列表。
时间预估: 每季度耗时15分钟。
Tools
工具
invoice_categorizer.py
invoice_categorizer.py
Reads a CSV of receipts/invoices and:
- Categorizes each row by vendor + description against rules in (extensible)
assets/category_rules.json - Aggregates totals per category and per vendor
- Detects likely duplicates (same vendor + amount within 3 days)
- Flags uncategorized items for manual review
bash
undefined读取收据/发票的CSV文件并执行以下操作:
- 分类:根据中的规则(可扩展),按供应商+描述对每一行条目进行分类
assets/category_rules.json - 汇总:按类别和供应商统计支出总额
- 检测:疑似重复项(同一供应商+相同金额,且时间间隔在3天内)
- 标记:未分类条目以便人工审核
bash
undefinedHuman-readable summary
人类可读格式的汇总
python scripts/invoice_categorizer.py receipts.csv
python scripts/invoice_categorizer.py receipts.csv
JSON for programmatic use
供程序调用的JSON格式
python scripts/invoice_categorizer.py receipts.csv --json
python scripts/invoice_categorizer.py receipts.csv --json
Use a custom rules file
使用自定义规则文件
python scripts/invoice_categorizer.py receipts.csv --rules my-rules.json
**Expected CSV columns:** `date, vendor, description, amount` (currency optional)
**Date formats accepted:** `YYYY-MM-DD`, `MM/DD/YYYY`, `DD/MM/YYYY`
---python scripts/invoice_categorizer.py receipts.csv --rules my-rules.json
**要求的CSV列:** `date, vendor, description, amount`(currency为可选列)
**支持的日期格式:** `YYYY-MM-DD`, `MM/DD/YYYY`, `DD/MM/YYYY`
---Reference Guides
参考指南
- — Standard expense categories, common tax buckets (US Schedule C, UK self-employment, generic), how to map vendors to categories
references/expense_categorization_guide.md
- —— 标准费用类别、常见税务分类(美国Schedule C、英国自雇、通用分类)、供应商与类别的映射方法
references/expense_categorization_guide.md
Templates
模板
- — Default rules; extend with your recurring vendors
assets/category_rules.json - — Format for handing the monthly summary to an accountant
assets/monthly_summary_template.md
- —— 默认规则;可添加您的经常性合作供应商规则
assets/category_rules.json - —— 提交给会计师的月度汇总文件格式模板
assets/monthly_summary_template.md
Best Practices
最佳实践
- Categorize monthly, not annually. Annual catch-up bookkeeping always misses receipts and produces guess-categorization.
- Grow the rules file over time. First month: 30% uncategorized. Sixth month: < 5%. The compounding return on rule-writing is high.
- Keep evidence. Categorization is bookkeeping; receipts (PDFs, photos) are tax evidence. Store separately from this script's output.
- Don't trust auto-categorization for tax filing. Use it for prep; have a human (you or your accountant) sign off before filing.
- Currency consistency. If you have multi-currency receipts, convert at month-end FX rate before this script; it does not handle FX.
- 每月分类,而非年度整理。 年度补做簿记总会遗漏收据,且分类结果多为猜测。
- 逐步完善规则文件。 第一个月:30%未分类条目。第六个月:<5%未分类条目。编写规则的回报会逐步累积。
- 留存凭证。 分类属于簿记工作;收据(PDF、照片)是税务凭证。需与本脚本的输出文件分开存储。
- 税务申报勿完全依赖自动分类。 自动分类仅用于准备工作;申报前需由人工(您或您的会计师)审核确认。
- 保持货币一致性。 若有多币种收据,请在运行本脚本前按月末汇率转换为统一货币;本脚本不处理汇率转换。
Integration Points
集成点
- Pairs with skills for budgeting and forecasting
finance/ - Feeds into cash-flow workflows
c-level-advisor/cs-cfo-advisor - Used by solo-founder persona for monthly close
- 可与技能集成,用于预算编制与预测
finance/ - 可为现金流工作流提供数据支持
c-level-advisor/cs-cfo-advisor - 供独立创始人(solo-founder)角色用于月度结账