Deterministic migration-risk analysis for OCR-based invoice extraction workloads.
Project description
Invoice Migration Analyzer
Local CLI for deterministic migration-risk analysis of OCR-based invoice extraction workloads.
What it does / what it does not do
Answers one question: "Can this workload safely tolerate a cheaper LLM under operational risk constraints?"
Not an eval harness, benchmarking suite, model quality scorer, or generic document analyzer. Validation is source-supported only — OCR text is the supporting evidence, never ground truth.
Classification ladder
A row is labeled by the worst status across total_amount, currency, and invoice_date. vendor_name is warning-only.
| Label | Trigger |
|---|---|
RISK |
Any critical field has no source support, or unparseable extraction. |
REVIEW_AMBIGUOUS |
Any critical field has multiple competing source candidates. |
REVIEW_INFERRED |
Any critical field matches by tolerance/normalization, not literally. |
SAFE |
Every critical field has direct literal source support and zero ambiguity. |
SAFE invariant: both direct source support and zero unresolved ambiguity must hold on all critical fields. Any deviation demotes the row. REVIEW_INFERRED (tolerance/normalization match) never maps to SAFE — enforced by policy and by a runtime guard.
Guarantee on the shipped corpus: false-SAFE count: 0 across 200 rows.
Installation
pip install -e ".[fuzzy]"
python corpus\generate_corpus.py --full
Usage
invoice-analyzer run ^
--sample PATH (default: corpus\sample.jsonl) ^
--keys PATH (default: corpus\keys.json) ^
--output DIR (default: .\output) ^
--cache DIR (optional, enables replay cache) ^
--baseline-cost FLOAT (default: 0.015) ^
--candidate-cost FLOAT (default: 0.002) ^
--volume INT (default: 50000) ^
--max-rows INT (default: 1000, hard cap: 1000) ^
--no-detail ^
--baseline-model STR (default: baseline) ^
--candidate-model STR (default: candidate)
invoice-analyzer version
Output files
output\report.md— human-readable report with decision summary, label table, conservative vs optimistic cost projection, and per-row evidence.output\report.html— same content rendered as a single self-contained HTML page.output\raw_results.jsonl— one JSON object per row with label, per-field statuses, evidence strings, reasons, cache flag, and any error.
Cost model
Two scenarios. Conservative: only SAFE rows migrate to the cheaper model; rest stays baseline. Optimistic: also routes REVIEW_INFERRED and REVIEW_AMBIGUOUS to the cheaper model. Conservative is the planning figure. Optimistic is an upper bound contingent on a human-review pipeline absorbing the review classes.
Running tests
pytest tests\ -v --tb=short --basetemp=.\pytest-tmp
The --basetemp=.\pytest-tmp workaround is required on Windows due to AppData\Temp permission constraints in some environments.
Corpus
200 adversarial rows. Base 100 (seed 42) + expansion 100 (seed 142). The 8 base attack vectors:
multi_occurrence— same total appears in multiple labeled positions.ocr_near_miss— total garbled by O/0, l/1, spacing artifacts.inferred_equivalence— extracted matches by tolerance only (e.g.300.00vs300).multi_currency— two currencies present in source.date_collision— invoice/due/PO dates all parseable and distinct.low_ocr_quality— heavily corrupted OCR throughout.correct_clean— well-formed invoice with single supporting evidence.repeated_total— total repeats but supporting amounts disagree.
Expansion adds 11 further vectors (european amount format, ambiguous date format, symbol-only currency, single-line OCR, vendor edge cases, amount-in-words, amount rounded in source, relative dates, currency implied not stated, amount/date/currency collisions, plus a clean control). Every adversarial example exists to attack SAFE credibility.
Hard limits / scope
- Invoices only.
- Single-turn JSON extraction.
- Local CLI. No SaaS. No Docker. No UI. No telemetry.
- One baseline vs one candidate model.
- 1000 row hard cap.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file invoice_analyzer-0.1.0.tar.gz.
File metadata
- Download URL: invoice_analyzer-0.1.0.tar.gz
- Upload date:
- Size: 26.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94ff180343890dc9a45e2e2d433122463546635ac97dbcf1f3ebaf3659802c6e
|
|
| MD5 |
a579ac262b661dc8b0604fb54445f200
|
|
| BLAKE2b-256 |
2dba863c1cd129d048c0227cb5eaba49929dca4a9711656498eb29fac7552b2c
|
File details
Details for the file invoice_analyzer-0.1.0-py3-none-any.whl.
File metadata
- Download URL: invoice_analyzer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 21.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e09d6c2b8abc9cf1a9b1387ab054101dd34fe49fd77b1fa599a82002d0cc854
|
|
| MD5 |
c79a991827808c1e96f99714786f4405
|
|
| BLAKE2b-256 |
56bf674fd7378c4024597995b2d4cf8eb262e100a0d33bf848b6db54be070306
|