Package to decode and extract invoice metadata from an AFIP CAE qr code link
Project description
AFIP invoice PDF QR CAE extract and decode
afipcaeqrdecode extracts AFIP CAE invoice metadata from PDF invoices. It renders the first page of the PDF, locates the AFIP QR code, decodes the QR payload, and returns the decoded metadata as a Python dict.
Installation
This package depends on qreader, which depends on pyzbar, which depends on the system zbar library.
On Linux (Ubuntu 22.04):
sudo apt-get install libzbar0
On macOS:
brew install zbar
Then install the package:
pip install afipcaeqrdecode
Quick Start
from afipcaeqrdecode import get_cae_metadata
invoice_metadata = get_cae_metadata("./tests/sample_files/2000005044986390.pdf")
Example output:
{
"ver": 1,
"fecha": "2023-02-10",
"cuit": 30710145764,
"ptoVta": 4,
"tipoCmp": 1,
"nroCmp": 25399,
"importe": 2460,
"moneda": "PES",
"ctz": 1,
"tipoDocRec": 80,
"nroDocRec": 30717336905,
"tipoCodAut": "E",
"codAut": 73064176949471,
}
Return Value
get_cae_metadata(filepath, attempt_to_repair_json=True) returns:
- a Python
dictwhen AFIP QR metadata is decoded successfully Nonewhen no AFIP QR metadata can be extracted
The returned dictionary preserves the value types present in the decoded JSON payload. Some invoices encode numeric-looking values as JSON strings, and those values are preserved as strings.
How It Works
The decoding flow has two stages:
- Render the first page of the PDF with PyMuPDF and run qreader on the resulting image.
- If that fails, extract embedded images from the first page and run
qreaderon those images as a fallback.
If the AFIP QR payload contains malformed JSON, json-repair is used by default to repair it before parsing.
Why qreader instead of pyzbar
Earlier versions of this project used only pyzbar. Some real-world AFIP invoice QRs did not decode reliably with pyzbar alone.
qreader still uses pyzbar for decoding, but it first uses a trained QR detector and applies preprocessing strategies that improve decode rates on difficult images.
Notes and Limitations
- On first run,
qreadermay download model weights before decoding. - Some invoices omit fields such as
fechain the QR payload. - This project is tested mainly with sample PDF integration tests in
tests/sample_files/. - This package is still experimental. Use it carefully in production workloads.
Testing
Run the sample-based integration test suite with:
python -m unittest tests.test_sample_files
License
GNU Lesser General Public License v3.0 or later (LGPL-3.0-or-later). See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file afipcaeqrdecode-1.0.0.tar.gz.
File metadata
- Download URL: afipcaeqrdecode-1.0.0.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6c4607937a0f6e3022de8218db586654369dcb85742c716a83396d856e32f3e
|
|
| MD5 |
8f7f6c6e05bfffc547db19cdd8b2b740
|
|
| BLAKE2b-256 |
722afab3f8f19efac726b981fbe1860682fa2fe1f3488d37588f8492a4e997d5
|
File details
Details for the file afipcaeqrdecode-1.0.0-py3-none-any.whl.
File metadata
- Download URL: afipcaeqrdecode-1.0.0-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0b99b6baba49b8fe4bbce3ca5102e8be468e1fa93d22fdba7d8170f975b32fd
|
|
| MD5 |
f9d932d6e63ca6361fb63f0d4facaa83
|
|
| BLAKE2b-256 |
8e1b2dd4f486adbc49703c32b10e5f9ee619a4e4fab3b338e53e614b5ba707e2
|