Skip to main content

Package to decode and extract invoice metadata from an AFIP CAE qr code link

Project description

PyPI python package

AFIP invoice pdf qr CAE extract and decode

This is a python package that uses pdf2image to convert the first page of your AFIP invoice with an AFIP CAE QR code to an image, and then run qreader on it in order to locate and decode the AFIP CAE QR code in order to extract relevant invoice metadata like:

  • Invoice date
  • CUIT of invoice creator
  • AFIP electronic invoice point of sale (Punto de venta)
  • Invoice number
  • Amount
  • Currency
  • CUIT of inovoice recipient

And other less important properties.

Why qreader instead of pyzbar

In its inception this library used just pyzbar, however we came upon some QR codes which did not decode succesfully using just pyzbar.

qreader depends on pyzbar, but uses a pre-trained AI model to detect and segment QR codes, using information extracted by this AI model, it applies different image preprocessing techniques that heavily increase the decoding rate by pyzbar

Example Usage and notes about metadata

Using the included sample files for demonstration (and ran from repository root using included sample file):

from afipcaeqrdecode import get_cae_metadata

invoice_metadata = get_cae_metadata('./tests/sample_files/2000005044986390.pdf')

Here, invoice metadata will evaluate to:

{
    "ver":1,
    "fecha":"2023-02-10", #I've found this field to be missing in some decodes
    "cuit":30710145764,
    "ptoVta":4,
    "tipoCmp":1,
    "nroCmp":25399,
    "importe":2460,
    "moneda":"PES",
    "ctz":1,
    "tipoDocRec":80,
    "nroDocRec":30717336905,
    "tipoCodAut":"E",
    "codAut":73064176949471
}

#The actual output will not be pretty printed, it will be stripped of all whitespace and formatting characters

Salvaging bad QR code z-indexing on invoices, bad AFIP CAE urls, and bad JSON

Some bad PDFs have other images overlapping on the AFIP CAE QR code, so we implemented a second run codepath that uses PyMuPDF in order to extract all images inside the invoices and then run qreader on them.

In cases in which the construction of the AFIP CAE QR url was done incorrectly or have some parts missing, we try to decode anyways.

We came upon many decoded metadatas with bad json that had to be repaired in the consumer application, with this in mind we included [json-repair] (https://pypi.org/project/json-repair/) by and turn it on by default.

System Dependencies and their installation

This package depends on qreader, which in turn depends on pyzbar, which in turn depends on the system library zbar ZBar

Check your OS documentation on what package to install to get ZBar working with pyzbar.

On Linux (Ubuntu 22.04):

sudo apt-get install libzbar0

On Mac OS X:

brew install zbar

Installation using pip

After installing system dependencies, you can install using the PyPI python package

pip install afipcaeqrdecode

First run notice

On first run qreader will download the weights to run its QR detector AI model, then it will resume program operation automatically.

WARNING

This is an experimental package, USE IN PRODUCTION AT YOUR OWN RISK.

It is barely even tested, i'm sharing it so I can actually import it as a PyPI package in another project that consumes it.

Credits

All the other library authors this package depends on. Facundo Mainere for helping with JWT decode.

Author: Emiliano Mesquita.

License

GNU LGPLv3.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

afipcaeqrdecode-0.0.15.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

afipcaeqrdecode-0.0.15-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file afipcaeqrdecode-0.0.15.tar.gz.

File metadata

  • Download URL: afipcaeqrdecode-0.0.15.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.12

File hashes

Hashes for afipcaeqrdecode-0.0.15.tar.gz
Algorithm Hash digest
SHA256 1cb980c7de807bbeb25b5a4ed1943883438ebfbab530df31dbef5a856b90ab96
MD5 f12ef328de6522ce86507f1743ab6605
BLAKE2b-256 324804731c2c820debc9bcb5a0b2023405ecdae0a41951edcb3baddd103cf4d4

See more details on using hashes here.

File details

Details for the file afipcaeqrdecode-0.0.15-py3-none-any.whl.

File metadata

File hashes

Hashes for afipcaeqrdecode-0.0.15-py3-none-any.whl
Algorithm Hash digest
SHA256 883a785a0461681e8328e0958fdc19b873cc89c8ab5d128aac85891f3a657928
MD5 585a2da3d4f87225b9b74f9af5f5452b
BLAKE2b-256 2f6ba1293d8f9065ea9d4d57656a9b53f9273c42b32d0cf191faa326ee26dd30

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page