Skip to main content

Package to decode and extract invoice metadata from an AFIP CAE qr code link

Project description

PyPI package

AFIP invoice PDF QR CAE extract and decode

afipcaeqrdecode extracts AFIP CAE invoice metadata from PDF invoices. It renders the first page of the PDF, locates the AFIP QR code, decodes the QR payload, and returns the decoded metadata as a Python dict.

Installation

This package depends on qreader, which depends on pyzbar, which depends on the system zbar library.

On Linux (Ubuntu 22.04):

sudo apt-get install libzbar0

On macOS:

brew install zbar

Then install the package:

pip install afipcaeqrdecode

Quick Start

from afipcaeqrdecode import get_cae_metadata

invoice_metadata = get_cae_metadata("./tests/sample_files/2000005044986390.pdf")

Example output:

{
    "ver": 1,
    "fecha": "2023-02-10",
    "cuit": 30710145764,
    "ptoVta": 4,
    "tipoCmp": 1,
    "nroCmp": 25399,
    "importe": 2460,
    "moneda": "PES",
    "ctz": 1,
    "tipoDocRec": 80,
    "nroDocRec": 30717336905,
    "tipoCodAut": "E",
    "codAut": 73064176949471,
}

Return Value

get_cae_metadata(filepath, attempt_to_repair_json=True) returns:

  • a Python dict when AFIP QR metadata is decoded successfully
  • None when no AFIP QR metadata can be extracted

The returned dictionary preserves the value types present in the decoded JSON payload. Some invoices encode numeric-looking values as JSON strings, and those values are preserved as strings.

How It Works

The decoding flow has two stages:

  1. Render the first page of the PDF with PyMuPDF and run qreader on the resulting image.
  2. If that fails, extract embedded images from the first page and run qreader on those images as a fallback.

If the AFIP QR payload contains malformed JSON, json-repair is used by default to repair it before parsing.

Why qreader instead of pyzbar

Earlier versions of this project used only pyzbar. Some real-world AFIP invoice QRs did not decode reliably with pyzbar alone.

qreader still uses pyzbar for decoding, but it first uses a trained QR detector and applies preprocessing strategies that improve decode rates on difficult images.

Notes and Limitations

  • On first run, qreader may download model weights before decoding.
  • Some invoices omit fields such as fecha in the QR payload.
  • This project is tested mainly with sample PDF integration tests in tests/sample_files/.
  • This package is still experimental. Use it carefully in production workloads.

Testing

Run the sample-based integration test suite with:

python -m unittest tests.test_sample_files

License

GNU Lesser General Public License v3.0 or later (LGPL-3.0-or-later). See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

afipcaeqrdecode-1.0.0.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

afipcaeqrdecode-1.0.0-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file afipcaeqrdecode-1.0.0.tar.gz.

File metadata

  • Download URL: afipcaeqrdecode-1.0.0.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for afipcaeqrdecode-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d6c4607937a0f6e3022de8218db586654369dcb85742c716a83396d856e32f3e
MD5 8f7f6c6e05bfffc547db19cdd8b2b740
BLAKE2b-256 722afab3f8f19efac726b981fbe1860682fa2fe1f3488d37588f8492a4e997d5

See more details on using hashes here.

File details

Details for the file afipcaeqrdecode-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for afipcaeqrdecode-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a0b99b6baba49b8fe4bbce3ca5102e8be468e1fa93d22fdba7d8170f975b32fd
MD5 f9d932d6e63ca6361fb63f0d4facaa83
BLAKE2b-256 8e1b2dd4f486adbc49703c32b10e5f9ee619a4e4fab3b338e53e614b5ba707e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page