Skip to main content

SR Invoice Parser is a small library(crawler) that is parsing invoices and extracting relevant information from URL. For Serbian market.

Project description

SR Invoice Parser

build-status-image

SR Invoice Parser is a small library that is parsing invoices and extracting relevant information. It is designed to work with invoices from the TaxCore Tax Administration of the Republic of Serbia (Poreska uprava Republike Srbije).

It works on domain suf.purs.gov.rs where they use TaxCore website to show invoices.

QR code gives a URL to the invoice web page, and this parser extracts the relevant information from the web page, like a crawler.

Installation

To install SR Invoice Parser, follow these steps:

pip install sr-invoice-parser

Usage

The InvoiceParser class is the entry point for using the parser.

Methods

  • get_data() - Extracts all the data from the invoice and returns it as a dictionary
  • get_company_name() - Extracts the company name.
  • get_company_tin() - Extracts the company's tax identification number/PFR.
  • get_buyer_tin() - Extracts the buyer's tax identification number/PFR.
  • get_total_amount() - Extracts the total amount of the invoice.
  • get_dt() - Extracts the date and time of the invoice and converts it to UTC as a datetime object.
  • get_invoice_number() - Extracts the invoice number.
  • get_invoice_text() - Extracts the full text of the invoice with QR code base64.
  • get_items() - Extracts items details from the invoice. This is array of dictionaries with keys: name, quantity, price, total_price.

Here's a basic example of how to use it:

from sr_invoice_parser import InvoiceParser

parser = InvoiceParser(url="https://suf.purs.gov.rs/v/?vl=...")
# or with HTML fetching instead of JSON (json=True is default)
parser = InvoiceParser(url="https://suf.purs.gov.rs/v/?vl=...", json=False)
# or
parser = InvoiceParser(html_text="<HTML source code of invoice web page>")

parser.data()

parser.get_company_name()
parser.get_company_tin()
parser.get_buyer_tin()
parser.get_total_amount()
parser.get_dt()
parser.get_invoice_number()
parser.get_invoice_text()
parser.get_items()

Example response data

{
    "company_name": "Company Name",
    "company_tin": "123456789",
    "buyer_tin": "987654321",
    "invoice_number": "QWERTYU1-QWERTYU1-12345",
    "invoice_datetime": datetime.datetime(2021, 1, 1, 0, 0, tzinfo=datetime.timezone.utc),
    "invoice_total_amount": 123.45,
    "invoice_text": "============ ФИСКАЛНИ РАЧУН ============.....",
    "invoice_items": [
        {
            "name": "Item 1",
            "quantity": 1,
            "price": 123.45,
            "total_price": 123.45
        }
    ]
}

Check the test_parser.py file for more examples.

Handling Exceptions

The module has custom exceptions for handling various error scenarios:

  • ParserParseException - Raised when any error occurs during parsing the HTML content.
  • ParserRequestException - Raised for errors related to fetching HTML content.

Package Dependencies

Thanks to the following packages:

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

If you have any questions, please contact us via email: hello@innovigo.co

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sr_invoice_parser-1.0.3.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sr_invoice_parser-1.0.3-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file sr_invoice_parser-1.0.3.tar.gz.

File metadata

  • Download URL: sr_invoice_parser-1.0.3.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for sr_invoice_parser-1.0.3.tar.gz
Algorithm Hash digest
SHA256 529bb6d42bc5e4ca97af70a5334554106732f06709fbcc5ab236757c8cf5ff11
MD5 592149289eca880f0ef3659640d04404
BLAKE2b-256 583e6073a911561b47341791d085220b13d3065223c90d5499ab1f08781cb138

See more details on using hashes here.

File details

Details for the file sr_invoice_parser-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for sr_invoice_parser-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 98e756f4a70da612ec12aa5c135f4ac396a9c3519dc2f735e66b88be6f82275f
MD5 e2badf26dc08f02bb6de5ad90294474a
BLAKE2b-256 4adc2abe753d89581b9778039407938836652a536c383d9f2ffbefed467d321a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page