Import supplier invoices using the invoice2data lib
Account Invoice Import Invoice2data
This module is an extension of the module account_invoice_import: it adds support for regular PDF invoices i.e. PDF invoice that don’t have an embedded XML file. It uses the invoice2data library which takes care of extracting the text of the PDF invoice, find an existing invoice template and execute the invoice template to extract the useful information from the invoice.
To know the full story behind the development of this module, read this blog post.
This module requires the Python library invoice2data available on Github with a version >= 0.2.74 (February 2018).
To install the latest version of this library, run:
sudo pip install --upgrade invoice2data
If you use Ubuntu 16.04 LTS, you can use the pdftotext version 0.41.0 that is packaged in the distribution:
sudo apt-get install poppler-utils
If you want the invoice2data library to fallback on OCR if the PDF doesn’t contain text (only a very small minority of PDF invoices are image-based and require OCR), you should also install Imagemagick (to get the convert utility to convert PDF to TIFF) and Tesseract OCR :
sudo apt-get install imagemagick tesseract-ocr
If you want to use custom invoice templates for the invoice2data lib (in addition to the templates provided by the invoice2data lib), you should add a line in your Odoo server configuration file such as:
invoice2data_templates_dir = /opt/invoice2data_local_templates
and store your invoice templates in YAML format (.yml extension) in the directory that you have configured above. If you add invoice tempates in this directory, you don’t have to restart Odoo, they will be used automatically on the next invoice import.
If you want to use only your custom invoice templates and ignore the templates provided by the invoice2data lib, you should have in your Odoo server configuration file:
invoice2data_templates_dir = /opt/invoice2data_local_templates invoice2data_exclude_built_in_templates = True
French users should also install the module l10n_fr_business_document_import available in the French localization.
Go to the form view of the supplier and configure it with the following parameters:
- is a Company ? is True
- Supplier is True
- the TIN (i.e. VAT number) is set (the VAT number is used by default when searching the supplier in the Odoo partner database)
- in the Accounting tab, create an Invoice Import Configuration.
For the PDF invoice of your supplier that don’t have an embedded XML file, you will have to create a template file in YAML format in the invoice2data Python library. It is quite easy to do ; if you are familiar with regexp, it should not take more than 10 minutes for each supplier.
Here are some hints to help you add a template for your supplier:
- Take Free SAS template file as an example. You will find a sample PDF invoice for this supplier under invoice2data/test/pdfs/2015-07-02-invoice_free_fiber.pdf
- Try to run the invoice2data library manually on the sample invoice of Free:
% python -m invoice2data.main --debug invoice2data/test/pdfs/2015-07-02-invoice_free_fiber.pdf
On the output, you will get first the text of the PDF, then some debug info on the parsing of the invoice and the regexps, and, on the last line, you will have the dict that contain the result of the parsing.
- if the VAT number of the supplier is present in the text of the PDF invoice, I think it’s a good idea to use it as the keyword. It is a good practice to add 2 other keywoards: one for the language (for example, match on the word Invoice in the language of the invoice) and one for the currency, so as to match only the invoices of that supplier in this particular language and currency.
- the list of fields should contain the following entries:
- ‘vat’ with the VAT number of the supplier (if the VAT number of the supplier is not in the text of PDF file, add a ‘partner_name’ key, or, if the supplier is French and the module l10n_fr_invoice_pdf_import is installed, add a ‘siren’ key)
- ‘amount’ (‘amount’ is the total amount with taxes)
- ‘amount_untaxed’ or ‘amount_tax’ (one or the other, no need for both)
- ‘date’: the date of the invoice
- ‘date_due’, if this information is available in the text of the PDF file.
Bugs are tracked on GitHub Issues. In case of trouble, please check there if your issue has already been reported. If you spotted it first, help us smashing it by providing a detailed and welcomed feedback.
- Alexis de Lattre <firstname.lastname@example.org>
This module is maintained by the OCA.
OCA, or the Odoo Community Association, is a nonprofit organization whose mission is to support the collaborative development of Odoo features and promote its widespread use.
To contribute to this module, please visit https://odoo-community.org.
Release history Release notifications
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size odoo10_addon_account_invoice_import_invoice2data-10.0.1.0.1-py2-none-any.whl (135.1 kB)||File type Wheel||Python version py2||Upload date||Hashes View hashes|
Hashes for odoo10_addon_account_invoice_import_invoice2data-10.0.1.0.1-py2-none-any.whl