A supermarket receipt parser written in Python using tesseract OCR
Project description
A fuzzy receipt parser written in Python
This is a fuzzy receipt parser written in Python. It extracts information like the shop, the date, and the total from scanned receipts. It can work as a standalone script or as part of our IOS and Android application.
Dependencies
The receipt-parser-core
library depend on imagemagick
. Please install imagemagick
with your favorite package manager.
Usage
To convert all images from the data/img/
folder to text using tesseract and parse the resulting text files, run
make run
Docker
A Dockerfile
is available with all dependencies needed to run the program.
To build the image, run
make docker-build
To run it on the sample files, try
make docker-run
By default, running the image will execute the make run
command. To use with your own images, run the following:
docker run -v <path_to_input_images>:/usr/src/app/data/img mre0/receipt_parser
History
This project started as a hackathon idea. Read more about it on the trivago techblog. Also read the comments on HackerNews There's also a talk about the project. The library is now available at PyPi.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file receipt_parser_core-0.2.5.tar.gz
.
File metadata
- Download URL: receipt_parser_core-0.2.5.tar.gz
- Upload date:
- Size: 12.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.6 CPython/3.9.5 Linux/5.12.4-arch1-2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fbe1eecf75f3c7ae19aebf6b510d4314df5ed6b318c912902bb1a59cf76400a8 |
|
MD5 | fa851a918a7a137091a6f030870f2033 |
|
BLAKE2b-256 | f553f564340519bc71a27ddba58ed2f20b2dbeb52fc4c2b06ca7a5b09cae8c5d |
File details
Details for the file receipt_parser_core-0.2.5-py3-none-any.whl
.
File metadata
- Download URL: receipt_parser_core-0.2.5-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.6 CPython/3.9.5 Linux/5.12.4-arch1-2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dab82af428e33e56f339873979c22c2c967292dc872594594d2c0d7cf6f85c83 |
|
MD5 | 2664ec2d406e907817e1ef6be81d47bf |
|
BLAKE2b-256 | b4bad7635ce2410e2be48f3acb2d818272027c0132e99a2ef0bfab866ae8afe6 |