A Python wrapper for Tesseract-OCR.
Project description
tessy
tessy is a Python wrapper for Google's Tesseract-OCR, an optical character recognition engine used to detect and extract text data from various image file formats.
Features
- No initial dependencies beside Tesseract.
- Supports input image in
PNG
,JPG
,JPEG
,GIF
,TIF
andBMP
format. - Supports multiple input images via text file (.txt).
- Supports image objects from:
- Dynamically detect and import the corresponding image module on runtime.
- Supports
txt
,box
,pdf
,hocr
,tsv
andosd
as output file format. - Supports multiple output format.
- Can convert any raw output data to
string
,bytes
ordict
(except pdf). - Works on macOS, Linux and Windows.
- Well documented.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tessy-0.5.2.tar.gz
(15.7 kB
view hashes)
Built Distribution
tessy-0.5.2-py3-none-any.whl
(15.7 kB
view hashes)