A pytesseract wrapper enabling OCR on images and directories.
Project description
pytesseract-cli
A command-line wrapper for pytesseract
, a Python wrapper for tesseract
.
Description
This is a command-line wrapper to enable easier usage of the Tesseract OCR engine with multiple files and/or directories. The project itself is written in Python, and uses pytesseract for interaction with tesseract
.
Benefits of this interface include the ability to easily parse multiple images and files, as well as recurse upon directories.
Requirements
Basic requirements are an up-to-date installation of Python 3 and Tesseract OCR.
Tesseract
Both the Tesseract OCR engine as well as any training data for desired languages must be installed.
Both of the above are available, for example, on the ArchLinux User Repository:
tesseract
tesseract-data
Installation
This project is available on PyPI under the page pytesseract-cli.
Using pip:
pip install pytesseract-cli
to upgrade:
pip install -U pytesseract-cli
Usage
In a terminal:
$ pytesseract-cli
usage: pytesseract-cli [-h] [-f [FILES ...]] [-d [DIRECTORIES ...]] [-r] [-t {pdf,txt}] [-l LANG] [--list-languages]
optional arguments:
-h, --help show this help message and exit
-f [FILES ...] name(s) of file(s) to process
-d [DIRECTORIES ...] directory(s) to process
-r recurse on all directories listed
-t {pdf,txt} desired output filetype
-l LANG language of the text in any image(s)
--list-languages list all languages available
Acknowledgements
- pytesseract for providing an easy-to-use wrapper for
tesseract
. - tesseract for providing a free and open-source OCR engine.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pytesseract-cli-1.2.0.tar.gz
.
File metadata
- Download URL: pytesseract-cli-1.2.0.tar.gz
- Upload date:
- Size: 3.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c657e2b42a8f5023a7d4df1f23c5e7a9cb318c5e6141f7d7b5ca468845b83082 |
|
MD5 | 1e06129d8fcd094a0fd6e2de0cb69065 |
|
BLAKE2b-256 | 61d99de4b4f663a1dc9d0ceeff4ef95a5fe88430a298eba9fd53d9f921207114 |
File details
Details for the file pytesseract_cli-1.2.0-py3-none-any.whl
.
File metadata
- Download URL: pytesseract_cli-1.2.0-py3-none-any.whl
- Upload date:
- Size: 4.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 791345458b1330c52c4eaac197d6298f325d90b8bac0d9f152935d1bb1d94296 |
|
MD5 | 4a5c70b11d3913015ff8b933b69dea62 |
|
BLAKE2b-256 | 59f96c484a83ddaaf5652d7be56fd8558686489e90ebd97e338171dc1addebc9 |