Skip to main content

A pytesseract wrapper enabling OCR on images and directories.

Project description

pytesseract-cli

A command-line wrapper for pytesseract, a Python wrapper for tesseract.

Description

This is a command-line wrapper to enable easier usage of the Tesseract OCR engine with multiple files and/or directories. The project itself is written in Python, and uses pytesseract for interaction with tesseract.

Benefits of this interface include the ability to easily parse multiple images and files, as well as recurse upon directories.

Requirements

Basic requirements are an up-to-date installation of Python 3 and Tesseract OCR.

Tesseract

Both the Tesseract OCR engine as well as any training data for desired languages must be installed.

Both of the above are available, for example, on the ArchLinux User Repository:

  • tesseract
  • tesseract-data

Installation

This project is available on PyPI under the page pytesseract-cli.

Using pip:

pip install pytesseract-cli

to upgrade:

pip install -U pytesseract-cli

Usage

In a terminal:

$ pytesseract-cli
usage: pytesseract-cli [-h] [-f [FILES ...]] [-d [DIRECTORIES ...]] [-r] [-t {pdf,txt}] [-l LANG] [--list-languages]

optional arguments:
  -h, --help            show this help message and exit
  -f [FILES ...]        name(s) of file(s) to process
  -d [DIRECTORIES ...]  directory(s) to process
  -r                    recurse on all directories listed
  -t {pdf,txt}          desired output filetype
  -l LANG               language of the text in any image(s)
  --list-languages      list all languages available

Acknowledgements

  • pytesseract for providing an easy-to-use wrapper for tesseract.
  • tesseract for providing a free and open-source OCR engine.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytesseract-cli-1.2.0.tar.gz (3.7 kB view details)

Uploaded Source

Built Distribution

pytesseract_cli-1.2.0-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file pytesseract-cli-1.2.0.tar.gz.

File metadata

  • Download URL: pytesseract-cli-1.2.0.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.5

File hashes

Hashes for pytesseract-cli-1.2.0.tar.gz
Algorithm Hash digest
SHA256 c657e2b42a8f5023a7d4df1f23c5e7a9cb318c5e6141f7d7b5ca468845b83082
MD5 1e06129d8fcd094a0fd6e2de0cb69065
BLAKE2b-256 61d99de4b4f663a1dc9d0ceeff4ef95a5fe88430a298eba9fd53d9f921207114

See more details on using hashes here.

File details

Details for the file pytesseract_cli-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: pytesseract_cli-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.5

File hashes

Hashes for pytesseract_cli-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 791345458b1330c52c4eaac197d6298f325d90b8bac0d9f152935d1bb1d94296
MD5 4a5c70b11d3913015ff8b933b69dea62
BLAKE2b-256 59f96c484a83ddaaf5652d7be56fd8558686489e90ebd97e338171dc1addebc9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page