Skip to main content

A pytesseract wrapper enabling OCR on images and directories.

Project description

pytesseract-cli

A command-line wrapper for pytesseract, a Python wrapper for tesseract.

Description

This is a command-line wrapper to enable easier usage of the Tesseract OCR engine with multiple files and/or directories. The project itself is written in Python, and uses pytesseract for interaction with tesseract.

Benefits of this interface include the ability to easily parse multiple images and files, as well as recurse upon directories.

Requirements

Basic requirements are an up-to-date installation of Python 3 and Tesseract OCR.

Tesseract

Both the Tesseract OCR engine as well as any training data for desired languages must be installed.

Both of the above are available, for example, on the ArchLinux User Repository:

  • tesseract
  • tesseract-data

Installation

This project is available on PyPI under the page pytesseract-cli.

Using pip:

pip install pytesseract-cli

to upgrade:

pip install -U pytesseract-cli

Usage

In a terminal:

$ pytesseract-cli
usage: pytesseract-cli [-h] [-f [FILES ...]] [-d [DIRECTORIES ...]] [-r] [-t {pdf,txt}] [-l LANG] [--list-languages]

optional arguments:
  -h, --help            show this help message and exit
  -f [FILES ...]        name(s) of file(s) to process
  -d [DIRECTORIES ...]  directory(s) to process
  -r                    recurse on all directories listed
  -t {pdf,txt}          desired output filetype
  -l LANG               language of the text in any image(s)
  --list-languages      list all languages available

Acknowledgements

  • pytesseract for providing an easy-to-use wrapper for tesseract.
  • tesseract for providing a free and open-source OCR engine.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytesseract-cli-1.2.0.tar.gz (3.7 kB view hashes)

Uploaded Source

Built Distribution

pytesseract_cli-1.2.0-py3-none-any.whl (4.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page