Skip to main content

asyncio tesseract wrapper for Tesseract-OCR

Project description

ci codecov PyPI version PyPI - Python Version Code style: black

aiopytesseract

A Python asyncio wrapper for Tesseract-OCR.

Installation

Install and update using pip:

pip install aiopytesseract

Usage

from pathlib import Path

import aiopytesseract


# list all available languages by tesseract installation
await aiopytesseract.languages()
await aiopytesseract.get_languages()


# tesseract version
await aiopytesseract.tesseract_version()
await aiopytesseract.get_tesseract_version()


# tesseract parameters
await aiopytesseract.tesseract_parameters()


# confidence only info
await aiopytesseract.confidence("tests/samples/file-sample_150kB.png")


# deskew info
await aiopytesseract.deskew("tests/samples/file-sample_150kB.png")


# extract text from an image: locally or bytes
await aiopytesseract.image_to_string("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_string(
	Path("tests/samples/file-sample_150kB.png")read_bytes(), dpi=220, lang='eng+por'
)


# box estimates
await aiopytesseract.image_to_boxes("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_boxes(Path("tests/samples/file-sample_150kB.png")


# boxes, confidence and page numbers
await aiopytesseract.image_to_data("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_data(Path("tests/samples/file-sample_150kB.png")


# information about orientation and script detection
await aiopytesseract.image_to_osd("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_osd(Path("tests/samples/file-sample_150kB.png")


# generate a searchable PDF
await aiopytesseract.image_to_pdf("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_pdf(Path("tests/samples/file-sample_150kB.png")


# generate HOCR output
await aiopytesseract.image_to_hocr("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_hocr(Path("tests/samples/file-sample_150kB.png")


# multi ouput
async with aiopytesseract.run(
	Path('tests/samples/file-sample_150kB.png').read_bytes(),
	'output',
	'alto tsv txt'
) as resp:
	# will generate (output.xml, output.tsv and output.txt)
	print(resp)
	alto_file, tsv_file, txt_file = resp

Examples

If you want to test aiopytesseract easily, can you use some options like:

Docker

Just copy and paste the following line.

docker run --rm --name aiopytesseract -p 8501:8501 amenezes/aiopytesseract

docker-compose

After clone this repo run the command below:

docker-compose up -d

streamlit app

For this option it's necessary first install aiopytesseract and streamlit, after execute:

streamlit run https://github.com/amenezes/aiopytesseract/blob/master/examples/streamlit/app.py

note: The streamlit example need python >= 3.10

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiopytesseract-0.7.0.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aiopytesseract-0.7.0-py2.py3-none-any.whl (21.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file aiopytesseract-0.7.0.tar.gz.

File metadata

  • Download URL: aiopytesseract-0.7.0.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.9

File hashes

Hashes for aiopytesseract-0.7.0.tar.gz
Algorithm Hash digest
SHA256 3c35041c786d5e4f94d0d383ddf70c531c00e60e77e895ffc3d69f621f0cfd3a
MD5 0e839f097805acffd6eef75c7ba1d795
BLAKE2b-256 e91ff4c617b7e623a7771b6660a0170ba5075e6c48e5db17f5169e50311fc833

See more details on using hashes here.

File details

Details for the file aiopytesseract-0.7.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for aiopytesseract-0.7.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e178788bff2a20844f65318788a9a87d7d041c6932befd69d8a49559688b58c2
MD5 d195c26f4e06b06922f440964d3cbb70
BLAKE2b-256 6116fb9d9a61e5c52cdd4e0d9c74ac7b1935083720484965474ab82c6ce888f1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page