Skip to main content

asyncio tesseract wrapper for Tesseract-OCR

Project description

ci codecov PyPI version PyPI - Python Version Code style: black

aiopytesseract

A Python asyncio wrapper for Tesseract-OCR.

Installation

Install and update using pip:

pip install aiopytesseract

Usage

List all available languages by Tesseract installation

import aiopytesseract

await aiopytesseract.languages()
await aiopytesseract.get_languages()

Tesseract version

import aiopytesseract

await aiopytesseract.tesseract_version()
await aiopytesseract.get_tesseract_version()

Tesseract parameters

import aiopytesseract

await aiopytesseract.tesseract_parameters()

Confidence only info

import aiopytesseract

await aiopytesseract.confidence("tests/samples/file-sample_150kB.png")

Deskew info

import aiopytesseract

await aiopytesseract.deskew("tests/samples/file-sample_150kB.png")

Extract text from an image: locally or bytes

from pathlib import Path

import aiopytesseract

await aiopytesseract.image_to_string("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_string(
	Path("tests/samples/file-sample_150kB.png").read_bytes(), dpi=220, lang='eng+por'
)

Box estimates

from pathlib import Path

import aiopytesseract

await aiopytesseract.image_to_boxes("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_boxes(Path("tests/samples/file-sample_150kB.png")

Boxes, confidence and page numbers

from pathlib import Path

import aiopytesseract

await aiopytesseract.image_to_data("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_data(Path("tests/samples/file-sample_150kB.png")

Information about orientation and script detection

from pathlib import Path

import aiopytesseract

await aiopytesseract.image_to_osd("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_osd(Path("tests/samples/file-sample_150kB.png")

Generate a searchable PDF

from pathlib import Path

import aiopytesseract

await aiopytesseract.image_to_pdf("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_pdf(Path("tests/samples/file-sample_150kB.png")

Generate HOCR output

from pathlib import Path

import aiopytesseract

await aiopytesseract.image_to_hocr("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_hocr(Path("tests/samples/file-sample_150kB.png")

Multi ouput

from pathlib import Path

import aiopytesseract

async with aiopytesseract.run(
	Path('tests/samples/file-sample_150kB.png').read_bytes(),
	'output',
	'alto tsv txt'
) as resp:
	# will generate (output.xml, output.tsv and output.txt)
	print(resp)
	alto_file, tsv_file, txt_file = resp

Config variables

from pathlib import Path

import aiopytesseract

async with aiopytesseract.run(
	Path('tests/samples/text-with-chars-and-numbers.png').read_bytes(),
	'output',
	'alto tsv txt'
	config=[("tessedit_char_whitelist", "0123456789")]
) as resp:
	# will generate (output.xml, output.tsv and output.txt)
	print(resp)
	alto_file, tsv_file, txt_file = resp
from pathlib import Path

import aiopytesseract

await aiopytesseract.image_to_string(
	"tests/samples/text-with-chars-and-numbers.png",
	config=[("tessedit_char_whitelist", "0123456789")]
)

await aiopytesseract.image_to_string(
	Path("tests/samples/text-with-chars-and-numbers.png").read_bytes(),
	dpi=220,
	lang='eng+por',
	config=[("tessedit_char_whitelist", "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")]
)

For more details on Tesseract best practices and the aiopytesseract, see the folder: docs.

Examples

If you want to test aiopytesseract easily, can you use some options like:

Docker / docker-compose

After clone this repo run the command below:

docker-compose up -d

streamlit app

For this option it's necessary first install aiopytesseract and streamlit, after execute:

# remote option:
streamlit run https://github.com/amenezes/aiopytesseract/blob/master/examples/streamlit/app.py
# local option:
streamlit run examples/streamlit/app.py

note: The streamlit example need python >= 3.13

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiopytesseract-1.1.0.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aiopytesseract-1.1.0-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file aiopytesseract-1.1.0.tar.gz.

File metadata

  • Download URL: aiopytesseract-1.1.0.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for aiopytesseract-1.1.0.tar.gz
Algorithm Hash digest
SHA256 639260f26b30357717cd8daaaa2a234a3492eee89db425fffdca4f61e33a3dfa
MD5 d61b4f29112aea4e94d28a5632e7a391
BLAKE2b-256 36d76678ac784ddbaf319dbad46c68b0f0ec1b62fed1c0fc38dbb658c2b898c0

See more details on using hashes here.

File details

Details for the file aiopytesseract-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: aiopytesseract-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for aiopytesseract-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 eb56227048943d5622c668eed2812dfa78040d29acdf431c5d4312dcf8abfaa3
MD5 84547dc1bf8564442b8ac2a1eb1b8018
BLAKE2b-256 b172229f773418225c6eb94a00111c02272f6f274952589a95e54ad71552e1da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page