asyncio tesseract wrapper for Tesseract-OCR
Project description
aiopytesseract
A Python asyncio wrapper for Tesseract-OCR.
Installation
Install and update using pip:
pip install aiopytesseract
Usage
from pathlib import Path
import aiopytesseract
# list all available languages by tesseract installation
await aiopytesseract.languages()
await aiopytesseract.get_languages()
# tesseract version
await aiopytesseract.tesseract_version()
await aiopytesseract.get_tesseract_version()
# tesseract parameters
await aiopytesseract.tesseract_parameters()
# confidence only info
await aiopytesseract.confidence("tests/samples/file-sample_150kB.png")
# deskew info
await aiopytesseract.deskew("tests/samples/file-sample_150kB.png")
# extract text from an image: locally or bytes
await aiopytesseract.image_to_string("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_string(
Path("tests/samples/file-sample_150kB.png")read_bytes(), dpi=220, lang='eng+por'
)
# box estimates
await aiopytesseract.image_to_boxes("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_boxes(Path("tests/samples/file-sample_150kB.png")
# boxes, confidence and page numbers
await aiopytesseract.image_to_data("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_data(Path("tests/samples/file-sample_150kB.png")
# information about orientation and script detection
await aiopytesseract.image_to_osd("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_osd(Path("tests/samples/file-sample_150kB.png")
# generate a searchable PDF
await aiopytesseract.image_to_pdf("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_pdf(Path("tests/samples/file-sample_150kB.png")
# generate HOCR output
await aiopytesseract.image_to_hocr("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_hocr(Path("tests/samples/file-sample_150kB.png")
# multi ouput
async with aiopytesseract.run(
Path('tests/samples/file-sample_150kB.png').read_bytes(),
'output',
'alto tsv txt'
) as resp:
# will generate (output.xml, output.tsv and output.txt)
print(resp)
alto_file, tsv_file, txt_file = resp
Examples
If you want to test aiopytesseract easily, can you use some options like:
- docker
- docker-compose
- streamlit
Docker
Just copy and paste the following line.
docker run --rm --name aiopytesseract -p 8501:8501 amenezes/aiopytesseract
docker-compose
After clone this repo run the command below:
docker-compose up -d
streamlit app
For this option it's necessary first install aiopytesseract and streamlit, after execute:
streamlit run https://github.com/amenezes/aiopytesseract/blob/master/examples/streamlit/app.py
note: The streamlit example need python >= 3.10
Links
- License: Apache License
- Code: https://github.com/amenezes/aiopytesseract
- Issue tracker: https://github.com/amenezes/aiopytesseract/issues
- Docs: https://aiopytesseract.amenezes.net
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aiopytesseract-0.7.0.tar.gz.
File metadata
- Download URL: aiopytesseract-0.7.0.tar.gz
- Upload date:
- Size: 17.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c35041c786d5e4f94d0d383ddf70c531c00e60e77e895ffc3d69f621f0cfd3a
|
|
| MD5 |
0e839f097805acffd6eef75c7ba1d795
|
|
| BLAKE2b-256 |
e91ff4c617b7e623a7771b6660a0170ba5075e6c48e5db17f5169e50311fc833
|
File details
Details for the file aiopytesseract-0.7.0-py2.py3-none-any.whl.
File metadata
- Download URL: aiopytesseract-0.7.0-py2.py3-none-any.whl
- Upload date:
- Size: 21.5 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e178788bff2a20844f65318788a9a87d7d041c6932befd69d8a49559688b58c2
|
|
| MD5 |
d195c26f4e06b06922f440964d3cbb70
|
|
| BLAKE2b-256 |
6116fb9d9a61e5c52cdd4e0d9c74ac7b1935083720484965474ab82c6ce888f1
|