Skip to main content

Optical character recognition (OCR) tool for printed book pages

Project description

BookOcr

Optical character recognition (OCR) tool for printed book pages.

Usage examples:

from bookocr.ocr import Ocr

ocr = Ocr()
image_path = "my_image.png"
nested_list_structure = ocr.image_ocr(image_path)  # pages > text areas > lines > words
text = ocr.get_data_as_text()  # the same result, but joined

print(text)
from bookocr.config import OcrConfig
from bookocr.stats_config import OcrStatsConfig
from bookocr.ocr import Ocr

# optional
config = OcrConfig()
# ... set config values here

# optional as well
# provides intermediate results of image processing 
stats_config = OcrStatsConfig()
stats_config.set_enabled_true("stats_folder")
# ... set stats_config values here

ocr = Ocr(config, stats_config)
image_path = "my_image.png"
nested_list_structure = ocr.image_ocr(image_path)  # pages > text areas > lines > words
text = ocr.get_data_as_text()  # the same result, but joined

print(text)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bookocr-1.0.3.tar.gz (1.9 MB view hashes)

Uploaded Source

Built Distribution

bookocr-1.0.3-py3-none-any.whl (1.9 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page