Skip to main content

Fast OCR to read computer rendered texts

Project description

hash-ocr

Fast OCR to read computer rendered texts.

Installation

You can install the package via pip:

pip install hash-ocr

Usage

Demo

import cv2

from hash_ocr import MD5HashModel
from hash_ocr import draw_text_boxes

img = cv2.imread("test_data/lorem.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
threshed = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY)[1]

model = MD5HashModel(
    "hash_ocr/models/segoe_ui_9.png",
    "hash_ocr/models/segoe_ui_9.json",
    connected_chars=True,
)

print(model.get_text(threshed))
# Lorem ipsum dolor sit amet consectetur adipiscing elit Donec ac odio volutpat
# vestibulum mauris ut vulputate quam Etiam auctor purus id massa egestas in
# imperdiet ligula ultrices Donec sodales volutpat erat pellentesque aliquam sapien
# consequat imperdiet Vestibulum eget odio lacinia porta dui a imperdiet magna
# Mauris ullamcorpertellus at scelerisque euismod Vestibulum rutrum blandit gravida
# Sed venenatis magna lobortis dui commodo ut dignissim elit ultricies Sed
# consequat nisl at placerat aliquam Praesent neque magna lacinia quis elementum
# ut dapibus ac mi
# Sed eget erat odio Phasellus lacinia mauris vel ex maximus pretium ln sed mattis
# felis Pellentesque sollicitudin orci sed tellus fermentum dapibus ln at urna
# condimentum velittincidunt pulvinar Quisque diam libero vehicula non mi non
# efficiturvenenatis magna ln non eros tincidunt ullamcorper sem et rhoncus augue
# Duis a dolor in ex efficitur blanditvel at eros

draw_text_boxes(img, model.get_line_boxes(threshed))

cv2.imshow("Hash OCR", img)
cv2.waitKey()

Custom Models

A model in hash-ocr contains an image and a json file.

Example image:

Model Image

Use label tool to label your image, this tool generates a json label file.

python -m hash_ocr.label /path/to/image

Label Tool

Example:

from hash_ocr.models import MD5HashModel

model = MD5HashModel(
    model_path="hash_ocr/models/digits.png",
    label_path="hash_ocr/models/letters.json",
)

License

This project is licensed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hash_ocr-2.2.5.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

hash_ocr-2.2.5-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file hash_ocr-2.2.5.tar.gz.

File metadata

  • Download URL: hash_ocr-2.2.5.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for hash_ocr-2.2.5.tar.gz
Algorithm Hash digest
SHA256 d7fdda81a95a9f28e286ceb7f8f8a7969ef436f914808d19e40ea866f1db2887
MD5 151bbe72d001ea257bdcfe61fd46285a
BLAKE2b-256 ee1671eb3aedcfe62a93c1b93ada84c6f7464c832a71ade0449f11c1cb39284e

See more details on using hashes here.

File details

Details for the file hash_ocr-2.2.5-py3-none-any.whl.

File metadata

  • Download URL: hash_ocr-2.2.5-py3-none-any.whl
  • Upload date:
  • Size: 18.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for hash_ocr-2.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 6e4379e98ea38a3db2f1f36c3fa8b78636a78072640a58650d938778b3e1c1ba
MD5 8ded2fea4125e754b04dfd8ab862dab6
BLAKE2b-256 5865ed4a72cca27614d7667b52beb5b005b1de87736002181978709750e7db23

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page