Skip to main content

End-to-End English Optical Character Recognition

Project description

Torchless EasyOCR

This package is EasyOCR-based English optical character recognition. Unlike EasyOCR, the package uses a pre-saved with onnx English language model, so it doesn't need a 1-2 Gb pytorch dependency. This is particularly useful for developing light-weight applications that utilize text recognition.

Comparison for virtual env size

  • 365 MB torchfree_ocr with dependencies
  • 1.52 GB easyocr with dependencies
  • 4.79 GB easyocr with dependencies + GPU enabled

Limitations

Each language requires generating a corresponding .onnx file, therefore, for the purpose of reducing package size only English is supported. Additionally, there is no GPU CUDA support.

Performance of torchfree_ocr vs easyocr

In terms of recognition, there is no visible recognition quality difference between torchfree_ocr and easyocr. In terms of speed torchfree_ocr works slightly faster than easyocr in CPU mode (~20% faster). Obviously, easyocr generally runs much faster in GPU mode, which torchfree_ocr doesn't support.

Examples

example

example2

example3

Installation

Install using pip

For the latest release:

pip install torchfree_ocr

Usage

import torchfree_ocr
reader = torchfree_ocr.Reader() # English recognition is default, other languages are not supported
result = reader.readtext('english.png')

The output will be in a list format, each item represents a bounding box, the text detected and confident level, respectively.

[([[231, 32], [672, 32], [672, 64], [231, 64]], 'Reduce your risk of coronavirus infection:', 0.8413621448628567), 
 ([[326, 98], [598, 98], [598, 124], [326, 124]], 'Clean hands with soap and water', 0.9633979603853523), 
 ([[328, 124], [540, 124], [540, 148], [328, 148]], 'or alcohol-based hand rub', 0.802668636048309), 
 ([[248, 170], [595, 170], [595, 196], [248, 196]], 'Cover nose and mouth when coughing and', 0.9529594602295661), 
 ([[248, 196], [546, 196], [546, 222], [248, 222]], 'sneezing with tissue or flexed elbow', 0.8406205896147358), 
 ([[320, 240], [624, 240], [624, 266], [320, 266]], 'Avoid close contact with anyone with', 0.8602271367787114), 
 ([[318, 265], [528, 265], [528, 293], [318, 293]], 'cold or flu-like symptoms', 0.9378307488433589), 
 ([[248, 322], [510, 322], [510, 348], [248, 348]], 'Thoroughly cook meat and eggs', 0.7159722535422908), 
 ([[332, 370], [640, 370], [640, 396], [332, 396]], 'No unprotected contact with live wild', 0.8346977728209518), 
 ([[334, 396], [464, 396], [464, 420], [334, 420]], 'or farm animals', 0.7179850171130348), 
 ([[595, 427], [683, 427], [683, 447], [595, 447]], 'World Health', 0.9979501800152029), 
 ([[597, 445], [685, 445], [685, 463], [597, 463]], 'Organization', 0.9977550970521537)]

Note 1: Instead of the filepath english.png, you can also pass an OpenCV image object (numpy array) or an image file as bytes. A URL to a raw image is also acceptable.

Note 2: The line reader = easyocr.Reader() is for loading a model into memory. It takes some time but it needs to be run only once.

You can also set detail=0 for simpler output.

reader.readtext('english.png', detail = 0)

Result:

['Reduce your risk of coronavirus infection:', 'Clean hands with soap and water', 'or alcohol-based hand rub', 'Cover nose and mouth when coughing and', 'sneezing with tissue or flexed elbow', 'Avoid close contact with anyone with', 'cold or flu-like symptoms', 'Thoroughly cook meat and eggs', 'No unprotected contact with live wild', 'or farm animals', 'World Health', 'Organization']

Averall, usage is the same as with EasyOCR, except Reader in this package only has recognizer= parameter and there is no readtextlang() method.

Usage for EasyOCR can be found in their tutorial and API Documentation.

Run on command line

$ torchfree_ocr -f english.png --detail=1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchfree_ocr-1.0.1.tar.gz (91.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

torchfree_ocr-1.0.1-py3-none-any.whl (91.2 MB view details)

Uploaded Python 3

File details

Details for the file torchfree_ocr-1.0.1.tar.gz.

File metadata

  • Download URL: torchfree_ocr-1.0.1.tar.gz
  • Upload date:
  • Size: 91.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for torchfree_ocr-1.0.1.tar.gz
Algorithm Hash digest
SHA256 09e10f74db2447e244a935b5ef390706880306598f5d908ebb451373e1dff3b3
MD5 e0e5cecc5e572fc81e41b3493a881948
BLAKE2b-256 2be2fa239d84c8b8ca240db0d32dc06cd86653ff483f33253130024534b2a094

See more details on using hashes here.

File details

Details for the file torchfree_ocr-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: torchfree_ocr-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 91.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for torchfree_ocr-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 541655646ce091a495686c1c3584d92d6f8bb89e72e61437a80305eac1310271
MD5 54805cc0e668f82dafd447d216714335
BLAKE2b-256 6a209db73bdb3f61d1bb71c8eb4da815be25480986e505d24be67a5280647012

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page