A python wrapper to extract text from images on a mac system. Uses the vision framework from Apple.
Project description
ocrmac
A small Python wrapper to extract text from images on a Mac system. Uses the vision framework from Apple. Simply pass a path to an image or a PIL
image directly and get lists of texts, their confidence, and bounding box.
This only works on macOS systems with newer macOS versions (10.15+).
Example and Quickstart
Install via pip:
pip install ocrmac
Basic Usage
from ocrmac import ocrmac
annotations = ocrmac.OCR('test.png').recognize()
print(annotations)
Output (Text, Confidence, BoundingBox):
[("GitHub: Let's build from here - X", 0.5, [0.16, 0.91, 0.17, 0.01]),
('github.com', 0.5, [0.174, 0.87, 0.06, 0.01]),
('Qi &0 O M #O', 0.30, [0.65, 0.87, 0.23, 0.02]),
[...]
('P&G U TELUS', 0.5, [0.64, 0.16, 0.22, 0.03])]
(BoundingBox precision capped for readability reasons)
Create Annotated Images
from ocrmac import ocrmac
ocrmac.OCR('test.png').annotate_PIL()
Functionality
- You can pass the path to an image or a PIL image as an object
- You can use as a class (
ocrmac.OCR
) or functionocrmac.text_from_image
) - You can pass several arguments:
recognition_level
:fast
oraccurate
language_preference
: A list with languages for post-processing, e.g.['en', 'de']
.
- You can get an annotated output either as PIL image (
annotate_PIL
) or matplotlib figure (annotate_matplotlib
)
Example: Select Language Preference
You can set a language preference like so:
ocrmac.OCR('test.png',language_preference=['en'])
What abbreviation should you use for your language of choice? Here is an overview of language codes.
See also this Example Notebook for implementation details.
Speed
Timings for the above recognize-statement: MacBook Pro (14-inch, 2021):
accurate
: 233 ms ± 1.77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)fast
: 200 ms ± 4.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Technical Background & Motivation
If you want to do Optical character recognition (OCR) with Python, widely used tools are pytesseract
or EasyOCR
. For me, tesseract never did give great results. EasyOCR did, but it is slow con CPU. While GPU for CUDA, it is not for Mac. (Update from 9/2023: Apparently EasyOCR now has mps support for mac.)
In any case, as a Mac user you might notice that you can, with newer versions, directly copy and paste from images. The built-in OCR functionality is quite good. The underlying functionality for this is VNRecognizeTextRequest
from Apple's Vision Framework. Unfortunately it is in Swift; luckily, a wrapper for this exists. pyobjc-framework-Vision
. ocrmac
utilizes this wrapper and provides an easy interface to use this for OCR.
I found the following resources very helpful when implementing this:
I also did a small writeup about OCR on mac in this blogpost on medium.com.
Contributing
If you have a feature request or a bug report, please post it either as an idea in the discussions or as an issue on the GitHub issue tracker. If you want to contribute, put a PR for it. Thanks!
If you like the project, consider starring it!
History
0.1.0 (2022-12-30)
- First release on PyPI.
- Basic functionality for PIL and matplotlib
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ocrmac-0.1.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 11265c87a0f8f02626512a3d61daf620a9cbe620ab75509d64ce3e483bbe2535 |
|
MD5 | bf26ba772df1d85264d6cf713a998c30 |
|
BLAKE2b-256 | 721ffc3de224dba99684ca0a4b2cb9502fd535642beec4876e6e17eea109dd55 |