Plugin to run OCRmyPDF with Apple Vision Framework OCR engine
Project description
OCRmyPDF AppleOCR
A plugin for OCRmyPDF that enables optical character recognition (OCR) using the text detection capabilities of Apple’s Vision Framework on macOS.
Apple’s proprietary OCR implementation provides excellent accuracy and speed compared to other on-device OCR engines such as Tesseract.
Installation
The package is available on PyPI.
pip install ocrmypdf-appleocr
Usage
To use the plugin, pass the --plugin option when invoking ocrmypdf. You can also specify the language(s) for OCR using the -l or --language option. If you want to enable automatic language detection, use und (undetermined) as the language code.
ocrmypdf -l jpn --plugin ocrmypdf_appleocr input.pdf output.pdf
Options
--appleocr-recognition-mode: Recognition mode for Apple Vision OCR. Choices:fast,accurate, orlivetext. Default:livetexton macOS 13 and later,accurateon macOS 12 and earlier.--appleocr-disable-correction: Disable language correction in Apple Vision OCR (default:False)--pdf-renderer: Renderer used to embed OCR results as invisible (“phantom”) text. Choices:hocr,sandwich. Default:sandwich.-lor--language: Specify OCR language(s) in ISO 639-2 three-letter codes. Useundfor undetermined language. Specifying multiple languages joined with+(e.g.eng+fra) for multilingual documents is not supported.
Automatic language detection (und) is not supported in livetext mode.
Recognition Modes
The fast and accurate modes use VNRecognizeTextRequest from Apple's Vision framework.
The livetext mode uses the newer ImageAnalyzer API from the VisionKit framework.
Although officially Swift-only, it can be accessed via private API (VKCImageAnalyzer) through pyobjc.
The key difference is that LiveText supports vertical text layout in East Asian languages, which is not handled properly by the older API.
PDF Renderers
This plugin supports two OCRmyPDF renderers: hocr and sandwich.
The default is sandwich.
- sandwich: The plugin renders OCR output as a PDF layer with invisible text, which OCRmyPDF then merges with the original page image.
- hocr: The plugin outputs OCR results as hOCR markup, and OCRmyPDF converts the markup to PDF.
Because the hOCR format cannot represent vertical text in East Asian (CJK) scripts, the hocr renderer cannot accurately reproduce vertical text layouts.
However, OCRmyPDF’s built-in hOCR-to-PDF conversion is more mature and may perform better in other scenarios.
Supported Languages
As of macOS Tahoe 26, the following languages are supported by Apple Vision OCR:
| Language code | Language name | Fast mode | Accurate mode | LiveText |
|---|---|---|---|---|
| eng | English | ✓ | ✓ | ✓ |
| fra | French | ✓ | ✓ | ✓ |
| ita | Italian | ✓ | ✓ | ✓ |
| deu | German | ✓ | ✓ | ✓ |
| spa | Spanish | ✓ | ✓ | ✓ |
| por | Portuguese | ✓ | ✓ | ✓ |
| chi_sim | Chinese (Simplified) | ✓ | ✓ | |
| chi_tra | Chinese (Traditional) | ✓ | ✓ | |
| yue_sim | Cantonese (Simplified) | ✓ | ✓ | |
| yue_tra | Cantonese (Traditional) | ✓ | ✓ | |
| kor | Korean | ✓ | ✓ | |
| jpn | Japanese | ✓ | ✓ | |
| rus | Russian | ✓ | ✓ | |
| ukr | Ukrainian | ✓ | ✓ | |
| tha | Thai | ✓ | ✓ | |
| vie | Vietnamese | ✓ | ✓ | |
| ara | Arabic | ✓ | ✓ | |
| ars | Arabic (Najdi) | ✓ | ✓ | |
| tur | Turkish | ✓ | ✓ | |
| ind | Indonesian | ✓ | ✓ | |
| ces | Czech | ✓ | ✓ | |
| dan | Danish | ✓ | ✓ | |
| nld | Dutch | ✓ | ✓ | |
| nor | Norwegian | ✓ | ✓ | |
| nno | Norwegian (Nynorsk) | ✓ | ✓ | |
| nob | Norwegian (Bokmål) | ✓ | ✓ | |
| msa | Malay | ✓ | ✓ | |
| pol | Polish | ✓ | ✓ | |
| ron | Romanian | ✓ | ✓ | |
| swe | Swedish | ✓ | ✓ |
Acknowledgements
This project incorporates and references code from the following projects:
- straussmaximilian/ocrmac - for invoking
VKCImageAnalyzer(LiveText API) viapyobjc - ocrmypdf/OCRmyPDF-EasyOCR - for PDF rendering of recognized text
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ocrmypdf_appleocr-0.3.3.tar.gz.
File metadata
- Download URL: ocrmypdf_appleocr-0.3.3.tar.gz
- Upload date:
- Size: 15.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e73166079883077a874dae6c8bd41d93bebb0b68d67c3365897cc618a6c3f4e3
|
|
| MD5 |
8d9f752de5e16d36d804774d92d5480f
|
|
| BLAKE2b-256 |
790010e72b455276d14281d102fa34820477043ce75952342f6fb188d72ef304
|
Provenance
The following attestation bundles were made for ocrmypdf_appleocr-0.3.3.tar.gz:
Publisher:
release.yml on mkyt/OCRmyPDF-AppleOCR
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ocrmypdf_appleocr-0.3.3.tar.gz -
Subject digest:
e73166079883077a874dae6c8bd41d93bebb0b68d67c3365897cc618a6c3f4e3 - Sigstore transparency entry: 928408681
- Sigstore integration time:
-
Permalink:
mkyt/OCRmyPDF-AppleOCR@07a5a5193214add1ed6ae98185f96024f7e98d85 -
Branch / Tag:
refs/tags/v0.3.3 - Owner: https://github.com/mkyt
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@07a5a5193214add1ed6ae98185f96024f7e98d85 -
Trigger Event:
release
-
Statement type:
File details
Details for the file ocrmypdf_appleocr-0.3.3-py3-none-any.whl.
File metadata
- Download URL: ocrmypdf_appleocr-0.3.3-py3-none-any.whl
- Upload date:
- Size: 15.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba2581f432c1ce16b8345dfb56e06355da46a9cd34732f458ce66ab2d7a845ea
|
|
| MD5 |
20ad02493eb43a4aaa8525de9fb5facc
|
|
| BLAKE2b-256 |
d7fd95f18d11b781ef9d4fa9f50a3b39637c9cc771d5913a3e1d8f3728d94d14
|
Provenance
The following attestation bundles were made for ocrmypdf_appleocr-0.3.3-py3-none-any.whl:
Publisher:
release.yml on mkyt/OCRmyPDF-AppleOCR
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ocrmypdf_appleocr-0.3.3-py3-none-any.whl -
Subject digest:
ba2581f432c1ce16b8345dfb56e06355da46a9cd34732f458ce66ab2d7a845ea - Sigstore transparency entry: 928408682
- Sigstore integration time:
-
Permalink:
mkyt/OCRmyPDF-AppleOCR@07a5a5193214add1ed6ae98185f96024f7e98d85 -
Branch / Tag:
refs/tags/v0.3.3 - Owner: https://github.com/mkyt
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@07a5a5193214add1ed6ae98185f96024f7e98d85 -
Trigger Event:
release
-
Statement type: