Skip to main content

Plugin to run OCRmyPDF with Apple Vision Framework OCR engine

Project description

OCRmyPDF AppleOCR

A plugin for OCRmyPDF that enables optical character recognition (OCR) using the text detection capabilities of Apple’s Vision Framework on macOS.

Apple’s proprietary OCR implementation provides excellent accuracy and speed compared to other on-device OCR engines such as Tesseract.

Installation

The package is available on PyPI.

pip install ocrmypdf-appleocr

Usage

To use the plugin, pass the --plugin option when invoking ocrmypdf. You can also specify the language(s) for OCR using the -l or --language option. If you want to enable automatic language detection, use und (undetermined) as the language code.

ocrmypdf -l jpn --plugin ocrmypdf_appleocr input.pdf output.pdf

Options

  • --appleocr-recognition-mode: Recognition mode for Apple Vision OCR. Choices: fast, accurate, or livetext. Default: livetext on macOS 13 and later, accurate on macOS 12 and earlier.
  • --appleocr-disable-correction: Disable language correction in Apple Vision OCR (default: False)
  • --pdf-renderer: Renderer used to embed OCR results as invisible (“phantom”) text. Choices: hocr, sandwich. Default: sandwich.
  • -l or --language: Specify OCR language(s) in ISO 639-2 three-letter codes. Use und for undetermined language. Specifying multiple languages joined with + (e.g. eng+fra) for multilingual documents is not supported.

Automatic language detection (und) is not supported in livetext mode.

Recognition Modes

The fast and accurate modes use VNRecognizeTextRequest from Apple's Vision framework.

The livetext mode uses the newer ImageAnalyzer API from the VisionKit framework. Although officially Swift-only, it can be accessed via private API (VKCImageAnalyzer) through pyobjc.

The key difference is that LiveText supports vertical text layout in East Asian languages, which is not handled properly by the older API.

PDF Renderers

This plugin supports two OCRmyPDF renderers: hocr and sandwich. The default is sandwich.

  • sandwich: The plugin renders OCR output as a PDF layer with invisible text, which OCRmyPDF then merges with the original page image.
  • hocr: The plugin outputs OCR results as hOCR markup, and OCRmyPDF converts the markup to PDF.

Because the hOCR format cannot represent vertical text in East Asian (CJK) scripts, the hocr renderer cannot accurately reproduce vertical text layouts. However, OCRmyPDF’s built-in hOCR-to-PDF conversion is more mature and may perform better in other scenarios.

Supported Languages

As of macOS Tahoe 26, the following languages are supported by Apple Vision OCR:

Language code Language name Fast mode Accurate mode LiveText
eng English
fra French
ita Italian
deu German
spa Spanish
por Portuguese
chi_sim Chinese (Simplified)
chi_tra Chinese (Traditional)
yue_sim Cantonese (Simplified)
yue_tra Cantonese (Traditional)
kor Korean
jpn Japanese
rus Russian
ukr Ukrainian
tha Thai
vie Vietnamese
ara Arabic
ars Arabic (Najdi)
tur Turkish
ind Indonesian
ces Czech
dan Danish
nld Dutch
nor Norwegian
nno Norwegian (Nynorsk)
nob Norwegian (Bokmål)
msa Malay
pol Polish
ron Romanian
swe Swedish

Acknowledgements

This project incorporates and references code from the following projects:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocrmypdf_appleocr-0.3.4.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ocrmypdf_appleocr-0.3.4-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file ocrmypdf_appleocr-0.3.4.tar.gz.

File metadata

  • Download URL: ocrmypdf_appleocr-0.3.4.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ocrmypdf_appleocr-0.3.4.tar.gz
Algorithm Hash digest
SHA256 48fec0394851aa9485f9f40b9ef0e98fadae41f6900dd8cf2a629feb98af8ab0
MD5 6bc5deaa3136ce70bc7421fe1aeb1fb6
BLAKE2b-256 ce34b7cc8057e131189b7e8b3174d1b5cad9b43ecfc9711bd3f99e35b7d2b121

See more details on using hashes here.

Provenance

The following attestation bundles were made for ocrmypdf_appleocr-0.3.4.tar.gz:

Publisher: release.yml on mkyt/OCRmyPDF-AppleOCR

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ocrmypdf_appleocr-0.3.4-py3-none-any.whl.

File metadata

File hashes

Hashes for ocrmypdf_appleocr-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 7d63cb582b91683a8cef5d291c764ade20330d7c1b59dbb2d83bcc6238d53425
MD5 e2ced8190483ff8944dbcc52a7db1682
BLAKE2b-256 7a60e252425af74b7581c773d6c7ee28119341e6e0df8f23f92dc2faa4ef351f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ocrmypdf_appleocr-0.3.4-py3-none-any.whl:

Publisher: release.yml on mkyt/OCRmyPDF-AppleOCR

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page