Fast and accurate OCR on images and PDFs using Apple Vision framework directly from command line.

These details have not been verified by PyPI

Project links

Project description

Apple Vision Framework Python Utilities

Fast and accurate OCR on images and PDFs using Apple Vision framework (pyobjc-framework-Vision) directly from command line.

Apple Vision Framework Python Utilities
- Features
- Demo
- Installation
- Usage
  - Command Line
  - As a Library
- Develop
- Test

Features

Fast and accurate, multi-language support (-l, --lang), powered by Apple's industry-strength Vision framework (pyobjc-framework-Vision).
Supports all common input image formats: PNG, JPEG, TIFF and WebP.
Supports PDF input (the file gets converted to images first). This tool does NOT assume a file is PDF just because it has a .pdf extension, you need to pass -p, --pdf flag.
Outputs extracted text only by default, but can output in JSON format containing confidence of recognition for each line with -j, --json flag.
Supports text clipping based on start and end markers (-s, -S, -e, -E).

Demo

Below is the output of running the tests:

https://g.teddysc.me/96d5b1217b90035c163b3c97ce99112f

Installation

Requires Python >= 3.11, <4.0.

Since this package uses Apple's Vision framework, it only works on macOS.

To OCR PDFs with -p, you need to install required dependency poppler with brew install poppler (detailed guide).

pipx

This is the recommended installation method.

$ pipx install apple-vision-utils

pip

$ pip install apple-vision-utils

`uv tool` installation doesn't work

I tried to install this with uv tool install using different Python versions on Apple Silicon Mac, it didn't work. May be caused by some peculiarities of objc interfacing libs. Just use pipx for now.

Usage

Command Line

$ apple-ocr --help

usage: apple-ocr [-h] [-j] [-p] [-l LANG] [--pdf2image-only] [--pdf2image-dir PDF2IMAGE_DIR] [-s START_MARKER_INCLUSIVE] [-S START_MARKER_EXCLUSIVE] [-e END_MARKER_INCLUSIVE] [-E END_MARKER] [-V] file_path

Extract text from an image or PDF using Apple's Vision framework.

positional arguments:
  file_path             Path to the image or PDF file.

options:
  -h, --help            show this help message and exit
  -j, --json            Output results in JSON format.
  -p, --pdf             Specify if the input file is a PDF.
  -l LANG, --lang LANG  Specify the language for text recognition (e.g., eng,
                        fra, deu, zh-Hans for Simplified Chinese, zh-Hant for
                        Traditional Chinese). Default is 'zh-Hant', which
                        works with images containing both Chinese characters
                        and latin letters.
  --pdf2image-only      Only convert PDF to images without performing OCR.
  --pdf2image-dir PDF2IMAGE_DIR
                        Specify the directory to store output images. By
                        default, a secure temporary directory is created.
  -s START_MARKER_INCLUSIVE, --start-marker-inclusive START_MARKER_INCLUSIVE
                        Specify the start marker (included, as the first line of the extracted text) for text extraction in PDF.
  -S START_MARKER_EXCLUSIVE, --start-marker-exclusive START_MARKER_EXCLUSIVE
                        Specify the start marker (excluded, as the first line of the extracted text) for text extraction in PDF.
  -e END_MARKER_INCLUSIVE, --end-marker-inclusive END_MARKER_INCLUSIVE
                        Specify the end marker (included, as the last line of the extracted text) for text extraction in PDF.
  -E END_MARKER, --end-marker END_MARKER
                        Specify the end marker (excluded, as the last line of the extracted text) for text extraction in PDF.
  -V, --version         show program's version number and exit

As a Library

You can also use the utility functions in your own Python code:

from apple_vision_utils.utils import image_to_text, pdf_to_images, process_pdf, clip_results

# Extract text from an image
results = image_to_text("path/to/image.png", lang="eng")

# Convert PDF to images
images = pdf_to_images("path/to/document.pdf")

# Process PDF for text recognition
pdf_results = process_pdf("path/to/document.pdf", lang="eng")

# Clip text results based on markers
clipped_results = clip_results(results, start_marker_inclusive="Start", end_marker_exclusive="End")

Develop

$ git clone https://github.com/tddschn/apple-vision-utils.git
$ cd apple-vision-utils
$ poetry install

Test

# in the root of the project
poetry install
poetry shell
cd tests && ./test.sh

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.2

Jan 31, 2025

1.0.1

Jan 31, 2025

1.0.0

Jan 31, 2025

0.1.6

May 21, 2024

0.1.5

May 21, 2024

0.1.4

May 21, 2024

0.1.3

May 21, 2024

0.1.2

May 21, 2024

0.1.1

May 21, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apple_vision_utils-1.0.2.tar.gz (5.9 kB view details)

Uploaded Jan 31, 2025 Source

Built Distribution

apple_vision_utils-1.0.2-py3-none-any.whl (7.6 kB view details)

Uploaded Jan 31, 2025 Python 3

File details

Details for the file apple_vision_utils-1.0.2.tar.gz.

File metadata

Download URL: apple_vision_utils-1.0.2.tar.gz
Upload date: Jan 31, 2025
Size: 5.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.4 CPython/3.12.8 Darwin/24.2.0

File hashes

Hashes for apple_vision_utils-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`574d560a5d0b1885ec09e02c31d8c26678ce58ecd6d4fb767dbb81854bea60d0`
MD5	`9237274e63d37984ddc7798c3f75db0e`
BLAKE2b-256	`40861947d4267acdbe80883e425a0956d2c2ff8ea9554411b0c0e72dd2565e0e`

See more details on using hashes here.

File details

Details for the file apple_vision_utils-1.0.2-py3-none-any.whl.

File metadata

Download URL: apple_vision_utils-1.0.2-py3-none-any.whl
Upload date: Jan 31, 2025
Size: 7.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.4 CPython/3.12.8 Darwin/24.2.0

File hashes

Hashes for apple_vision_utils-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eb8aee26f820b9f053824c52a7b2f666f6a3452d14b46bdade40a11257b152e5`
MD5	`a285de5e880376c5c8d6d59c9cb91f4c`
BLAKE2b-256	`c6b2ab79ad54ad6882d421825c904b3d6818de2cc690b0f693c3620d30d77247`

See more details on using hashes here.

apple-vision-utils 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Apple Vision Framework Python Utilities

Features

Demo

Installation

pipx

pip

`uv tool` installation doesn't work

Usage

Command Line

As a Library

Develop

Test

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

apple-vision-utils 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Apple Vision Framework Python Utilities

Features

Demo

Installation

pipx

pip

uv tool installation doesn't work

Usage

Command Line

As a Library

Develop

Test

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`uv tool` installation doesn't work