Skip to main content

Call macOS/Windows built-in Vision OCR engine via Python C extension

Project description

native_ocr_py

A Python library that exposes the macOS Vision OCR engine via a Python C extension — no third-party OCR service, no bundled models, no network calls. Uses the OS-native framework directly for fast, on-device text recognition.

Platform support: macOS only for now. Windows support is planned.


Requirements

  • macOS 13 or later (for Vision framework revision 3)
  • Python 3.11 or later

Installation

Install from PyPI:

pip install native_ocr_py

Because the package includes a compiled C extension, a pre-built wheel for your macOS architecture (arm64 / x86_64) must be available on PyPI. If no matching wheel is found, pip will attempt to build from source, which requires Xcode Command Line Tools:

xcode-select --install

Quick start

import asyncio
import native_ocr

async def main():
    with open("screenshot.png", "rb") as f:
        data = f.read()

    results = await native_ocr.perform_ocr_on_image(data, normalized=True)

    for r in results:
        print(r.content, r.position)

asyncio.run(main())

API

get_supported_languages() -> list[str]

Returns BCP-47 language codes supported by the Vision OCR engine (e.g. "en-US", "zh-Hans"). The result is cached after the first call.


await perform_ocr_on_image(data, normalized, *, roi, high_accuracy, languages, custom_words) -> list[OcrResult]

Run OCR on an encoded image loaded into memory. Accepts any format supported by the OS decoder — JPEG and PNG are guaranteed; HEIC, TIFF, BMP, and WebP are available on most systems.

Parameter Type Default Description
data bytes Raw bytes of an encoded image file
normalized bool True → result coordinates in [0.0, 1.0]; False → pixels
roi BoundingBox | None None Region of interest in normalised coordinates. None = full image
high_accuracy bool True Use the accurate (slower) recognition level
languages list[str] | None None BCP-47 hints from get_supported_languages(). None = auto-detect
custom_words list[str] | None None Domain-specific vocabulary hints. Only applied when high_accuracy=True

await perform_ocr_on_bgra(bgra, width, height, normalized, *, roi, high_accuracy, languages, custom_words) -> list[OcrResult]

Run OCR on a raw BGRA8 pixel buffer. Useful when you already have decoded pixel data (e.g. from a screen capture or camera frame) and want to avoid re-encoding.

The buffer must be tightly packed: len(bgra) == width * height * 4.

Parameter Type Default Description
bgra bytes Raw BGRA8 pixel data, no row padding
width int Image width in pixels
height int Image height in pixels
normalized bool True → result coordinates in [0.0, 1.0]; False → pixels
roi BoundingBox | None None Region of interest in normalised coordinates. None = full image
high_accuracy bool True Use the accurate (slower) recognition level
languages list[str] | None None BCP-47 hints. None = auto-detect
custom_words list[str] | None None Domain-specific vocabulary hints. Only applied when high_accuracy=True

BoundingBox

@dataclass
class BoundingBox:
    x: float      # distance from left edge
    y: float      # distance from top edge
    width: float
    height: float

Top-left-origin coordinate rectangle. When used as roi input, always normalised. When returned in OcrResult, normalised or pixel depending on the normalized flag.


OcrResult

@dataclass
class OcrResult:
    content: str        # recognised text, stripped, never empty
    position: BoundingBox

Examples

Restrict to a region of interest

# Scan only the top-right quarter of the image
roi = native_ocr.BoundingBox(x=0.5, y=0.0, width=0.5, height=0.5)
results = await native_ocr.perform_ocr_on_image(data, normalized=True, roi=roi)

Pixel coordinates

results = await native_ocr.perform_ocr_on_image(data, normalized=False)
for r in results:
    print(f"{r.content!r} at ({r.position.x:.0f}, {r.position.y:.0f})")

OCR from a raw screen capture buffer

async def ocr_frame(bgra_bytes: bytes, width: int, height: int):
    return await native_ocr.perform_ocr_on_bgra(
        bgra_bytes, width, height,
        normalized=True,
        high_accuracy=False,   # faster for real-time use
    )

Language hints

langs = native_ocr.get_supported_languages()
print(langs)  # ['en-US', 'zh-Hans', 'zh-Hant', 'ja-JP', ...]

results = await native_ocr.perform_ocr_on_image(
    data, normalized=True, languages=["zh-Hans", "en-US"]
)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

native_ocr_py-0.0.2.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

native_ocr_py-0.0.2-cp314-cp314-macosx_26_0_arm64.whl (13.0 kB view details)

Uploaded CPython 3.14macOS 26.0+ ARM64

File details

Details for the file native_ocr_py-0.0.2.tar.gz.

File metadata

  • Download URL: native_ocr_py-0.0.2.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for native_ocr_py-0.0.2.tar.gz
Algorithm Hash digest
SHA256 ef60a333d907c1e0c3193fe4153e399b5035527a8fc9322a15ff11125e41a89e
MD5 8e616b89f0b1ce107520addceaed262a
BLAKE2b-256 022014c1bd18afabfa2456b0c2c8ae568f56aea76e512eff9c97a2f5b06a091d

See more details on using hashes here.

File details

Details for the file native_ocr_py-0.0.2-cp314-cp314-macosx_26_0_arm64.whl.

File metadata

  • Download URL: native_ocr_py-0.0.2-cp314-cp314-macosx_26_0_arm64.whl
  • Upload date:
  • Size: 13.0 kB
  • Tags: CPython 3.14, macOS 26.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for native_ocr_py-0.0.2-cp314-cp314-macosx_26_0_arm64.whl
Algorithm Hash digest
SHA256 dbf2499b0d5a3c421a34d0c336d7f83dcf270c38235480bdb5b39a5b172e9d99
MD5 6437be1fde5505be79656fb6be657fb9
BLAKE2b-256 9d3c41c201cc0f5419680bbb00b42c96cbf8312b50481af431c569e260fc37be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page