Skip to main content

Call macOS/Windows built-in Vision OCR engine via Python C extension

Project description

native_ocr_py

A Python library that exposes the macOS Vision OCR engine via a Python C extension — no third-party OCR service, no bundled models, no network calls. Uses the OS-native framework directly for fast, on-device text recognition.

Platform support: macOS only for now. Windows support is planned.


Requirements

  • macOS 13 or later (for Vision framework revision 3)
  • Python 3.11 or later

Installation

Install from PyPI:

pip install native_ocr_py

Because the package includes a compiled C extension, a pre-built wheel for your macOS architecture (arm64 / x86_64) must be available on PyPI. If no matching wheel is found, pip will attempt to build from source, which requires Xcode Command Line Tools:

xcode-select --install

Quick start

import asyncio
import native_ocr

async def main():
    with open("screenshot.png", "rb") as f:
        data = f.read()

    results = await native_ocr.perform_ocr_on_image(data, normalized=True)

    for r in results:
        print(r.content, r.position)

asyncio.run(main())

API

get_supported_languages() -> list[str]

Returns BCP-47 language codes supported by the Vision OCR engine (e.g. "en-US", "zh-Hans"). The result is cached after the first call.


await perform_ocr_on_image(data, normalized, *, roi, high_accuracy, languages, custom_words) -> list[OcrResult]

Run OCR on an encoded image loaded into memory. Accepts any format supported by the OS decoder — JPEG and PNG are guaranteed; HEIC, TIFF, BMP, and WebP are available on most systems.

Parameter Type Default Description
data bytes Raw bytes of an encoded image file
normalized bool True → result coordinates in [0.0, 1.0]; False → pixels
roi BoundingBox | None None Region of interest in normalised coordinates. None = full image
high_accuracy bool True Use the accurate (slower) recognition level
languages list[str] | None None BCP-47 hints from get_supported_languages(). None = auto-detect
custom_words list[str] | None None Domain-specific vocabulary hints. Only applied when high_accuracy=True

await perform_ocr_on_bgra(bgra, width, height, normalized, *, roi, high_accuracy, languages, custom_words) -> list[OcrResult]

Run OCR on a raw BGRA8 pixel buffer. Useful when you already have decoded pixel data (e.g. from a screen capture or camera frame) and want to avoid re-encoding.

The buffer must be tightly packed: len(bgra) == width * height * 4.

Parameter Type Default Description
bgra bytes Raw BGRA8 pixel data, no row padding
width int Image width in pixels
height int Image height in pixels
normalized bool True → result coordinates in [0.0, 1.0]; False → pixels
roi BoundingBox | None None Region of interest in normalised coordinates. None = full image
high_accuracy bool True Use the accurate (slower) recognition level
languages list[str] | None None BCP-47 hints. None = auto-detect
custom_words list[str] | None None Domain-specific vocabulary hints. Only applied when high_accuracy=True

BoundingBox

@dataclass
class BoundingBox:
    x: float      # distance from left edge
    y: float      # distance from top edge
    width: float
    height: float

Top-left-origin coordinate rectangle. When used as roi input, always normalised. When returned in OcrResult, normalised or pixel depending on the normalized flag.


OcrResult

@dataclass
class OcrResult:
    content: str        # recognised text, stripped, never empty
    position: BoundingBox

Examples

Restrict to a region of interest

# Scan only the top-right quarter of the image
roi = native_ocr.BoundingBox(x=0.5, y=0.0, width=0.5, height=0.5)
results = await native_ocr.perform_ocr_on_image(data, normalized=True, roi=roi)

Pixel coordinates

results = await native_ocr.perform_ocr_on_image(data, normalized=False)
for r in results:
    print(f"{r.content!r} at ({r.position.x:.0f}, {r.position.y:.0f})")

OCR from a raw screen capture buffer

async def ocr_frame(bgra_bytes: bytes, width: int, height: int):
    return await native_ocr.perform_ocr_on_bgra(
        bgra_bytes, width, height,
        normalized=True,
        high_accuracy=False,   # faster for real-time use
    )

Language hints

langs = native_ocr.get_supported_languages()
print(langs)  # ['en-US', 'zh-Hans', 'zh-Hant', 'ja-JP', ...]

results = await native_ocr.perform_ocr_on_image(
    data, normalized=True, languages=["zh-Hans", "en-US"]
)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

native_ocr_py-0.0.1.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

native_ocr_py-0.0.1-cp314-cp314-macosx_26_0_arm64.whl (14.4 kB view details)

Uploaded CPython 3.14macOS 26.0+ ARM64

File details

Details for the file native_ocr_py-0.0.1.tar.gz.

File metadata

  • Download URL: native_ocr_py-0.0.1.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for native_ocr_py-0.0.1.tar.gz
Algorithm Hash digest
SHA256 5036bd69f119b4ea548acf812835b6aadf854a3f9508c97b1eee685109fd53df
MD5 7b7996d50da81f12fe48bed34166d543
BLAKE2b-256 5ff0c22671b459f2a4fd666af89521e615913984eb8e117bf05fc446c6e92f0f

See more details on using hashes here.

File details

Details for the file native_ocr_py-0.0.1-cp314-cp314-macosx_26_0_arm64.whl.

File metadata

  • Download URL: native_ocr_py-0.0.1-cp314-cp314-macosx_26_0_arm64.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: CPython 3.14, macOS 26.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for native_ocr_py-0.0.1-cp314-cp314-macosx_26_0_arm64.whl
Algorithm Hash digest
SHA256 abe8253f6f17b5d69ef0e105ea556bc8a6ea7d6e1d7d48f49ecb0d7701f73daf
MD5 e88d9c5c3f35ed23cf169509a8830235
BLAKE2b-256 d8797cbc99f2079de6978fd36ed3aa6866763da684273d47260d53a4bc1b1e1c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page