Skip to main content

ONNX-only OCR runtime for mathematical documents

Project description

MathCraft OCR

MathCraft OCR is an ONNX-only OCR runtime for mathematical documents. It provides formula recognition, text recognition, mixed text/formula page OCR, explicit model-cache management, and structured block output for downstream Markdown or TeX document engines.

The package is developed for LaTeXSnipper but is usable as a standalone Python library.

Features

  • ONNX Runtime inference only; no active PyTorch OCR runtime.
  • Formula OCR: image to LaTeX.
  • Text OCR: multilingual PP-OCRv5 mobile detector/recognizer.
  • Mixed OCR: formula detection, text masking, batched recognition, and layout merge.
  • Manifest-driven model cache with SHA-256 file checks.
  • Automatic repair for missing or incomplete model directories.
  • Resumable model downloads for interrupted first-run cache repair.
  • CPU/GPU provider selection through ONNX Runtime.
  • JSONL worker mode for GUI or service integration.

Installation

CPU backend:

pip install "mathcraft-ocr[cpu]"

GPU backend:

pip install "mathcraft-ocr[gpu]"

Install only one backend extra in a clean environment. onnxruntime and onnxruntime-gpu should not be mixed in the same environment.

LaTeXSnipper's dependency wizard selects the ONNX Runtime GPU wheel line from the detected CUDA toolkit. CUDA 11.x uses the ONNX Runtime CUDA 11 package feed, CUDA 12.x uses the stable PyPI GPU wheels, and CUDA 13.x uses the ONNX Runtime CUDA 13 nightly feed. Static mathcraft-ocr[gpu] package metadata cannot inspect the local CUDA toolkit, so it keeps a broad stable PyPI range; CUDA 11.x users installing manually should use the CUDA 11 feed shown by the wizard.

Quick Start

from mathcraft_ocr import MathCraftRuntime

runtime = MathCraftRuntime(provider_preference="auto")
result = runtime.recognize_mixed("page.png")

print(result.text)
for block in result.blocks:
    print(block.role, block.kind, block.text[:80])

Formula-only recognition:

from mathcraft_ocr import MathCraftRuntime

runtime = MathCraftRuntime(provider_preference="cpu")
formula = runtime.recognize_formula("formula.png")
print(formula.text)

CLI

Check model cache:

mathcraft models check

Inspect runtime:

mathcraft doctor --provider auto

Warm up models:

mathcraft warmup --profile mixed --provider auto

Recognize an image:

mathcraft ocr "C:\path\to\page.png" --profile mixed --provider auto --output result.md
mathcraft ocr "C:\path\to\page.png" --profile mixed --provider auto --output-dir "D:\MathCraft\outputs"
mathcraft ocr "C:\path\to\formula.png" --profile formula --provider auto --json

Run JSONL worker mode:

mathcraft worker --provider auto

Model Cache

MathCraft reads models from:

%APPDATA%\MathCraft\models

or from a custom root:

$env:MATHCRAFT_HOME="D:\MathCraft\models"
mathcraft doctor --provider auto

Persist the custom root for future PowerShell sessions:

setx MATHCRAFT_HOME "D:\MathCraft\models"

Restore the default user cache root:

[Environment]::SetEnvironmentVariable("MATHCRAFT_HOME", $null, "User")
Remove-Item Env:\MATHCRAFT_HOME -ErrorAction SilentlyContinue
mathcraft doctor --provider auto

Open a new PowerShell window after removing the persistent variable. The default root is:

%APPDATA%\MathCraft\models

Model artifacts are downloaded from the MathCraft-Models release assets declared in mathcraft_ocr/manifests/models.v1.json.

Runtime Profiles

Profile Models Output
formula formula detector + formula recognizer LaTeX string
text text detector + text recognizer OCR text and text blocks
mixed formula detector + formula recognizer + text detector + text recognizer Markdown-ready structured blocks

Provider Selection

provider_preference accepts:

  • auto: prefer CUDA when available and valid, otherwise CPU.
  • cpu: force CPU.
  • gpu: request CUDA-capable ONNX Runtime.

The actual provider is available on results through the provider field.

Development

Run tests from the repository root:

cd E:\LaTexSnipper
python .\test\test_mathcraft_ocr.py
python .\test\test_mathcraft_document_engine.py

Build package artifacts:

cd E:\LaTexSnipper
python -m build --no-isolation --outdir .\release_assets\mathcraft-ocr-package\dist .

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mathcraft_ocr-0.2.2.tar.gz (50.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mathcraft_ocr-0.2.2-py3-none-any.whl (51.4 kB view details)

Uploaded Python 3

File details

Details for the file mathcraft_ocr-0.2.2.tar.gz.

File metadata

  • Download URL: mathcraft_ocr-0.2.2.tar.gz
  • Upload date:
  • Size: 50.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for mathcraft_ocr-0.2.2.tar.gz
Algorithm Hash digest
SHA256 1b4d5e1c05ad6633227c6b3725a6eb547e4639f9a4d189d0cea0cb23bb37a4ed
MD5 de28cf28dae671f706ebebe456293593
BLAKE2b-256 53f6c9984e1a1797ad3478b741a2515a5156b7975151ad8bdf9e9d1b07d6712c

See more details on using hashes here.

File details

Details for the file mathcraft_ocr-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: mathcraft_ocr-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 51.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for mathcraft_ocr-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a0cec06cf3201ab77220b9d53b76a03ce6700e9551f14816ce0859bc4f981935
MD5 b6663f2d603d2a8812924b6d86395940
BLAKE2b-256 94b7301b5dfd1ce09f2a048bdb3049bbceb70d77d54bc12a90d7d8962d805960

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page