Skip to main content

Torchless optical character recognition for manga focused Japanese text, lightweight version of kha-white's manga-ocr

Project description

manga-ocr-torchless

A lightweight, torch-free version of the excellent manga-ocr by kha-white.

This package uses ONNX Runtime for inference, making it significantly faster to install and run on machines without a GPU, eliminating the multi-gigabyte dependency on PyTorch.

By default, this package uses mayocream's onnx export of the original manga-ocr package, but you can use any onnx export based on the manga-ocr package such as l0wgear's onnx export of jzhang533/manga-ocr-base.

Features

  • Lightweight: No PyTorch dependency (~2GB saved).
  • Parity: Achieves 100% character-level parity on the original test suite.
  • Fast: Optimized for CPU inference using ONNX.

Installation

pip install manga-ocr-torchless

Note: The required ONNX models (~400MB) will be automatically downloaded from HuggingFace on the first run, not during installation.

Usage

CLI

Process a single image:

manga_ocr path/to/image.jpg

Monitor clipboard (auto-OCR every time you copy an image):

manga_ocr -b

Watch a directory for new screenshots:

manga_ocr -d ./screenshots

Python API

from manga_ocr import MangaOcr
from PIL import Image

mocr = MangaOcr()
img = Image.open('image.jpg')
text = mocr(img)
print(text)

Custom Models

You can use a different ONNX model by providing a HuggingFace repo ID or a local path to the constructor or via the --model flag in the CLI:

Python:

mocr = MangaOcr(pretrained_model_name_or_path="user/repo-id")
# OR
mocr = MangaOcr(pretrained_model_name_or_path="./local_model_directory")

CLI:

manga_ocr --model "user/repo-id" path/to/image.jpg

Acknowledgments

This project is a direct port of manga-ocr by kha-white. All credit for the model architecture and training belongs to them. This version simply swaps the backend to ONNX for a leaner distribution.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

manga_ocr_torchless-0.1.0.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

manga_ocr_torchless-0.1.0-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file manga_ocr_torchless-0.1.0.tar.gz.

File metadata

  • Download URL: manga_ocr_torchless-0.1.0.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for manga_ocr_torchless-0.1.0.tar.gz
Algorithm Hash digest
SHA256 619e291fa0052706a2e84e1aeaea4a08973b45f439c2532ca67cd4f665e8a582
MD5 17fe0774b6befaa47af5e95c0920f2f0
BLAKE2b-256 cc3c1c5f12c8cb0eb8483e711b116ce87710afb7f1156f4744d8f679d26bdcc7

See more details on using hashes here.

File details

Details for the file manga_ocr_torchless-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for manga_ocr_torchless-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 db48749c5dfe4f2359b954ae624d10a23d4868be70a553d5e29414f88b891bc0
MD5 3d6052740e94e8b1d6842cc641af4406
BLAKE2b-256 3e02b84cc84ecbe58eed54b93752f478179a85562694a143b1b957b981a215b4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page