Torchless optical character recognition for manga focused Japanese text, lightweight version of kha-white's manga-ocr
Project description
manga-ocr-torchless
A lightweight, torch-free version of the excellent manga-ocr by kha-white.
This package uses ONNX Runtime for inference, making it significantly faster to install and run on machines without a GPU, eliminating the multi-gigabyte dependency on PyTorch.
By default, this package uses mayocream's onnx export of the original manga-ocr package, but you can use any onnx export based on the manga-ocr package such as l0wgear's onnx export of jzhang533/manga-ocr-base.
Features
- Lightweight: No PyTorch dependency (~2GB saved).
- Parity: Achieves 100% character-level parity on the original test suite.
- Fast: Optimized for CPU inference using ONNX.
Installation
pip install manga-ocr-torchless
Note: The required ONNX models (~400MB) will be automatically downloaded from HuggingFace on the first run, not during installation.
Usage
CLI
Process a single image:
manga_ocr path/to/image.jpg
Monitor clipboard (auto-OCR every time you copy an image):
manga_ocr -b
Watch a directory for new screenshots:
manga_ocr -d ./screenshots
Python API
from manga_ocr import MangaOcr
from PIL import Image
mocr = MangaOcr()
img = Image.open('image.jpg')
text = mocr(img)
print(text)
Custom Models
You can use a different ONNX model by providing a HuggingFace repo ID or a local path to the constructor or via the --model flag in the CLI:
Python:
mocr = MangaOcr(pretrained_model_name_or_path="user/repo-id")
# OR
mocr = MangaOcr(pretrained_model_name_or_path="./local_model_directory")
CLI:
manga_ocr --model "user/repo-id" path/to/image.jpg
Acknowledgments
This project is a direct port of manga-ocr by kha-white. All credit for the model architecture and training belongs to them. This version simply swaps the backend to ONNX for a leaner distribution.
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file manga_ocr_torchless-0.1.0.tar.gz.
File metadata
- Download URL: manga_ocr_torchless-0.1.0.tar.gz
- Upload date:
- Size: 9.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
619e291fa0052706a2e84e1aeaea4a08973b45f439c2532ca67cd4f665e8a582
|
|
| MD5 |
17fe0774b6befaa47af5e95c0920f2f0
|
|
| BLAKE2b-256 |
cc3c1c5f12c8cb0eb8483e711b116ce87710afb7f1156f4744d8f679d26bdcc7
|
File details
Details for the file manga_ocr_torchless-0.1.0-py3-none-any.whl.
File metadata
- Download URL: manga_ocr_torchless-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db48749c5dfe4f2359b954ae624d10a23d4868be70a553d5e29414f88b891bc0
|
|
| MD5 |
3d6052740e94e8b1d6842cc641af4406
|
|
| BLAKE2b-256 |
3e02b84cc84ecbe58eed54b93752f478179a85562694a143b1b957b981a215b4
|