Lightweight PP-OCR runtime – ONNX only, no OpenCV, no heavy frameworks
Project description
ppocr-lite
A lightweight PaddlePaddle-OCR runtime for images like screenshots.
| Dependency | Role |
|---|---|
numpy |
All numerical computation |
Pillow |
Image I/O and resize |
onnxruntime |
Model inference |
scipy (optional) |
Faster connected-component labelling |
No OpenCV, deep-learning framework or utility libraries.
Install
pip install ppocr-lite # CPU
pip install ppocr-lite[gpu] # GPU (uses onnxruntime-gpu)
pip install ppocr-lite[fast] # + scipy for faster CC labelling
Models (PP-OCRv5 mobile det/rec + v2 direction cls) can be auto-downloaded to ~/.cache/ppocr_lite/ on
first use, or manually downloaded and configured.
Automatically downloaded models come from RapidOCR and are downloaded from huggingface (see here for details).
To manually download models see their huggingface - you'll
need one det.onnx (for text detection), one rec.onnx (for text recognition) and the corresponding dict.txt
(the model-output-to-character mapping). The mobile (= smaller) models as
shipped by OnnxOCR also work
quite well.
Quick Start
from ppocr_lite import PPOCRLite
ocr_engine = PPOCRLite()
for result in ocr_engine.run("screenshot.png"):
print(f"{result.score:.2f} {result.text}")
# result.box is a np.ndarray (4, 2) - top-left, top-right, bottom-right, bottom-left
Use Your Own Models
from ppocr_lite import PPOCRLite, ModelConfig
from pathlib import Path
ocr_engine = PPOCRLite(
ModelConfig(
det_model=Path("models/PP-OCRv5/det.onnx"),
rec_model=Path("models/PP-OCRv5/rec.onnx"),
dict_path=Path("models/PP-OCRv5/dict.txt"),
cls_model=False, # skip direction classifier
)
)
GPU inference
ocr_engine = PPOCRLite(providers=["CUDAExecutionProvider", "CPUExecutionProvider"])
Manage Downloaded Models
A few utility functions are available to configure from and to where models are downloaded:
from ppocr_lite import models
models.set_cache_directory("./my-cache-dir")
models.get_cache_directory() # -> pathlib.Path
models.list_downloaded_models() # -> list[pathlib.Path]
models.download_default_models()
models.download_model("https://huggingface.co/me/my-repo/resolve/main/my-model.onnx?download=true")
Of course you are entirely free to not use the built-in model management functionality and instead do everything yourself – just configure your engine on initialization as described above.
Optimized Path to Check Whether Text is Present
To efficiently check whether a certain text is present in the image, use this function:
res_first_text, res_second_text = ocr_engine.check_contains(
"./my-screenshot.png",
# Phrases to look for:
["This is some text", "some other text"],
# Optionally, position hints can speed up the search by starting to recognize text
# close to them first; on images with much text, this can be a big boost:
position_hints=[
(0.5, 0.5),
(0.5, 0.6)
],
# You can control how far text can be from any given location hint. Text further away than this
# distance will be ignored; it basically tells the engine how precise your location hints are.
# The value is relative to the shorter image side (0 - 1.0):
position_max_dist=0.3,
# Fuzzy matching is supported; set to zero to disable:
fuzzy_match_min_similarity=0.8,
)
Design Notes
This project is very similar to the excellent RapidOCR, but more lightweight. Notably, it does not depend on OpenCV (which weighs around 200MB) and uses numpy-based alternatives instead. This does not hurt performance much, at least in my humble tests.
Please be aware that many of those numpy-based alternatives are only really feasible because this project assumes non-distorted input images (screenshots, clean document scans, …). I have not tested it, but I'd assume it doesn't work nearly as well on inputs like perspective-distorted real-world photographs.
What's different here?
-
Detection post-processing – contour finding is replaced with scipy
ndimage.label(or a numpy fallback). The minimum-area rectangle is simplified under the assumption of non-perspective distorted input. Polygon offset ("unclip") is done analytically using the area/perimeter ratio and a per-vertex outward push — accurate enough for near-rectangular screenshot text. -
Resize – PIL
BILINEARinstead ofcv2.resize. The two are numerically equivalent for the precision required by OCR. -
Crop – axis-aligned bounding-rect crop instead of a perspective warp. Screenshot text is always axis-aligned, making this lossless.
-
No config YAML, no omegaconf – plain Python dataclasses.
Limitations vs. full PaddleOCR
- No perspective correction
- Direction classifier is only a 0°/180° binary; no 90°/270° support.
License
This project is GPL-3.0-or-later licensed. Note that the licenses of models (self-brought or auto-downloaded) will likely differ; refer to their creators for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ppocr_lite-0.5.1.tar.gz.
File metadata
- Download URL: ppocr_lite-0.5.1.tar.gz
- Upload date:
- Size: 32.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
162284d8ff08b43ff481ae4f23e00ba36c3a686a9c1abcfaf3db8d4feb3b83b6
|
|
| MD5 |
39487971b5b694d9ad9d9909bccb298e
|
|
| BLAKE2b-256 |
e122f8c19f562c81f7a475a35c4774530fbdaeaebb607fb76864c763a7c631d5
|
Provenance
The following attestation bundles were made for ppocr_lite-0.5.1.tar.gz:
Publisher:
python-publish.yml on mityax/ppocr_lite
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ppocr_lite-0.5.1.tar.gz -
Subject digest:
162284d8ff08b43ff481ae4f23e00ba36c3a686a9c1abcfaf3db8d4feb3b83b6 - Sigstore transparency entry: 1239401572
- Sigstore integration time:
-
Permalink:
mityax/ppocr_lite@c6c858a0ab8b5bc9953ee680e9044bbcb1e48110 -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/mityax
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@c6c858a0ab8b5bc9953ee680e9044bbcb1e48110 -
Trigger Event:
release
-
Statement type:
File details
Details for the file ppocr_lite-0.5.1-py3-none-any.whl.
File metadata
- Download URL: ppocr_lite-0.5.1-py3-none-any.whl
- Upload date:
- Size: 33.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6b5f9dd5f128bb3b45856153fb109355896f4dab7c33728e2132d9d903ec884
|
|
| MD5 |
3b06fb3f136da977dd1c73d3a374163c
|
|
| BLAKE2b-256 |
87a0a3de6d54ddee7844aad4e6eea65c785c99311e150d3cdeefc0ea9ae6b717
|
Provenance
The following attestation bundles were made for ppocr_lite-0.5.1-py3-none-any.whl:
Publisher:
python-publish.yml on mityax/ppocr_lite
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ppocr_lite-0.5.1-py3-none-any.whl -
Subject digest:
e6b5f9dd5f128bb3b45856153fb109355896f4dab7c33728e2132d9d903ec884 - Sigstore transparency entry: 1239401578
- Sigstore integration time:
-
Permalink:
mityax/ppocr_lite@c6c858a0ab8b5bc9953ee680e9044bbcb1e48110 -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/mityax
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@c6c858a0ab8b5bc9953ee680e9044bbcb1e48110 -
Trigger Event:
release
-
Statement type: