Skip to main content

Fast CPU OCR — PaddleOCR PP-OCRv6 tiny (lightweight), reimplemented in Rust + ONNX Runtime. ~7x faster than PaddlePaddle, self-contained wheels with models bundled.

Project description

faster-paddle

Fast, CPU-only OCR in Rust with Python bindings — a self-contained reimplementation of PaddleOCR's PP-OCRv6 detection + recognition pipeline powered by ONNX Runtime.

  • ~9× faster than paddleocr on CPU for the same models and output (parallel detection pre/post-processing + a concurrent recognition session pool).
  • 📦 Self-contained — the tiny + small ONNX models are bundled inside the wheel. No paddlepaddle, no model downloads for tiny/small.
  • 🎚️ Three model sizes: tiny (default, fastest), small, and medium (higher accuracy; downloaded once on first use and cached).
  • 🦀 Pure-Rust pre/post-processing (detection DB decode, minAreaRect, perspective crop, CTC decode, reading-order text reconstruction). No OpenCV.
  • 🖥️ Prebuilt wheels for Linux, Windows, macOS (x86-64 + arm64).
paddleocr (PaddlePaddle, CPU)        22.7 s / image
faster-paddle (Rust + ONNXRuntime)    2.5 s / image     →  ~9× faster

(test image 3157×4464, AMD Ryzen 7 5800X3D; both after warm-up, same weights.)


Install

pip install faster-paddle

Usage

import faster_paddle

# One-shot, using a shared default engine (lazily initialized):
with open("document.jpg", "rb") as f:
    result = faster_paddle.ocr(f.read())

print(result["text"])              # reading-order reconstructed text
for idx, b in result["bounds"].items():
    print(idx, b["text"], b["confidence"], b["topLeftCoord"], b["bottomRightCoord"])

Reuse an explicit engine (recommended for servers — load the models once):

from faster_paddle import OcrEngine

# model_size: "tiny" (default), "small", or "medium"
engine = OcrEngine(model_size="tiny", threads=None, rec_batch=6)

result = engine.ocr(image_bytes)                 # raw jpeg/png/webp/bmp/tiff/gif bytes
result = engine.ocr_base64(b64_string)           # base64-encoded image

Optional preprocessing

ocr / ocr_base64 take four optional flags (all default False), applied — when enabled — in the optimal order, all in fast parallel Rust:

result = engine.ocr(
    image_bytes,
    resize=True,     # 1. downscale to ≤ 2100×3000 (aspect preserved) if larger
    denoise=True,    # 2. fast Non-Local-Means denoise (grayscale)
    deskew=True,     # 3. detect skew (Canny + Hough) and rotate to straighten
    binarize=True,   # 4. Sauvola adaptive thresholding (clean black/white)
)

Order rationale: resize first (everything downstream is then faster), denoise before angle detection and thresholding, deskew on the cleaned image, binarize last to produce the final B/W. Enabling resize typically makes OCR faster overall (less detector work). Any of denoise/deskew/binarize converts the image to grayscale.

Returned bounds are always in the original image's coordinate space — even when resize or deskew changes the working image, the boxes are mapped back so they line up with your input.

Model sizes

size bundled det+rec notes
tiny ✅ yes ~6 MB default, fastest, lightweight
small ✅ yes ~31 MB better accuracy
medium ⬇️ on demand ~138 MB best accuracy; downloaded once from the GitHub release and cached under your user cache dir

tiny and small are embedded in the wheel (offline). medium exceeds PyPI's file-size limit, so the first OcrEngine(model_size="medium") downloads it once (needs network that time only) and caches it for subsequent runs.

Result shape

{
  "text": "full reconstructed text...",
  "structured_text": "layout-preserving text (see below)",
  "bounds": {
     0: {
        "topLeftCoord":     (x1, y1),
        "bottomRightCoord": (x2, y2),
        "text":             "line text",
        "confidence":       0.97,
     },
     1: { ... },
  }
}

text and bounds match the JSON contract of the original paddle-ocr-api service, so it is a drop-in replacement.

structured_text

A spatial reconstruction that reads left-to-right, top-to-bottom while preserving the visual layout: vertical whitespace gaps split the page into columns/panes (each read fully before the next), and within each one the rows are laid out as a monospace grid, so indentation (tree nesting) and aligned sub-columns (key/value tables) are kept. Single-glyph UI icon noise is dropped.

Use structured_text for screenshots, forms, table/tree UIs, and code — anything where spatial structure carries meaning. Use text for dense multi-column prose: there the absolute pixel spacing of structured_text produces very wide lines, so the column-merging text reconstruction reads better. Both are always returned, so you can pick per use case.

Example structured_text for a two-pane database UI:

PNS
 Collections (11)
   System
   CAGED
   IPCMAPS_MUNICIPIO
 Functions
 Users

Key                                                Value
        OUTRAS_DESPESAS_POTENCIAL_DE_CONSUMO_EM... 7332964
        TOTAL_DO_CONSUMO_URBANO_E_RURAL            613855113
        CD_MUNI_IBGE                               1100015

API

faster_paddle.ocr(image, resize=False, denoise=False, deskew=False, binarize=False) -> dict OCR encoded image bytes (shared default engine).
faster_paddle.ocr_base64(image_base64, resize=False, denoise=False, deskew=False, binarize=False) -> dict OCR a base64 image string.
OcrEngine(model_size="tiny", threads=None, rec_batch=None) Construct a reusable engine.
OcrEngine.ocr(image, resize=False, denoise=False, deskew=False, binarize=False) -> dict OCR encoded image bytes.
OcrEngine.ocr_base64(image_base64, resize=False, denoise=False, deskew=False, binarize=False) -> dict OCR a base64 image string.
  • resize/denoise/deskew/binarize: optional preprocessing (see above).
  • model_size: "tiny" (default), "small", or "medium".
  • threads: ONNX Runtime intra-op threads. Defaults to the number of physical CPU cores (SMT/logical threads tend to slow compute-bound inference down).
  • rec_batch: recognition batch size (default 6).

Calls are thread-safe (serialized internally) and release the GIL during inference.


How it works

The pipeline faithfully mirrors PaddleOCR's lightweight path:

  1. Detection — resize (min-side 736, clamp max-side 4000, round to ×32), normalize (BGR mean/std), run the DB detector.
  2. DB post-process — threshold 0.2, connected components, minAreaRect, box score ≥ 0.4, unclip ratio 1.4, rescale to source coordinates.
  3. Sort boxes top-to-bottom / left-to-right; crop each via perspective warp.
  4. Recognition — resize each crop to H=48, normalize, batch, run the CTC recognizer ([N, T, 6906]), greedy CTC decode.
  5. Reconstruct reading-order text with dynamic column/line detection.

Detection matches PaddlePaddle at 96 % IoU>0.5 with 0.93 character-level similarity on the recognized text; the residual difference is ONNX-Runtime vs PaddlePaddle floating-point numerics, not the algorithm.

The bundled models are PP-OCRv6_tiny_det and PP-OCRv6_tiny_rec exported with paddle2onnx.

Building from source

pip install maturin
maturin develop --release      # build + install into the current environment
# or
maturin build --release        # produce a wheel in target/wheels/

Requires a Rust toolchain. ONNX Runtime is fetched automatically by the ort crate at build time and linked into the extension.

Tests

cargo test --release                 # Rust unit tests (geometry, resize, CTC)
maturin develop --release            # then the Python integration tests:
python faster_paddle/tests/test_integration.py

The integration tests check the result shape, known-text detection, that the recognition session pool is deterministic, that bounds map back to original coordinates after resize, that all preprocessing options run, and a speed regression guard.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faster_paddle-0.0.6.tar.gz (32.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

faster_paddle-0.0.6-cp38-abi3-win_amd64.whl (72.9 MB view details)

Uploaded CPython 3.8+Windows x86-64

faster_paddle-0.0.6-cp38-abi3-manylinux_2_34_x86_64.whl (73.7 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.34+ x86-64

faster_paddle-0.0.6-cp38-abi3-manylinux_2_28_aarch64.whl (74.6 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.28+ ARM64

faster_paddle-0.0.6-cp38-abi3-macosx_11_0_arm64.whl (72.8 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file faster_paddle-0.0.6.tar.gz.

File metadata

  • Download URL: faster_paddle-0.0.6.tar.gz
  • Upload date:
  • Size: 32.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.14.1

File hashes

Hashes for faster_paddle-0.0.6.tar.gz
Algorithm Hash digest
SHA256 fd6cdbd199176697b5c1c8c2aea7d6c05c1369729f9c896cc05e323bd7151776
MD5 8ccbfaba75730e8d9fedd0d40313c28d
BLAKE2b-256 ae06b8afba995a2aa732774b1ac3e9c35517a602ab16acee5e0725ca3d3aa903

See more details on using hashes here.

File details

Details for the file faster_paddle-0.0.6-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for faster_paddle-0.0.6-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 2fe650db57bbd2db57400571ae147cb3647532c3fde8c79bce58b69d0e2c8492
MD5 ffcd4eb2705281d864e6d206eb57c1a3
BLAKE2b-256 da8cadf191a9e9b1fb578bdc94b2bf8b08d2779ddf7d21eb03d9b97be04526b9

See more details on using hashes here.

File details

Details for the file faster_paddle-0.0.6-cp38-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for faster_paddle-0.0.6-cp38-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 b52b306a00ca685368cc3c46fb4685c41516bfff5a68656145411014bea6746c
MD5 010be091fbce11d4b57810e368b68c0a
BLAKE2b-256 aa391d7bc5583c22f6c84383051b2c8aa2f85bf3c69a18456930938d446505be

See more details on using hashes here.

File details

Details for the file faster_paddle-0.0.6-cp38-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for faster_paddle-0.0.6-cp38-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 d8c41c5d849282c30e2210b542e1949b4984ca8d10dae8169d179b0e64a9d62d
MD5 c3db274dafa0c9bdcb39e79fcedf3050
BLAKE2b-256 71a3be24d9c0a7a634b98daa8f8d1a607e5e863eb951ec151c23687c95efbba6

See more details on using hashes here.

File details

Details for the file faster_paddle-0.0.6-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for faster_paddle-0.0.6-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f7e56ec232ecadae8362d41cfc6d65f87f2c95cbf4cdd89db955c7644e277b28
MD5 fc21734b5ad1dabbdb1742ada0ee20c1
BLAKE2b-256 c61a48e5dc5237631b6c416a918439654049f7e4583951f158497a5cec7dcd9c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page