Skip to main content

Adaptive, modular OCR and document-AI pipeline orchestrator

Project description

DocuVision

Adaptive, modular document OCR & AI pipeline orchestrator. DocuVision detects your system's capabilities (CPU, GPU, VRAM, installed ML libraries) and dynamically picks the best feasible OCR and document-AI stack — from Tesseract on a 4 GB CPU box all the way up to TrOCR / Donut on a multi-GPU workstation.

Features

  • System profiler — OS, CPU, GPU, CUDA, VRAM, RAM, TensorRT, installed libs
  • Tiered engine selection — Tier 0 (CPU / Tesseract) → Tier 4 (GPU / TrOCR, Donut)
  • Unified OCR registry — Tesseract, EasyOCR, PaddleOCR (CPU/GPU), docTR, TrOCR, Donut
  • Text detection — CRAFT, DBNet, EAST, PaddleOCR-det + pure-OpenCV contour fallback
  • Clustering — DBSCAN / HDBSCAN bounding-box merging with anisotropic distance
  • Document classification — pluggable YOLO / ONNX / keyword backends, custom labels
  • PII masking — YOLO or regex backends, gaussian blur / black box / pixelation
  • Embossed / engraved OCR — CLAHE + Sobel + shape-from-shading + morphology, ideal for chassis numbers, industrial plates, metal serials
  • Graceful fallbacks — no hard deps on heavy libs, missing engines are skipped

Installation

# Core only — works with the pure-OpenCV detector but needs an OCR engine
# installed separately
pip install docuvision

# Common bundles
pip install "docuvision[tesseract]"   # + pytesseract (requires tesseract binary)
pip install "docuvision[easyocr]"     # + easyocr
pip install "docuvision[paddle-cpu]"  # + paddleocr + paddlepaddle (CPU)
pip install "docuvision[paddle-gpu]"  # + paddlepaddle-gpu
pip install "docuvision[doctr]"       # + python-doctr
pip install "docuvision[trocr]"       # + torch + transformers + TrOCR deps
pip install "docuvision[donut]"       # + torch + transformers + Donut deps

# Aggregate bundles
pip install "docuvision[cpu]"         # every CPU-capable engine
pip install "docuvision[gpu]"         # every engine, GPU-preferred
pip install "docuvision[all]"         # everything, library versions only

# Development
pip install "docuvision[dev]"

Heavy engines (torch, paddle, transformers…) are optional. DocuVision lazily imports them, so a missing library simply disables that one engine; the rest of the pipeline keeps working.

Quick start

from docuvision import DocumentPipeline, profile_system

# See what DocuVision picks for your system
print(profile_system().summary())

# Zero-config pipeline — uses the best available engine
pipe = DocumentPipeline()
result = pipe.run("invoice.png")
print(result.text)

# Full-featured pipeline
pipe = DocumentPipeline(
    detect_text=True,
    classify_doc=True,
    mask=True,
    embossed_mode=False,
    language=["en"],
)
result = pipe.run("aadhaar.jpg")
print(result.doc_class.label)  # → 'aadhaar'
print(result.text)
print(result.masks)            # detected PII regions
# result.masked_image is a numpy.ndarray ready for cv2.imwrite(...)

Command-line

docuvision profile                            # show the capability report
docuvision ocr invoice.png                    # default pipeline
docuvision ocr plate.jpg --embossed           # embossed / chassis OCR
docuvision ocr id.jpg --classify --mask --mask-method blackbox
docuvision engines                            # list registered OCR engines
docuvision detectors                          # list registered detectors

Design principles

  1. No hard deps on heavy libs. Every engine is imported lazily. pip install docuvision drops in fine even without torch, paddle, or transformers.
  2. Explicit tiering. The profiler scores your hardware and picks a tier; you can override it per call.
  3. Composable, not monolithic. Each stage (detect, classify, OCR, mask) is an independent, swappable component with a common ABC and a registry.
  4. Sane defaults, full control. DocumentPipeline().run(img) just works; every knob is exposed via PipelineConfig.
  5. Fails gracefully. A per-stage failure is captured in PipelineResult.metadata["failures"] rather than aborting the pipeline.

Testing

The full test suite runs against the core install only — heavy engines are not required:

pytest tests/
# or, without pytest installed:
python run_tests.py

License

Apache License 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docuvision-2.0.0.tar.gz (55.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docuvision-2.0.0-py3-none-any.whl (66.9 kB view details)

Uploaded Python 3

File details

Details for the file docuvision-2.0.0.tar.gz.

File metadata

  • Download URL: docuvision-2.0.0.tar.gz
  • Upload date:
  • Size: 55.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for docuvision-2.0.0.tar.gz
Algorithm Hash digest
SHA256 4eba981d1ae4b19cd113854aa607742b850479b5b83da215ebfed5cb647d1f5f
MD5 c32078339ce2e4395c2bec712ad218ac
BLAKE2b-256 1f2c07c5f7cb5bef637ae0ca495261ec3ca620b0e56a72a2137745508fbf942a

See more details on using hashes here.

Provenance

The following attestation bundles were made for docuvision-2.0.0.tar.gz:

Publisher: publish.yml on Tanupvats/docuvision

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file docuvision-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: docuvision-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 66.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for docuvision-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5a757acfa04df18d318691e9e9db4babe19201c15a04298f0315e763dd883f1f
MD5 4e161f9336baad03d9cc3eb2ade2a642
BLAKE2b-256 eba532cb080e7334bb972087fc10142ebc52fcb336d4f3a56ef52f7a3f1e8c33

See more details on using hashes here.

Provenance

The following attestation bundles were made for docuvision-2.0.0-py3-none-any.whl:

Publisher: publish.yml on Tanupvats/docuvision

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page