Adaptive, modular OCR and document-AI pipeline orchestrator
Project description
DocuVision
Adaptive, modular document OCR & AI pipeline orchestrator. DocuVision detects your system's capabilities (CPU, GPU, VRAM, installed ML libraries) and dynamically picks the best feasible OCR and document-AI stack — from Tesseract on a 4 GB CPU box all the way up to TrOCR / Donut on a multi-GPU workstation.
Features
- System profiler — OS, CPU, GPU, CUDA, VRAM, RAM, TensorRT, installed libs
- Tiered engine selection — Tier 0 (CPU / Tesseract) → Tier 4 (GPU / TrOCR, Donut)
- Unified OCR registry — Tesseract, EasyOCR, PaddleOCR (CPU/GPU), docTR, TrOCR, Donut
- Text detection — CRAFT, DBNet, EAST, PaddleOCR-det + pure-OpenCV contour fallback
- Clustering — DBSCAN / HDBSCAN bounding-box merging with anisotropic distance
- Document classification — pluggable YOLO / ONNX / keyword backends, custom labels
- PII masking — YOLO or regex backends, gaussian blur / black box / pixelation
- Embossed / engraved OCR — CLAHE + Sobel + shape-from-shading + morphology, ideal for chassis numbers, industrial plates, metal serials
- Graceful fallbacks — no hard deps on heavy libs, missing engines are skipped
Installation
# Core only — works with the pure-OpenCV detector but needs an OCR engine
# installed separately
pip install docuvision
# Common bundles
pip install "docuvision[tesseract]" # + pytesseract (requires tesseract binary)
pip install "docuvision[easyocr]" # + easyocr
pip install "docuvision[paddle-cpu]" # + paddleocr + paddlepaddle (CPU)
pip install "docuvision[paddle-gpu]" # + paddlepaddle-gpu
pip install "docuvision[doctr]" # + python-doctr
pip install "docuvision[trocr]" # + torch + transformers + TrOCR deps
pip install "docuvision[donut]" # + torch + transformers + Donut deps
# Aggregate bundles
pip install "docuvision[cpu]" # every CPU-capable engine
pip install "docuvision[gpu]" # every engine, GPU-preferred
pip install "docuvision[all]" # everything, library versions only
# Development
pip install "docuvision[dev]"
Heavy engines (torch, paddle, transformers…) are optional. DocuVision lazily imports them, so a missing library simply disables that one engine; the rest of the pipeline keeps working.
Quick start
from docuvision import DocumentPipeline, profile_system
# See what DocuVision picks for your system
print(profile_system().summary())
# Zero-config pipeline — uses the best available engine
pipe = DocumentPipeline()
result = pipe.run("invoice.png")
print(result.text)
# Full-featured pipeline
pipe = DocumentPipeline(
detect_text=True,
classify_doc=True,
mask=True,
embossed_mode=False,
language=["en"],
)
result = pipe.run("aadhaar.jpg")
print(result.doc_class.label) # → 'aadhaar'
print(result.text)
print(result.masks) # detected PII regions
# result.masked_image is a numpy.ndarray ready for cv2.imwrite(...)
Command-line
docuvision profile # show the capability report
docuvision ocr invoice.png # default pipeline
docuvision ocr plate.jpg --embossed # embossed / chassis OCR
docuvision ocr id.jpg --classify --mask --mask-method blackbox
docuvision engines # list registered OCR engines
docuvision detectors # list registered detectors
Design principles
- No hard deps on heavy libs. Every engine is imported lazily.
pip install docuvisiondrops in fine even without torch, paddle, or transformers. - Explicit tiering. The profiler scores your hardware and picks a tier; you can override it per call.
- Composable, not monolithic. Each stage (detect, classify, OCR, mask) is an independent, swappable component with a common ABC and a registry.
- Sane defaults, full control.
DocumentPipeline().run(img)just works; every knob is exposed viaPipelineConfig. - Fails gracefully. A per-stage failure is captured in
PipelineResult.metadata["failures"]rather than aborting the pipeline.
Testing
The full test suite runs against the core install only — heavy engines are not required:
pytest tests/
# or, without pytest installed:
python run_tests.py
License
Apache License 2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docuvision-2.0.0.tar.gz.
File metadata
- Download URL: docuvision-2.0.0.tar.gz
- Upload date:
- Size: 55.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4eba981d1ae4b19cd113854aa607742b850479b5b83da215ebfed5cb647d1f5f
|
|
| MD5 |
c32078339ce2e4395c2bec712ad218ac
|
|
| BLAKE2b-256 |
1f2c07c5f7cb5bef637ae0ca495261ec3ca620b0e56a72a2137745508fbf942a
|
Provenance
The following attestation bundles were made for docuvision-2.0.0.tar.gz:
Publisher:
publish.yml on Tanupvats/docuvision
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docuvision-2.0.0.tar.gz -
Subject digest:
4eba981d1ae4b19cd113854aa607742b850479b5b83da215ebfed5cb647d1f5f - Sigstore transparency entry: 1342616118
- Sigstore integration time:
-
Permalink:
Tanupvats/docuvision@e6c04346098c980a8c0a0e2490fa5359e4c30dac -
Branch / Tag:
refs/tags/v2.0.1 - Owner: https://github.com/Tanupvats
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e6c04346098c980a8c0a0e2490fa5359e4c30dac -
Trigger Event:
release
-
Statement type:
File details
Details for the file docuvision-2.0.0-py3-none-any.whl.
File metadata
- Download URL: docuvision-2.0.0-py3-none-any.whl
- Upload date:
- Size: 66.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a757acfa04df18d318691e9e9db4babe19201c15a04298f0315e763dd883f1f
|
|
| MD5 |
4e161f9336baad03d9cc3eb2ade2a642
|
|
| BLAKE2b-256 |
eba532cb080e7334bb972087fc10142ebc52fcb336d4f3a56ef52f7a3f1e8c33
|
Provenance
The following attestation bundles were made for docuvision-2.0.0-py3-none-any.whl:
Publisher:
publish.yml on Tanupvats/docuvision
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docuvision-2.0.0-py3-none-any.whl -
Subject digest:
5a757acfa04df18d318691e9e9db4babe19201c15a04298f0315e763dd883f1f - Sigstore transparency entry: 1342616128
- Sigstore integration time:
-
Permalink:
Tanupvats/docuvision@e6c04346098c980a8c0a0e2490fa5359e4c30dac -
Branch / Tag:
refs/tags/v2.0.1 - Owner: https://github.com/Tanupvats
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e6c04346098c980a8c0a0e2490fa5359e4c30dac -
Trigger Event:
release
-
Statement type: