OCR-Ops
Project description
ocr-ops
OCR-Ops is infrastructure to perform Optimal Character Recognition (OCR) at scale on a large number of images and videos. Built on top of the algo-ops framework, OCR-Ops is modular and extensible in its data processing operations.
Key Features:
- Supports building an OCRPipeline that can utilize multiple popular OCR annotation methods (e.g. PyTesseract, EasyOCR, etc.) and return the results in structured and efficient fashion within a unified framework.
- Enables multi-levels of information of the OCR application (e.g. text-only, bounding boxes, etc.)
- Allows definition of an image pre-processing pipeline (before OCR) and a text-cleaning pipeline (after OCR) of detected but noisy text to enable optimal and robust OCR performance.
- Supports several nice presets that are plug-and-play for the above purpose!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ocr_ops-0.0.0.4.3.1.tar.gz
(453.4 kB
view hashes)
Built Distribution
ocr_ops-0.0.0.4.3.1-py3-none-any.whl
(460.0 kB
view hashes)
Close
Hashes for ocr_ops-0.0.0.4.3.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b0c7efdf72877e6466a0870cc31a3d0334eb36fe8138a360d57b2b084763b89 |
|
MD5 | e59a747c2511d98cb9853630716e58bb |
|
BLAKE2b-256 | dc52f80b29e3ee1accb37f5cffd6bb9e4ab79edf9e679043c3821778151e6c68 |