Fast and robust MRZ extraction, parsing, and validation using PaddleOCR
Project description
OmniMRZ
OmniMRZ is a production-grade MRZ extraction and validation engine designed for KYC, identity verification, and document intelligence pipelines.
Unlike simple MRZ readers, OmniMRZ evaluates whether an MRZ is structurally correct, cryptographically valid, and logically plausible.
⭐ Show Your Support If OmniMRZ helped you or saved development time: 👉 Please consider starring the repository It helps visibility and motivates continued development
Features
At a glance
- MRZ detection and extraction from images
- Supports TD3 (passport) format
- Checksum validation (ICAO 9303)
- Logical and structural validation
- Clean Python API
Detailed features
🔍 MRZ Extraction
- PaddleOCR-based MRZ text extraction (robust on mobile & noisy images)
- Intelligent MRZ line clustering & reconstruction
- Automatic MRZ type detection (TD1 / TD2 / TD3)
- OCR noise filtering & MRZ-safe character normalization
- Works even with partially corrupted or misaligned MRZs
🧱 Structural Validation (ICAO 9303)
- Exact line-length enforcement
- Strict MRZ format verification
- Field-level structural checks
- Early-exit gating for invalid layouts
🔢 Checksum Validation
- Fully ICAO-9303 compliant checksum algorithm
- Field-level validation:
- Document number
- Date of birth
- Expiry date
- Composite checksum
- OCR-error tolerant digit correction (O→0, S→5, B→8, etc.)
- Detailed checksum failure diagnostics
🧠 Logical & Semantic Validation
- Expired document detection
- Future date-of-birth detection
- Implausible age detection
- DOB ≥ expiry detection
- Gender value validation (M, F, X, <)
- Cross-field consistency signals (issuer vs nationality)
📤 Output
- Clean MRZ text
- Structured JSON
- Deterministic pass / fail / warning signals
- Human-readable error messages
Installation
pip install omnimrz
Note: PaddleOCR requires additional system dependencies. Please ensure PaddlePaddle installs correctly on your platform.
pip install paddleocr
pip install paddle paddle
or if that fails then run
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
Quick Usage
from omnimrz import OmniMRZ
omni = OmniMRZ()
result = omni.process("ukpassport.jpg")
print(result)
Output Example
{
"extraction": {
"status": "SUCCESS(extraction of mrz)",
"line1": "P<GBRPUDARSAN<<HENERT<<<<<<<<<<<<<<<<<<<<<<<",
"line2": "7077979792GBR9505209M1704224<<<<<<<<<<<<<<00"
},
"structural_validation": {
"status": "PASS",
"mrz_type": "TD3",
"errors": []
},
"checksum_validation": {
"status": "PASS",
"errors": []
},
"parsed_data": {
"status": "PARSED",
"data": {
"document_type": "P",
"issuing_country": "GBR",
"surname": "PUDARSAN",
"given_names": "HENERT",
"document_number": "707797979",
"nationality": "GBR",
"date_of_birth": "1995-05-20",
"gender": "M",
"expiry_date": "2017-04-22",
"personal_number": ""
}
},
"logical_validation": {
"status": "FAIL",
"errors": [
"DOCUMENT_EXPIRED"
]
}
}
Contributing
Contributions are welcome!🤝
- Fork the repository
- Create your feature branch
git checkout -b feature/amazing-feature
- Commit your changes
- Push to your branch
- Open a Pull Request
misc
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omnimrz-0.1.1.tar.gz.
File metadata
- Download URL: omnimrz-0.1.1.tar.gz
- Upload date:
- Size: 30.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec724b6ea62a5fce4ca12feccad5b939c256dd199f8e32ed5b169af8d73d6986
|
|
| MD5 |
92d67026db407af291eec24fd65d7dc6
|
|
| BLAKE2b-256 |
9f743d3d3a008878ccc4c0d2fc96b4188bea339111f942ec9399e021e9140a9c
|
File details
Details for the file omnimrz-0.1.1-py3-none-any.whl.
File metadata
- Download URL: omnimrz-0.1.1-py3-none-any.whl
- Upload date:
- Size: 38.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86c53619e231eb1617302ee2d9000d33a517b8cf9c2455c582d41d0ebf1b7cf8
|
|
| MD5 |
e135dc0032730f10324f96bb7568897e
|
|
| BLAKE2b-256 |
a67dba251e97d00aa522fd956cef527710c06b7b8c2318e51ae84f55495767e2
|