Skip to main content

A Python library for reading MRZ data from passport images using Tesseract OCR

Project description

passport_mrz_extractor is a Python library for extracting and validating Machine Readable Zone (MRZ) data from passport images. It uses Tesseract OCR to read MRZ text and validates it using the mrz library.

Features

  • Extract MRZ data from passport images.

  • Validate MRZ data fields, including document type, name, nationality, date of birth, and expiry date.

  • Automatic image processing for better OCR accuracy.

Installation

You can install passport_mrz_extractor using pip:

pip install passport_mrz_extractor

Requirements

  • Python >= 3.10

  • Tesseract OCR installed on your system

To install Tesseract:

Dependencies

This library requires the following Python packages:

  • pytesseract - For performing OCR on images.

  • opencv-python - For image processing.

  • mrz - For MRZ data validation.

  • Pillow - For handling image files in Python.

Usage

Here’s how to use passport_mrz_extractor to extract MRZ data from a passport image.

### Basic Example

This example demonstrates extracting all available MRZ fields from an image and handling potential errors.

from passport_mrz_extractor import read_mrz

# Path to the passport image
image_path = 'path/to/passport_image.jpg'

try:
    mrz_data = read_mrz(image_path)
    print("Extracted MRZ Data:")
    for key, value in mrz_data.items():
        print(f"{key}: {value}")
except ValueError as e:
    print(f"Error reading MRZ: {e}")

### Example of Using Specific MRZ Fields

In this example, we extract specific fields such as the country, document number, and birth date, and print them in a formatted output.

from passport_mrz_extractor import read_mrz

# Path to the passport image
image_path = 'path/to/passport_image.jpg'

try:
    # Extract MRZ data
    mrz_data = mrz_reader.read_mrz(image_path)

    # Display specific fields
    print("Country of Issue:", mrz_data.get("country"))
    print("Document Number:", mrz_data.get("document_number"))
    print("Name:", mrz_data.get("name"))
    print("Surname:", mrz_data.get("surname"))
    print("Date of Birth:", mrz_data.get("birth_date"))
    print("Expiry Date:", mrz_data.get("expiry_date"))
    print("Nationality:", mrz_data.get("nationality"))
    print("Sex:", mrz_data.get("sex"))

except ValueError as e:
    print(f"Error reading MRZ: {e}")

Contributing

If you’d like to contribute, please fork the repository and use a feature branch. Pull requests are welcome.

Issues

If you encounter any issues, please report them on the GitHub repository:

https://github.com/Azim-Kenzh/passport_mrz_extractor/issues

License

passport_mrz_extractor is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

passport_mrz_extractor-1.0.2.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

passport_mrz_extractor-1.0.2-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file passport_mrz_extractor-1.0.2.tar.gz.

File metadata

File hashes

Hashes for passport_mrz_extractor-1.0.2.tar.gz
Algorithm Hash digest
SHA256 195e5e83c9303f66c9a036201ec500bfb6eca53dfa3a729659f9558e307e4412
MD5 3ee9ecf970694f9f2508fc5bd1946907
BLAKE2b-256 19ff4fbdca0ccf7dc079d9c64542f8dae70c353be7409efd1c0e71f55ab81eb7

See more details on using hashes here.

File details

Details for the file passport_mrz_extractor-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for passport_mrz_extractor-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 93d84b208926d785c9359efb11497d27dec2b15057fd312c5e098bd33e9a1522
MD5 c2043824377bc0c440d336a9f540e738
BLAKE2b-256 be1802a77cffed53b770f02261d0c5f25dcf22c3263bc5ea89267bc10e99c9e9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page