Skip to main content

A Python library for reading MRZ data from passport images using Tesseract OCR

Project description

passport_mrz_extractor is a Python library for extracting and validating Machine Readable Zone (MRZ) data from passport images. It uses Tesseract OCR to read MRZ text and validates it using the mrz library.

Features

  • Extract MRZ data from passport images.

  • Validate MRZ data fields, including document type, name, nationality, date of birth, and expiry date.

  • Automatic image processing for better OCR accuracy.

Installation

You can install passport_mrz_extractor using pip:

pip install passport_mrz_extractor

Requirements

  • Python >= 3.10

  • Tesseract OCR installed on your system

To install Tesseract:

Dependencies

This library requires the following Python packages:

  • pytesseract - For performing OCR on images.

  • opencv-python - For image processing.

  • mrz - For MRZ data validation.

  • Pillow - For handling image files in Python.

Usage

Here’s how to use passport_mrz_extractor to extract MRZ data from a passport image.

### Basic Example

This example demonstrates extracting all available MRZ fields from an image and handling potential errors.

from passport_mrz_extractor import mrz_reader

# Path to the passport image
image_path = 'path/to/passport_image.jpg'

try:
    mrz_data = mrz_reader.read_mrz(image_path)
    print("Extracted MRZ Data:")
    for key, value in mrz_data.items():
        print(f"{key}: {value}")
except ValueError as e:
    print(f"Error reading MRZ: {e}")

### Example of Using Specific MRZ Fields

In this example, we extract specific fields such as the country, document number, and birth date, and print them in a formatted output.

from passport_mrz_extractor import mrz_reader

# Path to the passport image
image_path = 'path/to/passport_image.jpg'

try:
    # Extract MRZ data
    mrz_data = mrz_reader.read_mrz(image_path)

    # Display specific fields
    print("Country of Issue:", mrz_data.get("country"))
    print("Document Number:", mrz_data.get("document_number"))
    print("Name:", mrz_data.get("name"))
    print("Surname:", mrz_data.get("surname"))
    print("Date of Birth:", mrz_data.get("birth_date"))
    print("Expiry Date:", mrz_data.get("expiry_date"))
    print("Nationality:", mrz_data.get("nationality"))
    print("Sex:", mrz_data.get("sex"))

except ValueError as e:
    print(f"Error reading MRZ: {e}")

Contributing

If you’d like to contribute, please fork the repository and use a feature branch. Pull requests are welcome.

Issues

If you encounter any issues, please report them on the GitHub repository:

https://github.com/Azim-Kenzh/passport_mrz_extractor/issues

License

passport_mrz_extractor is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

passport_mrz_extractor-1.0.1.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

passport_mrz_extractor-1.0.1-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file passport_mrz_extractor-1.0.1.tar.gz.

File metadata

File hashes

Hashes for passport_mrz_extractor-1.0.1.tar.gz
Algorithm Hash digest
SHA256 8d7262746fb83ec4edffb3f0335ea556aa7d03eeae4cf1a174c29a2174745739
MD5 6f87476419cde7173df62e0722746296
BLAKE2b-256 18f203d6cec8161bf4ef2f1c6e6dfef93614a0b06e8515026999b0672b82a040

See more details on using hashes here.

File details

Details for the file passport_mrz_extractor-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for passport_mrz_extractor-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 060a392008ee74a715038e329d590d19338cc280b92ff146ce585b01db9e485c
MD5 3488c21e924b2e3e5093a34244f3884f
BLAKE2b-256 1f66ac728bbe84064c1bb2d44420ef70e6ebb5c7163f208d8fdc9aa096521677

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page