Skip to main content

A Python library for reading MRZ data from passport images using Tesseract OCR

Project description

passport_mrz_extractor is a Python library for extracting and validating Machine Readable Zone (MRZ) data from passport images. It uses Tesseract OCR to read MRZ text and validates it using the mrz library.

Features

  • Extract MRZ data from passport images.

  • Validate MRZ data fields, including document type, name, nationality, date of birth, and expiry date.

  • Automatic image processing for better OCR accuracy.

Installation

You can install passport_mrz_extractor using pip:

pip install passport_mrz_extractor

Requirements

  • Python >= 3.10

  • Tesseract OCR installed on your system

To install Tesseract:

Dependencies

This library requires the following Python packages:

  • pytesseract - For performing OCR on images.

  • opencv-python - For image processing.

  • mrz - For MRZ data validation.

  • Pillow - For handling image files in Python.

Usage

Here’s how to use passport_mrz_extractor to extract MRZ data from a passport image.

### Basic Example

This example demonstrates extracting all available MRZ fields from an image and handling potential errors.

from passport_mrz_extractor import read_mrz

# Path to the passport image
image_path = 'path/to/passport_image.jpg'

try:
    mrz_data = read_mrz(image_path)
    print("Extracted MRZ Data:")
    for key, value in mrz_data.items():
        print(f"{key}: {value}")
except ValueError as e:
    print(f"Error reading MRZ: {e}")

### Example of Using Specific MRZ Fields

In this example, we extract specific fields such as the country, document number, and birth date, and print them in a formatted output.

from passport_mrz_extractor import read_mrz

# Path to the passport image
image_path = 'path/to/passport_image.jpg'

try:
    # Extract MRZ data
    mrz_data = mrz_reader.read_mrz(image_path)

    # Display specific fields
    print("Country of Issue:", mrz_data.get("country"))
    print("Document Number:", mrz_data.get("document_number"))
    print("Name:", mrz_data.get("name"))
    print("Surname:", mrz_data.get("surname"))
    print("Date of Birth:", mrz_data.get("birth_date"))
    print("Expiry Date:", mrz_data.get("expiry_date"))
    print("Nationality:", mrz_data.get("nationality"))
    print("Sex:", mrz_data.get("sex"))

except ValueError as e:
    print(f"Error reading MRZ: {e}")

Contributing

If you’d like to contribute, please fork the repository and use a feature branch. Pull requests are welcome.

Issues

If you encounter any issues, please report them on the GitHub repository:

https://github.com/Azim-Kenzh/passport_mrz_extractor/issues

License

passport_mrz_extractor is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

passport_mrz_extractor-1.0.13.tar.gz (3.9 kB view details)

Uploaded Source

File details

Details for the file passport_mrz_extractor-1.0.13.tar.gz.

File metadata

  • Download URL: passport_mrz_extractor-1.0.13.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for passport_mrz_extractor-1.0.13.tar.gz
Algorithm Hash digest
SHA256 10ab904e47b6b17d5462984d6168d0ab664cbda8d06c95310c1de929c0ee8d93
MD5 36add40eb88164792ebfa8d0f19d06f4
BLAKE2b-256 287954ea90e1b001576c9300e1cf24885215f26646a18f017508b93936a42089

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page