Skip to main content

openbharatocr is an opensource python library for ocr Indian government documents

Project description

openbharatocr

Build status

openbharatocr is a Python library developed as open-source, designed specifically for optical character recognition (OCR) of Indian government documents.

The features of this package:

  • It offers comprehensive support for the majority of Indian government documents, covering a wide range of document types.

Setting Up a Development Environment for OpenBharatOCR

This guide details how to establish a development environment for OpenBharatOCR on a Linux system (Ubuntu/Debian preferred). If you're using Windows or macOS, consider using a virtual machine or a Linux subsystem (WSL2 on Windows, Docker on macOS).

Prerequisites:

  • Operating System: Linux (Ubuntu/Debian preferred)
  • Python 3.6 or later: Check the version with python3 --version or python --version in your terminal. Download the latest installer from https://www.python.org/downloads/ if needed.

Installation:

  • Clone the OpenBharatOCR repository:
git clone https://github.com/essentiasoftserv/openbharatocr.git
  • Create a virtual environment (recommended):

    This isolates project dependencies and avoids conflicts with system-wide packages. Use venv or virtualenv (if venv is not available):

    python3 -m venv openbharatocr_env  # Using venv
    # OR
    virtualenv openbharatocr_env      # Using virtualenv
  • Activate the virtual environment:
source openbharatocr_env/bin/activate  # For venv
# OR
source openbharatocr_env/bin/activate  # For virtualenv
  • Install dependencies: Navigate to the cloned repository directory and install required packages using pip:
    cd OpenBharatOCR
    pip install -r requirements.txt

Installation

    pip install openbharatocr

Pan Card

This function takes the path of a PAN card image as input and returns its information in the form of a dictionary.

    import openbharatocr 
    dict_output = openbharatocr.pan(image_path)

Aadhaar Card

The two functions accepts the file paths of the front and back images of an Aadhaar card as input and returns their corresponding information encapsulated in a dictionary.

    import openbharatocr 
    dict_output = openbharatocr.front_aadhaar(image_path)
    dict_output = openbharatocr.back_aadhaar(image_path)

Driving Licence

This function takes the path of a Driving Licence card image as input and returns its information in the form of a dictionary.

    import openbharatocr 
    dict_output = openbharatocr.driving_licence(image_path)

Passport

This function takes the path of a Passport image as input and returns its information in the form of a dictionary.

    import openbharatocr 
    dict_output = openbharatocr.passport(image_path)

VoterID

The two functions accepts the file paths of the front and back images of a voterID as input and returns their corresponding information encapsulated in a dictionary.

    import openbharatocr 
    # Download YOLOv3 models from links(added below) and set local downloaded path to YOLO_CFG, YOLO_WEIGHT env variables
    dict_output = openbharatocr.voter_id_front(image_path)
    dict_output = openbharatocr.voter_id_back(image_path)

Download Resources

Some resources need to be downloaded and set the path in the variables.

Vehicle Registration Card/Certificate

This function takes the path of a Vehicle Registration Card/Certificate image as an input and returns its information in the form of a dictionary.

    import openbharatocr 
    dict_output = openbharatocr.vehicle_registration(image_path)

Water Bill

This function takes the path of a Water Bill image as an input and returns its information in the form of a dictionary.

    import openbharatocr 
    dict_output = openbharatocr.water_bill(image_path)

Birth Certificate

This function takes the path of a Birth Certificate image as an input and returns its information in the form of a dictionary.

    import openbharatocr 
    dict_output = openbharatocr.birth_certificate(image_path)

Degree

This function takes the path of a Degree image as an input and returns its information in the form of a dictionary.

    import openbharatocr 
    dict_output = openbharatocr.degree(image_path)

Contribute & support

We are so pleased to your help and help you. If you wanna develop openbharatocr, Congrats! If you have problem, don't worry, create an issue here:

    https://github.com/essentiasoftserv/openbharatocr/issues

Pre Commit

Note: Before committing your changes, run pre-commits

    pre-commit run --all

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openbharatocr-0.4.1.tar.gz (33.9 kB view details)

Uploaded Source

Built Distribution

openbharatocr-0.4.1-py3-none-any.whl (42.0 kB view details)

Uploaded Python 3

File details

Details for the file openbharatocr-0.4.1.tar.gz.

File metadata

  • Download URL: openbharatocr-0.4.1.tar.gz
  • Upload date:
  • Size: 33.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for openbharatocr-0.4.1.tar.gz
Algorithm Hash digest
SHA256 b6227ffbe8ea78b8a8d1cc44184f2c6e28e8e6d983408a4c89bbd213a7dc1fe7
MD5 f78beb17cd6f065f24c9c01f1f84bafe
BLAKE2b-256 5c3089c0293a634c1efd4bc2234ffc672c47b3f6ca74d2beb3682d1b82476987

See more details on using hashes here.

File details

Details for the file openbharatocr-0.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for openbharatocr-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 da13791ca914d47b02d8e2eb63ee71496addea107fe435feb51a86a9d4e09b5f
MD5 afd035f839f3d798c573bdf1e7385052
BLAKE2b-256 afacf56e72efe505c63e388d2a20107b0e5281e6bb4f70cec5b66e244840a674

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page