Extract information from the pan and aadhar card image
Project description
Pan Aadhar OCR
Extract Text from Pan and Aadhar Cards
Pan Aadhar OCR is a python package which takes an Image of a valid Pan/Aadhar Document and extracts the text from it and returns the information in JSON format.
- Easy to use
- Returns information in JSON
- Works even faster with the GPU
- If you don't have a GPU, you can still run it on CPU, but slower
Tech
Pan Aadhar OCR uses a number of open source projects to work properly:
- EasyOCR - Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
- Python - Python is a programming language that lets you work quickly and integrate systems more effectively.
- OpenCV - OpenCV is open source and released under the BSD 3-Clause License. It is free for commercial use.
Installation
This library requires Python 3.6+ to run. As well as you also need to install tesseract on your system. If you have Linux based system just run:
sudo apt install tesseract-ocr
On windows system you will need to download Tessaract from here. and Add it to the Path.
Install the package.
pip install pan-aadhar-ocr
Then Import the package.
from pan_aadhar_ocr import Pan_Info_Extractor
Create an instance of the extractor.
extractor = Pan_Info_Extractor()
Pass the image to the extractor to get the results.
extractor.info_extractor('/content/pan test.jpeg')
This will return a result as following:
{
"Pan_number": "EKAPS0276J",
"Name": "John Kevin Doe",
"Father_Name": "Kevin Doe",
"DOB": "31/10/1992"
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.