OCR extractor for PAN and Aadhaar card details
Project description
OCR Extractor
A simple and efficient OCR-based data extraction tool for Indian PAN and Aadhaar cards using Tesseract OCR.
🆕 What's New in v0.1.3
- Corrected example usage in README:
print(pan_data.get_pan())print(aadhaar_data.get_aadhaar())
- Includes all features from v0.1.2:
- Added
tesseract_cmdparameter toExtractAadhaarDataandExtractPanDatafor custom Tesseract paths. - Fixed issue with preprocessing argument (
preprocess) in child classes not being passed correctly.
- Added
(For full version history, see CHANGELOG.md)
✨ Features
- Extract PAN card data with a single function call
- Extract Aadhaar card data with a single function call
- Built-in preprocessing option for better OCR accuracy
- Cross-platform support (Windows, Linux, macOS) with configurable Tesseract path
📦 Installation
pip install ocr-pro
🚀 Usage
Extract PAN Card Data
from ocr import ExtractPanData
# Default usage (preprocess=False by default)
pan_data = ExtractPanData("pan_image.jpg", tesseract_cmd="/usr/bin/tesseract")
print(pan_data.get_pan())
Extract Aadhaar Card Data
from ocr import ExtractAadhaarData
# You can also enable preprocessing
aadhaar_data = ExtractAadhaarData("aadhaar_image.jpg", tesseract_cmd="/usr/bin/tesseract", preprocess=True)
print(aadhaar_data.get_aadhaar())
Arguments
- filepath (str) → Path to the image file
- tesseract_cmd (str, optional) → Path to the Tesseract
executable (default: system auto-detection or
"C:\Program Files\Tesseract-OCR\tesseract.exe"on Windows) - preprocess (bool, default=False) → Whether to apply preprocessing for better OCR results
⚙️ Requirements
- Python 3.7+
- Tesseract OCR installed on your system
📜 License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ocr_pro-0.1.3.tar.gz
(4.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ocr_pro-0.1.3.tar.gz.
File metadata
- Download URL: ocr_pro-0.1.3.tar.gz
- Upload date:
- Size: 4.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e4b99c5ce043fe0963b23430d268b9eec9f79e97dc6d7276e5a88dc55820b412
|
|
| MD5 |
69dd4c19d57c9c182714df06e754ec29
|
|
| BLAKE2b-256 |
6b3d710f36fa536fe7702ede8b3189d28bb5522115f2489d6bf8c499dba99ebd
|
File details
Details for the file ocr_pro-0.1.3-py3-none-any.whl.
File metadata
- Download URL: ocr_pro-0.1.3-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bcb8c25420f71d038214499e8cdf142aff237ff4d996bf30cf50ced73abb2081
|
|
| MD5 |
af7475218efe5947b0e2bcb7a4f4c593
|
|
| BLAKE2b-256 |
a72f23bc8db762918c3b01c315a9099970b8ffd0d57a7042f9369ad300c32ba5
|