Kannada OCR
Project description
AksharaJaana
AksharaJaana is the package which uses tesseract ocr in backend to convert the kannada text to editable format.You can use following sample code in ubuntu.The Special feature of this is it can separate columns in page
The Requirements
Conda environment is prefered for the smooth use
- AksharaJaana (pip package), check out the latest version available
- Tesseract
- poppler
Details for Installation
Complete installation including requirements
Ubuntu
- Installing tesseract-ocr in the system
- open terminal
- sudo apt-get update -y
- sudo apt-get install -y tesseract-ocr
- Installing poppler in the system
- sudo apt-get install -y poppler-utils
- Installing python and pip (if pip is not installed)
- sudo apt install python==3.6.9
- Installing packages for AksharaJaana
- pip install AksharaJaana
Windows
- Installing tesseract-ocr in the system
- go to the website --> click on --> tesseract-ocr-w64-setup-v5.0.0-alpha.20200328.exe (64 bit) resp.
- After downloading open that file --> hit enter --> you will give an option to choose the languages. --> choose kannada in both script and language
- next check if this folder C:\Program Files\Tesseract-OCR is present
- If yes, follow below procedure
- Add C:\Program Files\Tesseract-OCR to your system PATH by doing the following:
1. Click on the Windows start button, search for Edit the system environment variables, click on Environment Variables
2. under System variables, look for and double-click on PATH, click on New
3. then add C:\Program Files\Tesseract-OCR, click OK
- if no, manually add the folder tesseract-ocr to the Program Files in the C drive which must be present at the download section (after extraction) and follow the same procedure
- Installing poppler in the system
- go to this page
- click on --> poppler-0.54_x86
- after downloading open that file --> hit enter --> you will give an option to choose the languages. --> choose kannada in both script and lang
- next after installation of that files. check if this folder is present C:\Program Files\Tesseract-OCR
- if yes, then search for follow below procedure
- Add C:\Program Files\poppler-0.68.0_x86\bin to your system PATH by doing the following:
1. Click on the Windows start button, search for Edit the system environment variables, click on Environment Variables
2. under System variables, look for and double-click on PATH, click on New
3. then add C:\Users\Program Files\poppler-0.68.0_x86\bin, click OK
- if no, manually add the folder poppler-0.68.0_x86 to the Program Files in the C drive which must be present at the download section (after extraction) and follow the same procedure
- Installing python and pip in the system (If pip is not installed)
- Click here and download python
- Installing packages for AksharaJaana
- open command prompt
- pip install AksharaJaana
- Reboot the system before starting to use
Python Script
Its in test.py in Github Repo
import AksharaJaana.main as ak
text = ak.ocr_engine("Your file Path")
print(text)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file AksharaJaana-0.1.9.9.tar.gz
.
File metadata
- Download URL: AksharaJaana-0.1.9.9.tar.gz
- Upload date:
- Size: 5.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/3.10.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a6ef9f45770c1c7717a579a7cb4e058c7a017fa4faf1a4f05e614bdfccb9146 |
|
MD5 | b4b86686faf4458d79bd495ecd4f5bbd |
|
BLAKE2b-256 | b6af643d0c2aedb1cef85a7e509675cebecdc07c269f447f97f4ef0c85ba1d47 |
File details
Details for the file AksharaJaana-0.1.9.9-py3-none-any.whl
.
File metadata
- Download URL: AksharaJaana-0.1.9.9-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/3.10.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f45e8944c9a9948d52cd60962389f4a99e3bda95c102a0146628e9a783eeed0 |
|
MD5 | 8344d80ee361ec76d12c8451e1d55b04 |
|
BLAKE2b-256 | 4c86312ba72d77158605c341258f3dbafc49b4c270a151982c60736eaa95fd50 |