Kannada OCR

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

AksharaJaana

AksharaJaana is an Optical Character Recognition software that is jointly developed by the Department of Electronics and Communication, NMAM Institute of Technology, Nitte, Karnataka, India and Kannada GaNaka Parishattu, Bengaluru. The focus is to add more features to the engine and make it more user friendly

The Requirements [ Conda environment is prefered for the smooth use ]

python >= 3.6.9
opencv-python >= 4.2.0.32
numpy==1.18.4
pdf2image==1.13.1
pytesseract
tesseract
poppler

Details for installation (Complete installation including requirements)

--------------------------------------------------------------Ubuntu------------------------------------------------------------

Installing tesseract-ocr in the sysytem ---> open terminal ---> type-> sudo apt-get update -y -> press enter ---> type-> sudo apt-get install -y tesseract-ocr -> press enter
Installing poppler in the system ---> open terminal ---> type-> sudo apt-get install -y poppler-utils -> press enter
Installing python and pip (if pip is not installed) ---> open terminal ---> type-> sudo apt install python==3.6.9
Installing packages for AksharaJaana ---> open terminal ---> type-> pip install opencv-python==4.2.0.32 --> press enter ---> type-> pip install numpy==1.18.4 --> press enter ---> type-> pip install pdf2image==1.13.1 --> press enter ---> type-> pip install AksharaJaana==0.1.5 --> press enter ---> type-> pip install pytesseract --> press enter

------------------------------------------------------------- Windows-----------------------------------------------------------

Installing tesseract-ocr in the system ---> go to the website --> https://github.com/UB-Mannheim/tesseract/wiki ---> click on --> tesseract-ocr-w64-setup-v5.0.0-alpha.20200328.exe (64 bit) resp. ---> after downloading open that file --> hit enter --> you will give an option to choose the languages. --> choose kannada in both script and lang ---> next after installation of that files. check if this folder is present C:\Program Files\Tesseract-OCR\ ---> if yes, then search for follow below procedure

Add C:\Program Files\Tesseract-OCR\ to your system PATH by doing the following: 1---> Click on the Windows start button, search for Edit the system environment variables, click on Environment Variables..., 2---> under System variables, look for and double-click on PATH, click on New, 3---> then add C:\Program Files\Tesseract-OCR, click OK.

---> if no, manually add the folder tesseract-ocr to the Program Files in the C drive which must be present at the download section (after extraction) and follow the same procedure

Installing poppler in the system ---> go to this page, http://blog.alivate.com.au/poppler-windows/ ---> click on --> poppler-0.54_x86 ---> after downloading open that file --> hit enter --> you will give an option to choose the languages. --> choose kannada in both script and lang ---> next after installation of that files. check if this folder is present C:\Program Files\Tesseract-OCR\ ---> if yes, then search for follow below procedure Add C:\Program Files\poppler-0.68.0_x86\bin to your system PATH by doing the following: 1 ---> Click on the Windows start button, search for Edit the system environment variables, click on Environment Variables..., 2 ---> under System variables, look for and double-click on PATH, click on New, 3 ---> then add C:\Users\Program Files\poppler-0.68.0_x86\bin, click OK. ---> if no, manually add the folder poppler-0.68.0_x86 to the Program Files in the C drive which must be present at the download section (after extraction) and follow the same procedure
Installing python and pip in the system (If pip is not installed) ---> go to this page and download python (any version >= 3.6.9) ---> now click allow changes as soon as it pops up a window for allowing changes. ---> now click next and accept and finally ok. here the pip is installed
Installing packages for AksharaJaana ---> open command prompt ---> type-> pip install opencv-python==4.2.0.32 --> press enter ---> type-> pip install numpy==1.18.4 --> press enter ---> type-> pip install pdf2image==1.13.1 --> press enter ---> type-> pip install AksharaJaana==0.1.5 --> press enter ---> type-> pip install pytesseract --> press enter
Reboot the system before starting to use

Python Script

Its in test.py in Github Repo

import AksharaJaana.main as ak

text = ak.ocr_engine('/home/navaneeth/Desktop/NandD/OCR_kannada/CamScanner 06-28-2020 12.12.10.pdf')

from AksharaJaana.utils import utils

u = utils()

u.write_as_RTF(text, saving_path='/home/navaneeth/Desktop/1.rtf')

README.md Displaying README.md.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.0.1.3

Nov 8, 2023

1.0.1.2

Nov 8, 2023

1.0.1.1

Apr 15, 2023

1.0.0.1

Jun 13, 2022

0.2.1.0

Apr 15, 2023

0.1.9.9

Sep 14, 2021

This version

0.1.9.3

Jun 1, 2021

0.1.8

Feb 5, 2021

0.1.7

Dec 23, 2020

0.1.6

Nov 3, 2020

0.1.5

Oct 29, 2020

0.1.2.9

Oct 29, 2020

0.1.2.8

Oct 29, 2020

0.1.2.7

Oct 29, 2020

0.1.2.6

Oct 29, 2020

0.1.2.4

Aug 24, 2020

0.1.2.3

Aug 24, 2020

0.1.2.2

Aug 20, 2020

0.1.2.1

Aug 20, 2020

0.1.0

Aug 6, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AksharaJaana-0.1.9.3.tar.gz (6.2 kB view hashes)

Uploaded Jun 1, 2021 Source

Built Distributions

AksharaJaana-0.1.9.4-py3-none-any.whl (6.7 kB view hashes)

Uploaded Sep 14, 2021 Python 3

AksharaJaana-0.1.9.3-py3-none-any.whl (6.6 kB view hashes)

Uploaded Jun 1, 2021 Python 3

Hashes for AksharaJaana-0.1.9.3.tar.gz

Hashes for AksharaJaana-0.1.9.3.tar.gz
Algorithm	Hash digest
SHA256	`a99aaf85b49a25ea5363e9d45e9bdb751d3c62cd6f85464cd3d11277decb98f3`
MD5	`5b06b46ceb803bcf20d3600101c0f7cf`
BLAKE2b-256	`6cfc1cfe459206da1af20b78069281f1910fa35d3370ac3ba545370cff5b6e86`

Hashes for AksharaJaana-0.1.9.4-py3-none-any.whl

Hashes for AksharaJaana-0.1.9.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c1cc881701f57a88ed8f2ff1bfdf628174ff2556388d3d14e33e16ae9033c517`
MD5	`a6ddeb0a0c8ba3d885e4fe456b900df3`
BLAKE2b-256	`96ca98199333c9c1b7cf624dae26abf3e67e6dede47505d8220390fef1644682`

Hashes for AksharaJaana-0.1.9.3-py3-none-any.whl

Hashes for AksharaJaana-0.1.9.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`772aa8ee2981d1674425cbc6a2aa72a089ce71f1889ce4b2770cc86339ce7776`
MD5	`2390d577e2c47dd454d98cb8a3728a6d`
BLAKE2b-256	`c470e36536d26dae73434c8114fc5aeedaa6bef7a2e1234cdd822e92900ee730`