Bangla English Mixed Languge Scene / printed OCR Toolkit
Project description
apsis-ocr
Apsis-OCR is a Mixed language ocr system for Printed Documents developed at Apsis Solutions limited
The full system is build with 3 components:
- Text detection : DBNet
- Text recognition:
- Text classification : DenseNet121
Installation
As module/pypi package
cpu installation
pip install apsisocr
pip install onnxruntime
pip install fastdeploy-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html
gpu installation
It is recommended to use conda environment . Specially for GPU.
- installing cudatoolkit and cudnn:
conda install cudatoolkit
conda install cudnn
- installing packages
pip install apsisocr
pip install onnxruntime-gpu
python -m pip install -U fastdeploy-gpu-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html
- exporting environment variables
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'export LD_LIBRARY_PATH=$CUDNN_PATH/lib:$CONDA_PREFIX/lib/:$LD_LIBRARY_PATH' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
Building from source : Linux/Ubuntu
It is recommended to use conda environment .
- clone the repository :
git clone https://github.com/mnansary/apsisOCR.git
cd apsisOCR
- create a conda environment:
conda create -n apsisocr python=3.9
- activate conda environment:
conda activate apsisocr
- cpu installation :
bash install.sh cpu
- gpu installation :
bash install.sh gpu
Useage
Apsisnet : Bangla Recognizer
- useage
from apsisocr import ApsisNet
bnocr=ApsisNet()
bnocr.infer(crops)
- docstring for
ApsisNet.infer
"""
Perform inference on image crops.
Args:
crops (list[np.ndarray]): List of image crops.
batch_size (int): Batch size for inference (default: 32).
normalize_unicode (bool): Flag to normalize unicode (default: True).
Returns:
list[str]: List of inferred texts.
"""
SVTR-LCNet : English Recognizer
from apsisocr import SVTRLCNet
enocr=SVTRLCNet()
enocr.infer(crops)
- docstring for
SVTRLCNet.infer
"""
Perform inference on image crops.
Args:
crops (list[np.ndarray]): List of image crops.
batch_size (int): Batch size for inference.
Returns:
list[str]: List of recognized texts.
"""
DenseNet121BnEnClassifier : Language classifier
from apsisocr import DenseNet121BnEnClassifier
lang=DenseNet121BnEnClassifier()
lang.infer(crops)
- docstring for
DenseNet121BnEnClassifier.infer
"""
Perform inference on image crops.
Args:
crops (list[np.ndarray]): List of image crops.
batch_size (int): Batch size for inference (default: 32).
Returns:
list[str]: List of inferred languages.
"""
PaddleDBNet : Text Detector
- check paddleOCR official website for better understanding of the model
# initialization
from apsisocr import PaddleDBNet
detector=PaddleDBNet()
# getting word boxes
word_boxes=detector.get_word_boxes(img)
# getting line boxes
line_boxes=detector.get_line_boxes(img)
# getting crop with either of the results
crops=detector.get_crops(img,word_boxes)
ApsisOCR : Overall System
from apsisocr import ApsisOCR
ocr=ApsisOCR()
results=ocr(img_path)
- docstring for
ApsisOCR.__call__
"""
Perform OCR on an image.
Args:
img_path (str): Path to the image file.
Returns:
dict: OCR results containing recognized text and associated information. The dictionary has the following structre
{
"text" : multiline text with newline separators
"result" : list a dictionaries that contains the following structre:
{
"line_no" : the line number of the word
"word_no" : the word number in the line
"poly" : the four point polygonal bounding box of the word in the image
"text" : the recognized text
"lang" : the classified language code
}
}
"""
check useage/useage.ipynb
for examples
Deployment
cd deployment
: change directory to deployment folder- change the configs as required in
config.py
# This port will be used by the api_ocr.py
OCR_API_PORT=3032
# This api address is diplayed after deploying api_ocr.py and this is used in app.py
OCR_API="http://172.20.4.53:3032/ocr"
- running the api and app:
python api_ocr.py # deploys at port 3032 by defautl
streamlit run app.py --server.port 3033 # deploys streamlit built frontend at 3033 port
- The api_ocr.py lets the api to be used without any UI (a postman screenshot is attached below)
- The app.py runs a streamlit UI
TESTED GPU INFERENCE SERVER CONFIG
OS : Ubuntu 20.04.6 LTS
Memory : 62.4 GiB
Processor : Intel® Xeon(R) Silver 4214R CPU @ 2.40GHz × 24
Graphics : NVIDIA RTX A6000/PCIe/SSE2
Gnome : 3.36.8
License
Contents of this repository are restricted to non-commercial research purposes only under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
Change Log
0.0.1 (24/09/2023)
- added useage and examples
0.0.2 (24/09/2023)
- added liscencing
0.0.3 (05/11/2023)
- added base application without line and word number
0.0.4 (01/05/2024)
- apsisbnocr
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file apsisocr-0.0.4.tar.gz
.
File metadata
- Download URL: apsisocr-0.0.4.tar.gz
- Upload date:
- Size: 19.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e0a70821a6c234c013bd5ad34975262ecd3e76fcd4448be45a69074e6798c8a |
|
MD5 | 90e0fccd34a40e01ffc51de932ec3e86 |
|
BLAKE2b-256 | d6006354a9157a25d5210dde410debe6bdd9943d7ce0d4d1c27c31f07f36bfdb |
File details
Details for the file apsisocr-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: apsisocr-0.0.4-py3-none-any.whl
- Upload date:
- Size: 21.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e0ae3d23a17f9ca724841e9bed7cdfe47a381419df425b86270c0bf04ac4804 |
|
MD5 | 5da6cabb916213b62278d25ea32e8251 |
|
BLAKE2b-256 | 8e82cdfbdc58667254904894760fb0236d5d6dbf784976440ee06a0e416f1ae6 |