Bangla English Mixed Languge Scene / printed OCR Toolkit
Project description
apsis-ocr
Apsis-OCR is a Mixed language ocr system for Printed Documents developed at Apsis Solutions limited
The full system is build with 3 components:
- Text detection : DBNet
- Text recognition:
- Text classification : DenseNet121
Installation
As module/pypi package
cpu installation
pip install apsisocr
pip install onnxruntime
pip install fastdeploy-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html
gpu installation
It is recommended to use conda environment . Specially for GPU.
- installing cudatoolkit and cudnn:
conda install cudatoolkit
conda install cudnn
- installing packages
pip install apsisocr
pip install onnxruntime-gpu
python -m pip install -U fastdeploy-gpu-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html
- exporting environment variables
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'export LD_LIBRARY_PATH=$CUDNN_PATH/lib:$CONDA_PREFIX/lib/:$LD_LIBRARY_PATH' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
Building from source : Linux/Ubuntu
It is recommended to use conda environment .
- clone the repository :
git clone https://github.com/mnansary/apsisOCR.git
cd apsisOCR
- create a conda environment:
conda create -n apsisocr python=3.9
- activate conda environment:
conda activate apsisocr
- cpu installation :
bash install.sh cpu
- gpu installation :
bash install.sh gpu
Useage
Apsisnet : Bangla Recognizer
- useage
from apsisocr import ApsisNet
bnocr=ApsisNet()
bnocr.infer(crops)
- docstring for
ApsisNet.infer
"""
Perform inference on image crops.
Args:
crops (list[np.ndarray]): List of image crops.
batch_size (int): Batch size for inference (default: 32).
normalize_unicode (bool): Flag to normalize unicode (default: True).
Returns:
list[str]: List of inferred texts.
"""
SVTR-LCNet : English Recognizer
from apsisocr import SVTRLCNet
enocr=SVTRLCNet()
enocr.infer(crops)
- docstring for
SVTRLCNet.infer
"""
Perform inference on image crops.
Args:
crops (list[np.ndarray]): List of image crops.
batch_size (int): Batch size for inference.
Returns:
list[str]: List of recognized texts.
"""
DenseNet121BnEnClassifier : Language classifier
from apsisocr import DenseNet121BnEnClassifier
lang=DenseNet121BnEnClassifier()
lang.infer(crops)
- docstring for
DenseNet121BnEnClassifier.infer
"""
Perform inference on image crops.
Args:
crops (list[np.ndarray]): List of image crops.
batch_size (int): Batch size for inference (default: 32).
Returns:
list[str]: List of inferred languages.
"""
PaddleDBNet : Text Detector
- check paddleOCR official website for better understanding of the model
# initialization
from apsisocr import PaddleDBNet
detector=PaddleDBNet()
# getting word boxes
word_boxes=detector.get_word_boxes(img)
# getting line boxes
line_boxes=detector.get_line_boxes(img)
# getting crop with either of the results
crops=detector.get_crops(img,word_boxes)
ApsisOCR : Overall System
from apsisocr import ApsisOCR
ocr=ApsisOCR()
results=ocr(img_path)
- docstring for
ApsisOCR.__call__
"""
Perform OCR on an image.
Args:
img_path (str): Path to the image file.
Returns:
dict: OCR results containing recognized text and associated information. The dictionary has the following structre
{
"text" : multiline text with newline separators
"result" : list a dictionaries that contains the following structre:
{
"line_no" : the line number of the word
"word_no" : the word number in the line
"poly" : the four point polygonal bounding box of the word in the image
"text" : the recognized text
"lang" : the classified language code
}
}
"""
check useage/useage.ipynb
for examples
Deployment
cd deployment
: change directory to deployment folder- change the configs as required in
config.py
# This port will be used by the api_ocr.py
OCR_API_PORT=3032
# This api address is diplayed after deploying api_ocr.py and this is used in app.py
OCR_API="http://172.20.4.53:3032/ocr"
- running the api and app:
python api_ocr.py # deploys at port 3032 by defautl
streamlit run app.py --server.port 3033 # deploys streamlit built frontend at 3033 port
- The api_ocr.py lets the api to be used without any UI (a postman screenshot is attached below)
- The app.py runs a streamlit UI
TESTED GPU INFERENCE SERVER CONFIG
OS : Ubuntu 20.04.6 LTS
Memory : 62.4 GiB
Processor : Intel® Xeon(R) Silver 4214R CPU @ 2.40GHz × 24
Graphics : NVIDIA RTX A6000/PCIe/SSE2
Gnome : 3.36.8
License
Contents of this repository are restricted to non-commercial research purposes only under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
Change Log
0.0.1 (24/09/2023)
- added useage and examples
0.0.2 (24/09/2023)
- added liscencing
0.0.3 (05/11/2023)
- added base application without line and word number
0.0.4 (01/05/2024)
- apsisbnocr
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
apsisocr-0.0.4.tar.gz
(19.5 kB
view hashes)
Built Distribution
apsisocr-0.0.4-py3-none-any.whl
(21.2 kB
view hashes)