Skip to main content

Bangla English Mixed Languge Scene / printed OCR Toolkit

Project description

apsis-ocr

Apsis-OCR is a Mixed language ocr system for Printed Documents developed at Apsis Solutions limited

The full system is build with 3 components:

  • Text detection : DBNet
  • Text recognition:
    • Bangla Text : ApsisNet
      • ApsisNet is a model developed at Apsis Solutions Limited.
      • It is used by bbOCR as the recognition model
      • ApsisNet is found to be the best among other available recognition models (such as tesseract and easyOCR) in the linked paper
    • English Text : SVTR-LCNet
  • Text classification : DenseNet121

Installation

As module/pypi package

cpu installation

pip install apsisocr
pip install onnxruntime
pip install fastdeploy-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html

gpu installation

It is recommended to use conda environment . Specially for GPU.

  • installing cudatoolkit and cudnn:
conda install cudatoolkit
conda install cudnn
  • installing packages
pip install apsisocr
pip install onnxruntime-gpu
python -m pip install -U fastdeploy-gpu-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html
  • exporting environment variables
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'export LD_LIBRARY_PATH=$CUDNN_PATH/lib:$CONDA_PREFIX/lib/:$LD_LIBRARY_PATH' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

Building from source : Linux/Ubuntu

It is recommended to use conda environment .

  • clone the repository :
git clone https://github.com/mnansary/apsisOCR.git
cd apsisOCR
  • create a conda environment:
conda create -n apsisocr python=3.9
  • activate conda environment:
conda activate apsisocr
  • cpu installation :
bash install.sh cpu
  • gpu installation :
bash install.sh gpu

Useage

Apsisnet : Bangla Recognizer

  • useage
from apsisocr import ApsisNet
bnocr=ApsisNet()
bnocr.infer(crops)
  • docstring for ApsisNet.infer
"""
Perform inference on image crops.

Args:
    crops (list[np.ndarray]): List of image crops.
    batch_size (int): Batch size for inference (default: 32).
    normalize_unicode (bool): Flag to normalize unicode (default: True).

Returns:
    list[str]: List of inferred texts.
"""

SVTR-LCNet : English Recognizer

from apsisocr import SVTRLCNet
enocr=SVTRLCNet()
enocr.infer(crops)
  • docstring for SVTRLCNet.infer
"""
Perform inference on image crops.

Args:
    crops (list[np.ndarray]): List of image crops.
    batch_size (int): Batch size for inference.

Returns:
    list[str]: List of recognized texts.
"""

DenseNet121BnEnClassifier : Language classifier

from apsisocr import DenseNet121BnEnClassifier
lang=DenseNet121BnEnClassifier()
lang.infer(crops)
  • docstring for DenseNet121BnEnClassifier.infer
"""
Perform inference on image crops.

Args:
    crops (list[np.ndarray]): List of image crops.
    batch_size (int): Batch size for inference (default: 32).

Returns:
    list[str]: List of inferred languages.
"""

PaddleDBNet : Text Detector

  • check paddleOCR official website for better understanding of the model
# initialization
from apsisocr import PaddleDBNet
detector=PaddleDBNet()
# getting word boxes
word_boxes=detector.get_word_boxes(img)
# getting line boxes
line_boxes=detector.get_line_boxes(img)
# getting crop with either of the results
crops=detector.get_crops(img,word_boxes)

ApsisOCR : Overall System

from apsisocr import ApsisOCR
ocr=ApsisOCR()
results=ocr(img_path)
  • docstring for ApsisOCR.__call__
"""
Perform OCR on an image.

Args:
    img_path (str): Path to the image file.

Returns:
    dict: OCR results containing recognized text and associated information. The dictionary has the following structre
            {
            "text" : multiline text with newline separators
            "result" : list a dictionaries that contains the following structre:
                        {
                        "line_no" : the line number of the word
                        "word_no" : the word number in the line 
                        "poly"    : the four point polygonal bounding box of the word in the image
                        "text"    : the recognized text 
                        "lang"    : the classified language code
                        }
            }
"""  

check useage/useage.ipynb for examples

Deployment

  • cd deployment: change directory to deployment folder
  • change the configs as required in config.py
# This port will be used by the api_ocr.py 
OCR_API_PORT=3032
# This api address is diplayed after deploying api_ocr.py and this is used in app.py  
OCR_API="http://172.20.4.53:3032/ocr"
  • running the api and app:
python api_ocr.py # deploys at port 3032 by defautl
streamlit run app.py --server.port 3033 # deploys streamlit built frontend at 3033 port
  • The api_ocr.py lets the api to be used without any UI (a postman screenshot is attached below)

  • The app.py runs a streamlit UI

TESTED GPU INFERENCE SERVER CONFIG

OS          : Ubuntu 20.04.6 LTS      
Memory      : 62.4 GiB 
Processor   : Intel® Xeon(R) Silver 4214R CPU @ 2.40GHz × 24    
Graphics    : NVIDIA RTX A6000/PCIe/SSE2
Gnome       : 3.36.8

License

Contents of this repository are restricted to non-commercial research purposes only under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Creative Commons License

Change Log

0.0.1 (24/09/2023)

  • added useage and examples

0.0.2 (24/09/2023)

  • added liscencing

0.0.3 (05/11/2023)

  • added base application without line and word number

0.0.4 (01/05/2024)

  • apsisbnocr

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apsisocr-0.0.4.tar.gz (19.5 kB view hashes)

Uploaded Source

Built Distribution

apsisocr-0.0.4-py3-none-any.whl (21.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page