Skip to main content

Bangla English Mixed Languge Scene / printed OCR Toolkit

Project description

apsis-ocr

Apsis-OCR is a Mixed language ocr system for Printed Documents developed at Apsis Solutions limited

The full system is build with 3 components:

  • Text detection : DBNet
  • Text recognition:
    • Bangla Text : ApsisNet
      • ApsisNet is a model developed at Apsis Solutions Limited.
      • It is used by bbOCR as the recognition model
      • ApsisNet is found to be the best among other available recognition models (such as tesseract and easyOCR) in the linked paper
    • English Text : SVTR-LCNet
  • Text classification : DenseNet121

Installation

As module/pypi package

cpu installation

pip install apsisocr
pip install onnxruntime
pip install fastdeploy-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html

gpu installation

It is recommended to use conda environment . Specially for GPU.

  • installing cudatoolkit and cudnn:
conda install cudatoolkit
conda install cudnn
  • installing packages
pip install apsisocr
pip install onnxruntime-gpu
python -m pip install -U fastdeploy-gpu-python -f https://www.paddlepaddle.org.cn/whl/fastdeploy.html
  • exporting environment variables
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'export LD_LIBRARY_PATH=$CUDNN_PATH/lib:$CONDA_PREFIX/lib/:$LD_LIBRARY_PATH' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

Building from source : Linux/Ubuntu

It is recommended to use conda environment .

  • clone the repository :
git clone https://github.com/mnansary/apsisOCR.git
cd apsisOCR
  • create a conda environment:
conda create -n apsisocr python=3.9
  • activate conda environment:
conda activate apsisocr
  • cpu installation :
bash install.sh cpu
  • gpu installation :
bash install.sh gpu

Useage

Apsisnet : Bangla Recognizer

  • useage
from apsisocr import ApsisNet
bnocr=ApsisNet()
bnocr.infer(crops)
  • docstring for ApsisNet.infer
"""
Perform inference on image crops.

Args:
    crops (list[np.ndarray]): List of image crops.
    batch_size (int): Batch size for inference (default: 32).
    normalize_unicode (bool): Flag to normalize unicode (default: True).

Returns:
    list[str]: List of inferred texts.
"""

SVTR-LCNet : English Recognizer

from apsisocr import SVTRLCNet
enocr=SVTRLCNet()
enocr.infer(crops)
  • docstring for SVTRLCNet.infer
"""
Perform inference on image crops.

Args:
    crops (list[np.ndarray]): List of image crops.
    batch_size (int): Batch size for inference.

Returns:
    list[str]: List of recognized texts.
"""

DenseNet121BnEnClassifier : Language classifier

from apsisocr import DenseNet121BnEnClassifier
lang=DenseNet121BnEnClassifier()
lang.infer(crops)
  • docstring for DenseNet121BnEnClassifier.infer
"""
Perform inference on image crops.

Args:
    crops (list[np.ndarray]): List of image crops.
    batch_size (int): Batch size for inference (default: 32).

Returns:
    list[str]: List of inferred languages.
"""

PaddleDBNet : Text Detector

  • check paddleOCR official website for better understanding of the model
# initialization
from apsisocr import PaddleDBNet
detector=PaddleDBNet()
# getting word boxes
word_boxes=detector.get_word_boxes(img)
# getting line boxes
line_boxes=detector.get_line_boxes(img)
# getting crop with either of the results
crops=detector.get_crops(img,word_boxes)

ApsisOCR : Overall System

from apsisocr import ApsisOCR
ocr=ApsisOCR()
results=ocr(img_path)
  • docstring for ApsisOCR.__call__
"""
Perform OCR on an image.

Args:
    img_path (str): Path to the image file.

Returns:
    dict: OCR results containing recognized text and associated information. The dictionary has the following structre
            {
            "text" : multiline text with newline separators
            "result" : list a dictionaries that contains the following structre:
                        {
                        "line_no" : the line number of the word
                        "word_no" : the word number in the line 
                        "poly"    : the four point polygonal bounding box of the word in the image
                        "text"    : the recognized text 
                        "lang"    : the classified language code
                        }
            }
"""  

check useage/useage.ipynb for examples

Deployment

  • cd deployment: change directory to deployment folder
  • change the configs as required in config.py
# This port will be used by the api_ocr.py 
OCR_API_PORT=3032
# This api address is diplayed after deploying api_ocr.py and this is used in app.py  
OCR_API="http://172.20.4.53:3032/ocr"
  • running the api and app:
python api_ocr.py # deploys at port 3032 by defautl
streamlit run app.py --server.port 3033 # deploys streamlit built frontend at 3033 port
  • The api_ocr.py lets the api to be used without any UI (a postman screenshot is attached below)

  • The app.py runs a streamlit UI

TESTED GPU INFERENCE SERVER CONFIG

OS          : Ubuntu 20.04.6 LTS      
Memory      : 62.4 GiB 
Processor   : Intel® Xeon(R) Silver 4214R CPU @ 2.40GHz × 24    
Graphics    : NVIDIA RTX A6000/PCIe/SSE2
Gnome       : 3.36.8

License

Contents of this repository are restricted to non-commercial research purposes only under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Creative Commons License

Change Log

0.0.1 (24/09/2023)

  • added useage and examples

0.0.2 (24/09/2023)

  • added liscencing

0.0.3 (05/11/2023)

  • added base application without line and word number

0.0.4 (01/05/2024)

  • apsisbnocr

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apsisocr-0.0.4.tar.gz (19.5 kB view details)

Uploaded Source

Built Distribution

apsisocr-0.0.4-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file apsisocr-0.0.4.tar.gz.

File metadata

  • Download URL: apsisocr-0.0.4.tar.gz
  • Upload date:
  • Size: 19.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for apsisocr-0.0.4.tar.gz
Algorithm Hash digest
SHA256 8e0a70821a6c234c013bd5ad34975262ecd3e76fcd4448be45a69074e6798c8a
MD5 90e0fccd34a40e01ffc51de932ec3e86
BLAKE2b-256 d6006354a9157a25d5210dde410debe6bdd9943d7ce0d4d1c27c31f07f36bfdb

See more details on using hashes here.

File details

Details for the file apsisocr-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: apsisocr-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for apsisocr-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 7e0ae3d23a17f9ca724841e9bed7cdfe47a381419df425b86270c0bf04ac4804
MD5 5da6cabb916213b62278d25ea32e8251
BLAKE2b-256 8e82cdfbdc58667254904894760fb0236d5d6dbf784976440ee06a0e416f1ae6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page