Skip to main content

Encoder-Decoder base for Vietnamese handwriting recognition

Project description

Vietnamese Handwriting Text Recognition (aka vnhtr package)

This project deploys and improves two foundational models within TrOCR and VietOCR.

Proposal Architecture

VGG Transformer with Rethinking Head

VGG Transformer with Rethinking Head

TrOCR with Rethinking Head

TrOCR with Rethinking Head

Usage

vnhtr package

pip install vnhtr
from PIL import Image
from vnhtr.vnhtr_script.tools import *

vta_predictor = VGGTransformer("cuda:0")
tra_predictor = TrOCR("cuda:0")

vta_predictor.predict([Image.open("/content/out_sample_2.jpg")])
tra_predictor.predict([Image.open("/content/out_sample_2.jpg")])

Fully implemented

git clone https://github.com/nguyenhoanganh2002/vnhtr
cd ./vnhtr/vnhtr/source
pip install -r requirements.txt
  • Pretrain/Fintune VGG Transformer/TrOCR (pretraining on a large dataset and then finetuning on a wild dataset)
python VGGTransformer/train.py
python VisionEncoderDecoder/train.py
  • Pretrain VGG Transformer/TrOCR with Rethinking Head (large dataset)
python VGGTransformer/adapter_trainer.py
python VisionEncoderDecoder/adapter_trainer.py
  • Finetune VGG Transformer with Rethinking Head (wild dataset)
python VGGTransformer/finetune.py
python VisionEncoderDecoder/finetune.py
  • Access the model without going through the training or finetuning phases.
from VGGTransformer.config import config as vggtransformer_cf
from VGGTransformer.models import VGGTransformer, AdapterVGGTransformer
from VisionEncoderDecoder.config import config as trocr_cf
from VisionEncoderDecoder.model import VNTrOCR, AdapterVNTrOCR

vt_base = VGGTransformer(vggtransformer_cf)
vt_adapter = AdapterVGGTransformer(vggtransformer_cf)
tr_base = VNTrOCR(trocr_cf)
tr_adapter = AdapterVNTrOCR(trocr_cf)

For access to the full dataset and pretrained weights, please contact: anh.nh204511@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vnhtr-0.1.5.tar.gz (32.9 kB view details)

Uploaded Source

Built Distribution

vnhtr-0.1.5-py3-none-any.whl (50.6 kB view details)

Uploaded Python 3

File details

Details for the file vnhtr-0.1.5.tar.gz.

File metadata

  • Download URL: vnhtr-0.1.5.tar.gz
  • Upload date:
  • Size: 32.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for vnhtr-0.1.5.tar.gz
Algorithm Hash digest
SHA256 29c2df3a1f09b3f62fca7718bb0e56405998d99ee140dfa56c18873d6156ec82
MD5 06db9e7df4b750e2155e9b9d70ed2b62
BLAKE2b-256 ed2490b037830cd6d68cb44013b83f26b4e6a8abe1e3d5f11edf4cd9bff6d351

See more details on using hashes here.

File details

Details for the file vnhtr-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: vnhtr-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 50.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for vnhtr-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 e5152ae4ffa3b3512ca5ce7e427536837807cf2ee7ca234b07adb54a73a8f021
MD5 45969559f6cd05f7edc00a5dca6082b2
BLAKE2b-256 37ae7f721253bd8b9ddcf1d9c3a1ee74caab08a92e7b6314377528b3de2c1d78

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page