Skip to main content

Encoder-Decoder base for Vietnamese handwriting recognition

Project description

Vietnamese Handwriting Text Recognition (aka vnhtr package)

This project deploys and improves two foundational models within TrOCR and VietOCR.

Proposal Architecture

VGG Transformer with Rethinking Head

VGG Transformer with Rethinking Head

TrOCR with Rethinking Head

TrOCR with Rethinking Head

Usage

vnhtr package

pip install vnhtr
from PIL import Image
from vnhtr.vnhtr_script.tools import *

vta_predictor = VGGTransformer("cuda:0")
tra_predictor = TrOCR("cuda:0")

vta_predictor.predict([Image.open("/content/out_sample_2.jpg")])
tra_predictor.predict([Image.open("/content/out_sample_2.jpg")])

Fully implemented

git clone https://github.com/nguyenhoanganh2002/vnhtr
cd ./vnhtr/vnhtr/source
pip install -r requirements.txt
  • Pretrain/Fintune VGG Transformer/TrOCR (pretraining on a large dataset and then finetuning on a wild dataset)
python VGGTransformer/train.py
python VisionEncoderDecoder/train.py
  • Pretrain VGG Transformer/TrOCR with Rethinking Head (large dataset)
python VGGTransformer/adapter_trainer.py
python VisionEncoderDecoder/adapter_trainer.py
  • Finetune VGG Transformer with Rethinking Head (wild dataset)
python VGGTransformer/finetune.py
python VisionEncoderDecoder/finetune.py
  • Access the model without going through the training or finetuning phases.
from VGGTransformer.config import config as vggtransformer_cf
from VGGTransformer.models import VGGTransformer, AdapterVGGTransformer
from VisionEncoderDecoder.config import config as trocr_cf
from VisionEncoderDecoder.model import VNTrOCR, AdapterVNTrOCR

vt_base = VGGTransformer(vggtransformer_cf)
vt_adapter = AdapterVGGTransformer(vggtransformer_cf)
tr_base = VNTrOCR(trocr_cf)
tr_adapter = AdapterVNTrOCR(trocr_cf)

For access to the full dataset and pretrained weights, please contact: anh.nh204511@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vnhtr-0.1.2.tar.gz (32.9 kB view details)

Uploaded Source

Built Distribution

vnhtr-0.1.2-py3-none-any.whl (50.6 kB view details)

Uploaded Python 3

File details

Details for the file vnhtr-0.1.2.tar.gz.

File metadata

  • Download URL: vnhtr-0.1.2.tar.gz
  • Upload date:
  • Size: 32.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for vnhtr-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ba6e445994a47e4c2e27c6ecee2db855d2746077e9150b7dd0c70d34be839265
MD5 53796f2ebf4a67e31dfcc0dafda0d41e
BLAKE2b-256 53c4f72a0a0b406c22cf3a131c505e3f1e53c09296c57df121f243fdc7c94643

See more details on using hashes here.

File details

Details for the file vnhtr-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: vnhtr-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 50.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for vnhtr-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1cac706036b5f88f4a774c6e735fc2d2885962bc1bb8fc1d61c0eb1f27c91bfc
MD5 f395185e114a1eb4557ec3e762c7a06e
BLAKE2b-256 c849703b83e42d7f815fb5e9acfc2bf620d590a7e2887c7b4ec5d3186aeb3593

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page