Encoder-Decoder base for Vietnamese handwriting recognition
Project description
Vietnamese Handwriting Text Recognition (aka vnhtr package)
This project deploys and improves two foundational models within TrOCR and VietOCR.
Proposal Architecture
VGG Transformer with Rethinking Head
TrOCR with Rethinking Head
Usage
vnhtr
package
pip install vnhtr
from PIL import Image
from vnhtr.vnhtr_script.tools import *
vta_predictor = VGGTransformer("cuda:0")
tra_predictor = TrOCR("cuda:0")
vta_predictor.predict([Image.open("/content/out_sample_2.jpg")])
tra_predictor.predict([Image.open("/content/out_sample_2.jpg")])
Fully implemented
git clone https://github.com/nguyenhoanganh2002/vnhtr
cd ./vnhtr/vnhtr/source
pip install -r requirements.txt
- Pretrain/Fintune VGG Transformer/TrOCR (pretraining on a large dataset and then finetuning on a wild dataset)
python VGGTransformer/train.py
python VisionEncoderDecoder/train.py
- Pretrain VGG Transformer/TrOCR with Rethinking Head (large dataset)
python VGGTransformer/adapter_trainer.py
python VisionEncoderDecoder/adapter_trainer.py
- Finetune VGG Transformer with Rethinking Head (wild dataset)
python VGGTransformer/finetune.py
python VisionEncoderDecoder/finetune.py
- Access the model without going through the training or finetuning phases.
from VGGTransformer.config import config as vggtransformer_cf
from VGGTransformer.models import VGGTransformer, AdapterVGGTransformer
from VisionEncoderDecoder.config import config as trocr_cf
from VisionEncoderDecoder.model import VNTrOCR, AdapterVNTrOCR
vt_base = VGGTransformer(vggtransformer_cf)
vt_adapter = AdapterVGGTransformer(vggtransformer_cf)
tr_base = VNTrOCR(trocr_cf)
tr_adapter = AdapterVNTrOCR(trocr_cf)
For access to the full dataset and pretrained weights, please contact: anh.nh204511@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vnhtr-0.1.1.tar.gz
(32.9 kB
view details)
Built Distribution
vnhtr-0.1.1-py3-none-any.whl
(50.6 kB
view details)
File details
Details for the file vnhtr-0.1.1.tar.gz
.
File metadata
- Download URL: vnhtr-0.1.1.tar.gz
- Upload date:
- Size: 32.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3bc040f7db123295524afd5eedf294bfd7cf562972786ad5597d30e0fa9f2d3 |
|
MD5 | 2da591835b477b58d2d2381a53259607 |
|
BLAKE2b-256 | 7e7efa6668ef83f2e7ead972c16863824927de08800729038d83c3e9bb60d2cc |
File details
Details for the file vnhtr-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: vnhtr-0.1.1-py3-none-any.whl
- Upload date:
- Size: 50.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bcfa6128025d62f682017522ce809ba762a24e615732016b53bc088cb535719c |
|
MD5 | 1299698a267874238b382b365b6c29f9 |
|
BLAKE2b-256 | 3f5635c13251ee96e4b268a74b55ed6d7d9e14c17e928ede11dfdbf5060c7f82 |