ComicsOCR is a Python package created for easily distributing OCR models trained for golden age of comics.
Project description
to build locally after cloning
pip install comics-ocr[cuda] -f https://download.pytorch.org/whl/torch_stable.html
or
pip install comics-ocr[cpu]
You can get the necessary model checkpoints and configs from COMICS TEXT+ repository.
Usage
# Import library
from comics_ocr import ComicsOCR
# initalize the model
e2e_ocr_model = ComicsOCR(
ocr_detector_config="fcenet_r50dcnv2_fpn_1500e_ctw1500_custom/fcenet_r50dcnv2_fpn_1500e_ctw1500_custom.py",
ocr_detector_checkpoint='fcenet_r50dcnv2_fpn_1500e_ctw1500_custom/best_0_hmean-iou:hmean_epoch_5.pth',
recog_config='master_custom_dataset.py',
ocr_recognition_checkpoint='best_0_1-N.E.D_epoch_4.pth',
det='FCE_CTW_DCNv2',
recog='MASTER')
# Run the model
img_path = "speech_bubble/0/3/9.jpg"
text, preprocessed_text, sanitized_text = e2e_ocr_model.extract_text(img_path)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
comics_ocr-0.1.2.tar.gz
(204.2 kB
view hashes)
Built Distribution
comics_ocr-0.1.2-py3-none-any.whl
(931.2 kB
view hashes)
Close
Hashes for comics_ocr-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cfd8c1a931265a4b69199de7f25b803608ee015f1259656a1c09331626c1987a |
|
MD5 | 9cbdbf6106e8b0dab2a01067d94d815b |
|
BLAKE2b-256 | feb4e1ef2299e295941d903a1cfbd3fb0a5f6a40d96daed22d8cec0f51af2ac6 |