ComicsOCR is a Python package created for easily distributing OCR models trained for golden age of comics.
Project description
to build locally after cloning
pip install comics-ocr.[cuda] -f https://download.pytorch.org/whl/torch_stable.html
or
pip install comics-ocr.[cpu]
You can get the necessary model checkpoints and configs from COMICS TEXT+ repository.
Usage
# Import library
from comics_ocr import ComicsOCR
# initalize the model
e2e_ocr_model = ComicsOCR(
ocr_detector_config="fcenet_r50dcnv2_fpn_1500e_ctw1500_custom/fcenet_r50dcnv2_fpn_1500e_ctw1500_custom.py",
ocr_detector_checkpoint='fcenet_r50dcnv2_fpn_1500e_ctw1500_custom/best_0_hmean-iou:hmean_epoch_5.pth',
recog_config='master_custom_dataset.py',
ocr_recognition_checkpoint='best_0_1-N.E.D_epoch_4.pth',
det='FCE_CTW_DCNv2',
recog='MASTER')
# Run the model
img_path = "speech_bubble/0/3/9.jpg"
text, preprocessed_text, sanitized_text = e2e_ocr_model.extract_text(img_path)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
comics_ocr-0.1.1.tar.gz
(204.2 kB
view hashes)
Built Distribution
comics_ocr-0.1.1-py3-none-any.whl
(931.2 kB
view hashes)
Close
Hashes for comics_ocr-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 808398a6b9398fe88075db036bcfbdfa55a3f66d9262424bb53cd9025ae5b8ae |
|
MD5 | 5fb06c96fbe2f2b309ebf4b6674e3d65 |
|
BLAKE2b-256 | 7ca0961c7087cc1831d5cb61e6225485b4178a0f8c38c668960e984fc42a3135 |