Skip to main content

A Python library for OCR

Project description

Dulu

Dulu is a small python library for doing OCR on image. Under the hood it uses florence model from microsoft, which is a state of the art vision model for many downstream tasks including OCR.

Installation

pip install dulu

Example Code

Here is a snippet of the code used in the notebook:

from dulu import OCR
from PIL import Image

# Load the image
image = Image.open("./sample_image.png")
image

# Initialize the OCR object
ocr = OCR()

# Recognize text without bounding boxes
text = ocr.recognize_text("./sample_image.png", bbox=False)
print(text)

# Recognize text with bounding boxes
text_with_bbox = ocr.recognize_text("./sample_image.png")
print(text_with_bbox)

List of models

Model Id Model size Model Description
Florence-2-base 0.23B Pretrained model with FLD-5B
Florence-2-large 0.77B Pretrained model with FLD-5B
Florence-2-base-ft 0.23B Finetuned model on a collection of downstream tasks
Florence-2-large-ft 0.77B Finetuned model on a collection of downstream tasks

By default we use Florence-2-base-ft model, you can over ride this with any other above in the list when instantiating OCR(model_name="microsoft/{model_id})

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dulu-0.1.3.tar.gz (292.3 kB view details)

Uploaded Source

Built Distribution

dulu-0.1.3-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file dulu-0.1.3.tar.gz.

File metadata

  • Download URL: dulu-0.1.3.tar.gz
  • Upload date:
  • Size: 292.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for dulu-0.1.3.tar.gz
Algorithm Hash digest
SHA256 48833941812aa8c1107fa2a25055c5920bcbc6663da00a4e8e8468e9ce3340f9
MD5 ca4cf6550b0d0b5bfd4baa4be97f8a60
BLAKE2b-256 47051146c28d9fbedf100ad413c9041dc4b95e632405f6eafb2de854d5ec2c5a

See more details on using hashes here.

File details

Details for the file dulu-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: dulu-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 7.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for dulu-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d42f90956b434b6d5d6d144477693a4d6f66eccc4bc16e2d178e7216ca97dfad
MD5 56fce5947fbd191280b173a419b5e406
BLAKE2b-256 dee7d506f35123e657b926db3afeca063e034946094e10dcb4e477cd7eed79ad

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page