Skip to main content

A Python library for OCR

Project description

Dulu

Dulu is a small python library for doing OCR on image. Under the hood it uses florence model from microsoft, which is a state of the art vision model for many downstream tasks including OCR.

Installation

pip install dulu

Example Code

Here is a snippet of the code used in the notebook:

from dulu import OCR
from PIL import Image

# Load the image
image = Image.open("./sample_image.png")
image

# Initialize the OCR object
ocr = OCR()

# Recognize text without bounding boxes
text = ocr.recognize_text("./sample_image.png", bbox=False)
print(text)

# Recognize text with bounding boxes
text_with_bbox = ocr.recognize_text("./sample_image.png")
print(text_with_bbox)

List of models

Model Id Model size Model Description
Florence-2-base 0.23B Pretrained model with FLD-5B
Florence-2-large 0.77B Pretrained model with FLD-5B
Florence-2-base-ft 0.23B Finetuned model on a collection of downstream tasks
Florence-2-large-ft 0.77B Finetuned model on a collection of downstream tasks

By default we use Florence-2-base-ft model, you can over ride this with any other above in the list when instantiating OCR(model_name="microsoft/{model_id})

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dulu-0.1.4.tar.gz (292.2 kB view details)

Uploaded Source

Built Distribution

dulu-0.1.4-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file dulu-0.1.4.tar.gz.

File metadata

  • Download URL: dulu-0.1.4.tar.gz
  • Upload date:
  • Size: 292.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for dulu-0.1.4.tar.gz
Algorithm Hash digest
SHA256 5ab9289c32d19088b312cb3adf2dfb8a065fd5dedce80451290df99c13aa8dcc
MD5 eea34edff814e3157fbd78d5777a078f
BLAKE2b-256 78d699a06f4403468ba9901028f5389ff7805cabead150053937e972aeed2e7b

See more details on using hashes here.

File details

Details for the file dulu-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: dulu-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 7.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for dulu-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 4e81e6783bec204320cbe9be5bc344683b0394cc79dc3890fd28a155f413c7f2
MD5 64b9b5cbe85b35ce03756dbae74862ff
BLAKE2b-256 ab5e0aff1aae96ad8b4c2d9c18948a378808c671a6c301afcf5dd923d7c3f138

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page