Skip to main content

A packaged and flexible version of the CRAFT text detector and Keras OCR example.

Reason this release was yanked:

The URLs for weights have changed. Please upgrade.

Project description

keras-ocr

This is a slightly polished and packaged version of the Keras OCR example and the published CRAFT text detection model. It provides a high level API for training a text detection and OCR pipeline.

Getting Started

Installation

You must have the Cairo library installed to use the built in image generator. Then you can install the package.

pip install git+https://github.com/faustomorales/keras-ocr.git#egg=keras-ocr

Using

The following trains a simple OCR and shows how the different components work.

Using pretrained text detection model

The package ships with an easy-to-use implementation of the CRAFT text detection model with the original weights.

import keras_ocr

detector = keras_ocr.detection.Detector(pretrained=True)
image = keras_ocr.tools.read('path/to/image.jpg')

# Boxes will be an Nx4x2 array of box quadrangles
# where N is the number of detected text boxes.
boxes = detector.detect(images=[image])[0]

Complete end-to-end training

You may wish to train your own end-to-end OCR pipeline! Here's an example for how you might do it. Note that the image generator has many options not documented here (such as adding backgrounds and image augmentation). Check the documentation for the keras_ocr.tools.get_image_generator function for more details.

import string

import keras_ocr

# The alphabet defines which characters
# the OCR will be trained to detect.
alphabet = string.digits + \
           string.ascii_lowercase + \
           string.ascii_uppercase + \
           string.punctuation + ' '

# For each text sample, the text generator provides
# a list of (category, string) tuples. The category
# is used to select which fonts the image generator
# should choose from when rendering those characters 
# (see the image generator step below) this is useful
# for cases where you have characters that are only
# available in some fonts. You can replace this with
# your own generator, just be sure to match
# that function signature if you are using
# recognizer.get_image_generator. Alternatively,
# you can provide your own image_generator altogether.
# The default text generator uses the DocumentGenerator
# from essential-generators.
detection_text_generator = keras_ocr.get_text_generator(
    max_string_length=32,
    alphabet=alphabet
)

# We first need to build and train a text detector.
detector = keras_ocr.detection.Detector(pretrained=True)
detector.model.compile(
    loss='mse',
    optimizer='adam'
)

# The image generator generates (image, sentence)
# tuples where image is a HxWx1 image (grayscale)
# and sentence is a string using only letters
# from the selected alphabet. You can replace
# this with your own generator, just be sure to match
# that function signature.
detection_image_generator = keras_ocr.tools.get_image_generator(
    height=256,
    width=256,
    x_start=(10, 30),
    y_start=(10, 30),
    single_line=False,
    text_generator=detection_text_generator
    font_groups={
        'characters': [
            'Century Schoolbook',
            'Courier',
            'STIX',
            'URW Chancery L',
            'FreeMono'
        ]
    }
)
detection_batch_generator = detector.get_batch_generator(
    image_generator=detection_image_generator,
    batch_size=2,
)
detector.model.fit_generator(
  generator=detection_batch_generator,
  steps_per_epoch=100,
  epochs=10,
  workers=0
)
detector.model.save_weights('v0_detector.h5')

# Great, now we need a recogizer.
recognizer = keras_ocr.Recognizer(
    alphabet=alphabet,
    width=128,
    height=64
)

# Create a training model (requires you to set
# a maximum string length).
recognizer.create_training_model(max_string_length=16)

# This next part is similar to before but now
# we adjust the image generator to provide only
# single lines of text.
recognition_image_generator = keras_ocr.tools.convert_multiline_generator_to_single_line(
    multiline_generator=detection_image_generator,
    max_string_length=recognizer.training_model.input_shape[1][1],
    target_width=recognizer.model.input_shape[2],
    target_height=recognizer.model.input_shape[1]
)
recognition_batch_generator = recognizer.get_batch_generator(
    image_generator=recognition_image_generator,
    batch_size=8
)
recognizer.training_model.fit_generator(
    generator=recognition_batch_generator,
    steps_per_epoch=100,
    epochs=100
)

# You can save the model weights to use later.
recognizer.model.save_weights('v0_recognizer.h5')

# Once training is done, you can use recognize
# to extract text.
image, _, _ = next(detection_image_generator)
boxes = detector.detect(images=[image])[0]
predictions = recognizer.recognize_from_boxes(boxes=boxes, image=image)
print(predictions)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

keras_ocr-0.1-py3-none-any.whl (20.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page