Skip to main content

Fast & Lightweight OCR for vehicle license plates.

Project description

Fast & Lightweight License Plate OCR

Actions status Keras 3 image image Ruff Pylint Checked with mypy image

Intro


Introduction

Lightweight and fast OCR models for license plate text recognition. You can train models from scratch or use the trained models for inference.

The idea is to use this after a plate object detector, since the OCR expects the cropped plates.

Features

  • Keras 3 Backend Support: Compatible with TensorFlow, JAX, and PyTorch backends 🧠
  • Augmentation Variety: Diverse augmentations via Albumentations library 🖼️
  • Efficient Execution: Lightweight models that are cheap to run 💰
  • ONNX Runtime Inference: Fast and optimized inference with ONNX runtime ⚡
  • User-Friendly CLI: Simplified CLI for training and validating OCR models 🛠️
  • Model HUB: Access to a collection of pre-trained models ready for inference 🌟

Available Models

Model Name Time b=1
(ms)[1]
Throughput
(plates/second)[1]
Dataset Accuracy[2] Dataset
argentinian-plates-cnn-model 2.0964 477 arg_plate_dataset.zip 94.05% Non-synthetic, plates up to 2020.

[1]Inference on Mac M1 chip using CPUExecutionProvider. Utilizing CoreMLExecutionProvider accelerates speed by 5x.

[2] Accuracy is what we refer as plate_acc. See metrics section.

Reproduce results.
  • Calculate Inference Time:

    pip install fast_plate_ocr  # CPU
    # or
    pip install fast_plate_ocr[inference_gpu]  # GPU
    
    from fast_plate_ocr import ONNXPlateRecognizer
    
    m = ONNXPlateRecognizer("argentinian-plates-cnn-model")
    m.benchmark()
    
  • Calculate Model accuracy

    pip install fast-plate-ocr[train]
    curl -LO https://github.com/ankandrew/fast-plate-ocr/releases/download/v1.0/arg_cnn_ocr_config.yaml
    curl -LO https://github.com/ankandrew/fast-plate-ocr/releases/download/v1.0/arg_cnn_ocr.keras
    curl -LO https://github.com/ankandrew/fast-plate-ocr/releases/download/v1.0/arg_plate_benchmark.zip
    unzip arg_plate_benchmark.zip
    fast_plate_ocr valid \
        -m arg_cnn_ocr.keras \
        --config-file arg_cnn_ocr_config.yaml \
        --annotations benchmark/annotations.csv
    

Inference

For inference only, install:

pip install fast_plate_ocr

For doing inference on GPU, install:

pip install fast_plate_ocr[inference_gpu]

Usage

To predict from disk image:

from fast_plate_ocr import ONNXPlateRecognizer

m = ONNXPlateRecognizer('argentinian-plates-cnn-model')
print(m.run('test_plate.png'))
run demo

Run demo

To run model benchmark:

from fast_plate_ocr import ONNXPlateRecognizer

m = ONNXPlateRecognizer('argentinian-plates-cnn-model')
m.benchmark()
benchmark demo

Benchmark demo

CLI

CLI

To train or use the CLI tool, you'll need to install:

pip install fast_plate_ocr[train]

Train Model

To train the model you will need:

  1. A configuration used for the OCR model. Depending on your use case, you might have more plate slots or different set of characters. Take a look at the config for Argentinian license plate as an example:
    # Config example for Argentinian License Plates
    # The old license plates contain 6 slots/characters (i.e. JUH697)
    # and new 'Mercosur' contain 7 slots/characters (i.e. AB123CD)
    
    # Max number of plate slots supported. This represents the number of model classification heads.
    max_plate_slots: 7
    # All the possible character set for the model output.
    alphabet: '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_'
    # Padding character for plates which length is smaller than MAX_PLATE_SLOTS. It should still be present in the alphabet.
    pad_char: '_'
    # Image height which is fed to the model.
    img_height: 70
    # Image width which is fed to the model.
    img_width: 140
    
  2. A labeled dataset, see arg_plate_dataset.zip for the expected data format.
  3. Run train script:
    # You can set the backend to either TensorFlow, JAX or PyTorch
    # (just make sure it is installed)
    !KERAS_BACKEND=tensorflow fast_plate_ocr train \
        --annotations path_to_the_train.csv \
        --val-annotations path_to_the_val.csv \
        --batch-size 128 \
        --epochs 750 \
        --dense \
        --early-stopping-patience 100 \
        --reduce-lr-patience 50
    

You will probably want to change the augmentation pipeline to apply to your dataset. In order to do this

Define Albumentations pipeline:

import albumentations as A

transform_pipeline = A.Compose(
    [
        # ...
        A.RandomBrightnessContrast(brightness_limit=0.1, contrast_limit=0.1, p=1),
        A.MotionBlur(blur_limit=(3, 5), p=0.1),
        A.CoarseDropout(max_holes=10, max_height=4, max_width=4, p=0.3),
        # ... and any other augmentation ...
    ]
)

# Export to a file (this resultant YAML can be used by the train script)
A.save(transform_pipeline, "./transform_pipeline.yaml", data_format="yaml")

And then you can train using the custom transformation pipeline with the --augmentation-path option.

Visualize Augmentation

It's useful to visualize the augmentation pipeline before training the model. This helps us to identify if we should apply more heavy augmentation or less, as it can hurt the model.

You might want to see the augmented image next to the original, to see how much it changed:

fast_plate_ocr visualize-augmentation \
    --img-dir benchmark/imgs \
    --columns 2 \
    --show-original \
    --augmentation-path '/transform_pipeline.yaml'

You will see something like:

Augmented Images

Validate Model

After finishing training you can validate the model on a labeled test dataset.

Example:

fast_plate_ocr valid \
    --model arg_cnn_ocr.keras \
    --config-file arg_plate_example.yaml \
    --annotations benchmark/annotations.csv

Visualize Predictions

Once you finish training your model, you can view the model predictions on raw data with:

fast_plate_ocr visualize-predictions \
    --model arg_cnn_ocr.keras \
    --img-dir benchmark/imgs \
    --config-file arg_cnn_ocr_config.yaml

You will see something like:

Visualize Predictions

Export as ONNX

Exporting the Keras model to ONNX format might be beneficial to speed-up inference time.

fast_plate_ocr export-onnx \
	--model arg_cnn_ocr.keras \
	--output-path arg_cnn_ocr.onnx \
	--opset 18 \
	--config-file arg_cnn_ocr_config.yaml

Keras Backend

To train the model, you can install the ML Framework you like the most. Keras 3 has support for TensorFlow, JAX and PyTorch backends.

To change the Keras backend you can either:

  1. Export KERAS_BACKEND environment variable, i.e. to use JAX for training:
    KERAS_BACKEND=jax fast_plate_ocr train --config-file ...
    
  2. Edit your local config file at ~/.keras/keras.json.

Note: You will probably need to install your desired framework for training.

Model Architecture

The current model architecture is quite simple but effective. See cnn_ocr_model the code.

The model output consists of several heads. Each head represents the prediction of a character of the plate. If the plate consists of 7 characters at most (max_plate_slots=7), then the model would have 7 heads.

Example of Argentinian plates:

Model head

Each head will output a probability distribution over the vocabulary specified during training. So the output prediction for a single plate will be of shape (max_plate_slots, vocabulary_size).

Model Metrics

During training, you will see the following metrics

  • plate_acc: Compute how many plates were correctly classified. For a single plate, if ground truth is ABC123, and the prediction is 'ABC 123', then this would give a score of 1. If the prediction was ABD 123, it would score 0.
  • cat_acc: Calculates how many characters of the plate were correctly classified. Example if the correct label is ABC123 and ABC133 is predicted, it will not give a precision of 0% like plate_acc (not completely classified correctly), but 83.3% (5/6).
  • top_3_k: Calculates how often the true character is found in the top-3 predictions (the 3 with the highest probability).

Metrics are defined in this custom.py module.

Contributing

Contributions to the repo are greatly appreciated. Whether it's bug fixes, feature enhancements, or new models, your contributions are warmly welcomed.

To start contributing or to begin development, you can follow these steps:

  1. Clone repo
    git clone https://github.com/ankandrew/fast-plate-ocr.git
    
  2. Install all dependencies using Poetry:
    poetry install --all-extras
    
  3. To ensure your changes pass linting and tests before submitting a PR:
    make checks
    

If you want to train a model and share it, we'll add it to the HUB 🚀

TODO

  • Expand model zoo.
  • Use synthetic image plates.
  • Finish and push TorchServe files.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast_plate_ocr-0.1.0.tar.gz (26.4 kB view details)

Uploaded Source

Built Distribution

fast_plate_ocr-0.1.0-py3-none-any.whl (30.9 kB view details)

Uploaded Python 3

File details

Details for the file fast_plate_ocr-0.1.0.tar.gz.

File metadata

  • Download URL: fast_plate_ocr-0.1.0.tar.gz
  • Upload date:
  • Size: 26.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for fast_plate_ocr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2f29da506538af990cf6a409ec7030ce6377485f05ed1d45193fd74be4b3cf8c
MD5 a18c526d0b6c8f5a51ce40ca70b9581f
BLAKE2b-256 dfa81ed70c894162fc93a9d089f5737f4f0b7c672655fe803db34269a45930ca

See more details on using hashes here.

File details

Details for the file fast_plate_ocr-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for fast_plate_ocr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e75b7cab22b38ef59768ecdfcb17878e93802722ad0c1eff100a91d7a20eac32
MD5 19b9f49f0ebfabf9b4d58c13c2aba9ab
BLAKE2b-256 069a91953bbb5a400e99ca1c958abd08190c62411c95524ac7bda00c80335cb8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page