Skip to main content

Resource utilization and Latency Estimation for ML on FPGA

Project description

License

rule4ml: Resource Utilization and Latency Estimation for ML

rule4ml is a tool designed for pre-synthesis estimation of FPGA resource utilization and inference latency for machine learning models.

Installation

rule4ml releases are uploaded to the Python Package Index for easy installation via pip.

pip install rule4ml

This will only install the base package and its dependencies for resources and latency prediction. The data_gen scripts and the Jupyter notebooks are to be cloned from the repo if needed. The data generation dependencies are listed seperately in data_gen/requirements.txt.

Getting Started

Tutorial

To get started with rule4ml, please refer to the detailed Jupyter Notebook tutorial. This tutorial covers:

  • Using pre-trained estimators for resources and latency predictions.
  • Generating synthetic datasets.
  • Training and testing your own predictors.

Usage

Here's a quick example of how to use rule4ml to estimate resources and latency for a given model:

import keras
from keras.layers import Input, Dense, Activation

from rule4ml.models.estimators import MultiModelEstimator

# Example of a simple keras Model
input_size = 16
inputs = Input(shape=(input_size,))
x = Dense(32, activation="relu")(inputs)
x = Dense(32, activation="relu")(x)
x = Dense(32, activation="relu")(x)
outputs = Dense(5, activation="softmax")(x)

model_to_predict = keras.Model(inputs=inputs, outputs=outputs, name="Jet Classifier")
model_to_predict.build((None, input_size))  # building keras models is required

# Loading default predictors
estimator = MultiModelEstimator()
estimator.load_default_models()

# MultiModelEstimator predictions are formatted as a pandas DataFrame
prediction_df = estimator.predict(model_to_predict)

# Further formatting can applied to organize the DataFrame
if not prediction_df.empty:
    prediction_df = prediction_df.groupby(
        ["Model", "Board", "Strategy", "Precision", "Reuse Factor"], observed=True
    ).mean()  # each row is unique in the groupby, mean() is only called to convert DataFrameGroupBy

# Outside of Jupyter notebooks, we recommend saving the DataFrame as HTML for better readability
prediction_df.to_html("keras_example.html")

keras_example.html (truncated)

BRAM (%) DSP (%) FF (%) LUT (%) CYCLES
Model Board Strategy Precision Reuse Factor
Jet Classifier pynq-z2 Latency ap_fixed<2, 1> 1 2.77 0.89 2.63 30.02 54.68
2 2.75 0.86 2.62 29.91 55.84
4 2.70 0.79 2.58 29.80 55.78
8 2.97 0.67 2.49 29.79 68.84
16 2.97 0.63 2.50 30.24 75.38
32 2.26 0.74 2.43 30.90 76.19
64 0.83 0.47 2.19 32.89 112.04
ap_fixed<8, 3> 1 2.63 1.58 13.91 115.89 53.96
2 2.63 1.50 13.63 111.75 54.70
4 2.59 1.25 13.07 108.52 56.16
8 2.76 1.41 12.22 108.01 53.07
16 3.42 1.96 11.98 104.58 64.71
32 2.99 1.93 12.74 94.71 83.06
64 0.56 1.70 14.74 92.78 104.88
ap_fixed<16, 6> 1 1.78 199.86 45.96 184.86 66.59
2 2.30 198.30 45.71 190.51 68.14
4 2.38 198.50 45.95 195.05 73.15
8 1.48 175.18 46.42 188.65 95.70
16 2.90 83.85 48.13 184.96 101.44
32 4.43 51.04 51.83 193.38 141.07
64 0.75 30.32 55.36 193.26 229.37

Datasets

Training accurate predictors requires large datasets of synthesized neural networks. We used hls4ml to synthesize neural networks generated with parameters randomly sampled from predefined ranges (defaults of data classes in the code). Our models' training data is publicly available at https://borealisdata.ca/dataverse/rule4ml.

Limitations

In their current iteration, the predictors can process Keras or PyTorch models to generate FPGA resources (BRAM, DSP, FF, LUT) and latency (Clock Cycles) estimations for various synthesis configurations. However, the training models are limited to specific layers: Dense/Linear, ReLU, Tanh, Sigmoid, Softmax, BatchNorm, Add, Concatenate, and Dropout. They are also constrained by synthesis parameters, notably clock_period (10 ns) and io_type (io_parallel). Inputs outside these configurations may result in inaccurate predictions.

License

This project is licensed under the GPL-3.0 License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rule4ml_test-0.1.7.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

rule4ml_test-0.1.7-py3-none-any.whl (2.3 MB view details)

Uploaded Python 3

File details

Details for the file rule4ml_test-0.1.7.tar.gz.

File metadata

  • Download URL: rule4ml_test-0.1.7.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for rule4ml_test-0.1.7.tar.gz
Algorithm Hash digest
SHA256 1b1d704b9d7077984caa1dea9e0809c9e3695b81444c0a73dcbf2fc425fd5ffa
MD5 18e6bfb86cad8775e361dc33896294d9
BLAKE2b-256 d85e99996db5fa4471b5500f3be55ab2e515f4fa92d07f7ffa141855d2c364c1

See more details on using hashes here.

File details

Details for the file rule4ml_test-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: rule4ml_test-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 2.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for rule4ml_test-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 3df062913cc0cf174b23ad42a80c1d717e39f99bd34f1cc6c4691a19924a47b9
MD5 77a8b70500bbe516b5e1c4f3a0b3dbf8
BLAKE2b-256 7206b7a9a8cad9004395a043037896bca674c71a305bac25368506186cb68a6c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page