Skip to main content

Resource utilization and Latency Estimation for ML on FPGA

Project description

License PyPI version

rule4ml: Resource Utilization and Latency Estimation for ML

rule4ml is a tool designed for pre-synthesis estimation of FPGA resource utilization and inference latency for machine learning models.

Installation

rule4ml releases are uploaded to the Python Package Index for easy installation via pip.

pip install rule4ml

This will only install the base package and its dependencies for resources and latency prediction. The data_gen scripts and the Jupyter notebooks are to be cloned from the repo if needed.

The data generation dependencies are listed seperately in data_gen/requirements.txt, or can be installed with:

pip install rule4ml[datagen]

Getting Started

Tutorial

To get started with rule4ml, please refer to the detailed Jupyter Notebook tutorial. This tutorial covers:

  • Using pre-trained estimators for resources and latency predictions.
  • Generating synthetic datasets.
  • Training and testing your own predictors.

Usage

Here's a quick example of how to use rule4ml to estimate resources and latency for a given model:

import keras
from keras.layers import Input, Dense, Activation

from rule4ml.models.wrappers import MultiModelWrapper

# Example of a simple keras Model
input_size = 16
inputs = Input(shape=(input_size,))
x = Dense(32, activation="relu")(inputs)
x = Dense(32, activation="relu")(x)
x = Dense(32, activation="relu")(x)
outputs = Dense(5, activation="softmax")(x)

model_to_predict = keras.Model(inputs=inputs, outputs=outputs, name="Jet Classifier")
model_to_predict.build((None, input_size))  # building keras models is required

# Loading default predictors
estimator = MultiModelWrapper()
estimator.load_default_models()

# MultiModelWrapper predictions are formatted as a pandas DataFrame
prediction_df = estimator.predict(model_to_predict)

# Further formatting can be applied to organize the DataFrame
if not prediction_df.empty:
    prediction_df = prediction_df.groupby(
        ["Model", "Board", "Strategy", "Precision", "Reuse Factor", "HLS4ML Version", "Vivado Version"], observed=True
    ).mean()  # each row is unique in the groupby, mean() is only called to convert DataFrameGroupBy

# Outside of Jupyter notebooks, we recommend saving the DataFrame as HTML for better readability
prediction_df.to_html("keras_example.html")

keras_example.html (truncated)

BRAM BRAM (%) DSP DSP (%) FF FF (%) LUT LUT (%) CYCLES INTERVAL
Model Board Strategy Precision Reuse Factor HLS4ML Version Vivado Version
Jet Classifier pynq-z2 latency ap_fixed<2, 1> 1 0.8.1 2019.1 2.52 0.90 0.32 0.14 1265.02 1.19 3564.90 6.70 125.77 1.35
2019.2 2.47 0.88 0.48 0.22 1262.29 1.19 3380.57 6.35 115.48 1.35
2020.1 2.29 0.82 0.49 0.22 1109.34 1.04 3279.37 6.16 115.62 1.35
2020.2 2.55 0.91 0.53 0.24 1490.04 1.40 3457.23 6.50 118.07 1.35
2021.1 2.31 0.83 0.44 0.20 1054.50 0.99 2915.67 5.48 118.99 1.35
2021.2 2.48 0.89 0.58 0.26 1085.17 1.02 3072.19 5.77 117.91 1.35
2022.1 2.53 0.90 0.47 0.21 1301.50 1.22 3093.67 5.82 119.36 1.35
2022.2 2.43 0.87 0.57 0.26 1150.09 1.08 3032.74 5.70 119.39 1.35
2023.1 2.51 0.90 0.59 0.27 1357.55 1.28 3327.19 6.25 118.30 1.35
2023.2 2.39 0.85 0.29 0.13 304.04 0.29 2689.27 5.06 108.34 1.35
2024.1 2.41 0.86 0.54 0.25 1574.28 1.48 3517.61 6.61 116.26 1.35
2024.2 2.08 0.74 0.77 0.35 936.16 0.88 2780.73 5.23 110.77 1.35
1.1.0 2019.1 2.57 0.92 1.16 0.53 1237.20 1.16 2434.88 4.58 37.70 1.35
2019.2 2.53 0.90 1.39 0.63 1273.41 1.20 2317.88 4.36 28.73 1.35
2020.1 2.35 0.84 1.42 0.65 1023.07 0.96 2275.59 4.28 28.97 1.35
2020.2 2.64 0.94 1.45 0.66 1314.61 1.24 2359.94 4.44 30.62 1.35
2021.1 2.34 0.84 1.35 0.61 983.35 0.92 2025.47 3.81 31.37 1.35
2021.2 2.56 0.91 1.50 0.68 1149.12 1.08 2167.54 4.07 30.66 1.35
2022.1 2.65 0.95 1.39 0.63 1104.21 1.04 2131.50 4.01 31.74 1.35
2022.2 2.47 0.88 1.49 0.68 1200.66 1.13 2120.53 3.99 31.79 1.35
2023.1 2.58 0.92 1.64 0.74 1247.67 1.17 2301.45 4.33 30.79 1.35
2023.2 2.49 0.89 1.14 0.52 499.64 0.47 1795.66 3.38 25.01 1.35
2024.1 2.46 0.88 1.45 0.66 1373.96 1.29 2405.98 4.52 29.38 1.35
2024.2 2.09 0.75 1.99 0.91 1059.89 1.00 2089.47 3.93 26.71 1.35

Datasets

Training accurate predictors requires large datasets of synthesized neural networks. We used hls4ml to synthesize neural networks generated with parameters randomly sampled from predefined ranges (defaults of data classes in the code). Our models' training data is publicly available at https://borealisdata.ca/dataverse/rule4ml.

Newer predictors were trained on wa-hls4ml, a bigger dataset including more architectures and parameter ranges. This dataset, along with the HLS project files, can be found at https://huggingface.co/datasets/fastmachinelearning/wa-hls4ml and https://huggingface.co/datasets/fastmachinelearning/wa-hls4ml-projects.

Limitations

In their current iteration, the predictors can process Keras or PyTorch models to generate FPGA resources (BRAM, DSP, FF, LUT) and latency (Clock Cycles) estimations for various synthesis configurations. However, the training models are limited to specific layers: Dense/Linear, ReLU, Tanh, Sigmoid, Softmax, BatchNorm, Add, Concatenate, and Dropout. They are also constrained by synthesis parameters, notably clock_period (10 ns) and io_type (io_parallel). Inputs outside these configurations may result in inaccurate predictions.

License

This project is licensed under the GPL-3.0 License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rule4ml-0.2.0.tar.gz (8.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rule4ml-0.2.0-py3-none-any.whl (8.7 MB view details)

Uploaded Python 3

File details

Details for the file rule4ml-0.2.0.tar.gz.

File metadata

  • Download URL: rule4ml-0.2.0.tar.gz
  • Upload date:
  • Size: 8.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rule4ml-0.2.0.tar.gz
Algorithm Hash digest
SHA256 dba36cd1b05f7f8bce60a3202f4dfb83a7ad43a71472ab16e5230782e89ec813
MD5 f263f14b3d3b5fc1ef91be88e5972996
BLAKE2b-256 aff76e8b340a247eb5123fcc2ec7c2b4172014d50a7ef30b478edbeb15ecc7bb

See more details on using hashes here.

Provenance

The following attestation bundles were made for rule4ml-0.2.0.tar.gz:

Publisher: publish.yml on IMPETUS-UdeS/rule4ml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rule4ml-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: rule4ml-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 8.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rule4ml-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 38d28a3bf5064e79f5dc1fbe0e517a0584dd0153732a2edf02d59961cb47c1ac
MD5 778873905cbea17327c061b69b611c30
BLAKE2b-256 b3ad89e75c1de3143ed29564d96ad572718d0086b9f0edb7831b066dc6424ce3

See more details on using hashes here.

Provenance

The following attestation bundles were made for rule4ml-0.2.0-py3-none-any.whl:

Publisher: publish.yml on IMPETUS-UdeS/rule4ml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page