Skip to main content

Translation of LightGBM model to VHDL

Project description

lgbm2vhdl

This project aims to create a tool for automated conversion of the LightGBM library model to the digital circuit description in VHDL. The input to this conversion is an arbitrary model obtained by the LightGBM library. The output is a VHDL code that can be synthesized and loaded, for example, on configurable chips with FPGA technology.

Simulations for the ModelSIM environment are also prepared to test the resulting VHDL code. Within these simulations, a testbench file including additional supporting scripts for the ModelSIM environment is created, and it is automatically verified that for a given input test vector, the VHDL architecture gives the same prediction results as the original LightGBM model executed in Python.

Installation

Install the package from pip with:

pip install lgbm2vhdl

Usage Example

import os
from lgbm2vhdl.LGBM2VHDL import LGBM2VHDL

# Example model and quantization definition
model_filename = os.path.join("data", "example_model_class3.txt")
quantization_filename = os.path.join("data", "example_model_class3-quantization.csv")

# Model loading
lvg = LGBM2VHDL(model_filename, quantization_filename, working_dir="./tmp")

# Generation of VHDL files
lvg.generate_vhdl(architecture="mem-c")

# Running simulation
lvg.run_simulation(architecture="mem-c")

Definition of quantization

Some of the input features may be in the form of decimal numbers, usually in floating-point format (exponent and mantissa). Processing this type of number is very computationally resource intensive in FPGA chips, and therefore, these numbers need to be converted to a fixed-point format where a specific number of bits is always specified for the integer and decimal parts of the number. For each such feature, it is necessary to specify the parameters for floating to fixed point conversion. This type of conversion must be performed not only for the input features, but also for the model output values representing the degree of membership (called log-odds ratio) of the input object to the selected class. Therefore, an important input of the conversion is also a CSV file with a definition for quantizing the input features and the output values of the model.

Format of quantization CSV file

For each input feature it is necessary to enter one line in the CSV file containing information about the number format that will be used at the circuit implementation level. In addition, the CSV file will contain one more line defining the number format for the model output values (log-odds ratio). Thus, in total, the file must contain a number of lines corresponding to the number of features + 1.

The format of each line is identical and consists of 5 items:

  1. Name of the feature
  2. Integer/Decimal number (True if a number is integer)
  3. Signed/Unsigned number (True if a number is signed)
  4. Number of bits for integer part of a number
  5. Number of bits for decimal part of a number (zero in case of integer numbers)

Architecture specification

The lgbm2vhdl module supports up to four different types of resulting VHDL circuit architectures.

  1. mem-c - It represents the standard Memory Centric architecture for implementing decision trees and Gradient Boosting models [1]).
  2. mem-c_ord - Similar to mem-c, but the values of the input features and the thresholds (within instructions) are replaced by the rank order within the sorted sequence of all thresholds used in the model.
  3. mem-c_mem - Similar to mem-c, but process element multiplexers are replaced by a memory block into which the input features are sequentially loaded.
  4. mem-c_ord_mem - It combines both the mem-c_ord and mem-c_ord_mem optimizations described above.

Acknowledgements

This project was supported by the Ministry of the Interior of the Czech Republic, grant No. VJ02010024: Flow-Based Encrypted Traffic Analysis.

References

[1] Alcolea, A.; Resano, J. FPGA Accelerator for Gradient Boosting Decision Trees. Electronics 2021, 10, 314. https://doi.org/10.3390/electronics10030314.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lgbm2vhdl-0.0.2.tar.gz (36.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lgbm2vhdl-0.0.2-py3-none-any.whl (58.8 kB view details)

Uploaded Python 3

File details

Details for the file lgbm2vhdl-0.0.2.tar.gz.

File metadata

  • Download URL: lgbm2vhdl-0.0.2.tar.gz
  • Upload date:
  • Size: 36.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/34.0 requests/2.25.1 requests-toolbelt/1.0.0 urllib3/1.26.6 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.3 CPython/3.6.8

File hashes

Hashes for lgbm2vhdl-0.0.2.tar.gz
Algorithm Hash digest
SHA256 53fd894b2b2bacffceeb8e96af318f036e4083b39b501632803bfccc40b27285
MD5 2c0d11bcc1721a95dde97376bf751e14
BLAKE2b-256 82d5aa7656be3aedaab71e3434e097b45811f8d1d32591a9a56fdc9fafd109d2

See more details on using hashes here.

File details

Details for the file lgbm2vhdl-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: lgbm2vhdl-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 58.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/34.0 requests/2.25.1 requests-toolbelt/1.0.0 urllib3/1.26.6 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.3 CPython/3.6.8

File hashes

Hashes for lgbm2vhdl-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f51af48703eb82736f24c2be51d52bbfe9516fd452473a43c07bd3d8971c9c83
MD5 ada0d2d663a05d83a5e6d9fb3e60003a
BLAKE2b-256 af97d43fc26e0020c75223436cbd7a7e454a8ef345d0b36e5f5571bb771bf5ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page