Skip to main content

Quantization algorithms to compress aidge networks.

Project description

Aidge Quantization Module

You can find in this folder the library that implements the quantization algorithms. For the moment only Post Training Quantization (PTQ) is available. Its implementation does support multiple branch architectures.

[TOC]

Installation

Dependencies

  • GCC
  • Make/Ninja
  • CMake
  • Python (optional, if you have no intend to use this library in python with pybind)

Aidge dependencies

  • aidge_core The requirements for installing the library are the followings:

    • GCC, Make and CMake for the compilation pipeline
    • The AIDGE modules aidge_core, aidge_onnx and aidge_backend_cpu
    • Python (> 3.7) if you intend to use the pybind wrapper

Pip installation

pip install . -v

TIPS : Use environment variables to change compilation options :

  • AIDGE_INSTALL : to set the installation folder. Defaults to /usr/local/lib. :warning: This path must be identical to aidge_core install path.
  • AIDGE_PYTHON_BUILD_TYPE : to set the compilation mode to Debug or Release
  • AIDGE_BUILD_GEN : to set the build backend with

User guide

In order to perform a quantization, you will need an AIDGE model (that can be loaded from an ONNX). Then, you will have to provide a calibration dataset consisting of AIDGE tensors (that can be loaded from some numpy arrays). And finally, you will have to specify the quantization number of bits.

Performing the PTQ on your model will then be a one liner:

aidge_quantization.quantize_network(aidge_model, nb_of_bits, calibration_set)

Technical insights

The PTQ algorithm consists of 3 main steps:

- Normalization of the parameters, so that each node set of weights fits in the [-1:1] range.
- Normalization of the activations, so that each node output value fits in the [-1:1] range.
- Quantization of the scaling nodes previously inserted

To achieve those steps, one must propagate the scaling factors inside the network. One should also balance the different branches when they are merging. A particular care is needed for the biases rescaling at each step.

Doing quantization step by step

It is possible to perform the PTQ step by step, thanks to the exposed functions of the API. In that case, here is the standard pipeline:

- Prepare the network for the PTQ (remove the flatten nodes, fuse the BatchNorms ...)
- Insert the scaling nodes that will allow the model calibration
- Perform the Cross Layer Equalization if possible
- Perform the parameter normalization
- Compute the node output ranges over an input calibration dataset
- Adjust the output ranges using a specified error metric (MSE, KL, ...)
- Perform the activation normalization
- Quantize the normalized network
- Convert the scaling factors to bit-shifting operations if needed

Further work

  • add Quantization Aware Training (QAT)

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

aidge_quantization-0.3.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

aidge_quantization-0.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

aidge_quantization-0.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

aidge_quantization-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

aidge_quantization-0.3.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

aidge_quantization-0.3.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file aidge_quantization-0.3.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.3.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0868a4876c2626a1d1f9cc776ccde9a54966697412f0456d9c0ea43a0ca1a3fb
MD5 31589088d77154f611c36d879e29809d
BLAKE2b-256 c60aaf89a70ce71d13f211d2a0b48ac9ef296918b66c3fe819f844ce511c7f68

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f773a7cf45c5a0007d3b62cec56e89fa227bb7e0978f82e60bb3e986b18e3bb0
MD5 29e3b59ce892868715f6055b455ebdce
BLAKE2b-256 9ee3a757c0f22b7ead5e686c6932e4bc0de1d3e314cfcaf9eabada45f1e016ad

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0400b2a6ed0c795cdec09c8faf9afca9b9b9b0d1d34a8e4608ddbce625a8213d
MD5 e8c0b5934b15e5ad48cc5fd77c21f4ad
BLAKE2b-256 e03c44f6d64fe8454d86846779d41fae1570aff8f6e052fb927fff342d1e225a

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c301afebba83746d8dfe31e27c0657c302fe2e1540c485a30f416abc67109b4b
MD5 39920347bfd447db48675744fd42a982
BLAKE2b-256 207782824e90847a2a5178a07f9df3066f326e69f297a64f709bc6a6a2709278

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.3.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.3.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5233da1230e1fc7aeb4c6c63b25543dfab4b7a533c66efa92846b413fd08e6a5
MD5 d6102ba6988a7372893c76b03d58a87f
BLAKE2b-256 392f69e52666488a1935efa10f6c99476a4eb0aac7236d7b21b37e56e83f65f0

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.3.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.3.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 af3d1327f4d435d8ce76b5bc42d5bbf8dfabdffbca2f2f5703c0c6dbd7f11d2e
MD5 707035c1b43dc6cafff12c72afd74585
BLAKE2b-256 5c33743a2375f1dc7cc581368af4f6acd2a482a4ce33fd7b5146d31c695ec67d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page