Skip to main content

Quantization algorithms to compress aidge networks.

Project description

Aidge Quantization Module

You can find in this folder the library that implements the quantization algorithms. For the moment only Post Training Quantization (PTQ) is available. Its implementation does support multiple branch architectures.

[TOC]

Quick Start

Dependencies

Aidge dependencies

  • aidge_core
  • aidge_backend_cpu
  • aidge_backend_cuda, if AIDGE_ENABLE_CUDA is set to ON.
  • aidge_onnx
# On Windows, from aidge/
# ----------[ C++ development]----------
setup.ps1 -Modules core, backend_cpu, backend_cuda -Clean -Tests

# ----------[ Python development]----------
setup.ps1 -Modules core, backend_cpu, backend_cuda, onnx -Clean -Tests -Python

# On Unix, from aidge/
# ----------[ C++ development]----------
./setup.sh -m core -m backend_cpu, -m backend_cuda --clean --tests

# ----------[ Python development]----------
./setup.sh -m core -m backend_cpu, -m backend_cuda --clean --tests --python

Using setup.ps1

[!NOTE] Windows only* *Unless you installed powershell on your system

# ----------[ C++ development]----------
# From aidge/
setup.ps1 -Modules quantization -Clean -Tests

# ----------[ Python development]----------
# From aidge/
setup.ps1 -Modules quantization -Clean -Tests -Python

[!TIP] Run Get-Help setup.ps1 --Full to display documentation

Using setup.sh

[!NOTE] Unix only*

# ----------[ C++ development]----------
# From aidge/
./setup.sh -m quantization --clean --tests

# ----------[ Python development]----------
# From aidge/
./setup.sh -m quantization --clean --tests --python

[!TIP] Run setup.sh -h to display documentation

Using pip

[!NOTE] If using virtual environment, make sure to use the same for every installation !

# ----------[ Python development]----------
# only in aidge/aidge/aidge_quantization/
pip install . -v

# If you want to install test, do this instead
pip install .[test] -v

# Launch tests using pytest
pytest

[!TIP] -v is to enable verbose mode !

Development mode install

[!WARNING] Experimental Untested & experimental feature, see https://scikit-build-core.readthedocs.io/en/latest/configuration/index.html#editable-installs.

pip install --no-build-isolation --config-settings=editable.rebuild=true -Cbuild-dir=build -ve.

Using CMake

[!NOTE] Only for C++ development Don't use this method, if you wish to create a pip package Use pip instead.

A CMakePreset.json is available.

# Configure
cmake --preset clang-debug
# Build
cmake --build --preset clang-debug
# Install
cmake --install
# Test
ctest --test-dir build/

Feel free to create your own presets in CMakeUsersPresets.json by inheriting the ones available in CMakePresets.json.

User guide

In order to perform a quantization, you will need an AIDGE model (that can be loaded from an ONNX). Then, you will have to provide a calibration dataset consisting of AIDGE tensors (that can be loaded from some numpy arrays). And finally, you will have to specify the quantization number of bits.

Performing the PTQ on your model will then be a one liner:

aidge_quantization.quantize_network(aidge_model, nb_of_bits, calibration_set)

Technical insights

The PTQ algorithm consists of 3 main steps:

- Normalization of the parameters, so that each node set of weights fits in the [-1:1] range.
- Normalization of the activations, so that each node output value fits in the [-1:1] range.
- Quantization of the scaling nodes previously inserted

To achieve those steps, one must propagate the scaling factors inside the network. One should also balance the different branches when they are merging. A particular care is needed for the biases rescaling at each step.

Doing quantization step by step

It is possible to perform the PTQ step by step, thanks to the exposed functions of the API. In that case, here is the standard pipeline:

- Prepare the network for the PTQ (remove the flatten nodes, fuse the BatchNorms ...)
- Insert the scaling nodes that will allow the model calibration
- Perform the Cross Layer Equalization if possible
- Perform the parameter normalization
- Compute the node output ranges over an input calibration dataset
- Adjust the output ranges using a specified error metric (MSE, KL, ...)
- Perform the activation normalization
- Quantize the normalized network
- Convert the scaling factors to bit-shifting operations if needed

Further work

  • add Quantization Aware Training (QAT)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

aidge_quantization-0.9.0-cp312-cp312-win_amd64.whl (10.6 MB view details)

Uploaded CPython 3.12Windows x86-64

aidge_quantization-0.9.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (4.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

aidge_quantization-0.9.0-cp311-cp311-win_amd64.whl (10.6 MB view details)

Uploaded CPython 3.11Windows x86-64

aidge_quantization-0.9.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (4.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

aidge_quantization-0.9.0-cp310-cp310-win_amd64.whl (10.6 MB view details)

Uploaded CPython 3.10Windows x86-64

aidge_quantization-0.9.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (4.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file aidge_quantization-0.9.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 18242a0f8b54e48551756a0ef05d361ac5a26414ae4fad10c741bd2cb730d11d
MD5 e1a6e3b04202cab5c9dabe384de711e8
BLAKE2b-256 b66b5e23d599600ab2701b4cac756e89937db68f66ad3d6bad5476f027082c16

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.9.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c9453042274d0497b1127882bdc0dcf314be46a72b4b2e8111d7b9ba10a19547
MD5 4b3f668b9b3fef6019106093acb6a0b8
BLAKE2b-256 7d8250285b0f4f248f3593f1563cb8b0c39d717b512d794eaccfeb5d4379ab6e

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.9.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 b7ca558449c24fb638fdea7bef610543dcea93b0caac447c626712bdc6fc6389
MD5 bd88801535916e0ede15f66bf65bc2c4
BLAKE2b-256 251733f9a11ef24befa3d65fe042dbc3bc0acafc666ff3890e3916ffd00bcc32

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.9.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5cc793935bbf083777837d170540fb35452a91f3746d20be207ee7a91e1db25a
MD5 e678419de259f3e0d2f34561307baa1e
BLAKE2b-256 3dd10660dccf679d6e28a20a8fcfb69ab6d9f6a66b085a908ab98a2416a5cc4f

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.9.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 f050783db9757b86ce024c96484da44922b79814f81aef53bd33d4487db59570
MD5 44e2b5e3b1bca89b09a1000d9144c54b
BLAKE2b-256 6369d8b9d5df99fe008ae769b0212cd1f9e647de361320a88df80d0e1ec8905f

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.9.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4dca140f5173d6f5b5d47d053d30d4a2707f8000fdcf84a64b32d6b4f3ccb14a
MD5 93b9c19fa8662217e4001960f0f6413d
BLAKE2b-256 46d7fa3a81b1273a76eafc0b5aaf39986aef3cbcc53ba375bfd08292a15d3ecb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page