Skip to main content

Quantization algorithms to compress aidge networks.

Project description

Aidge logo

EPL 2.0 Examples PyPi Examples Documentation Status GitLab Contributors Open GitLab Issues Closed GitLab Issues

Aidge Quantization Module

You can find in this folder the library that implements the quantization algorithms. For the moment only Post Training Quantization (PTQ) is available. Its implementation does support multiple branch architectures.

Prerequisite:

  • aidge_core
  • aidge_backend_cpu
  • aidge_backend_cuda
  • aidge_onnx
pip install aidge-learning

🛠 Build from Source

Prerequisite (in addition to previous one):

1. Python or C++ installation using setup scripts

Environment C++ Development Python Development
Windows .\setup.ps1 -Modules quantization -Clean -Tests .\setup.ps1 -Modules quantization -Clean -Tests -Python
Unix ./setup.sh -m quantization --clean --tests ./setup.sh -m quantization --clean --tests --python

[!TIP] Use Get-Help setup.ps1 (Windows) or ./setup.sh -h (Unix) for full documentation.

2. Python Installation using pip

Run these commands from the aidge_quantization/ directory:

# Standard install
pip install . -v

# Install with testing dependencies
pip install .[test] -v && pytest

Editable Install (Experimental)

Use this for real-time development without re-installing.

pip install --no-build-isolation -ve . --config-settings=editable.rebuild=true -Cbuild-dir=build

3. C++ Installation (CMake)

A CMakePresets.json is provided for standard configurations.

# Configure, Build, and Install
cmake --preset clang-debug
cmake --build --preset clang-debug
cmake --install

# Run C++ Tests
ctest --test-dir build/

[!TIP] Create a CMakeUserPresets.json to define your own local build configurations.

User guide

In order to perform a quantization, you will need an AIDGE model (that can be loaded from an ONNX). Then, you will have to provide a calibration dataset consisting of AIDGE tensors (that can be loaded from some numpy arrays). And finally, you will have to specify the quantization number of bits.

Performing the PTQ on your model will then be a one liner:

aidge_quantization.quantize_network(aidge_model, nb_of_bits, calibration_set)

Technical insights

The PTQ algorithm consists of 3 main steps:

- Normalization of the parameters, so that each node set of weights fits in the [-1:1] range.
- Normalization of the activations, so that each node output value fits in the [-1:1] range.
- Quantization of the scaling nodes previously inserted

To achieve those steps, one must propagate the scaling factors inside the network. One should also balance the different branches when they are merging. A particular care is needed for the biases rescaling at each step.

Doing quantization step by step

It is possible to perform the PTQ step by step, thanks to the exposed functions of the API. In that case, here is the standard pipeline:

- Prepare the network for the PTQ (remove the flatten nodes, fuse the BatchNorms ...)
- Insert the scaling nodes that will allow the model calibration
- Perform the Cross Layer Equalization if possible
- Perform the parameter normalization
- Compute the node output ranges over an input calibration dataset
- Adjust the output ranges using a specified error metric (MSE, KL, ...)
- Perform the activation normalization
- Quantize the normalized network
- Convert the scaling factors to bit-shifting operations if needed

Further work

  • add Quantization Aware Training (QAT)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

aidge_quantization-0.9.0.post3-cp312-cp312-win_amd64.whl (11.5 MB view details)

Uploaded CPython 3.12Windows x86-64

aidge_quantization-0.9.0.post3-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (4.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

aidge_quantization-0.9.0.post3-cp311-cp311-win_amd64.whl (11.5 MB view details)

Uploaded CPython 3.11Windows x86-64

aidge_quantization-0.9.0.post3-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (4.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

aidge_quantization-0.9.0.post3-cp310-cp310-win_amd64.whl (11.4 MB view details)

Uploaded CPython 3.10Windows x86-64

aidge_quantization-0.9.0.post3-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (4.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file aidge_quantization-0.9.0.post3-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0.post3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 41815768217a5984448e2e53ac098d42a7781e106d592569ffdc9fcb530768b7
MD5 6baf659ae4579a68a398aacc57a907dc
BLAKE2b-256 0b11c860ddbf2585ed532696aa68d01ce512a479f4d912b7014ff17cf7b13964

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.9.0.post3-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0.post3-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 44b57e72a48559c5516fb6cb7f4e7c70141e1c079aabb5a53d6c27f55abff51b
MD5 9d0c9b61b69a8f0a99a29e52ac6c3a30
BLAKE2b-256 00cf48b5a5705bdd8960f216bcb1c77704bf17f0495ec4fb30cbcc0153b361ce

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.9.0.post3-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0.post3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 e29185a32f70196bf057d3cf7cdb58d84d06603df6b41643ca187e8129a726fa
MD5 accdaca70922077ed89a6df58edee998
BLAKE2b-256 07baf1022616ed2f4983e2c4dffb27918eeb4bc6f8c167b4bd1dac0b46fb2883

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.9.0.post3-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0.post3-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 230b28e09ae9be6edb89c6e3f79d5654c088fd9f6202a5e3a50889ddd26987af
MD5 14a35408a79f9ac8567fb51d6bad4e96
BLAKE2b-256 c2fd38a604f6ab3276d6a803c32ce54d778f1ea866e323b5b148cbc0aba968ae

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.9.0.post3-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0.post3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 1ad7afb9097305902684ad757af884d81ddc2e75b19c2fe44a17fd2f7a821d54
MD5 4aee5f10d01d29523acaff5906647a34
BLAKE2b-256 ac511ca4d72e07a7b646d1bb05ecbee9edafa88de5feacabe33cbe030df50bba

See more details on using hashes here.

File details

Details for the file aidge_quantization-0.9.0.post3-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for aidge_quantization-0.9.0.post3-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 59ee8ee54fcebb5a22401ba322bda49958d14be25b95acf676943ac4623df42d
MD5 7603cbe5f9d399e2b0b26aaca70f3e86
BLAKE2b-256 2fc23fe5b4b372cb159505f7cd623fdc2f0c4aade6a0f6b6552bd1c081d88ab0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page