Quantization-aware training in PyTorch
Project description
Brevitas
Brevitas is a PyTorch library for neural network quantization, with support for both post-training quantization (PTQ) and quantization-aware training (QAT).
Please note that Brevitas is a research project and not an official Xilinx product.
If you like this project please consider ⭐ this repo, as it is the simplest and best way to support it.
Requirements
- Python >= 3.8.
- Pytorch >= 1.9.1, <= 2.1 (more recent versions would be untested).
- Windows, Linux or macOS.
- GPU training-time acceleration (Optional but recommended).
Installation
You can install the latest release from PyPI:
pip install brevitas
Getting Started
Brevitas currently offers quantized implementations of the most common PyTorch layers used in DNN under brevitas.nn
, such as QuantConv1d
, QuantConv2d
, QuantConvTranspose1d
, QuantConvTranspose2d
, QuantMultiheadAttention
, QuantRNN
, QuantLSTM
etc., for adoption within PTQ and/or QAT.
For each one of these layers, quantization of different tensors (inputs, weights, bias, outputs, etc) can be individually tuned according to a wide range of quantization settings.
As a reference for PTQ, Brevitas provides an example user flow for ImageNet classification models under brevitas_examples.imagenet_classification.ptq
that quantizes an input torchvision model using PTQ under different quantization configurations (e.g. bit-width, granularity of scale, etc).
For more info, checkout our getting started guide.
Cite as
If you adopt Brevitas in your work, please cite it as:
@software{brevitas,
author = {Alessandro Pappalardo},
title = {Xilinx/brevitas},
year = {2023},
publisher = {Zenodo},
doi = {10.5281/zenodo.3333552},
url = {https://doi.org/10.5281/zenodo.3333552}
}
History
- 2024/10/10 - Release version 0.11.0, see the release notes.
- 2024/07/23 - Minor release version 0.10.3, see the release notes.
- 2024/02/19 - Minor release version 0.10.2, see the release notes.
- 2024/02/15 - Minor release version 0.10.1, see the release notes.
- 2023/12/08 - Release version 0.10.0, see the release notes.
- 2023/04/28 - Minor release version 0.9.1, see the release notes.
- 2023/04/21 - Release version 0.9.0, see the release notes.
- 2023/01/10 - Release version 0.8.0, see the release notes.
- 2021/12/13 - Release version 0.7.1, fix a bunch of issues. Added TVMCon 2021 tutorial notebook.
- 2021/11/03 - Re-release version 0.7.0 (build 1) on PyPI to fix a packaging issue.
- 2021/10/29 - Release version 0.7.0, see the release notes.
- 2021/06/04 - Release version 0.6.0, see the release notes.
- 2021/05/24 - Release version 0.5.1, fix a bunch of minor issues. See release notes.
- 2021/05/06 - Release version 0.5.0, see the release notes.
- 2021/03/15 - Release version 0.4.0, add support for __torch_function__ to QuantTensor.
- 2021/03/04 - Release version 0.3.1, fix bug w/ act initialization from statistics w/ IGNORE_MISSING_KEYS=1.
- 2021/03/01 - Release version 0.3.0, implements enum and shape solvers within extended dependency injectors. This allows declarative quantizers to be self-contained.
- 2021/02/04 - Release version 0.2.1, includes various bugfixes of QuantTensor w/ zero-point.
- 2021/01/30 - First release version 0.2.0 on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for brevitas-0.11.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0350a5516bbcd2eb3c6d504d9d32425129a62ad62d39df57ece820c45301a0b5 |
|
MD5 | 186507fbdead5817b4ebfa85b76161cd |
|
BLAKE2b-256 | b228baf3e9d3d101eb90bb101e48c8137ece6532f5e88782b6a20862e679d461 |