Skip to main content

Repository of Intel® Low Precision Optimization Tool

Project description

Intel® Low Precision Optimization Tool

Intel® Low Precision Optimization Tool is an open-source python library which is intended to deliver a unified low-precision inference interface cross multiple Intel optimized DL frameworks on both CPU and GPU. It supports automatic accuracy-driven tuning strategies, along with additional objectives like optimizing for performance, model size and memory footprint. It also provides the easy extension capability for new backends, tuning strategies, metrics and objectives.

WARNING

GPU support is under development.

Infrastructure Workflow

Supported Intel optimized DL frameworks are:

Supported tuning strategies are:

Documents

  • Introduction explains the API of Intel® Low Precision Optimization Tool.
  • Hello World demonstrates the simple steps to utilize Intel® Low Precision Optimization Tool for quanitzation, which can help you quick start with the tool.
  • Tutorial provides comprehensive instructions of how to utilize diffrennt features of Intel® Low Precision Optimization Tool. In examples, there are a lot of examples to demonstrate the usage of Intel® Low Precision Optimization Tool in TensorFlow, PyTorch and MxNet for diffrent categories.
  • Strategies provides comprehensive explanation on the details of how every tuning strategy works.
  • PTQ and QAT explains how Intel® Low Precision Optimization Tool works with post-training quantization and quantization-ware training.
  • Pruning on PyTorch explains how Intel® Low Precision Optimization Tool works with magnitude pruning on PyTorch.
  • Tensorboard explains how Intel® Low Precision Optimization Tool helps developer to analyze tensor distribution and the impact to final accuracy during tuning process.
  • Quantized Model Deployment on PyTorch explains how Intel® Low Precision Optimization Tool quantizes a FP32 PyTorch model, save and deploy quantized model through lpot utils.
  • BF16 Mix-Precision on TensorFlow explains how Intel® Low Precision Optimization Tool supports INT8/BF16/FP32 mix precision model tuning on TensorFlow backend.
  • Supported Model Types on TensorFlow explains the TensorFlow model types supported by Intel® Low Precision Optimization Tool.

Install from source

git clone https://github.com/intel/lpot.git
cd lpot
python setup.py install

Install from binary

# install from pip
pip install lpot

# install from conda
conda install lpot -c intel -c conda-forge

System Requirements

Intel® Low Precision Optimization Tool supports systems based on Intel 64 architecture or compatible processors, specially optimized for the following CPUs:

  • Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, and Cooper Lake)
  • future Intel Xeon Scalable processor (code name Sapphire Rapids)

Intel® Low Precision Optimization Tool requires to install Intel optimized framework version for TensorFlow, PyTorch, and MXNet.

Validated Hardware/Software Environment

Platform OS Python Framework Version
Cascade Lake

Cooper Lake

Skylake
CentOS 7.8

Ubuntu 18.04
3.6

3.7
tensorflow 2.2.0
1.15UP1
2.3.0
2.1.0
1.15.2
pytorch 1.5.0+cpu
mxnet 1.7.0
1.6.0

Model Zoo

Intel® Low Precision Optimization Tool provides a lot of examples to show promising accuracy loss with best performance gain.

Framework version model dataset TOP-1 Accuracy Performance Speedup
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio[(INT8-FP32)/FP32] Realtime Latency Ratio[FP32/INT8]
tensorflow 2.2.0 resnet50v1.0 ImageNet 73.80% 74.30% -0.67% 2.25x
tensorflow resnet50v1.5 ImageNet 76.80% 76.50% 0.39% 2.32x
tensorflow resnet101 ImageNet 77.20% 76.40% 1.05% 2.75x
tensorflow inception_v1 ImageNet 70.10% 69.70% 0.57% 1.56x
tensorflow inception_v2 ImageNet 74.00% 74.00% 0.00% 1.68x
tensorflow inception_v3 ImageNet 77.20% 76.70% 0.65% 2.05x
tensorflow inception_v4 ImageNet 80.00% 80.30% -0.37% 2.52x
tensorflow inception_resnet_v2 ImageNet 80.20% 80.40% -0.25% 1.75x
tensorflow mobilenetv1 ImageNet 71.10% 71.00% 0.14% 1.88x
tensorflow ssd_resnet50_v1 Coco 37.72% 38.01% -0.76% 2.88x
tensorflow mask_rcnn_inception_v2 Coco 28.75% 29.13% -1.30% 4.14x
tensorflow wide_deep_large_ds criteo-kaggle 77.61% 77.67% -0.08% 1.41x
tensorflow vgg16 ImageNet 72.10% 70.90% 1.69% 3.71x
tensorflow vgg19 ImageNet 72.30% 71.00% 1.83% 3.78x
tensorflow resnetv2_50 ImageNet 70.20% 69.60% 0.86% 1.52x
tensorflow resnetv2_101 ImageNet 72.50% 71.90% 0.83% 1.59x
tensorflow resnetv2_152 ImageNet 72.70% 72.40% 0.41% 1.62x
tensorflow densenet121 ImageNet 72.60% 72.90% -0.41% 1.84x
tensorflow densenet161 ImageNet 76.10% 76.30% -0.26% 1.44x
tensorflow densenet169 ImageNet 74.40% 74.60% -0.27% 1.22x
Framework Version Model Dataset TOP-1 Accuracy Performance Speedup
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio[(INT8-FP32)/FP32] Realtime Latency Ratio[FP32/INT8]
mxnet 1.7.0 resnet50v1 ImageNet 76.03% 76.33% -0.39% 3.18x
mxnet inceptionv3 ImageNet 77.80% 77.64% 0.21% 2.65x
mxnet mobilenet1.0 ImageNet 71.72% 72.22% -0.69% 2.62x
mxnet mobilenetv2_1.0 ImageNet 70.77% 70.87% -0.14% 2.89x
mxnet resnet18_v1 ImageNet 69.99% 70.14% -0.21% 3.08x
mxnet squeezenet1.0 ImageNet 56.88% 56.96% -0.14% 2.55x
mxnet ssd-resnet50_v1 VOC 80.21% 80.23% -0.02% 4.16x
mxnet ssd-mobilenet1.0 VOC 74.94% 75.54% -0.79% 3.31x
mxnet resnet152_v1 ImageNet 78.32% 78.54% -0.28% 3.16x

Known Issues

  1. MSE tuning strategy doesn't work with PyTorch adaptor layer MSE tuning strategy requires to compare FP32 tensor and INT8 tensor to decide which op has impact on final quantization accuracy. PyTorch adaptor layer doesn't implement this inspect tensor interface. So if the model to tune is a PyTorch model, please do not choose MSE tuning strategy.

Support

Please submit your questions, feature requests, and bug reports on the GitHub issues page. You may also reach out to lpot.maintainers@intel.com.

Contributing

We welcome community contributions to Intel® Low Precision Optimization Tool. If you have an idea on how to improve the library:

For additional details, see contribution guidelines.

This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

Intel® Low Precision Optimization Tool is licensed under Apache License Version 2.0. This software includes components with separate copyright notices and license terms. Your use of the source code for these components is subject to the terms and conditions of the following licenses.

Apache License Version 2.0:

MIT License:

See accompanying LICENSE file for full license text and copyright notices.


Legal Information

Citing

If you use Intel® Low Precision Optimization Tool in your research or wish to refer to the tuning results published in the Tuning Zoo, please use the following BibTeX entry.

@misc{Intel® Low Precision Optimization Tool,
  author =       {Feng Tian, Chuanqi Wang, Guoming Zhang, Penghui Cheng, Pengxin Yuan, Haihao Shen, and Jiong Gong},
  title =        {Intel® Low Precision Optimization Tool},
  howpublished = {\url{https://github.com/intel/lpot}},
  year =         {2020}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neural-compressor-0.1.tar.gz (161.2 kB view details)

Uploaded Source

Built Distribution

neural_compressor-0.1-py3-none-any.whl (258.3 kB view details)

Uploaded Python 3

File details

Details for the file neural-compressor-0.1.tar.gz.

File metadata

  • Download URL: neural-compressor-0.1.tar.gz
  • Upload date:
  • Size: 161.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3

File hashes

Hashes for neural-compressor-0.1.tar.gz
Algorithm Hash digest
SHA256 101b7dc8483ee117c64cd08dc9d6647485e38799be444b022f8cc3b5f05569bb
MD5 0221688885244db3bff9aa0f94bd39ac
BLAKE2b-256 6835e42d939c81038a5b67de2d48f2f3d0efd360f070595d5f7664ac97aa8308

See more details on using hashes here.

File details

Details for the file neural_compressor-0.1-py3-none-any.whl.

File metadata

  • Download URL: neural_compressor-0.1-py3-none-any.whl
  • Upload date:
  • Size: 258.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3

File hashes

Hashes for neural_compressor-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 20279c6fbc08d9bafbba3ac1842a487f77a93a2e2b5cdfdcfcf7f53d5f91acd7
MD5 ff0fa29065aab6ee5b0433cb1385f199
BLAKE2b-256 9b86f759ab2b2a91569ec0d91d0425a71651ede6f9e3091914d3e3273c1b9073

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page