Package for converting TensorFlow DenseNet121 and DenseNet201 models to ONNX and TensorRT formats.
Project description
quantizedensenet
A Python package for seamlessly converting DenseNet TensorFlow models to ONNX and TensorRT formats, enabling optimized inference on NVIDIA GPUs.
Features
- TensorFlow to ONNX Conversion: Convert SavedModel or model weights to ONNX format with FP32 or FP16 precision.
- ONNX to TensorRT Conversion: Build optimized TensorRT engines with dynamic batch sizes.
- Direct TF to TRT Pipeline: One-step conversion from TensorFlow to TensorRT.
- INT8 Quantization Support: Improve inference speed with INT8 calibration using image directories or cache files.
- DenseNet Support: specialized support for DenseNet121 and DenseNet201 architectures.
- Comprehensive Logging: Detailed conversion process tracking and validation.
Table of Contents
Installation
Prerequisites
Before installing the package, ensure you have the following:
- Python 3.8 or higher - Check with
python --version - pip (Python package manager) - Check with
pip --version - NVIDIA GPU with CUDA support
- CUDA Toolkit 12.x - (Required by
tensorrt_cu12andcuda_bindings==12.9.2) - TensorRT 10.14+
Install from Wheel File
- Download the wheel file provided to you.
- Open a terminal and navigate to the directory containing the
.whlfile. - Install the package using pip.
- If dependencies are not automatically installed with the wheel file, you can install them using the provided
requirements.txtfile:
pip install -r requirements.txt
Quick Start
-
Install the wheel and dependencies, then import the converter:
from quantizedensenet.converter import Converter -
Create a converter instance:
converter = Converter() -
Convert TensorFlow to ONNX or straight to TensorRT, with optional FP16/INT8 settings and dynamic batching.
Usage Examples
Converter Class
The Converter class provides three utilities: convert TensorFlow models to ONNX, build TensorRT engines from ONNX, and run a one-call TensorFlow to TensorRT pipeline for deployment-ready inference engines.
Keep In Mind
-
The original input shape $(N, H, W, C)$ will be changed to $(N, C, H, W)$ for optimized execution.
-
All the functions' output_path or engine_file_path could be None; this way the functions return the created models/engines in memory.
-
Memory Management: It is not recommended to use the converted models for inference immediately after conversion in the same script. The best practice is to restart the Python kernel to free up allocated CUDA memory. After that, with a new run, you can load the engine and run inference without errors.
-
FP16 TensorRT engines are generally the best choice, as they provide the fastest inference without significant accuracy loss.
tf_to_onnx
-
Converts a
tf.kerasSavedModel or a.h5Weights file to an ONNX model. -
If you pass a
.h5file that only contains weights, you must specify theonly_weigths_of_modelargument (Supports:'DenseNet121','DenseNet201'). -
The keras model input shape must be
(None, 224, 224, 3). -
Supports exporting to FP32 or FP16 ONNX graph.
from quantizedensenet.converter import Converter
converter = Converter()
onnx_model = converter.tf_to_onnx(
input_model="path/to/densenet121_weights.h5",
output_path="path/to/output/model.onnx",
precision="fp16",
only_weigths_of_model='DenseNet121', # Specify base model if using weights only
opset=13
)
onnx_to_trt
-
Parses an ONNX model and builds a TensorRT engine.
-
The input model can be passed as a path to a
.onnxfile or as aonnx.ModelProtoobject. -
The ONNX model input shape must be
(-1, 3, 224, 224). -
If engine_file_path is None and
auto_generate_engine_path=True, it auto-generates the path based on the input filename. -
Supports exporting to FP32, FP16, or INT8 TRT engines.
-
Dynamic Batching: You must specify 3 arguments:
-
min_batch: The minimum number of images a batch could ever contain (usually 1). -
opt_batch: The most common batch size for inference (should be close to max_batch). -
max_batch: The maximum number of images a batch could ever contain.
-
from quantizedensenet.converter import Converter
converter = Converter()
engine = converter.onnx_to_trt(
input_model="path/to/model.onnx",
engine_file_path="path/to/output/model.trt",
precision="fp16",
min_batch=1,
opt_batch=16,
max_batch=32
)
-
INT8 Calibration: INT8 mode requires calibration data or an existing cache.
-
You can provide:
-
A directory path containing images.
-
A single image path.
-
A list of image paths.
-
A path to a calibration cache file.
-
-
If
calibration_cacheis provided but does not exist, it will be created using thecalibration_images.
from quantizedensenet.converter import Converter
converter = Converter()
model = converter.onnx_to_trt(
input_model="path/to/model.onnx",
engine_file_path="path/to/int8_densenet.trt",
min_batch=1,
max_batch=32,
opt_batch=32,
precision="int8",
calibration_images="path/to/calibration/images/dir",
calibration_cache="path/to/calibration.cache",
)
tf_to_trt
Runs the end-to-end pipeline in one call. Exports a TensorFlow model to ONNX, then builds a TensorRT engine with the selected precision and batch profiles, streamlining deployment.
-
If you pass a
.h5file that only contains weights, you should also specify the base model usingonly_weigths_of_model. -
When using INT8, provide
calibration_imagesor acalibration_cache.
from quantizedensenet.converter import Converter
converter = Converter()
model = converter.tf_to_trt(
input_model="path/to/densenet201_weights.h5",
engine_file_path="path/to/int8_densenet201.trt",
only_weigths_of_model='DenseNet201',
min_batch=1,
opt_batch=32,
max_batch=32,
precision="int8",
calibration_images="path/to/calibration/images/dir",
calibration_cache="path/to/calibration.cache",
auto_generate_engine_path=False
)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file quantizedensenet-0.0.5.tar.gz.
File metadata
- Download URL: quantizedensenet-0.0.5.tar.gz
- Upload date:
- Size: 16.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9782053c43dabd429425e2afa5c9e91dea463e505505f54d61b086d7459498e1
|
|
| MD5 |
ca35432d6b1af948c09ffd2fe8265c77
|
|
| BLAKE2b-256 |
93745661e12702cd71386f16e24c3836e69492c7cd10b59eb8bc2f84bdbf5331
|
Provenance
The following attestation bundles were made for quantizedensenet-0.0.5.tar.gz:
Publisher:
python-publish.yml on Geridev/quantizedensenet
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
quantizedensenet-0.0.5.tar.gz -
Subject digest:
9782053c43dabd429425e2afa5c9e91dea463e505505f54d61b086d7459498e1 - Sigstore transparency entry: 715830599
- Sigstore integration time:
-
Permalink:
Geridev/quantizedensenet@0237a38bdd3799a407d9996658b297be0d21b775 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/Geridev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@0237a38bdd3799a407d9996658b297be0d21b775 -
Trigger Event:
push
-
Statement type:
File details
Details for the file quantizedensenet-0.0.5-py3-none-any.whl.
File metadata
- Download URL: quantizedensenet-0.0.5-py3-none-any.whl
- Upload date:
- Size: 16.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f625cc2bc6a4fca4af8acb20b396454b1ce0fe13327994d85eec880bf5effb9
|
|
| MD5 |
d7ccff103585a30db09791ab8b7162bf
|
|
| BLAKE2b-256 |
06f31a5507de13e5fa430b00dd18ebcc30915ca194185a0e031bfdcda7355b2e
|
Provenance
The following attestation bundles were made for quantizedensenet-0.0.5-py3-none-any.whl:
Publisher:
python-publish.yml on Geridev/quantizedensenet
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
quantizedensenet-0.0.5-py3-none-any.whl -
Subject digest:
5f625cc2bc6a4fca4af8acb20b396454b1ce0fe13327994d85eec880bf5effb9 - Sigstore transparency entry: 715830600
- Sigstore integration time:
-
Permalink:
Geridev/quantizedensenet@0237a38bdd3799a407d9996658b297be0d21b775 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/Geridev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@0237a38bdd3799a407d9996658b297be0d21b775 -
Trigger Event:
push
-
Statement type: