triton-model-navigator

Triton Model Navigator provides tools supporting to create Deep Learning production ready inference models

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Model optimization plays a crucial role in unlocking the maximum performance capabilities of the underlying hardware. By applying various transformation techniques, models can be optimized to fully utilize the specific features offered by the hardware architecture to improve the inference performance and cost. Furthermore, in many cases allow for serialization of models, separating them from the source code. The serialization process enhances portability, allowing the models to be seamlessly deployed in production environments. The decoupling of models from the source code also facilitates maintenance, updates, and collaboration among developers. However, this process comprises multiple steps and offers various potential paths, making manual execution complicated and time-consuming.

The Triton Model Navigator offers a user-friendly and automated solution for optimizing and deploying machine learning models. Using a single entry point for various supported frameworks, allowing users to start the process of searching for the best deployment option with a single call to the dedicated optimize function. Model Navigator handles model export, conversion, correctness testing, and profiling to select optimal model format and save generated artifacts for inference deployment on the PyTriton or Triton Inference Server .

The Model Navigator generates multiple optimized and production-ready models. The table below illustrates the model formats that can be obtained by using the Model Navigator with various frameworks.

Table: Supported conversion target formats per each supported Python framework or file.

PyTorch	TensorFlow 2	JAX	ONNX
Torch 2 Compile TorchScript Trace TorchScript Script TorchTensorRT ONNX TensorRT	SavedModel TensorRT in TensorFlow ONNX TensorRT	SavedModel TensorRT in TensorFlow ONNX TensorRT	TensorRT

Note: The Model Navigator has the capability to support any Python function as input. However, in this particular case, its role is limited to profiling the function without generating any serialized models.

The Model Navigator stores all artifacts within the navigator_workspace. Additionally, it provides the option to save a portable and transferable Navigator Package that includes only the models with minimal latency and maximal throughput. This package also includes base formats that can be used to regenerate the TensorRT plan on the target hardware.

Table: Model formats that can be generated from saved Navigator Package and from model sources.

From model source	From Navigator Package
SavedModel TensorRT in TensorFlow TorchScript Trace TorchScript Script Torch 2 Compile TorchTensorRT ONNX TensorRT	TorchTensorRT TensorRT in TensorFlow ONNX TensorRT

Installation

The package can be installed using extra index url:

pip install -U --extra-index-url https://pypi.ngc.nvidia.com triton-model-navigator[<extras,>]

or with nvidia-pyindex:

pip install nvidia-pyindex
pip install -U triton-model-navigator[<extras,>]

Extras:

tensorflow - Model Navigator with dependencies for TensorFlow2
jax - Model Navigator with dependencies for JAX

For using with PyTorch no extras are needed.

Quick Start

Optimizing models using Model Navigator is as simply as calling optimize function. The optimization process requires at least:

model - a Python object, callable or file path with model to optimize.
dataloader - a method or class generating input data. The data is utilized to determine the maximum and minimum shapes of the model inputs and create output samples that are used during the optimization process.

Here is an example of running optimize on Torch Hub ResNet50 model:

import logging

import torch
import model_navigator as nav

nav.torch.optimize(
    model=torch.hub.load('NVIDIA/DeepLearningExamples:torchhub', 'nvidia_resnet50', pretrained=True).eval(),
    dataloader=[torch.randn(1, 3, 256, 256) for _ in range(10)],
)

Once the model has been optimized the created artifacts are stored in navigator_workspace and a Package object is returned from the function. The returned object can be used to create Navigator Package or deploy model on PyTriton or Triton Inference Server. Read more about it in documentation

Examples

We provide step-by-step examples that demonstrate how to use various features of Model Navigator. For the sake of readability and accessibility, we use a simple torch.nn.Linear model as an example. These examples illustrate how to optimize, test and deploy the model on the PyTriton and Triton Inference Server.

Examples: https://github.com/triton-inference-server/model_navigator/tree/main/examples.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.9.0

May 7, 2024

0.8.1

Apr 4, 2024

0.8.0

Mar 22, 2024

0.7.7

Mar 15, 2024

0.7.5

Dec 20, 2023

0.7.4

Nov 8, 2023

0.7.3

Sep 27, 2023

0.7.2

Aug 30, 2023

0.7.1

Aug 21, 2023

0.7.0

Aug 11, 2023

This version

0.6.3

Jul 25, 2023

0.6.2

Jul 19, 2023

0.6.1

Jul 7, 2023

0.6.0

Jun 30, 2023

0.5.6

Jun 27, 2023

0.5.5

May 26, 2023

0.5.4

May 18, 2023

0.5.3

Apr 19, 2023

0.5.2

Apr 12, 2023

0.5.1

Mar 30, 2023

0.5.0

Mar 30, 2023

0.4.4

Mar 30, 2023

0.4.3

Mar 30, 2023

0.4.2

Mar 29, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

triton_model_navigator-0.6.3-py3-none-any.whl (254.5 kB view hashes)

Uploaded Jul 25, 2023 Python 3

Hashes for triton_model_navigator-0.6.3-py3-none-any.whl

Hashes for triton_model_navigator-0.6.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`69d5aac1ea04c1ccb94227025290b750dd87a9cf907ccdccd12095bb70498b8f`
MD5	`5bf6c6f004210efba68ffcb38c5bf9ca`
BLAKE2b-256	`a7ebaf2514291d06a890620a6e37d8e872068c946f8cbc48904bff0154b5f773`