Skip to main content

Torch-TensorRT is a package which allows users to automatically compile PyTorch and TorchScript modules to TensorRT while remaining in PyTorch

Project description

Torch-TensorRT

Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform.

Documentation pytorch cuda trt license linux_tests windows_tests


Torch-TensorRT brings the power of TensorRT to PyTorch. Accelerate inference latency by up to 5x compared to eager execution in just one line of code.

Installation

Stable versions of Torch-TensorRT are published on PyPI

pip install torch-tensorrt

Nightly versions of Torch-TensorRT are published on the PyTorch package index

pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu124

Torch-TensorRT is also distributed in the ready-to-run NVIDIA NGC PyTorch Container which has all dependencies with the proper versions and example notebooks included.

For more advanced installation methods, please see here

Quickstart

Option 1: torch.compile

You can use Torch-TensorRT anywhere you use torch.compile:

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
x = torch.randn((1, 3, 224, 224)).cuda() # define what the inputs to the model will look like

optimized_model = torch.compile(model, backend="tensorrt")
optimized_model(x) # compiled on first run

optimized_model(x) # this will be fast!

Option 2: Export

If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).

Step 1: Optimize + serialize

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
inputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of representative inputs here

trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
torch_tensorrt.save(trt_gm, "trt.ep", inputs=inputs) # PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript file
torch_tensorrt.save(trt_gm, "trt.ts", output_format="torchscript", inputs=inputs)

Step 2: Deploy

Deployment in PyTorch:
import torch
import torch_tensorrt

inputs = [torch.randn((1, 3, 224, 224)).cuda()] # your inputs go here

# You can run this in a new python session!
model = torch.export.load("trt.ep").module()
# model = torch_tensorrt.load("trt.ep").module() # this also works
model(*inputs)
Deployment in C++:
#include "torch/script.h"
#include "torch_tensorrt/torch_tensorrt.h"

auto trt_mod = torch::jit::load("trt.ts");
auto input_tensor = [...]; // fill this with your inputs
auto results = trt_mod.forward({input_tensor});

Further resources

Platform Support

Platform Support
Linux AMD64 / GPU Supported
Windows / GPU Supported (Dynamo only)
Linux aarch64 / GPU Native Compilation Supported on JetPack-4.4+ (use v1.0.0 for the time being)
Linux aarch64 / DLA Native Compilation Supported on JetPack-4.4+ (use v1.0.0 for the time being)
Linux ppc64le / GPU Not supported

Note: Refer NVIDIA L4T PyTorch NGC container for PyTorch libraries on JetPack.

Dependencies

These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.

  • Bazel 6.3.2
  • Libtorch 2.6.0 (built with CUDA 12.6)
  • CUDA 12.6
  • TensorRT 10.7.0.23

Deprecation Policy

Deprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy:

Deprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning.

Contributing

Take a look at the CONTRIBUTING.md

License

The Torch-TensorRT license can be found in the LICENSE file. It is licensed with a BSD Style licence

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

torch_tensorrt-2.6.0-cp312-cp312-win_amd64.whl (3.0 MB view details)

Uploaded CPython 3.12Windows x86-64

torch_tensorrt-2.6.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.34+ x86-64

torch_tensorrt-2.6.0-cp311-cp311-win_amd64.whl (3.0 MB view details)

Uploaded CPython 3.11Windows x86-64

torch_tensorrt-2.6.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.34+ x86-64

torch_tensorrt-2.6.0-cp310-cp310-win_amd64.whl (3.0 MB view details)

Uploaded CPython 3.10Windows x86-64

torch_tensorrt-2.6.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.34+ x86-64

torch_tensorrt-2.6.0-cp39-cp39-win_amd64.whl (3.0 MB view details)

Uploaded CPython 3.9Windows x86-64

torch_tensorrt-2.6.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.27+ x86-64manylinux: glibc 2.34+ x86-64

File details

Details for the file torch_tensorrt-2.6.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 5bdbaa9dc98440f8d27907df9bee32ac70c3bd2b82aa96378ac90e84c1057254
MD5 91d9e7ac80e63743b56e46f56c5c31b7
BLAKE2b-256 a4a53351fff626380f4df92675b65fe68cbfce213bacb3bd404f7c57df66c4da

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4ae14f4e596224f29a7b5f89bffcf3b2817d792baf3d15d3b09bb5b62a158e45
MD5 67869c053c10e1d4409a458d1751c7fd
BLAKE2b-256 6ed32b3343a0c0d6189565455e3851314ee24dd801fb59b359b661f73c2161da

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 721e3a5135979879b7c3120294c85d22fe3dc378fbbe7a04df01c789fd6b5537
MD5 2701ec4e4e9fb6929a011b2b3e4791ca
BLAKE2b-256 7b96c3a37f8fd9628058ca49742bf21fec94cc60019750a423144ea03d84879a

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 374b21ef43e2f6a553e4183c126be131d728d7c930f064fe7a44a3ba29c4aefa
MD5 236e9b49f3c72655cd8587d96a99c266
BLAKE2b-256 e7171369dee9812079bf239e78aef7181df71448e3b16dca3a33e8dad259d5bd

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 f184d6375eafe5537c6eafe5b27e333a0d0c9041f0b2b9c7e4597c3e8a04d255
MD5 1542d5befd11fa141337eec93d095407
BLAKE2b-256 4b515cdf2a96cd766bb819a5590ae56ef83ac0c3260c22ee0a5816f60b2f1fe5

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 addb44da551d120fd130060f2339fede0e996b22538dcd398274d692de5c31a6
MD5 653f41a5ebd86c909a63f7e86bcf4ba0
BLAKE2b-256 9fdb761064704722add37399a6d8ada4f46ad411974c0a6f3ab919af0050c6be

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.0-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 7100df39ef701a0778594302cf0841f000d81274c20fb439bc0531a0e7543351
MD5 c7d949223808ead14a4aa8736c6c192d
BLAKE2b-256 66f7261ae973ae03fe4ef65b73b909026ae0c745fa3add276052210b9061c42d

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 d15634f747fb42d9f3a9f7bddf4410b6bb2a36f785387f811c7e0754f9842743
MD5 4febd948ff8fe9937575cc26bf81b2fe
BLAKE2b-256 62576646711cd5659c26113800c22695c9b515667e1f6cb1bbb84e8cb45f2391

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page