Skip to main content

Torch-TensorRT is a package which allows users to automatically compile PyTorch and TorchScript modules to TensorRT while remaining in PyTorch

Project description

Torch-TensorRT

Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform.

Documentation pytorch cuda trt license linux_tests windows_tests


Torch-TensorRT brings the power of TensorRT to PyTorch. Accelerate inference latency by up to 5x compared to eager execution in just one line of code.

Installation

Stable versions of Torch-TensorRT are published on PyPI

pip install torch-tensorrt

Nightly versions of Torch-TensorRT are published on the PyTorch package index

pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu124

Torch-TensorRT is also distributed in the ready-to-run NVIDIA NGC PyTorch Container which has all dependencies with the proper versions and example notebooks included.

For more advanced installation methods, please see here

Quickstart

Option 1: torch.compile

You can use Torch-TensorRT anywhere you use torch.compile:

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
x = torch.randn((1, 3, 224, 224)).cuda() # define what the inputs to the model will look like

optimized_model = torch.compile(model, backend="tensorrt")
optimized_model(x) # compiled on first run

optimized_model(x) # this will be fast!

Option 2: Export

If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).

Step 1: Optimize + serialize

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
inputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of representative inputs here

trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
torch_tensorrt.save(trt_gm, "trt.ep", inputs=inputs) # PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript file
torch_tensorrt.save(trt_gm, "trt.ts", output_format="torchscript", inputs=inputs)

Step 2: Deploy

Deployment in PyTorch:
import torch
import torch_tensorrt

inputs = [torch.randn((1, 3, 224, 224)).cuda()] # your inputs go here

# You can run this in a new python session!
model = torch.export.load("trt.ep").module()
# model = torch_tensorrt.load("trt.ep").module() # this also works
model(*inputs)
Deployment in C++:
#include "torch/script.h"
#include "torch_tensorrt/torch_tensorrt.h"

auto trt_mod = torch::jit::load("trt.ts");
auto input_tensor = [...]; // fill this with your inputs
auto results = trt_mod.forward({input_tensor});

Further resources

Platform Support

Platform Support
Linux AMD64 / GPU Supported
Windows / GPU Supported (Dynamo only)
Linux aarch64 / GPU Native Compilation Supported on JetPack-4.4+ (use v1.0.0 for the time being)
Linux aarch64 / DLA Native Compilation Supported on JetPack-4.4+ (use v1.0.0 for the time being)
Linux ppc64le / GPU Not supported

Note: Refer NVIDIA L4T PyTorch NGC container for PyTorch libraries on JetPack.

Dependencies

These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.

  • Bazel 6.3.2
  • Libtorch 2.6.0 (built with CUDA 12.6)
  • CUDA 12.6
  • TensorRT 10.7.0.23

Deprecation Policy

Deprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy:

Deprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning.

Contributing

Take a look at the CONTRIBUTING.md

License

The Torch-TensorRT license can be found in the LICENSE file. It is licensed with a BSD Style licence

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

torch_tensorrt-2.6.1-cp312-cp312-win_amd64.whl (3.0 MB view details)

Uploaded CPython 3.12Windows x86-64

torch_tensorrt-2.6.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl (15.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.34+ x86-64

torch_tensorrt-2.6.1-cp311-cp311-win_amd64.whl (3.0 MB view details)

Uploaded CPython 3.11Windows x86-64

torch_tensorrt-2.6.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl (15.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.34+ x86-64

torch_tensorrt-2.6.1-cp310-cp310-win_amd64.whl (3.0 MB view details)

Uploaded CPython 3.10Windows x86-64

torch_tensorrt-2.6.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl (15.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.34+ x86-64

torch_tensorrt-2.6.1-cp39-cp39-win_amd64.whl (3.0 MB view details)

Uploaded CPython 3.9Windows x86-64

torch_tensorrt-2.6.1-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl (15.6 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.27+ x86-64manylinux: glibc 2.34+ x86-64

File details

Details for the file torch_tensorrt-2.6.1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 a923b7651fffd93577383b766734d0be809a27c53f62469f545c63a476c07dde
MD5 c7f5fe1dac79ea26133792adf8763055
BLAKE2b-256 1d61cf8a5eefe7bca51b8181681b8cf5df013c04c6d05bfcc09f5e53e7be7bb2

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 94a71a8c7ab65011a1aea8ab16d6c2c0398790e384ab5acaa53271d94a29dd97
MD5 c9664df5d723d1c32460425a6b948890
BLAKE2b-256 816742dc1652c1945dc58545f50c32655855e5f4752616045cbfefc5be10b1f2

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.1-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 0c4ce70e3db9f30aa2c8c421c06daf8ff184c3ef44b1ee1b22f3bc2276e7dbb8
MD5 782b01f2264c4af914177fadff72e722
BLAKE2b-256 6bf27aec4058502879735f07230ff2e106d5ca93e808e8d2faeab1b6fa8d7531

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 e2d8927732c1ce38663ded93d058d5a8e99893bc2610b77a81753863a2a4e701
MD5 e6b24d710105ebd5479f664e0245ab68
BLAKE2b-256 f6cf0193a78f5217cc3d614dfa56f64131cfdb3da90c60ac4ba9db9e0086b1f1

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 0b869b5b0b0af3c5fc37d41f27601c982b9265894cbaf49f22d90e08c71499d4
MD5 4f828bda42bb1d3022a972be56ecf4e2
BLAKE2b-256 4c32593b413f4e484acc3a4cd36f00d7ca87b4536ef812f8309a64b1c2bf1389

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 91a5ee05dd64fa0f8de5ff3656f4afa88aeb8fd3500c4dec5e68f892d3985136
MD5 dce29ba6859157189aafa1316465518b
BLAKE2b-256 7e4612fde0cef9764302f25f5b04f6d429999425ea030b9812b4a6c49c21f35b

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.1-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 01805d6c914fee066d0375efefb4aecb88290a9c83773c4683500818a6fac7ff
MD5 c6cdba9916c9471c650a184da296335a
BLAKE2b-256 132b77e9fb41de05d68a6041800aadc58df583a20167c7c93b5768788196e2bf

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.6.1-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.6.1-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 112fab0dc12327fe43a4fee88b21ce5e4b9a42a3739d60b80a92670c75feaaaa
MD5 c40b6a258f2fe1691411bf0a7fe4609d
BLAKE2b-256 2b84d4c463397a7fe20bc971d75acf7e809b0883cc7558461d9ca7534c572d5b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page