Skip to main content

Torch-TensorRT is a package which allows users to automatically compile PyTorch and TorchScript modules to TensorRT while remaining in PyTorch

Project description

Torch-TensorRT

Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform.

Documentation pytorch cuda trt license Linux x86-64 Nightly Wheels Linux SBSA Nightly Wheels Windows Nightly Wheels


Torch-TensorRT brings the power of TensorRT to PyTorch. Accelerate inference latency by up to 5x compared to eager execution in just one line of code.

Installation

Stable versions of Torch-TensorRT are published on PyPI

pip install torch-tensorrt

Nightly versions of Torch-TensorRT are published on the PyTorch package index

pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu130

Torch-TensorRT is also distributed in the ready-to-run NVIDIA NGC PyTorch Container which has all dependencies with the proper versions and example notebooks included.

For more advanced installation methods, please see here

Quickstart

Option 1: torch.compile

You can use Torch-TensorRT anywhere you use torch.compile:

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
x = torch.randn((1, 3, 224, 224)).cuda() # define what the inputs to the model will look like

optimized_model = torch.compile(model, backend="tensorrt")
optimized_model(x) # compiled on first run

optimized_model(x) # this will be fast!

Option 2: Export

If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).

Step 1: Optimize + serialize

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
inputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of representative inputs here

trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
torch_tensorrt.save(trt_gm, "trt.ep", inputs=inputs) # PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript file
torch_tensorrt.save(trt_gm, "trt.ts", output_format="torchscript", inputs=inputs)

Step 2: Deploy

Deployment in PyTorch:
import torch
import torch_tensorrt

inputs = [torch.randn((1, 3, 224, 224)).cuda()] # your inputs go here

# You can run this in a new python session!
model = torch.export.load("trt.ep").module()
# model = torch_tensorrt.load("trt.ep").module() # this also works
model(*inputs)
Deployment in C++:
#include "torch/script.h"
#include "torch_tensorrt/torch_tensorrt.h"

auto trt_mod = torch::jit::load("trt.ts");
auto input_tensor = [...]; // fill this with your inputs
auto results = trt_mod.forward({input_tensor});

Further resources

Platform Support

Platform Support
Linux AMD64 / GPU Supported
Linux SBSA / GPU Supported
Windows / GPU Supported (Dynamo only)
Linux Jetson / GPU Source Compilation Supported on JetPack-4.4+
Linux Jetson / DLA Source Compilation Supported on JetPack-4.4+
Linux ppc64le / GPU Not supported

Note: Refer NVIDIA L4T PyTorch NGC container for PyTorch libraries on JetPack.

Dependencies

These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.

  • Bazel 8.1.1
  • Libtorch 2.11.0.dev (latest nightly)
  • CUDA 13.0 (CUDA 12.6 on Jetson)
  • TensorRT 10.14.1.48 (TensorRT 10.3 on Jetson)

Deprecation Policy

Deprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy:

Deprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning.

Contributing

Take a look at the CONTRIBUTING.md

License

The Torch-TensorRT license can be found in the LICENSE file. It is licensed with a BSD Style licence

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

torch_tensorrt_rtx-2.11.0-cp313-cp313-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.13Windows x86-64

torch_tensorrt_rtx-2.11.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

torch_tensorrt_rtx-2.11.0-cp312-cp312-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.12Windows x86-64

torch_tensorrt_rtx-2.11.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

torch_tensorrt_rtx-2.11.0-cp311-cp311-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.11Windows x86-64

torch_tensorrt_rtx-2.11.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

torch_tensorrt_rtx-2.11.0-cp310-cp310-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.10Windows x86-64

torch_tensorrt_rtx-2.11.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file torch_tensorrt_rtx-2.11.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.11.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 c99a6f2e33353c41675d13b94dd55ca15f093a6ebfd2616a79d26b00c78f9e8e
MD5 0cb154cf5cd6e435f60f94f7a7cc5aa6
BLAKE2b-256 37c48b99f3262819ac37d3dc79fa5667edf0ec2680b321f28a798be3771f38ed

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.11.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.11.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f68b926fa8672f5f161318f39b3bda82f92aa53471b177e47117d249705f1374
MD5 b1b10efcd0478d221c986172fcde04ea
BLAKE2b-256 ed0235b3420320c719b7de915f6a64dfceff77aea0f6b2342b02005d8f69bd06

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.11.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.11.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 1d1d5d140b9c6bddb970804b51a3cbe3d35bc7dd06498e2ed3f97573dc4db0a4
MD5 f8ecf789126d819f3967b911d2bd4c91
BLAKE2b-256 cc8725d8ded7eb818ccffb0bfc0a780fb2fadce2b04268c532e261c81fae5a03

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.11.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.11.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 61d4421f2ac28775ce405c0203dc12fde3e0b5b1d96ada6f6e100379c179d396
MD5 fab3441808f2144d6bb91bd596570558
BLAKE2b-256 0aad0a03859c84761017c486f8c95e42627203c9dfffe8e8558313b09329c715

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.11.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.11.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 06e30fba8b378cf645a0240230031ecd2a5eb997ee1895c5c0510d10647aeb33
MD5 871eb7edf34867b6e001ce473c913887
BLAKE2b-256 c363c4fc0b665daa3b5c6fe0494efdd1364519215f8cc3011f51394a6de0f1a1

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.11.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.11.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 319596392e2777dfa5e090bb7c493d07f3745eab573fe49fce0ea95eb01f3751
MD5 9ce6d52e7d5cd8ace12d5717dd2569f6
BLAKE2b-256 49fcb7425b646b1e3edb20c38eacf3c6b5070bb4b14e8602011d827e431d7ef9

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.11.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.11.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 7bba165b8ce89eddcaddc81f61476542c7f1c2be8548e66e5d02e3b9290f2f46
MD5 a8be4775a544adebb235ec29ce7c740f
BLAKE2b-256 c5d3ab3a7f59550b6912ebc5b252b1099969034430e690509bd1d14507b1fa4e

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.11.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.11.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e8214b550d192888f95b15c7624bd0792845fe31c1ac715de5d7148426aee5b0
MD5 98820114b96836b4134101fdd46e55b5
BLAKE2b-256 63205399ddaa27311cd40125a5c2e8673b3333eee00e5bfaca4e2fc99b548db3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page