Skip to main content

Torch-TensorRT is a package which allows users to automatically compile PyTorch and TorchScript modules to TensorRT while remaining in PyTorch

Project description

Torch-TensorRT

Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform.

Documentation pytorch cuda trt license Linux x86-64 Nightly Wheels Linux SBSA Nightly Wheels Windows Nightly Wheels


Torch-TensorRT brings the power of TensorRT to PyTorch. Accelerate inference latency by up to 5x compared to eager execution in just one line of code.

Installation

Stable versions of Torch-TensorRT are published on PyPI

pip install torch-tensorrt

Nightly versions of Torch-TensorRT are published on the PyTorch package index

pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu130

Torch-TensorRT is also distributed in the ready-to-run NVIDIA NGC PyTorch Container which has all dependencies with the proper versions and example notebooks included.

For more advanced installation methods, please see here

Quickstart

Option 1: torch.compile

You can use Torch-TensorRT anywhere you use torch.compile:

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
x = torch.randn((1, 3, 224, 224)).cuda() # define what the inputs to the model will look like

optimized_model = torch.compile(model, backend="tensorrt")
optimized_model(x) # compiled on first run

optimized_model(x) # this will be fast!

Option 2: Export

If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).

Step 1: Optimize + serialize

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
inputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of representative inputs here

trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
torch_tensorrt.save(trt_gm, "trt.ep", inputs=inputs) # PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript file
torch_tensorrt.save(trt_gm, "trt.ts", output_format="torchscript", inputs=inputs)

Step 2: Deploy

Deployment in PyTorch:
import torch
import torch_tensorrt

inputs = [torch.randn((1, 3, 224, 224)).cuda()] # your inputs go here

# You can run this in a new python session!
model = torch.export.load("trt.ep").module()
# model = torch_tensorrt.load("trt.ep").module() # this also works
model(*inputs)
Deployment in C++:
#include "torch/script.h"
#include "torch_tensorrt/torch_tensorrt.h"

auto trt_mod = torch::jit::load("trt.ts");
auto input_tensor = [...]; // fill this with your inputs
auto results = trt_mod.forward({input_tensor});

Further resources

Platform Support

Platform Support
Linux AMD64 / GPU Supported
Linux SBSA / GPU Supported
Windows / GPU Supported (Dynamo only)
Linux Jetson / GPU Source Compilation Supported on JetPack-4.4+
Linux Jetson / DLA Source Compilation Supported on JetPack-4.4+
Linux ppc64le / GPU Not supported

Note: Refer NVIDIA L4T PyTorch NGC container for PyTorch libraries on JetPack.

Dependencies

These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.

  • Bazel 8.1.1
  • Libtorch 2.12.0.dev (latest nightly)
  • CUDA 13.0 (CUDA 12.6 on Jetson)
  • TensorRT 10.15.1.29 (TensorRT 10.3 on Jetson)

Deprecation Policy

Deprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy:

Deprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning.

Contributing

Take a look at the CONTRIBUTING.md

License

The Torch-TensorRT license can be found in the LICENSE file. It is licensed with a BSD Style licence

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

torch_tensorrt-2.12.1-cp313-cp313-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.13Windows x86-64

torch_tensorrt-2.12.1-cp313-cp313-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

torch_tensorrt-2.12.1-cp313-cp313-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

torch_tensorrt-2.12.1-cp312-cp312-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.12Windows x86-64

torch_tensorrt-2.12.1-cp312-cp312-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

torch_tensorrt-2.12.1-cp312-cp312-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

torch_tensorrt-2.12.1-cp311-cp311-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.11Windows x86-64

torch_tensorrt-2.12.1-cp311-cp311-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

torch_tensorrt-2.12.1-cp311-cp311-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

torch_tensorrt-2.12.1-cp310-cp310-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.10Windows x86-64

torch_tensorrt-2.12.1-cp310-cp310-manylinux_2_28_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

torch_tensorrt-2.12.1-cp310-cp310-manylinux_2_28_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file torch_tensorrt-2.12.1-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 2d954029f87b39de7649ca04fc2c340745bb246e0d64f4534b5cac5ab756a34f
MD5 337ee374ed7f9ad33d8f4abb624bf9e0
BLAKE2b-256 e5cbf62b1874fd9873bc4a51c20125b0f3c8695c5ecd377589891e78f44633b0

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d5b7b7f80e74672f204bb678821da7dd9b4ae1a1ecd9ca7456e9d85c4e6dcdd5
MD5 260dd56f9f15c9ba524f12529f7ae7b4
BLAKE2b-256 ae9297cc6cb05988add8cbabfd9161102c520fe1caef5a042dab3ae789636506

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 1818a6037622fb2751fcfefb0f93d924490695fd39c965da10ceb084d3317d49
MD5 3f730418bf9ee8543b0a8d7a4de9c082
BLAKE2b-256 b090a64e7f4eab7a078a48cbf031592aa1fe34f457e12729e35f6ce0c477a063

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 39a8aeabe2b60fb05df10ce8a80e2647b95e9c82697fe1a8726020c8ab8cdfad
MD5 950d43c2340bc81d4ebec13e9d7a669b
BLAKE2b-256 7721675af639c74b1a07d49b4bb1ccd08408a653dc6900dce895244b5a08c6e8

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4d522bfab143c2313d8e0c3178da09622511606a1f6d17a135417930b103a922
MD5 be360b6f6d60752d180f59618738af3e
BLAKE2b-256 30cf15cd2bb64bd6f7fb771b8b6eff07052008dd04453c3b0aef5fc15cb18774

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 02cab18e912bf0a9a92456a6f8764bf1e9441eddf95405e4240fbf400f425520
MD5 0c59721211c9a90a3305b212f61c8e75
BLAKE2b-256 7bea8701ef5a7f7099c7cde8a79a0d59150b5ca65c81e65ba6614741d8f494e0

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 f62d0d401d76c429ea071a5408e7a66f8e01350a89fe5a63e42060713d7645fb
MD5 6ce09a479f1fc2acbe48e10454150921
BLAKE2b-256 7205b335e71e4b9a2134c7d692cc6b7ed86e7a75c823b0817fbb685e522e7230

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d4c36294a534ea2862aa00717bf5dd76b86cb1c62f6bb75dc2d7548d60074635
MD5 373fd42e33368a1dbd35afcfd36208b1
BLAKE2b-256 d52f21436f2a5fd982cccb1368eccf6b51be64bbabd8fc00b216ab00b4bd6964

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 02ff3e84618c53cc9d34116017600399e45adcd4da41bfed7596daa2c558716b
MD5 65c3a78a88f34358169e2608212f34f1
BLAKE2b-256 85cae373aeadbb31295e1cead3bda2a11e977a21bf41f284539c597731a36f4c

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 9cc143806febbf73aaf0eea033c690fbd18b350e8863a8eae9946905c62d614c
MD5 572acd337486a2373f8f84fe53d9ac0c
BLAKE2b-256 9eca17ade4f012ffc72147735656ae83b14cf9fbe5632f5f48e0e5e2d40cb15b

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e8d1ef96b99ec34d5c743edef1ed811350bcbebcafdbb6edc008757ed272d23f
MD5 f86f66506ac0d964ba312bce71fded60
BLAKE2b-256 dd21c8c5454efd3717ed9914d51a2b297783662df326f9c054611711166c5120

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.12.1-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.12.1-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7b53fef8bcb59d8ccfda52ffee9739f0fe03b9dfee615d43953187393c72a7e6
MD5 f8100a02d7a751216e8d6880de7de747
BLAKE2b-256 60653277daccfd5fbd63c6dfb6bf58fb98ee3bf975331f63813356dd3091bf35

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page