Skip to main content

Torch-TensorRT is a package which allows users to automatically compile PyTorch and TorchScript modules to TensorRT while remaining in PyTorch

Project description

Torch-TensorRT

Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform.

Documentation pytorch cuda trt license Linux x86-64 Nightly Wheels Linux SBSA Nightly Wheels Windows Nightly Wheels


Torch-TensorRT brings the power of TensorRT to PyTorch. Accelerate inference latency by up to 5x compared to eager execution in just one line of code.

Installation

Stable versions of Torch-TensorRT are published on PyPI

pip install torch-tensorrt

Nightly versions of Torch-TensorRT are published on the PyTorch package index

pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu130

Torch-TensorRT is also distributed in the ready-to-run NVIDIA NGC PyTorch Container which has all dependencies with the proper versions and example notebooks included.

For more advanced installation methods, please see here

Quickstart

Option 1: torch.compile

You can use Torch-TensorRT anywhere you use torch.compile:

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
x = torch.randn((1, 3, 224, 224)).cuda() # define what the inputs to the model will look like

optimized_model = torch.compile(model, backend="tensorrt")
optimized_model(x) # compiled on first run

optimized_model(x) # this will be fast!

Option 2: Export

If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).

Step 1: Optimize + serialize

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
inputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of representative inputs here

trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs)
torch_tensorrt.save(trt_gm, "trt.ep", inputs=inputs) # PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript file
torch_tensorrt.save(trt_gm, "trt.ts", output_format="torchscript", inputs=inputs)

Step 2: Deploy

Deployment in PyTorch:
import torch
import torch_tensorrt

inputs = [torch.randn((1, 3, 224, 224)).cuda()] # your inputs go here

# You can run this in a new python session!
model = torch.export.load("trt.ep").module()
# model = torch_tensorrt.load("trt.ep").module() # this also works
model(*inputs)
Deployment in C++:
#include "torch/script.h"
#include "torch_tensorrt/torch_tensorrt.h"

auto trt_mod = torch::jit::load("trt.ts");
auto input_tensor = [...]; // fill this with your inputs
auto results = trt_mod.forward({input_tensor});

Further resources

Platform Support

Platform Support
Linux AMD64 / GPU Supported
Linux SBSA / GPU Supported
Windows / GPU Supported (Dynamo only)
Linux Jetson / GPU Source Compilation Supported on JetPack-4.4+
Linux Jetson / DLA Source Compilation Supported on JetPack-4.4+
Linux ppc64le / GPU Not supported

Note: Refer NVIDIA L4T PyTorch NGC container for PyTorch libraries on JetPack.

Dependencies

These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.

  • Bazel 8.1.1
  • Libtorch 2.12.0.dev (latest nightly)
  • CUDA 13.0 (CUDA 12.6 on Jetson)
  • TensorRT 10.15.1.29 (TensorRT 10.3 on Jetson)

Deprecation Policy

Deprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy:

Deprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning.

Contributing

Take a look at the CONTRIBUTING.md

License

The Torch-TensorRT license can be found in the LICENSE file. It is licensed with a BSD Style licence

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

torch_tensorrt_rtx-2.12.0-cp313-cp313-win_amd64.whl (1.8 MB view details)

Uploaded CPython 3.13Windows x86-64

torch_tensorrt_rtx-2.12.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

torch_tensorrt_rtx-2.12.0-cp312-cp312-win_amd64.whl (1.8 MB view details)

Uploaded CPython 3.12Windows x86-64

torch_tensorrt_rtx-2.12.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

torch_tensorrt_rtx-2.12.0-cp311-cp311-win_amd64.whl (1.8 MB view details)

Uploaded CPython 3.11Windows x86-64

torch_tensorrt_rtx-2.12.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

torch_tensorrt_rtx-2.12.0-cp310-cp310-win_amd64.whl (1.8 MB view details)

Uploaded CPython 3.10Windows x86-64

torch_tensorrt_rtx-2.12.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file torch_tensorrt_rtx-2.12.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.12.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 627fb02b50e8b6d3347d392576653a09ebbbc6aff4af34fc5e3ae4bc1b0f9844
MD5 53795311f12cdbe0db8d8e31f528e1ad
BLAKE2b-256 1a894fc3f29da3f4fdff7b1d2d11584b0afc973c42603aa75529781d6d88027a

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.12.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.12.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1c8170038082213ea5d3df153f19aa8b95f758205e02494e26f8cc77ab16fc38
MD5 b0a1f8190b5b4c4499de621c2872f524
BLAKE2b-256 ee097dde60426cef40138d06bb3883dc3ed48df1529d22e18c03b14abce882c0

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.12.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.12.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 921d3860afc1c4733100b98559d65aad5565944db4bc9ecd9cac51d18ff92e0f
MD5 3df122e8a38830ef4006b5006fdf906d
BLAKE2b-256 646843ec3038abdb2e9a921727bcbcc75f79cc720df9ade53fdc2272fc9045a5

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.12.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.12.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 76f31c2c1c1c6c7f723df08f70318794cb8410826766f424a68eca0381c682ec
MD5 5948acd8e10e68013a725f26dc41fdae
BLAKE2b-256 494a8dde04616d74e4d300d838aa483020a120f786de4793f41b48af3972042e

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.12.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.12.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 6fcad7b2a653b803ce831b38da86e2f393414a06714096901408e34deae4aa2e
MD5 f787704dff78f554595a1024518d0db4
BLAKE2b-256 0f20fe7a81a64e824ce85cd451a48f61975d451d8ed6a219e38101fb6d37ec6a

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.12.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.12.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6daf56736d4cb964bfdf03851e1bf21702d40179f95ecc427e101fb35d82c223
MD5 ffc0dcdce4d79c73b6702faeb5a422f8
BLAKE2b-256 3f27ef01e4e776b68f94a3c1af4955d5638b83c5c6d5a5a5f059027c7a46d70c

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.12.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.12.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 9bb97e851e92f9c8cec9e635dbde40e9e2affdfe8561e5de18e9ba7366b9e689
MD5 53c06e438115cde291b939047723a89c
BLAKE2b-256 66c4a4ac70102c59c2faaef9e99893b7a3165e40554459329133cba20db3a979

See more details on using hashes here.

File details

Details for the file torch_tensorrt_rtx-2.12.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt_rtx-2.12.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4bc2ba7cf3e911a5ae98bcd9c8a164408bac9324d5633e3ab80f6c032a90241a
MD5 b615b45f9ebbcccfef59d966bfd72252
BLAKE2b-256 b0b3116b18109241e61b0f03152cc1e7bc1cf82d7e128caef804d7e9757bdb7a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page