Skip to main content

A flexible and efficient inference framework

Project description

TorchPipe

PyPI version License Documentation Benchmark

TorchPipe is an alternative choice for Triton Inference Server, mainly featuring similar functionalities such as Shared-momory, Ensemble, and BLS mechanism.

For serving scenarios, TorchPipe is designed to support multi-instance deployment, pipeline parallelism, adaptive batching, GPU-accelerated operators, and reduced head-of-line (HOL) blocking.It acts as a bridge between lower-level acceleration libraries (e.g., TensorRT, OpenCV, CVCUDA) and RPC frameworks (e.g., Thrift). At its core, it is an engine that enables programmable scheduling.

News

  • [2026-01-23] 📦 Available on PyPI: pip install torchpipe
  • [2026-01-04] 🔧 We switched to tvm_ffi to provide clearer C++-Python interaction.

Usage

Below are some usage examples, for more check out the examples.

Initialize and Prepare Pipeline

from torchpipe import pipe
import torch

from torchvision.models.resnet import resnet101

# create some regular pytorch model...
model = resnet101(pretrained=True).eval().cuda()

# create example model
model_path = f"./resnet101.onnx"
x = torch.ones((1, 3, 224, 224)).cuda()
torch.onnx.export(model, x, model_path, opset_version=17,
                    input_names=['input'], output_names=['output'], 
                    dynamic_axes={'input': {0: 'batch_size'},
                                'output': {0: 'batch_size'}})

thread_safe_pipe = pipe({
    "preprocessor": {
        "backend": "S[DecodeTensor,ResizeTensor,CvtColorTensor,SyncTensor]",
        # "backend": "S[DecodeMat,ResizeMat,CvtColorMat,Mat2Tensor,SyncTensor]",
        'instance_num': 2,
        'color': 'rgb',
        'resize_h': '224',
        'resize_w': '224',
        'next': 'model',
    },
    "model": {
        "backend": "SyncTensor[TensorrtTensor]",
        "model": model_path,
        "model::cache": model_path.replace(".onnx", ".trt"),
        "max": '4',
        'batching_timeout': 4,  # ms, timeout for batching
        'instance_num': 2,
        'mean': "123.675, 116.28, 103.53",
        'std': "58.395, 57.120, 57.375",  # merged into trt
    }}
)

Execute

We can execute the returned thread_safe_pipe just like the original PyTorch model, but in a thread-safe manner.

data = {'data': open('/path/to/img.jpg', 'rb').read()}
thread_safe_pipe(data) # <-- this is thread-safe
result = data['result']

Installation

  • NGC Docker containers (recommended):

test on 25.05, 25.06, 24.05, 23.05

img_name=nvcr.io/nvidia/pytorch:25.05-py3

docker run --rm --gpus all -it --network host \
    -v $(pwd):/workspace/ --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
    -w /workspace/ \
    $img_name \
    bash

pip install torchpipe
python -c "import torchpipe"

The backends it introduces will be JIT-compiled and cached.

There are one core backend group(torchpipe_core) and three optional groups (torchpipe_opencv, torchpipe_nvjpeg, and torchpipe_tensorrt) with different dependencies. For details, see here.

Dependencies such as OpenCV and TensorRT can also be provided in the following ways:

  • providing environment variables:
    Users can specify paths via the following environment variables:
    OPENCV_INCLUDE, OPENCV_LIB, TENSORRT_INCLUDE, TENSORRT_LIB.

Other installation options

How does it work?

See Basic Usage.

How to add (or override) a backend

WIP

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniback-0.2.3.tar.gz (1.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

omniback-0.2.3-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (4.1 MB view details)

Uploaded Python 3manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

omniback-0.2.3-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded Python 3manylinux: glibc 2.26+ ARM64manylinux: glibc 2.28+ ARM64

omniback-0.2.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.8 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

omniback-0.2.3-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (3.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

File details

Details for the file omniback-0.2.3.tar.gz.

File metadata

  • Download URL: omniback-0.2.3.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniback-0.2.3.tar.gz
Algorithm Hash digest
SHA256 8b56f81f308c760a15a2931cf740b9b0a526aa80adcee0fa9ac96a7cb6e9c33c
MD5 40b73e51efcf8b3ca58f1fbc1da05c96
BLAKE2b-256 b54f5f18ca34f3b8656c163cc7dd39c3c6c939dc0591b2d2699383dc3b36c693

See more details on using hashes here.

File details

Details for the file omniback-0.2.3-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.2.3-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 446edbebefe84c8927fbea707086e91529fe538723e2f653658ad6342bef097c
MD5 b9abc96911805164e054dbb022b1a111
BLAKE2b-256 62f756c11258c6f4c47a14977ec84ac0673c10d6d5f8559d80b4cf8df223f1d4

See more details on using hashes here.

File details

Details for the file omniback-0.2.3-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.2.3-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 de3771a4f1346fba2502ed01250d03cf4f14fe1102a3a953aad899cfaad0f7d5
MD5 5445559aa62589acb9c543246fed81c9
BLAKE2b-256 3e049de00825a6b879f71f4f6ac045162471dee29f7d34c56f7938db587f7bb6

See more details on using hashes here.

File details

Details for the file omniback-0.2.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.2.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 e2ca728bb2d2cb5dd59df29e4d5ac6a23fcb297a83c8e7b17948baf3d38b9447
MD5 46e928bd5807e13339450a1f2b521896
BLAKE2b-256 4c2a2eeec91ab8e1ad45b008a047cabeb38f01b21b1d59ee32ec3b8c5e513bf3

See more details on using hashes here.

File details

Details for the file omniback-0.2.3-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.2.3-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl
Algorithm Hash digest
SHA256 12234ff61b589b441f1c72edf7f37810dcd1fa87259c1e6de26d47f504804d95
MD5 a839560e3c673f544e613394e4102630
BLAKE2b-256 f95c910a4f0f26be32d905aeb7342002c5a804cedc086aeee8d8b0da1ebf5209

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page