Skip to main content

A flexible and efficient inference framework

Project description

TorchPipe

PyPI version License Documentation Benchmark

TorchPipe is an alternative choice for Triton Inference Server, mainly featuring similar functionalities such as Shared-momory, Ensemble, and BLS mechanism.

For serving scenarios, TorchPipe is designed to support multi-instance deployment, pipeline parallelism, adaptive batching, GPU-accelerated operators, and reduced head-of-line (HOL) blocking.It acts as a bridge between lower-level acceleration libraries (e.g., TensorRT, OpenCV, CVCUDA) and RPC frameworks (e.g., Thrift). At its core, it is an engine that enables programmable scheduling.

News

  • [2026-01-23] 📦 Available on PyPI: pip install torchpipe
  • [2026-01-04] 🔧 We switched to tvm_ffi to provide clearer C++-Python interaction.

Usage

Below are some usage examples, for more check out the examples.

Initialize and Prepare Pipeline

from torchpipe import pipe
import torch

from torchvision.models.resnet import resnet101

# create some regular pytorch model...
model = resnet101(pretrained=True).eval().cuda()

# create example model
model_path = f"./resnet101.onnx"
x = torch.ones((1, 3, 224, 224)).cuda()
torch.onnx.export(model, x, model_path, opset_version=17,
                    input_names=['input'], output_names=['output'], 
                    dynamic_axes={'input': {0: 'batch_size'},
                                'output': {0: 'batch_size'}})

thread_safe_pipe = pipe({
    "preprocessor": {
        "backend": "S[DecodeTensor,ResizeTensor,CvtColorTensor,SyncTensor]",
        # "backend": "S[DecodeMat,ResizeMat,CvtColorMat,Mat2Tensor,SyncTensor]",
        'instance_num': 2,
        'color': 'rgb',
        'resize_h': '224',
        'resize_w': '224',
        'next': 'model',
    },
    "model": {
        "backend": "SyncTensor[TensorrtTensor]",
        "model": model_path,
        "model::cache": model_path.replace(".onnx", ".trt"),
        "max": '4',
        'batching_timeout': 4,  # ms, timeout for batching
        'instance_num': 2,
        'mean': "123.675, 116.28, 103.53",
        'std': "58.395, 57.120, 57.375",  # merged into trt
    }}
)

Execute

We can execute the returned thread_safe_pipe just like the original PyTorch model, but in a thread-safe manner.

data = {'data': open('/path/to/img.jpg', 'rb').read()}
thread_safe_pipe(data) # <-- this is thread-safe
result = data['result']

Installation

  • NGC Docker containers (recommended):

test on 25.05, 25.06, 24.05, 23.05

img_name=nvcr.io/nvidia/pytorch:25.05-py3

docker run --rm --gpus all -it --network host \
    -v $(pwd):/workspace/ --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
    -w /workspace/ \
    $img_name \
    bash

pip install torchpipe
python -c "import torchpipe"

The backends it introduces will be JIT-compiled and cached.

There are one core backend group(torchpipe_core) and three optional groups (torchpipe_opencv, torchpipe_nvjpeg, and torchpipe_tensorrt) with different dependencies. For details, see here.

Dependencies such as OpenCV and TensorRT can also be provided in the following ways:

  • providing environment variables:
    Users can specify paths via the following environment variables:
    OPENCV_INCLUDE, OPENCV_LIB, TENSORRT_INCLUDE, TENSORRT_LIB.

Other installation options

How does it work?

See Basic Usage.

How to add (or override) a backend

WIP

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniback-0.1.24.tar.gz (1.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

omniback-0.1.24-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (4.1 MB view details)

Uploaded Python 3manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

omniback-0.1.24-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded Python 3manylinux: glibc 2.26+ ARM64manylinux: glibc 2.28+ ARM64

omniback-0.1.24-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.8 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

omniback-0.1.24-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (3.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

File details

Details for the file omniback-0.1.24.tar.gz.

File metadata

  • Download URL: omniback-0.1.24.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniback-0.1.24.tar.gz
Algorithm Hash digest
SHA256 e4ed484fa89c2267be8f7227690a3dd8dd13ff590cd8770af0aeed26ead9c590
MD5 caa707056d878873c7f0d3b58093e8b3
BLAKE2b-256 02b7dbeea13c31aeaa2c27e0050441b6aa9ad386bbf77fdc3ec91d63442e5c5d

See more details on using hashes here.

File details

Details for the file omniback-0.1.24-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.1.24-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d46d0c2f8d6931a349f07b7cbd9c9bb52a16b9f1a986169adf833348797ee172
MD5 692102021cae3f266fcccd8550b22324
BLAKE2b-256 1573ccd80b3989190216d862cf361ac3df0c205e74139bf75c653ecae5567232

See more details on using hashes here.

File details

Details for the file omniback-0.1.24-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.1.24-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e360b9efec8e05ebd1a7729dd442088859af3d6341a5538528d6eef7e80d2de8
MD5 85201b3a72d2590b8bdedc97ebc42e27
BLAKE2b-256 6a81db87a58e3b8220d00732687f4fdb9c5ac4bae3275d55524439c12c020b5a

See more details on using hashes here.

File details

Details for the file omniback-0.1.24-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.1.24-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 406e7285959d0cb8a805a0847d93c1f1e1bdaf0bd5e4c2dc4d53f7798acf7501
MD5 eb8c5309b51281b435d77c6ff2432dec
BLAKE2b-256 7a2d01f897bde1bf9d8b538859322a892f555d97a5389b44d78bc2c95fc36cca

See more details on using hashes here.

File details

Details for the file omniback-0.1.24-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.1.24-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl
Algorithm Hash digest
SHA256 4d65b37e64ae878e0d31d865e0024cab922ae3170f063860bcf2ff3944bd4798
MD5 6e75e097fabeaf53833657d9cae76f6f
BLAKE2b-256 74d4d975ca111de29c60c91032cd92d7f9bba7ed206ef8f615d82bf914715ce4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page