Skip to main content

A flexible and efficient inference framework

Project description

TorchPipe

PyPI version License Documentation Benchmark

TorchPipe is an alternative choice for Triton Inference Server, mainly featuring similar functionalities such as Shared-momory, Ensemble, and BLS mechanism.

For serving scenarios, TorchPipe is designed to support multi-instance deployment, pipeline parallelism, adaptive batching, GPU-accelerated operators, and reduced head-of-line (HOL) blocking.It acts as a bridge between lower-level acceleration libraries (e.g., TensorRT, OpenCV, CVCUDA) and RPC frameworks (e.g., Thrift). At its core, it is an engine that enables programmable scheduling.

News

  • [2026-01-23] 📦 Available on PyPI: pip install torchpipe
  • [2026-01-04] 🔧 We switched to tvm_ffi to provide clearer C++-Python interaction.

Usage

Below are some usage examples, for more check out the examples.

Initialize and Prepare Pipeline

from torchpipe import pipe
import torch

from torchvision.models.resnet import resnet101

# create some regular pytorch model...
model = resnet101(pretrained=True).eval().cuda()

# create example model
model_path = f"./resnet101.onnx"
x = torch.ones((1, 3, 224, 224)).cuda()
torch.onnx.export(model, x, model_path, opset_version=17,
                    input_names=['input'], output_names=['output'], 
                    dynamic_axes={'input': {0: 'batch_size'},
                                'output': {0: 'batch_size'}})

thread_safe_pipe = pipe({
    "preprocessor": {
        "backend": "S[DecodeTensor,ResizeTensor,CvtColorTensor,SyncTensor]",
        # "backend": "S[DecodeMat,ResizeMat,CvtColorMat,Mat2Tensor,SyncTensor]",
        'instance_num': 2,
        'color': 'rgb',
        'resize_h': '224',
        'resize_w': '224',
        'next': 'model',
    },
    "model": {
        "backend": "SyncTensor[TensorrtTensor]",
        "model": model_path,
        "model::cache": model_path.replace(".onnx", ".trt"),
        "max": '4',
        'batching_timeout': 4,  # ms, timeout for batching
        'instance_num': 2,
        'mean': "123.675, 116.28, 103.53",
        'std': "58.395, 57.120, 57.375",  # merged into trt
    }}
)

Execute

We can execute the returned thread_safe_pipe just like the original PyTorch model, but in a thread-safe manner.

data = {'data': open('/path/to/img.jpg', 'rb').read()}
thread_safe_pipe(data) # <-- this is thread-safe
result = data['result']

Installation

  • NGC Docker containers (recommended):

test on 25.05, 25.06, 24.05, 23.05

img_name=nvcr.io/nvidia/pytorch:25.05-py3

docker run --rm --gpus all -it --network host \
    -v $(pwd):/workspace/ --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
    -w /workspace/ \
    $img_name \
    bash

pip install torchpipe
python -c "import torchpipe"

The backends it introduces will be JIT-compiled and cached.

There are one core backend group(torchpipe_core) and three optional groups (torchpipe_opencv, torchpipe_nvjpeg, and torchpipe_tensorrt) with different dependencies. For details, see here.

Dependencies such as OpenCV and TensorRT can also be provided in the following ways:

  • providing environment variables:
    Users can specify paths via the following environment variables:
    OPENCV_INCLUDE, OPENCV_LIB, TENSORRT_INCLUDE, TENSORRT_LIB.

Other installation options

How does it work?

See Basic Usage.

How to add (or override) a backend

WIP

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniback-0.1.26.tar.gz (1.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

omniback-0.1.26-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (4.1 MB view details)

Uploaded Python 3manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

omniback-0.1.26-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded Python 3manylinux: glibc 2.26+ ARM64manylinux: glibc 2.28+ ARM64

omniback-0.1.26-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.8 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

omniback-0.1.26-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (3.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

File details

Details for the file omniback-0.1.26.tar.gz.

File metadata

  • Download URL: omniback-0.1.26.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniback-0.1.26.tar.gz
Algorithm Hash digest
SHA256 58be0ae5f19597b9e0688f550ac5fceb13b81550e5a575fca1edd006a43c3d6a
MD5 a488ef2147ebc046ce0f5d7c532829d7
BLAKE2b-256 3c2c0d78502bdf4ac74a9911cd8b03e4e969d36a8c6163e1d4d7fdf23a81d143

See more details on using hashes here.

File details

Details for the file omniback-0.1.26-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.1.26-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 fe86be27351d92bd98f8eab41a1d981e4e2051579ef9970c249087c5b2e93b23
MD5 175ecce9b7acfd47a45fc2bd6050dcfe
BLAKE2b-256 f0f2a236da550c80d6754e243305c6f38b77d5ab7522643d2dd3ca3485288196

See more details on using hashes here.

File details

Details for the file omniback-0.1.26-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.1.26-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e8f8a4c5317f91bccb6e0e219d3e81b6f02b5ec7353da14d75a9ac0233706f7c
MD5 4f2b7a98fd065aff380fcc3ef173ee48
BLAKE2b-256 b2fc172c37f0a23a8af6dec70444512f1a216d474e16b05340b1f527727435c0

See more details on using hashes here.

File details

Details for the file omniback-0.1.26-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.1.26-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 572c045dffe2753bb371a23a86973aa9b91dd00476bddd13793557d1155d939c
MD5 6f2cf2598a208969da572402581f3ab8
BLAKE2b-256 ec8ba400303190889489308525a5a189ca775730f3083942c14de5a68f18e79c

See more details on using hashes here.

File details

Details for the file omniback-0.1.26-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.1.26-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl
Algorithm Hash digest
SHA256 58629dd8a68f83665a5561d9056708f1f8e7c234f2c1cecc4d035bbb3ed5cd02
MD5 6b1130900a82bf752f15de3692248e9a
BLAKE2b-256 9138e76eb488daa3a2265534b726048ad3bd7a630fba84d60812bdc8e1586bf4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page