Skip to main content

A flexible and efficient inference framework

Project description

TorchPipe

PyPI version License Documentation Benchmark

TorchPipe is an alternative choice for Triton Inference Server, mainly featuring similar functionalities such as Shared-momory, Ensemble, and BLS mechanism.

For serving scenarios, TorchPipe is designed to support multi-instance deployment, pipeline parallelism, adaptive batching, GPU-accelerated operators, and reduced head-of-line (HOL) blocking.It acts as a bridge between lower-level acceleration libraries (e.g., TensorRT, OpenCV, CVCUDA) and RPC frameworks (e.g., Thrift). At its core, it is an engine that enables programmable scheduling.

News

  • [2026-01-23] 📦 Available on PyPI: pip install torchpipe
  • [2026-01-04] 🔧 We switched to tvm_ffi to provide clearer C++-Python interaction.

Usage

Below are some usage examples, for more check out the examples.

Initialize and Prepare Pipeline

from torchpipe import pipe
import torch

from torchvision.models.resnet import resnet101

# create some regular pytorch model...
model = resnet101(pretrained=True).eval().cuda()

# create example model
model_path = f"./resnet101.onnx"
x = torch.ones((1, 3, 224, 224)).cuda()
torch.onnx.export(model, x, model_path, opset_version=17,
                    input_names=['input'], output_names=['output'], 
                    dynamic_axes={'input': {0: 'batch_size'},
                                'output': {0: 'batch_size'}})

thread_safe_pipe = pipe({
    "preprocessor": {
        "backend": "S[DecodeTensor,ResizeTensor,CvtColorTensor,SyncTensor]",
        # "backend": "S[DecodeMat,ResizeMat,CvtColorMat,Mat2Tensor,SyncTensor]",
        'instance_num': 2,
        'color': 'rgb',
        'resize_h': '224',
        'resize_w': '224',
        'next': 'model',
    },
    "model": {
        "backend": "SyncTensor[TensorrtTensor]",
        "model": model_path,
        "model::cache": model_path.replace(".onnx", ".trt"),
        "max": '4',
        'batching_timeout': 4,  # ms, timeout for batching
        'instance_num': 2,
        'mean': "123.675, 116.28, 103.53",
        'std': "58.395, 57.120, 57.375",  # merged into trt
    }}
)

Execute

We can execute the returned thread_safe_pipe just like the original PyTorch model, but in a thread-safe manner.

data = {'data': open('/path/to/img.jpg', 'rb').read()}
thread_safe_pipe(data) # <-- this is thread-safe
result = data['result']

Installation

  • NGC Docker containers (recommended):

test on 25.05, 25.06, 24.05, 23.05

img_name=nvcr.io/nvidia/pytorch:25.05-py3

docker run --rm --gpus all -it --network host \
    -v $(pwd):/workspace/ --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
    -w /workspace/ \
    $img_name \
    bash

pip install torchpipe
python -c "import torchpipe"

The backends it introduces will be JIT-compiled and cached.

There are one core backend group(torchpipe_core) and three optional groups (torchpipe_opencv, torchpipe_nvjpeg, and torchpipe_tensorrt) with different dependencies. For details, see here.

Dependencies such as OpenCV and TensorRT can also be provided in the following ways:

  • providing environment variables:
    Users can specify paths via the following environment variables:
    OPENCV_INCLUDE, OPENCV_LIB, TENSORRT_INCLUDE, TENSORRT_LIB.

Other installation options

How does it work?

See Basic Usage.

How to add (or override) a backend

WIP

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniback-0.1.25.tar.gz (1.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

omniback-0.1.25-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (4.1 MB view details)

Uploaded Python 3manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

omniback-0.1.25-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded Python 3manylinux: glibc 2.26+ ARM64manylinux: glibc 2.28+ ARM64

omniback-0.1.25-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.8 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

omniback-0.1.25-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (3.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

File details

Details for the file omniback-0.1.25.tar.gz.

File metadata

  • Download URL: omniback-0.1.25.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniback-0.1.25.tar.gz
Algorithm Hash digest
SHA256 9a1695137946da27a69115111c6082f672ac0d2b96e57bb42b068d9f8bb89b84
MD5 ec35d9253dfa1c4a183fa397311d0197
BLAKE2b-256 1249178e844f40d3487916e3eafe54c60f25cd90d68aa72dd96387fd15d58a67

See more details on using hashes here.

File details

Details for the file omniback-0.1.25-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.1.25-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 406c8e489edbd1fbc3e00dd4ecf442ca73454f0f3e0a1abd585552172bf7a0e3
MD5 19fb3e748106fba3300bc1dcc76db895
BLAKE2b-256 04ae34d76f943460d6bc779985fb6478eeb473b924e7af90007b904b95f82034

See more details on using hashes here.

File details

Details for the file omniback-0.1.25-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.1.25-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ff29765dc9517a89ea3c56eb43b45b808209b9437e4301971664bd96e62490db
MD5 f5565b7483ccfe9ccf09e193082fb7d9
BLAKE2b-256 381be9327ae53c20615fc073986e24102c2e0339e793e72b9b7bd8caa46191a3

See more details on using hashes here.

File details

Details for the file omniback-0.1.25-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.1.25-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 019cc7765ff91ff420c0abd0343692e6151f105d7a64b186c78fece637122a7a
MD5 61815ca9f927d2a2274a22da26296e31
BLAKE2b-256 ac8a2211096245ebcdde121342628e575745e1cb6ee8758837fe4178be8f40dc

See more details on using hashes here.

File details

Details for the file omniback-0.1.25-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.1.25-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl
Algorithm Hash digest
SHA256 374bc4f162d7ff3078eb95ff0177a9517159f9fee6e21e28b86078ac5c42a091
MD5 26d9e95db0ab8cdd8a7aabc9968a8ae6
BLAKE2b-256 3df9a8795f8ef3155a91cae8500a8e9995a725520603c7a129c62abe9ef46876

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page