Skip to main content

A flexible and efficient inference framework

Project description

TorchPipe

PyPI version License Documentation Benchmark

TorchPipe is an alternative choice for Triton Inference Server, mainly featuring similar functionalities such as Shared-momory, Ensemble, and BLS mechanism.

For serving scenarios, TorchPipe is designed to support multi-instance deployment, pipeline parallelism, adaptive batching, GPU-accelerated operators, and reduced head-of-line (HOL) blocking.It acts as a bridge between lower-level acceleration libraries (e.g., TensorRT, OpenCV, CVCUDA) and RPC frameworks (e.g., Thrift). At its core, it is an engine that enables programmable scheduling.

News

  • [2026-01-23] 📦 Available on PyPI: pip install torchpipe
  • [2026-01-04] 🔧 We switched to tvm_ffi to provide clearer C++-Python interaction.

Usage

Below are some usage examples, for more check out the examples.

Initialize and Prepare Pipeline

from torchpipe import pipe
import torch

from torchvision.models.resnet import resnet101

# create some regular pytorch model...
model = resnet101(pretrained=True).eval().cuda()

# create example model
model_path = f"./resnet101.onnx"
x = torch.ones((1, 3, 224, 224)).cuda()
torch.onnx.export(model, x, model_path, opset_version=17,
                    input_names=['input'], output_names=['output'], 
                    dynamic_axes={'input': {0: 'batch_size'},
                                'output': {0: 'batch_size'}})

thread_safe_pipe = pipe({
    "preprocessor": {
        "backend": "S[DecodeTensor,ResizeTensor,CvtColorTensor,SyncTensor]",
        # "backend": "S[DecodeMat,ResizeMat,CvtColorMat,Mat2Tensor,SyncTensor]",
        'instance_num': 2,
        'color': 'rgb',
        'resize_h': '224',
        'resize_w': '224',
        'next': 'model',
    },
    "model": {
        "backend": "SyncTensor[TensorrtTensor]",
        "model": model_path,
        "model::cache": model_path.replace(".onnx", ".trt"),
        "max": '4',
        'batching_timeout': 4,  # ms, timeout for batching
        'instance_num': 2,
        'mean': "123.675, 116.28, 103.53",
        'std': "58.395, 57.120, 57.375",  # merged into trt
    }}
)

Execute

We can execute the returned thread_safe_pipe just like the original PyTorch model, but in a thread-safe manner.

data = {'data': open('/path/to/img.jpg', 'rb').read()}
thread_safe_pipe(data) # <-- this is thread-safe
result = data['result']

Installation

  • NGC Docker containers (recommended):

test on 25.05, 25.06, 24.05, 23.05

img_name=nvcr.io/nvidia/pytorch:25.05-py3

docker run --rm --gpus all -it --network host \
    -v $(pwd):/workspace/ --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
    -w /workspace/ \
    $img_name \
    bash

pip install torchpipe
python -c "import torchpipe"

The backends it introduces will be JIT-compiled and cached.

There are one core backend group(torchpipe_core) and three optional groups (torchpipe_opencv, torchpipe_nvjpeg, and torchpipe_tensorrt) with different dependencies. For details, see here.

Dependencies such as OpenCV and TensorRT can also be provided in the following ways:

  • providing environment variables:
    Users can specify paths via the following environment variables:
    OPENCV_INCLUDE, OPENCV_LIB, TENSORRT_INCLUDE, TENSORRT_LIB.

Other installation options

How does it work?

See Basic Usage.

How to add (or override) a backend

WIP

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniback-0.2.2.tar.gz (1.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

omniback-0.2.2-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (4.1 MB view details)

Uploaded Python 3manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

omniback-0.2.2-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl (3.8 MB view details)

Uploaded Python 3manylinux: glibc 2.26+ ARM64manylinux: glibc 2.28+ ARM64

omniback-0.2.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.8 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

omniback-0.2.2-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (3.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

File details

Details for the file omniback-0.2.2.tar.gz.

File metadata

  • Download URL: omniback-0.2.2.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniback-0.2.2.tar.gz
Algorithm Hash digest
SHA256 1461a0b91fdb5c8f8869e4f966d33b7aaa6ea4a1c27edcdf94e88fcd2033b660
MD5 2f65d6e07bfd5f24a460d16c0d543ce1
BLAKE2b-256 e849b5eeb605671156ef69bfba4887195967dc4e584bc498b65eaced9d665ecc

See more details on using hashes here.

File details

Details for the file omniback-0.2.2-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.2.2-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5d9ffb7dc7354dcaa25a0890d35c69f98f1e5e130fe148cd4e92b6bd09276b75
MD5 f88249a87d7f8f0bbccf695f9fe88b66
BLAKE2b-256 8628c6e2143c7a149752a0dde05f36513ff2f4b42c890ad46ec44ddb4c57e1ec

See more details on using hashes here.

File details

Details for the file omniback-0.2.2-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.2.2-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c9a0aad67df0842f9bf980b3ba6124f884f21f7d73be3e368b8ea9bb4026a22f
MD5 4692e2e8e603162ef834c5268b38430d
BLAKE2b-256 db0c15bae3f0fbe5cf469e6759b5c7e7bb1067f3a08c74762aa1ef2e754e17fc

See more details on using hashes here.

File details

Details for the file omniback-0.2.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.2.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 45b8245496d702ca5a86604aeb04caa3c69ff445fdf78523156715fe06641a83
MD5 20da1de0bb8f85d7e798773617aad431
BLAKE2b-256 5c716746fefd414fccba2a39c443869ece8dce108de7f684830d9361fafd0193

See more details on using hashes here.

File details

Details for the file omniback-0.2.2-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.2.2-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl
Algorithm Hash digest
SHA256 147967011da7c179df1af3c9bd34a386755402af85cc08550e7eb71c49599e90
MD5 2f2b5775c6d3b6921d9575d2490df2a2
BLAKE2b-256 bcdf8c37653b90a59361861cd2cc68e0a59153e02abc8cb1f03433b77f5de7c0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page