Skip to main content

No project description provided

Project description

Torchpipe

torchpipe is an alternative choice for Triton Inference Server, mainly featuring similar functionalities such as Shared-momory, Ensemble, and BLS mechanism.

For serving scenarios, TorchPipe is designed to support multi-instance deployment, pipeline parallelism, adaptive batching, GPU-accelerated operators, and reduced head-of-line (HOL) blocking.It acts as a bridge between lower-level acceleration libraries (e.g., TensorRT, OpenCV, CVCUDA) and RPC frameworks (e.g., Thrift). At its core, it is an engine that enables programmable scheduling.

update

  • [20260104] We switched to tvm_ffi to provide clearer C++-Python interaction.

Usage

Below are some usage examples, for more check out the examples.

Initialize and Prepare Pipeline

from torchpipe import pipe
import torch

from torchvision.models.resnet import resnet101

# create some regular pytorch model...
model = resnet101(pretrained=True).eval().cuda()

# create example model
model_path = f"./resnet101.onnx"
x = torch.ones((1, 3, 224, 224)).cuda()
torch.onnx.export(model, x, model_path, opset_version=17,
                    input_names=['input'], output_names=['output'], 
                    dynamic_axes={'input': {0: 'batch_size'},
                                'output': {0: 'batch_size'}})

thread_safe_pipe = pipe({
    "preprocessor": {
        "backend": "S[DecodeTensor,ResizeTensor,CvtColorTensor,SyncTensor]",
        # "backend": "S[DecodeMat,ResizeMat,CvtColorMat,Mat2Tensor,SyncTensor]",
        'instance_num': 2,
        'color': 'rgb',
        'resize_h': '224',
        'resize_w': '224',
        'next': 'model',
    },
    "model": {
        "backend": "SyncTensor[TensorrtTensor]",
        "model": model_path,
        "model::cache": model_path.replace(".onnx", ".trt"),
        "max": '4',
        'batching_timeout': 4,  # ms, timeout for batching
        'instance_num': 2,
        'mean': "123.675, 116.28, 103.53",
        'std': "58.395, 57.120, 57.375",  # merged into trt
    }}
)

Execute

We can execute the returned thread_safe_pipe just like the original PyTorch model, but in a thread-safe manner.

data = {'data': open('/path/to/img.jpg', 'rb').read()}
thread_safe_pipe(data) # <-- this is thread-safe
result = data['result']

Installation

  • NGC Docker containers (recommended):

test on 25.05, 25.06, 24.05, 23.05, and 22.12

img_name=nvcr.io/nvidia/pytorch:25.05-py3

docker run --rm --gpus all -it --network host \
    -v $(pwd):/workspace/ --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
    -w /workspace/ \
    $img_name \
    bash

pip install torchpipe
python -c "import torchpipe"

The backends it introduces will be JIT-compiled and cached.

or you can try

pip install torch>=2.3 torchpipe

python -c "import torchpipe"

There are one core backend group(torchpipe_core) and three optional groups (torchpipe_opencv, torchpipe_nvjpeg, and torchpipe_tensorrt) with different dependencies. For details, see here.

Dependencies such as OpenCV and TensorRT can be provided in the following ways:

  • providing environment variables:
    Users can specify paths via the following environment variables:
    OPENCV_INCLUDE, OPENCV_LIB, TENSORRT_INCLUDE, TENSORRT_LIB.

Other installation options

How does it work?

See Basic Usage.

How to add (or override) a backend

WIP

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniback-0.1.21.post0.tar.gz (937.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

omniback-0.1.21.post0-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.3 MB view details)

Uploaded Python 3manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

omniback-0.1.21.post0-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl (6.1 MB view details)

Uploaded Python 3manylinux: glibc 2.26+ ARM64manylinux: glibc 2.28+ ARM64

omniback-0.1.21.post0-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (6.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

omniback-0.1.21.post0-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (6.4 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

File details

Details for the file omniback-0.1.21.post0.tar.gz.

File metadata

  • Download URL: omniback-0.1.21.post0.tar.gz
  • Upload date:
  • Size: 937.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for omniback-0.1.21.post0.tar.gz
Algorithm Hash digest
SHA256 8cdf688ce060120ba7ef05398ce46223a08d4c05c3c8ce4e5500357f53051217
MD5 c30e029bf8cd4435fc168da03accf2e6
BLAKE2b-256 acc9044131e3ce8d674cdc31d96a074c08200c273cba746a20bbdb3d65e88902

See more details on using hashes here.

File details

Details for the file omniback-0.1.21.post0-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.1.21.post0-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 dfbb92a47e5f947220f3705c34f36309ee260a81ba30cb9ec76eb4d2872cf2d3
MD5 1d69e99b1603706b81541bb781ee19ff
BLAKE2b-256 1f91145132c3f1b1258ce3523a8c3e2b718ff5d79a308f76b18b2d7d225a6f41

See more details on using hashes here.

File details

Details for the file omniback-0.1.21.post0-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.1.21.post0-py3-none-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 48c16784784ef528c257f7cd3f9e235a31e47ab27699f14ef4059998d05b266f
MD5 1f9d126e46719331ec0358dd5279f61d
BLAKE2b-256 8e36419994fd1b95f42fa6a53da740d4b0b11e707ed718a7c59fb7b09dbe345e

See more details on using hashes here.

File details

Details for the file omniback-0.1.21.post0-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for omniback-0.1.21.post0-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 a5daf725f6f974aeb54e88eca09de79f1141216361a9ae25ba19c6a9082daf52
MD5 80d75f2e7333207319f5bd0a58bbd54b
BLAKE2b-256 ffb17bce6aab6f05d950ab0b566574eedfd7f01c5862f426961a9a1cfe470809

See more details on using hashes here.

File details

Details for the file omniback-0.1.21.post0-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl.

File metadata

File hashes

Hashes for omniback-0.1.21.post0-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl
Algorithm Hash digest
SHA256 23630b20f988d8ce3855e3944d00c5f3b111245eebb1f4b1603a2d73d1fed24d
MD5 c37f2fd74d4c7aadbe100ee7081bc2bd
BLAKE2b-256 ea5eaa124e8d975a3a7d2b13239e7825c97fbe3d9cbe6a46217eee81dfffd4e8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page