Skip to main content

A KServe implementation of the Open Inference Protocol

Project description

aiSSEMBLE Open Inference Protocol™ KServe

The Open Inference Protocol (OIP) specification defines a standard protocol for performing machine learning model inference across serving runtimes for different ML frameworks. This Python application can be leveraged to deploy KServe that are compatible with the Open Inference Protocol.

Installation

Add aissemble-open-inference-protocol-kserve to an application

pip install aissemble-open-inference-protocol-kserve

Implementing a Handler

To make a custom handler to integrate with kserve, create your class and extend the AissembleOIPKServe. Then, implement methods based on the model such as load() for loading a model, and optional preprocess() and/or postprocess() for transforming input or output data for client and prediction model. predict method will call infer method of dataplaneHandler in which you need to implement either with REST or gRPC

Example of Usage with a Handler

Create your custom handler class with:

from kserve import ModelServer
from aissemble_open_inference_protocol_kserve.aissemble_oip_kserve import (
    AissembleOIPKServe,
)

class KserveCustomHandler(AissembleOIPKServe):
    """
    Implements Custom predictor of AissembleOIPKServe for requesting model.
    handler refers to custom DataplaneHandler
    """
    def __init__(self, name: str, model_path: str, handler=None):
        super().__init__(name, handler)
        self.model = None
        self.name = name
        self.model_path = model_path
        self.handler = handler

    def preprocess(self):
        """As preprocess is optional API in KServe, it is up to user to implement preprocess based on their use case to transform raw input to the format expected for model serve if applicable."""
    pass
    
    def postprocess(self):
        """As postprocess is optional API in KServe, it is up to user to implement preprocess based on their use case to transform prediction output to the format expected for client if applicable."""
        pass

    def load(self):
        """As loading model is different for each client, it is up to user to implement load based on their use case. 
        NOTE: setting self.ready to True will make sure KServe Model is ready to serve."""
        self.ready = True
        return self.ready

    async def start(self):
        self.load()
        ModelServer().start([self])

if __name__ == "__main__":
    # DataplaneHandler is abstract base class, user should be extending this class for their implementation based on preferred API calls (REST or GRPC)
    model = KserveCustomHandler( name= "sample_model",
        model_path="sample_model_path",
        handler=DataplaneHandler,
    )
    model.load()
    model.start()

Once you built your custom image for python application for KServe and KServe setup is complete, then you can run prediction based on preferred API calls (REST or gRPC)

Examples

For working examples, refer to the Examples documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file aissemble_open_inference_protocol_kserve-1.0.1.tar.gz.

File metadata

File hashes

Hashes for aissemble_open_inference_protocol_kserve-1.0.1.tar.gz
Algorithm Hash digest
SHA256 6fe4176bafecce05391edd00683a7a72c6006e1bc9bbfb4a751c84fb99715886
MD5 834c083fa388036b85e93a6cc1d5868e
BLAKE2b-256 2159b185d2f9b2104c2ae3fb069f891225bf4b8bca2a77732477fb60e34d5d1a

See more details on using hashes here.

File details

Details for the file aissemble_open_inference_protocol_kserve-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for aissemble_open_inference_protocol_kserve-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c76e001ccd383b6848d1264dec0c2d6ba4396b6e0603efc4355572dfafbc0331
MD5 c2c2d2ae76bb2eb930d65b93de7c9408
BLAKE2b-256 05761bfb18f63c3a13f78192edcd0065e0f4facc06f8ae095e80e08296c9fc6d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page