A KServe implementation of the Open Inference Protocol

These details have been verified by PyPI

Maintainers

ashcradr bcastle ccundiff choi_james lemuelwatts

These details have not been verified by PyPI

Project links

Homepage

Intended Audience
- Developers
License
- Other/Proprietary License
Operating System
- OS Independent
Programming Language

Project description

aiSSEMBLE™ Open Inference Protocol KServe

PyPI - Version PyPI - Python Version PyPI - Format PyPI - Downloads

The Open Inference Protocol (OIP) specification defines a standard protocol for performing machine learning model inference across serving runtimes for different ML frameworks. This Python application can be leveraged to deploy KServe that are compatible with the Open Inference Protocol.

Installation

Add aissemble-open-inference-protocol-kserve to an application

pip install aissemble-open-inference-protocol-kserve

Usage

Prerequisite

In order to stand up KServe Using aiSSEMBLE Open Inference Protocol, user should make sure all infrastructure/environment for KServe is set up using the official Documentation. Once KServe environment is set up, user can proceed with implementing custom handler for KServe using aiSSEMBLE Open Inference Protocol.

Implementing a Handler

To make a custom handler to integrate with Kserve, create your class and extend the DataplaneHandler. Then, implement methods based on the model such as load and infer.

Example of Usage with a Handler

Create your custom handler class with:

from typing import Optional

from aissemble_open_inference_protocol_shared.handlers.model_handler import (
    ModelHandler,
)
from aissemble_open_inference_protocol_shared.types.dataplane import (
    InferenceRequest,
    InferenceResponse,
    ModelMetadataResponse,
    MetadataTensor,
    Datatype,
)

class MyHandler(ModelHandler):
    def __init__(self):
        super().__init__()

    def infer(
            self,
            payload: InferenceRequest,
            model_name: str,
            model_version: Optional[str] = None,
    ) -> InferenceResponse:
        return InferenceResponse(
            model_name=model_name, model_version=model_version, id="id", outputs=[]
        )

    def model_metadata(
            self,
            model_name: str,
            model_version: Optional[str] = None,
    ) -> ModelMetadataResponse:
        # Return a stub ModelMetadataResponse
        return ModelMetadataResponse(
            name=model_name,
            versions=[model_version] if model_version else None,
            platform="python",
            inputs=[MetadataTensor(name="input", datatype=Datatype.FP32, shape=[1])],
            outputs=[
                MetadataTensor(name="output", datatype=Datatype.FP32, shape=[1])
            ],
        )

    def model_load(self, model_name: str) -> bool:
        # Do some model loading
        return True

You can now use this handler to create the AissembleOIPKServe class to be loaded into the Kserve inferencing server Example Kserve inferencing server

from aissemble_open_inference_protocol_kserve.aissemble_oip_kserve import (
    AissembleOIPKServe,
)

if __name__ == "__main__":
    model_name = "my_model"
    oip_kserve = AissembleOIPKServe(name=model_name, model_handler=MyHandler())
    # load() should be called before start server.
    # which will call the handler's model_load() to ensure model is loaded
    oip_kserve.load()
    oip_kserve.start_server()

You are now ready to containerize the app and pass it to the Kserve Kubernetes resources.

Configurations

There are several configurations available that affect the server. These can be implemented via container arguments (passed through InferenceService YAML in the args field), environment variables, or Krausening properties file oip.properties.

Configuration Name	Container Argument	Environment Variable	Default Value	Description
`kserve_http_port`	`--http_port`	`KSERVE_HTTP_PORT`	8080	The HTTP Port listened to by the model server
`kserve_grpc_port`	`--grpc_port`	`KSERVE_GRPC_PORT`	8081	The gRPC Port listened to by the model server
`kserve_workers`	`--workers`	`KSERVE_WORKERS`	1	The number of uvicorn workers for multi-processing
`kserve_max_threads`	`--max_threads`	`KSERVE_MAX_THREADS`	4	The max number of gRPC processing threads
`kserve_max_asyncio_workers`	`--max_asyncio_workers`	`KSERVE_MAX_ASYNCIO_WORKERS`	None	The max number of asyncio workers to spawn
`kserve_enable_grpc`	`--enable_grpc`	`KSERVE_ENABLE_GRPC`	True	Enable gRPC for the model server
`kserve_enable_docs_url`	`--enable_docs_url`	`KSERVE_ENABLE_DOCS_URL`	False	Enable docs url '/docs' to display Swagger UI
`kserve_enable_latency_logging`	`--enable_latency_logging`	`KSERVE_ENABLE_LATENCY_LOGGING`	True	Enable a log line per request with preprocess/predict/postprocess latency metrics
`kserve_access_log_format`	`--access_log_format`	`KSERVE_ACCESS_LOG_FORMAT`	None	The asgi access logging format. It allows to override only the `uvicorn.access`'s format configuration with a richer set of fields

Configuration Precedence

Configuration values are resolved in the following order of precedence (highest to lowest):

Container arguments (e.g., --http_port=9000 passed via InferenceService YAML args field)
Environment variables (e.g., KSERVE_HTTP_PORT=9000)
Krausening properties (e.g., kserve_http_port=9000 in oip.properties)
Default values (as shown in the table above)

Additional configuration options may be available via container arguments or environment variables. See the KServe documentation for more details.

Examples

For working examples, refer to the Examples documentation.

Project details

These details have been verified by PyPI

Maintainers

ashcradr bcastle ccundiff choi_james lemuelwatts

These details have not been verified by PyPI

Project links

Homepage

Intended Audience
- Developers
License
- Other/Proprietary License
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

This version

1.1.0

Oct 8, 2025

1.0.1

Aug 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aissemble_open_inference_protocol_kserve-1.1.0.tar.gz (10.0 kB view details)

Uploaded Oct 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aissemble_open_inference_protocol_kserve-1.1.0-py3-none-any.whl (13.4 kB view details)

Uploaded Oct 8, 2025 Python 3

File details

Details for the file aissemble_open_inference_protocol_kserve-1.1.0.tar.gz.

File metadata

Download URL: aissemble_open_inference_protocol_kserve-1.1.0.tar.gz
Upload date: Oct 8, 2025
Size: 10.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.2.1 CPython/3.11.4 Linux/6.11.0-1018-azure

File hashes

Hashes for aissemble_open_inference_protocol_kserve-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`26922706a1f2a890a0340a86ac9712bbfc893a291c284fef13f66cbf867f0c3a`
MD5	`1c8399d48b59666cdb02a842143d42e9`
BLAKE2b-256	`166cae35c02e861e4a6640927f0a2e3c6659546e8fca8e4ec0e0139936846dbf`

See more details on using hashes here.

File details

Details for the file aissemble_open_inference_protocol_kserve-1.1.0-py3-none-any.whl.

File metadata

Download URL: aissemble_open_inference_protocol_kserve-1.1.0-py3-none-any.whl
Upload date: Oct 8, 2025
Size: 13.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.2.1 CPython/3.11.4 Linux/6.11.0-1018-azure

File hashes

Hashes for aissemble_open_inference_protocol_kserve-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9b98932366333ead1a260dede820b782dde0a6a4aaf4b9f126d1f72c2bcf8fa9`
MD5	`c3fea75a269a153a9f4ceb963b175500`
BLAKE2b-256	`dda7a4a58a21a00cc68d7af65cfefb592d02c80fa8289148cbdccd30399ee762`

See more details on using hashes here.

aissemble-open-inference-protocol-kserve 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

aiSSEMBLE™ Open Inference Protocol KServe

Installation

Usage

Prerequisite

Implementing a Handler

Example of Usage with a Handler

Configurations

Configuration Precedence

Examples

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes