Skip to main content

Openvino runtime for MLServer

Project description

Overview

This package provides a MLServer runtime compatible with Openvino. This package has couple features:

  1. If server detect that model file is onnx format script will auto convert to openvino format (xml, bin) with dynamic batch size for openvino.
  2. Openvino dynamic batch size
  3. Grpc Ready
  4. V2 Inference Protocol
  5. Models metrics

Why MLserver?

For serving Openvino I choose MLServer because this framework has V2 Inference Protocol (https://kserve.github.io/website/modelserving/inference_api/), grpc and metrics out of the box.

Install

pip install mlserver mlserver-openvino

Content Types

If no content type is present on the request or metadata, the Openvino runtime will try to decode the payload as a NumPy Array. To avoid this, either send a different content type explicitly, or define the correct one as part of your model’s metadata.

Models repository

Your models add to models folder. Accepted files: ["model.xml", "model.onnx"]

/example
/models/your-model-name/
/tests
setup.py
README.md

Training and serve example: https://mlserver.readthedocs.io/en/latest/examples/sklearn/README.html

Metrics

For download metrics (prometheus) use below links

GET http://<your-endpoint>/metrics
GET http://0.0.0.0:8080/metrics

Start docker server

# Build docker image
mlserver build . -t test

# Start server and pass mlserevr_models_dir
docker run -it --rm -e MLSERVER_MODELS_DIR=/opt/mlserver/models/ -p 8080:8080 -p 8081:8081 test

Example queries:

For example script see below files:

/example/grpc-example.py
/example/rest-example.py

Kserve usage

  1. First create one time kserve runtime from file: kserve/cluster-runtime.yaml
  2. Create InferenceService from template:
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "my-openvino-model"
spec:
  predictor:
    model:
      modelFormat:
        name: openvino
      runtime: kserve-mlserver-openvino
      #storageUri: "gs://kfserving-examples/models/xgboost/iris"
      storageUri: https://github.com/myrepo/models/mymodel.joblib?raw=true

Example model-settings.json

{
    "name": "mnist-onnx-openvino",
    "implementation": "mlserver_openvino.OpenvinoRuntime",
    "parameters": {
        "uri": "./model.onnx",
        "version": "v0.1.0",
        "extra": {
            "transform": [
                {
                    "name": "Prepare Metadata",
                    "pipeline_file_path": "./pipeline.cloudpickle",
                    "input_index": 0
                }
            ]
        }
    },
    "inputs": [
        {
            "name": "input-0",
            "datatype": "FP32",
            "shape": [28,28,1]
        }
    ],
    "outputs": [
        {
            "name": "output",
            "datatype": "FP32",
            "shape": [10]
        }
    ]
}

Transformers

If you add transformer pipeline in extra properties you should dump code in same python version as execute mlserver

Tests

make test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlserver_openvino-0.4.10.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

mlserver_openvino-0.4.10-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file mlserver_openvino-0.4.10.tar.gz.

File metadata

  • Download URL: mlserver_openvino-0.4.10.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for mlserver_openvino-0.4.10.tar.gz
Algorithm Hash digest
SHA256 128e82b0ab9e93348a70bce8a8f2c186a0e14f3318d454a15bc831bc42bf9dd8
MD5 4559ae58f9d01af9e38ed0253245487f
BLAKE2b-256 90a05e9b79602a9111fb4cccd49ccdd69b31907d80e0d8ffe66c102b35931b3a

See more details on using hashes here.

File details

Details for the file mlserver_openvino-0.4.10-py3-none-any.whl.

File metadata

File hashes

Hashes for mlserver_openvino-0.4.10-py3-none-any.whl
Algorithm Hash digest
SHA256 b386e51db0ae13517693d6ee686e8277e8e468cc2db531817749040f552819a5
MD5 1860effb37ce0f4109dfcf9da73a63ff
BLAKE2b-256 c642eb11b5fff61ffcb5c4c922caed6314d0d2f1d4c4063d086671bd733fbae4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page