KServe Python SDK
Project description
KServe Python SDK
Python SDK for KServe Server and Client.
Installation
KServe Python SDK can be installed by pip
or poetry
.
pip install
pip install kserve
To install Kserve with storage support
pip install kserve[storage]
Poetry
Install via Poetry.
make dev_install
To install Kserve with storage support
poetry install -E storage
or
poetry install --extras "storage"
KServe Python Server
KServe's python server libraries implement a standardized library that is extended by model serving frameworks such as Scikit Learn, XGBoost and PyTorch. It encapsulates data plane API definitions and storage retrieval for models.
It provides many functionalities, including among others:
- Registering a model and starting the server
- Prediction Handler
- Pre/Post Processing Handler
- Liveness Handler
- Readiness Handlers
It supports the following storage providers:
- Google Cloud Storage with a prefix: "gs://"
- By default, it uses
GOOGLE_APPLICATION_CREDENTIALS
environment variable for user authentication. - If
GOOGLE_APPLICATION_CREDENTIALS
is not provided, anonymous client will be used to download the artifacts.
- By default, it uses
- S3 Compatible Object Storage with a prefix "s3://"
- By default, it uses
S3_ENDPOINT
,AWS_ACCESS_KEY_ID
, andAWS_SECRET_ACCESS_KEY
environment variables for user authentication.
- By default, it uses
- Azure Blob Storage with the format: "https://{$STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{$CONTAINER}/{$PATH}"
- By default, it uses anonymous client to download the artifacts.
- For e.g. https://kfserving.blob.core.windows.net/triton/simple_string/
- Local filesystem either without any prefix or with a prefix "file://". For example:
- Absolute path:
/absolute/path
orfile:///absolute/path
- Relative path:
relative/path
orfile://relative/path
- For local filesystem, we recommended to use relative path without any prefix.
- Absolute path:
- Persistent Volume Claim (PVC) with the format "pvc://{$pvcname}/[path]".
- The
pvcname
is the name of the PVC that contains the model. - The
[path]
is the relative path to the model on the PVC. - For e.g.
pvc://mypvcname/model/path/on/pvc
- The
- Generic URI, over either
HTTP
, prefixed withhttp://
orHTTPS
, prefixed withhttps://
. For example:https://<some_url>.com/model.joblib
http://<some_url>.com/model.joblib
Metrics
For latency metrics, send a request to /metrics
. Prometheus latency histograms are emitted for each of the steps (pre/postprocessing, explain, predict).
Additionally, the latencies of each step are logged per request.
Metric Name | Description | Type |
---|---|---|
request_preprocess_seconds | pre-processing request latency | Histogram |
request_explain_seconds | explain request latency | Histogram |
request_predict_seconds | prediction request latency | Histogram |
request_postprocess_seconds | pre-processing request latency | Histogram |
KServe Client
Getting Started
KServe's python client interacts with KServe control plane APIs for executing operations on a remote KServe cluster, such as creating, patching and deleting of a InferenceService instance. See the Sample for Python SDK Client to get started.
Documentation for Client API
Please review KServe Client API docs.
Documentation For Models
- KnativeAddressable
- KnativeCondition
- KnativeURL
- KnativeVolatileTime
- NetUrlUserinfo
- V1alpha1InferenceGraph
- V1alpha1InferenceGraphList
- V1alpha1InferenceGraphSpec
- V1alpha1InferenceGraphStatus
- V1alpha1InferenceRouter
- V1alpha1InferenceStep
- V1alpha1InferenceTarget
- V1beta1AlibiExplainerSpec
- V1beta1Batcher
- V1beta1ComponentExtensionSpec
- V1beta1ComponentStatusSpec
- V1beta1CustomExplainer
- V1beta1CustomPredictor
- V1beta1CustomTransformer
- V1beta1ExplainerConfig
- V1beta1ExplainerSpec
- V1beta1ExplainersConfig
- V1beta1InferenceService
- V1beta1InferenceServiceList
- V1beta1InferenceServiceSpec
- V1beta1InferenceServiceStatus
- V1beta1InferenceServicesConfig
- V1beta1IngressConfig
- V1beta1LoggerSpec
- V1beta1ModelSpec
- V1beta1ONNXRuntimeSpec
- V1beta1PodSpec
- V1beta1PredictorConfig
- V1beta1PredictorExtensionSpec
- V1beta1PredictorSpec
- V1beta1PredictorsConfig
- V1beta1SKLearnSpec
- V1beta1TFServingSpec
- V1beta1TorchServeSpec
- V1beta1TrainedModel
- V1beta1TrainedModelList
- V1beta1TrainedModelSpec
- V1beta1TrainedModelStatus
- V1beta1TransformerConfig
- V1beta1TransformerSpec
- V1beta1TransformersConfig
- V1beta1TritonSpec
- V1beta1XGBoostSpec
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.