Gunicorn Flask based library for serving ML Models, built by the ml-ops and science team at Aylien

Project description

model-serving

Flask based python wrapper for deploying models as a REST based service based on

💡 Flask

💡 Gunicorn

💡 Protobuf3 (optional for schema validation)

🥳 flask-caching

🥳 prometheus metrics

The repo also contains examples of registering end points and a Makefile to run the service

Installation

 pip install model-serving

Project Structure

aylien_model_serving
│
|--- requirements.txt
|--- Makefile
│   
│
└───app
│    |-- app_factory.py
│    |-- cached_app_factory.py       
│      
│   
└───examples
    │-----example_schema.proto
    |-----example_schema_pb2.py(autogenerated by protoc)
    │-----example_serving_handler.py
    |-----example_serving_handler_cached.py

How it works

It runs a web service on the given port (defaults to 8000).
Any incoming request JSON will be passed to your ServingHandler.process_request
Your ServingHandler.process_request is expected to return a json
The request and response will be validated with a protobuf schema (optional)
This library wraps common service code, monitoring, exception handling, etc.

Usage

Install this library as a dependency for whatever model you want to serve.
Create a ServingHandler (see below for interface details).
Run the make target make COMMAND_UNCACHED='ServingHandler.run_app()' example-service

Interfaces

The main interface to flask apps defined in app_factory is the process_json function. This function expects to receive json input, optionally perform schema validation, then call the callable_handler function using each of the fields in the json object as a keyword argument to the function. The function is expected to return an object that can be parsed to json and sent as the response.

This design allows for a very simple but powerful interface that can easily make an endpoint out of just about any Python function.

Example Serving Handler

The example serving handler defined here does the following

Defines a method predict_lang. For the purposes of this example, this returns a static prediction. Ideally would be the prediction or classification from your model.
Imports a protobuf3 generated .py schema file(only if you require the json message to be schema validated)
Defines a function process_request that calls the wrapper function process_json with the callable from 1 and schema from 2
Registers process_request and its route mapping
Repeat 1-4 for a (route, callable) pair if you have more than one service end point.

import examples.example_schema_pb2 as schema
from aylien_model_serving.app_factory import FlaskAppWrapper, InvalidRequest


def predict_lang(text):
    return "en", 0.71


def predict(title=None, body=None, enrichments=None):
    if body is None:
        body = enrichments["extracted"]["value"]["body"]
    if title is None and body is None:
        raise InvalidRequest("Missing text")
    article_text = f"{title} {body}"
    detected_lang, confidence = predict_lang(article_text)
    return {
        'language': detected_lang,
        'confidence': confidence,
        'error': 'Not an error',
        'version': '0.0.1'
    }


def process_request():
    return FlaskAppWrapper.process_json(predict, schema=schema) 


def run_app():
    routes = [
        {
            "endpoint": "/",
            "callable": process_request,
            "methods": ["POST"]
        }
    ]
    return FlaskAppWrapper.create_app(routes)

Note that the FlaskAppWrapper accepts a callable in the process_json , and if you'd like to load a classifier or model.bin in your memory you could modify it like below 👇

class ClassifyHandler:
    def __init__(self):
        self.classifier = Classifier() #this is the classifier to load , or a binary in local file storage

    def __call__(self, text):
        return self.classifier.predict(text)


def run_app():
    classify_handler = ClassifyHandler()
    routes = [
        {
            "endpoint": "/classify",
            "callable": classify_handler,
            "methods": ["POST"]
        }
    ]
    return FlaskAppWrapper.create_app(routes)

Project details

Release history Release notifications | RSS feed

2.0.6

Sep 12, 2022

2.0.5

Aug 29, 2022

2.0.4

Aug 25, 2022

This version

2.0.3

Aug 18, 2022

2.0.1

Jul 27, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

model-serving-2.0.3.tar.gz (12.1 kB view hashes)

Uploaded Aug 18, 2022 Source

Hashes for model-serving-2.0.3.tar.gz

Hashes for model-serving-2.0.3.tar.gz
Algorithm	Hash digest
SHA256	`ab10764212dd5b6df6a0f5ce4e7a62da133275a05553ee27ea2b23b5caf0d9bf`
MD5	`fae985371db5320326bd10acabe1c896`
BLAKE2b-256	`d6964806f1feacc1cf637fbbc95a0a9a8b49f01b50395bd9f441c96987d0a03a`