Skip to main content

Gunicorn Flask based library for serving ML Models, built by the ml-ops and science team at Aylien

Project description


Flask based python wrapper for deploying models as a REST based service based on

💡 Flask

💡 Gunicorn

💡 Protobuf3 (optional for schema validation)

🥳 flask-caching

🥳 prometheus metrics

The repo also contains examples of registering end points and a Makefile to run the service


 pip install model-serving

Project Structure

|--- requirements.txt
|--- Makefile
│    |--
│    |--       
    | by protoc)

How it works

  • It runs a web service on the given port (defaults to 8000).
  • Any incoming request JSON will be passed to your ServingHandler.process_request
  • Your ServingHandler.process_request is expected to return a json
  • The request and response will be validated with a protobuf schema (optional)
  • This library wraps common service code, monitoring, exception handling, etc.


  1. Install this library as a dependency for whatever model you want to serve.
  2. Create a ServingHandler (see below for interface details).
  3. Run the make target make COMMAND_UNCACHED='ServingHandler.run_app()' example-service


The main interface to flask apps defined in app_factory is the process_json function. This function expects to receive json input, optionally perform schema validation, then call the callable_handler function using each of the fields in the json object as a keyword argument to the function. The function is expected to return an object that can be parsed to json and sent as the response.

This design allows for a very simple but powerful interface that can easily make an endpoint out of just about any Python function.

Example Serving Handler

The example serving handler defined here does the following

  1. Defines a method predict_lang. For the purposes of this example, this returns a static prediction. Ideally would be the prediction or classification from your model.
  2. Imports a protobuf3 generated .py schema file(only if you require the json message to be schema validated)
  3. Defines a function process_request that calls the wrapper function process_json with the callable from 1 and schema from 2
  4. Registers process_request and its route mapping
  5. Repeat 1-4 for a (route, callable) pair if you have more than one service end point.
import examples.example_schema_pb2 as schema
from aylien_model_serving.app_factory import FlaskAppWrapper, InvalidRequest

def predict_lang(text):
    return "en", 0.71

def predict(title=None, body=None, enrichments=None):
    if body is None:
        body = enrichments["extracted"]["value"]["body"]
    if title is None and body is None:
        raise InvalidRequest("Missing text")
    article_text = f"{title} {body}"
    detected_lang, confidence = predict_lang(article_text)
    return {
        'language': detected_lang,
        'confidence': confidence,
        'error': 'Not an error',
        'version': '0.0.1'

def process_request():
    return FlaskAppWrapper.process_json(predict, schema=schema) 

def run_app():
    routes = [
            "endpoint": "/",
            "callable": process_request,
            "methods": ["POST"]
    return FlaskAppWrapper.create_app(routes)

Note that the FlaskAppWrapper accepts a callable in the process_json , and if you'd like to load a classifier or model.bin in your memory you could modify it like below 👇

class ClassifyHandler:
    def __init__(self):
        self.classifier = Classifier() #this is the classifier to load , or a binary in local file storage

    def __call__(self, text):
        return self.classifier.predict(text)

def run_app():
    classify_handler = ClassifyHandler()
    routes = [
            "endpoint": "/classify",
            "callable": classify_handler,
            "methods": ["POST"]
    return FlaskAppWrapper.create_app(routes)

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

model-serving-2.0.6.tar.gz (11.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page