Skip to main content

No project description provided

Project description

mlbaklava

This is a package for building python based Machine Learning models into docker images, that can be deployed directly into AWS SageMaker.

This is an extension to the standard python packaging utility setuptools. The official python packaging guide explains the basics of building python distributions in detail.

This extends the existing behavior of building a setuptools source distribution (sdist) by installing the built package artifact (*.tar.gz) into a Docker image. After the python distribution has been installed to the Docker image, it allows the user to configure the image for the purposes of model training and prediction.

The name was chosen because mlbaklava consists of small pieces and layers, like we put technologies together in form of many layers to create Docker images.

Installation

Install docker and then install the package:

pip install mlbaklava

Features

Installing the mlbaklava package automatically registers extensions to setuptools. New features are added to build python distributions into docker images.

When installed, this package allows you to use two new setuptools commands (similar to sdist or bdist_wheel):

  • train: Builds a training docker image for your package. A training image (python setup.py train) executes a user-provided function just once in order to produce a model artifact. This image conforms to the AWS SageMaker training image API.

  • predict: Builds a prediction docker image for your package. A prediction image (python setup.py predict) hosts the user-provided function in a web application to be able to produce many decisions over time using a RESTful service conforming to the AWS SageMaker prediction API.

  • execute: Builds a batch execution docker image for your package. A batch execution image (python setup.py execute) executes a user-provided batch function for prediction on large amount of records.

Production-grade Machine Learning API using Flask, Gunicorn, Nginx, and Docker

Flask App

New setup keywords are also registered with setuptools (similar to install_requires or entry_points). These include:

  • python_version: Specify the version of python to build the docker image for
  • dockerlines: Add docker commands to your resulting Dockerfile

This package also defines a Python API to perform the same actions as the setuptools extension.

Usage

Train

To create a training image, your package must define a function that takes no arguments and returns nothing. It can be named anything as long as it is correctly referenced in the setup.py file.

def my_training_function():
    """
    A training function takes no arguments and returns no results
    """
    pass

The setup.py must include a mlbaklava.train entrypoint which points to this function. The entrypoint is the full module path to the defined python function. An example of a setup.py script with a valid training entrypoint would look like the following:

from setuptools import setup, find_packages

setup(
    name='example',
    version='0.0.1',
    packages=find_packages(),
    include_package_data=True,
    entry_points={
        'mlbaklava.train': [
            'my_entrypoint = example.main:my_training_function',
        ],
    }
)

With this setup.py, a training docker image can be built:

python setup.py train

See the examples for full sample projects.

Predict

To create a prediction image, your package must define a function that takes one argument and returns one value. It can be named anything as long as it is correctly referenced in the setup.py file.

def my_hosted_function(payload):
    """
    A hosted function takes a dictionary input and returns a dictionary
    output.

    Arguments:
        payload (dict[str, object]): This is the payload was sent to
            the SageMaker server using a POST request to the
            `invocations` route.

    Returns:
        result (dict[str, object]): The output of the function is
            expected to be either a dictionary (like the function input)
            or a JSON string.
    """
    return {}

The setup.py must include a mlbaklava.predict entrypoint which points to this function. The entrypoint is the full module path to the defined python function. An example of a setup.py script with a valid prediction entrypoint would look like the following:

from setuptools import setup, find_packages

setup(
    name='example',
    version='0.0.1',
    packages=find_packages(),
    include_package_data=True,
    entry_points={
        'mlbaklava.predict': [
            'my_entrypoint = example.main:my_hosted_function',
        ]
    }
)

With this setup.py, a prediction docker image can be built:

python setup.py predict

See the examples for full sample projects.

Predict Initialization

There are often cases when python code needs to execute prior to running predictions. For example, it may take a long time to load a model artifact into memory.

To add a prediction initializer, your package must define a function that takes no arguments and may return anything. It can be named anything as long as it is correctly referenced in the setup.py file. The function is responsible for it's own caching, but it is recommended to use caching function similar to functools.lru_cache to save the function results in memory.

import functools

@functools.lru_cache()
def my_init_function():
    """
    An initialization function takes no arguments and may return a
    result.

    Returns:
        data (object): Data necessary for prediction. Could be any type.
    """
    return 1, 2, 3

The setup.py must include a mlbaklava.initialize entrypoint which points to this function. The entrypoint is the full module path to the defined python function. An example of a setup.py script with a valid prediction initialization entrypoint would look like the following:

from setuptools import setup, find_packages

setup(
    name='example',
    version='0.0.1',
    packages=find_packages(),
    include_package_data=True,

    # Notice that we have an initializer AND a predict function
    entry_points={
        'mlbaklava.predict': [
            'my_entrypoint = example.main:my_hosted_function',
        ]
        'mlbaklava.initialize': [
            'my_initializer = example.main:my_init_function',
        ]
    }
)

With this setup.py, a prediction docker image can be built that will initialize using the my_init_function initializer:

python setup.py predict

See the examples for full sample projects.

Multiple Options

A package may include all of the previous entrypoints in a single image if that package is responsible for both training and prediction. Like the previous examples, all that is required is to add a set of entrypoints to an existing setup.py script.

In addition, we can also fix the python_version and add custom dockerlines to the final image

from setuptools import setup, find_packages

setup(
    name='example',
    version='0.0.1',
    packages=find_packages(),
    include_package_data=True,

    # This will force the python version for the resulting image
    python_version='3.6.6',

    # This will run during the docker build stage
    dockerlines=[
        'RUN echo Hello, World!',
        'RUN echo Hello, Sailor!',
    ],

    # The predict and train entrypoints create distinct images
    entry_points={
        'mlbaklava.train': [
            'my_train_entrypoint = example.main:my_training_function',
        ],
        'mlbaklava.predict': [
            'my_predict_entrypoint = example.main:my_hosted_function',
        ]
        'mlbaklava.initialize': [
            'my_initializer = example.main:my_init_function',
        ]
    }
)

With this setup.py, both a prediction and a training docker image can be built:

python setup.py predict
python setup.py train

Community

Engage with the mlbaklava + MLCTL community on Slack at:

https://mlctl.slack.com/

Contributing

For information on how to contribute to mlbaklava, please read through the contributing guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlbaklava-0.0.5.dev1.tar.gz (233.0 kB view details)

Uploaded Source

Built Distribution

mlbaklava-0.0.5.dev1-py2.py3-none-any.whl (28.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file mlbaklava-0.0.5.dev1.tar.gz.

File metadata

  • Download URL: mlbaklava-0.0.5.dev1.tar.gz
  • Upload date:
  • Size: 233.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.6.10

File hashes

Hashes for mlbaklava-0.0.5.dev1.tar.gz
Algorithm Hash digest
SHA256 c93aa8b4cb6d14333e66ab659f6445a1659c565e6973079b9745f81cd3d435d7
MD5 b2d9aa54c889cf2f3cfade11b57b9bb5
BLAKE2b-256 e283fad98ea884e13610d7ce59921361e45b242777d898edde0c4e732b51c310

See more details on using hashes here.

File details

Details for the file mlbaklava-0.0.5.dev1-py2.py3-none-any.whl.

File metadata

  • Download URL: mlbaklava-0.0.5.dev1-py2.py3-none-any.whl
  • Upload date:
  • Size: 28.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.6.10

File hashes

Hashes for mlbaklava-0.0.5.dev1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 1c3608e6a8edccc8fd4fe26d5d241fc937d1a975944537380b18df65eb69ef6b
MD5 9fa84765d6daedf200729684b0262257
BLAKE2b-256 35d3cf6f1990a959f233f73f00a5792c8274d90ee87ab175770f24c9f713464d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page