No project description provided
Project description
Baklava
This is a package for building python based Machine Learning models into docker images, that can be deployed directly into AWS SageMaker.
This is an extension to the standard python packaging utility setuptools
. The
official python packaging guide
explains the basics of building python distributions in detail.
This extends the existing behavior of building a setuptools source
distribution (sdist
) by installing the built package artifact
(*.tar.gz
) into a Docker image. After the python distribution has been
installed to the Docker image, it allows the user to configure the image
for the purposes of model training and prediction.
The name was chosen because baklava consists of small pieces and layers, like we put technologies together in form of many layers to create Docker images.
Installation
Install docker and then install the package:
pip install baklava
Features
Installing the baklava
package automatically registers
extensions to setuptools
. New features are added to build python
distributions into docker images.
When installed, this package allows you to use two new setuptools
commands (similar to sdist
or bdist_wheel
):
-
train
: Builds a training docker image for your package. A training image (python setup.py train
) executes a user-provided function just once in order to produce a model artifact. This image conforms to the AWS SageMaker training image API. -
predict
: Builds a prediction docker image for your package. A prediction image (python setup.py predict
) hosts the user-provided function in a web application to be able to produce many decisions over time using a RESTful service conforming to the AWS SageMaker prediction API. -
execute
: Builds a batch execution docker image for your package. A batch execution image (python setup.py execute
) executes a user-provided batch function for prediction on large amount of records.
Production-grade Machine Learning API using Flask, Gunicorn, Nginx, and Docker
New setup keywords are also registered with setuptools (similar to
install_requires
or entry_points
). These include:
python_version
: Specify the version of python to build the docker image fordockerlines
: Add docker commands to your resultingDockerfile
This package also defines a Python API to perform the same actions as the setuptools extension.
Usage
Train
To create a training image, your package must define a function that
takes no arguments and returns nothing. It can be named anything as long
as it is correctly referenced in the setup.py
file.
def my_training_function():
"""
A training function takes no arguments and returns no results
"""
pass
The setup.py
must include a baklava.train
entrypoint which
points to this function. The entrypoint is the full module path to the
defined python function. An example of a setup.py
script with a valid
training entrypoint would look like the following:
from setuptools import setup, find_packages
setup(
name='example',
version='0.0.1',
packages=find_packages(),
include_package_data=True,
entry_points={
'baklava.train': [
'my_entrypoint = example.main:my_training_function',
],
}
)
With this setup.py
, a training docker image can be built:
python setup.py train
See the examples for full sample projects.
Predict
To create a prediction image, your package must define a function
that takes one argument and returns one value. It can be named anything
as long as it is correctly referenced in the setup.py
file.
def my_hosted_function(payload):
"""
A hosted function takes a dictionary input and returns a dictionary
output.
Arguments:
payload (dict[str, object]): This is the payload was sent to
the SageMaker server using a POST request to the
`invocations` route.
Returns:
result (dict[str, object]): The output of the function is
expected to be either a dictionary (like the function input)
or a JSON string.
"""
return {}
The setup.py
must include a baklava.predict
entrypoint
which points to this function. The entrypoint is the full module path to
the defined python function. An example of a setup.py
script with a
valid prediction entrypoint would look like the following:
from setuptools import setup, find_packages
setup(
name='example',
version='0.0.1',
packages=find_packages(),
include_package_data=True,
entry_points={
'baklava.predict': [
'my_entrypoint = example.main:my_hosted_function',
]
}
)
With this setup.py
, a prediction docker image can be built:
python setup.py predict
See the examples for full sample projects.
Predict Initialization
There are often cases when python code needs to execute prior to running predictions. For example, it may take a long time to load a model artifact into memory.
To add a prediction initializer, your package must define a function
that takes no arguments and may return anything. It can be named
anything as long as it is correctly referenced in the setup.py
file.
The function is responsible for it's own caching, but it is recommended
to use caching function similar to functools.lru_cache
to save the
function results in memory.
import functools
@functools.lru_cache()
def my_init_function():
"""
An initialization function takes no arguments and may return a
result.
Returns:
data (object): Data necessary for prediction. Could be any type.
"""
return 1, 2, 3
The setup.py
must include a baklava.initialize
entrypoint
which points to this function. The entrypoint is the full module path to
the defined python function. An example of a setup.py
script with a
valid prediction initialization entrypoint would look like the
following:
from setuptools import setup, find_packages
setup(
name='example',
version='0.0.1',
packages=find_packages(),
include_package_data=True,
# Notice that we have an initializer AND a predict function
entry_points={
'baklava.predict': [
'my_entrypoint = example.main:my_hosted_function',
]
'baklava.initialize': [
'my_initializer = example.main:my_init_function',
]
}
)
With this setup.py
, a prediction docker image can be built that will
initialize using the my_init_function
initializer:
python setup.py predict
See the examples for full sample projects.
Multiple Options
A package may include all of the previous entrypoints in a single image
if that package is responsible for both training and prediction. Like
the previous examples, all that is required is to add a set of
entrypoints to an existing setup.py
script.
In addition, we can also fix the python_version
and add custom
dockerlines
to the final image
from setuptools import setup, find_packages
setup(
name='example',
version='0.0.1',
packages=find_packages(),
include_package_data=True,
# This will force the python version for the resulting image
python_version='3.6.6',
# This will run during the docker build stage
dockerlines=[
'RUN echo Hello, World!',
'RUN echo Hello, Sailor!',
],
# The predict and train entrypoints create distinct images
entry_points={
'baklava.train': [
'my_train_entrypoint = example.main:my_training_function',
],
'baklava.predict': [
'my_predict_entrypoint = example.main:my_hosted_function',
]
'baklava.initialize': [
'my_initializer = example.main:my_init_function',
]
}
)
With this setup.py
, both a prediction and a training docker image can
be built:
python setup.py predict
python setup.py train
Community
Engage with the Baklava + MLCTL community on Slack at:
Contributing
For information on how to contribute to baklava
, please read through the contributing guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file baklava-0.0.4.tar.gz
.
File metadata
- Download URL: baklava-0.0.4.tar.gz
- Upload date:
- Size: 230.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c9f6f192e668fa37295c67dcdcb2631fce591e07f836bb8e58d5281ded23c43 |
|
MD5 | 616ef3323c9280ddea51d59247eeeedf |
|
BLAKE2b-256 | 9ad9848161f76fdb3dc97182baede4dc4820a28756d77032fba51eb6f70f322d |
File details
Details for the file baklava-0.0.4-py2.py3-none-any.whl
.
File metadata
- Download URL: baklava-0.0.4-py2.py3-none-any.whl
- Upload date:
- Size: 24.7 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b370e164f989d4550012bc922e03f5c677b05118d7cdfd8047c294ff0d73ba5d |
|
MD5 | 2293cf1fdb0ff7b7100ef96129ae1140 |
|
BLAKE2b-256 | c0f2752019cdaa4e1f000a52cfb5804c2b49bf276616c64ef44de654cb3f6805 |