Skip to main content

Train and Deploy is a framework to automatize the Machine Learning workflow.

Project description

TanD - Train and Deploy

TanD is a simple, no-code, flexible and customizable framework to automatize the Machine Learning workflow.

With TanD you can go through the whole ML workflow without writing a single line of code (for both sklearn and torch based models): by creating a project template and setting some configurations on a .json file you are able to train a ML model of your choice, store it to mlflow to control its lifecycle and create a ready-to-deploy API to serve your it.

Although TanD lets you run your workflows (from train to deploy) with no code at all, it is highly customizable, letting you introduce your chunks of code to enhance your modelling pipelines in anyway you want.

Our mission is to let you avoid repetitive tasks so you can focus on what matters. TanD brings Machine-Learning laziness to a whole new level.

Rodamap

The project's roadmap (which is not defined in order of priority) is:

  • Create project templates (torch and sklearn) for regression tasks in structured data;
  • Create a Dockerfile in project templates to ease deployment OK;
  • Create a cron job in Docker to update model parameters OK;
  • Create tutorials for train and deploy with tand;
  • Create project templates (torch / transformers) for classification tasks in text data;
  • Create project templates (torch) for classification in image data;
  • Create documentation for the project

Index

Install

To install tand you can use pip command:

pip install train-and-deploy

You can also clone the repo and pip install . it locally:

git clone https://github.com/piEsposito/TanD.git
cd TanD
pip install .

Documentation

Documentation for tand.util and explanation of project templates:

Documentation for tand.deployment with utils to ease deployment to cloud:

Documentation for project templates:


Quick start

After installing tand you can train and deploy on a sample project using the UCI heart disease dataset. Notice that you can perform the process on your own datasets by only changing the .csv file and setting some configurations. By following this steps, this tutorial will let you:

  • Train a torch model with a dataset and log all of its metrics to mlflow;
  • Automatically generate a fastapi based API service for serving the model (which receives a json with the features);
  • Deploy the model to AWS ElasticBeanstalk with two lines of code.

To create the project with a torch based model, on an empty folder, type:

tand-create-project --template pytorch-structured-classification

That will create all the needed files on the folder. We should first check config.json:

{
  "train": {
    "__help": "Configurations for the training of the project (of the model, etc...)",
    "data_path": "data/data.csv",
    "labels_column": "target",

    "log_every": 250,
    "epochs": 50,

    "hidden_dim": 256,
    "batch_size": 32,
    "device": "cpu",
    "labels": ["no_heart_disease", "heart_disease"],
    "to_drop": []

  },

  "test": {
    "__help": "Configurations for the testing of this project (train + app)"
  },

  "app": {
    "__help": "Configurations for the service generated by this project",
    "token": null
  },

  "mlflow": {
    "__help": "Configurations for the mlflow model manager of the project",
    "model_name": "pytorch-classifier-nn-heart-disease",
    "experiment_name": "heart-disease-experiment"
  }
}

The project is all set, but is important to check:

  • If the data_path attribute of train is set properly;
  • If the labels_column attribute of train is set according to the dataset label column;
  • If the labels attribute of train is set in proper order with the names.

We should also see the mlflow pathes for both database and model logging. As we want to keep it simple, we will use sqlite and a local storage, but you can set it to remote buckets and database in a production environment. They are set in all the env_files folder files as:

MLFLOW_TRACKING_URI=sqlite:///database.db
MLFLOW_DEFAULT_ARTIFACT_ROOT=./mlruns/

But feel free to change it, according to mlflow documentation.

To train the model is as easy as running:

source env_files/train.env
python train.py

And you can see lots of metrics for the model at mlflow:

bash mlflow-server.sh

If you are running an mlflow experiment for first time on a project, it will be automatically set for production. If you rerun the experiment with a different dataset or parameters, you can set the production model at mlflow followint the documentation. That will be useful once you deploy it.

That command also creates request_model.json, which is used both to validate request bodies for the API service and reordering it to comply with model. This file will be also used for unit-testing the API (this is also automatically generated).

{
    "age": 1,
    "sex": 1,
    "cp": 1,
    "trestbps": 1,
    "chol": 1,
    "fbs": 1,
    "restecg": 1,
    "thalach": 1,
    "exang": 1,
    "oldpeak": 1,
    "slope": 1,
    "ca": 1,
    "thal": 1
}

Moving to the API creation

If you want to generate value with a ML model, you should deploy it. tand helps you with creating the API, a Dockerfile and all the configurations files needed to deploy the model with no code. Notice that the API is protected with a token, which defaults to TOKEN123 but you can change it on env_files/app.env and env_files/docker_app.env.

The API contains simple authentication token, a / route for health checking, /update-model for model uptading (POST at it with proper credentials and it fetches the latest production model at mlflow), and /predict, which grabs the features from the request body and returns the prediction.

To test the API, just run:

source env_files/app.env
pytest

It should pass everything.

You now have some options for deployment. You can, using some arbitrary VM, run the app, build and run the image generated by Dockerfile or use tand features designed for AWS ElasticBeanstalk, which lets your model to be deployed cheap and scalable. We will cover both:

To run the app, just type:

uvicorn app:app --reload --host 0.0.0.0 --port 8000

You can test it with:

curl --header "Content-Type: application/json" \
  --header "TOKEN: $API_TOKEN" \
  --request POST \
  --data '{"age":1,"sex":1,"cp":1,"trestbps":1,"chol":1,"fbs":1,"restecg":1,"thalach":1,"exang":1,"oldpeak":1,"slope":1,"ca":1,"thal":1}' \
  http://localhost:8000/predict

Remember to source env_files/app.env before performing the request or else it will return status 401 Unauthorized.

You can build the Docker image with:

docker build . -t tand-app:v1

And run it with:

docker run -p 8000:8000 --env-file env_files/docker_app.env tand-app:v1

You can test it with:

curl --header "Content-Type: application/json" \
  --header "TOKEN: $API_TOKEN" \
  --request POST \
  --data '{"age":1,"sex":1,"cp":1,"trestbps":1,"chol":1,"fbs":1,"restecg":1,"thalach":1,"exang":1,"oldpeak":1,"slope":1,"ca":1,"thal":1}' \
  http://localhost:8000/predict

Remember to source env_files/app.env before performing the request or else it will return status 401 Unauthorized.

Last, we can deploy it to AWS ElasticBeanstalk. To do that, first you should set your AWS credentials at your machine and then install eb CLI. That should be done at your root environment, not conda or virtualenv.

You can generate the configurations with:

tand-prepare-aws-eb-deployment --init-git

We pass this --init-git flag because eb CLI uses some files from .git repository to upload the files to be deployed.

That will generate deploy-aws-eb.sh, which will be run for deployment. It will also generate .ebextensions containing:

  • cron.config - which runs, on each instance, a daily task to update the instance ML model by fetching the last production one from mlflow (which is properly used when we set cloud-based mlflow backend);
  • options.config - which sets the API token and mlflow backend env variables for the deployment; and
  • scaling.config - which sets the scalability configurations for the deployment, including the maximum and minimum number of replicas and criteria for scaling (defaults to latency)

To finally deploy it to AWS, run:

bash deploy-aws.eb.sh

It takes about 5 minutes, after what you can eb open to get the link and then try it with

curl --header "Content-Type: application/json" \
  --header "TOKEN: $API_TOKEN" \
  --request POST \
  --data '{"age":1,"sex":1,"cp":1,"trestbps":1,"chol":1,"fbs":1,"restecg":1,"thalach":1,"exang":1,"oldpeak":1,"slope":1,"ca":1,"thal":1}' \
  http://YOUR_LINK_GOES_HERE/predict

Remember to properly set the token for testing.

And with that, we showed how can we train and deploy a model with tand with a couple terminal commands and no coding at all.


Made by Pi Esposito

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

train-and-deploy-0.1.4.tar.gz (60.0 kB view details)

Uploaded Source

Built Distribution

train_and_deploy-0.1.4-py3-none-any.whl (70.5 kB view details)

Uploaded Python 3

File details

Details for the file train-and-deploy-0.1.4.tar.gz.

File metadata

  • Download URL: train-and-deploy-0.1.4.tar.gz
  • Upload date:
  • Size: 60.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for train-and-deploy-0.1.4.tar.gz
Algorithm Hash digest
SHA256 e332eacea8a71effb49a78ee7457c38504723c76df694c5ccba1223fa1e2ed12
MD5 4cb0fc47ca5788af60124df89f10a017
BLAKE2b-256 7db2b284871a646d6f940e4d33e8edf20b4b2ec42983b2e90082e79d20d36618

See more details on using hashes here.

File details

Details for the file train_and_deploy-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: train_and_deploy-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 70.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for train_and_deploy-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 93ba2033c734f81b11f2c7bd669b55ba313f06d2cac52864d28868aa5e4e4856
MD5 f2f3f8ba4b4fa9a0267eea7d0737739d
BLAKE2b-256 1e3e69dd2f12c81b2ecebac24a6538b5aed5edb484422912a0f5c7deeeb846c9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page