Skip to main content

A minimum-lovable machine-learning pipeline, built on top of AWS SageMaker.

Project description

ML2P – or (ML)^2P – is the minimal lovable machine-learning pipeline and a friendlier interface to AWS SageMaker.

Design goals:

  • support the full machine learning lifecyle

  • support custom feature engineering

  • support building custom models in Python

  • provide reproducible training and deployment of models

  • support the use of customised base Docker images for training and deployment

Concretely it provides a command line interface and a Python library to assist with:

  • S3:
    • Managing training data

  • SageMaker:
    • Launching training jobs

    • Deploying trained models

    • Creating notebook instances

  • On your local machine or in a SageMaker notebook:
    • Downloading training datasets from S3

    • Training models

    • Loading trained models from SageMaker / S3

Installing

Install ML2P with:

$ pip install ml2p

Mailing list

If you have questions about ML2P, or would like to contribute or have suggestions for improvements, you are welcome to join the project mailing list at https://groups.google.com/g/ml2p and write us a letter there.

Overview

ML2P helps manage a machine learning project. You’ll define your project by writing a small YAML file named ml2p.yml:

project: "ml2p-tutorial"
s3folder: "s3://your-s3-bucket/"
models:
  bob: "models.RegressorModel"
defaults:
  image: "XXXXX.dkr.ecr.REGION.amazonaws.com/your-docker-image:X.Y.Z"
  role: "arn:aws:iam::XXXXX:role/your-role"
train:
  instance_type: "ml.m5.large"
deploy:
  instance_type: "ml.t2.medium"
  record_invokes: true

This specifies:

  • project: the name of your project

  • s3folder: the S3 bucket that will hold the models and data sets for your project

  • models: a list of model names and the Python classes that will be used to train the models and make predictions

  • defaults:

    • image: the docker image that your project will use for training and prediction

    • role: the AWS role your project will run under

  • train:

    • instance_type: the AWS instance type that will be used when training your model

  • deploy:

    • instance_type: the AWS instance type that will be used when deploying your model

    • record_invokes: whether to record prediction requests in S3

The name of your project functions as a prefix to the names of SageMaker training jobs, models and endpoints that ML2P creates (since these names are global within a SageMaker account).

ML2P also tags all of the AWS objects it creates with your project name.

Tutorial

See https://ml2p.readthedocs.io/en/latest/tutorial/ for a step-by-step tutorial.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml2p-0.6.0.tar.gz (40.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ml2p-0.6.0-py2.py3-none-any.whl (47.9 kB view details)

Uploaded Python 2Python 3

File details

Details for the file ml2p-0.6.0.tar.gz.

File metadata

  • Download URL: ml2p-0.6.0.tar.gz
  • Upload date:
  • Size: 40.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for ml2p-0.6.0.tar.gz
Algorithm Hash digest
SHA256 9ae4c0d1b3d0ad09d370757f4b1227d33a01e1d55cf0605e5bcd1f4c1a83c581
MD5 e63f1b16b041ca4ea514a428d7260f53
BLAKE2b-256 c47b040e7daaec15a7ce90d7a67e02c8dfc798eb65eb2a7ff0e8ed66f705d646

See more details on using hashes here.

File details

Details for the file ml2p-0.6.0-py2.py3-none-any.whl.

File metadata

  • Download URL: ml2p-0.6.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 47.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for ml2p-0.6.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e525b8e256895927c9c4d25e7444729813cef969788ad415f862f51676e48956
MD5 3e452a7e0e4dc9a4a5cac7b85793df5b
BLAKE2b-256 11255736fb9fc1d9dc8bdef216b50ad28bf8f30813a6f777a427d6fec3d82014

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page