Skip to main content

Make your models look pretty.

Project description

makeup Dependency Framework

Run Machine Learning/AI models, reproducibly, from ideation to production.

makeup strives to help Data Scientists write model code and not be obliged to much else. makeup is the connective tissue that plugs the different stages of building a model together.

This is not a processor library, it's a code organizational framework made to make your development easier.

Additional features may be developed in this library to provide services, like a web API to host models.

Why?

  • write less code.
  • promote a "model interface" for interoperable models.
  • target caching for reproducible and expedient execution.
  • simplified debugging, without hacking code.
  • production deployable code that's easy to test.
  • artifact rendering, to help with deployments.

How?

We're going to try to break our ML code down into smaller functional parts. These parts will be simple python functions, and we will refer to them as targets. How big should we make these targets? A good rule of thumb is to make a new target anywhere you may want to print, save, or inspect variables or results.

Some example targets may be: load, prep (or feature generation), split, train

Getting Started

Take an ML project, like the Sklearn Iris Example. Let's start by making a module called iris to name our model. The following code will be added to iris/__init__.py.

loading data

No matter what you're doing, you'll want to load some data first. I'm hard pressed to find a example where the program should hardcode the data source in the code, but this seems to happen in every jupyter notebook I've ever seen, so let's write a method to do it. This will be our default data, but you will be able to change data sources/sets at run time.

To implement that here we will use the dataset.load_iris function. Keep in mind, this block of code could just as easily load a csv, call a database, or load any other data source. More on this later.

# iris/__init__.py
from sklearn import datasets

def load():
    """Returns reasonable "default data" for executon. Use in Juypter Notebooks.""" 
    iris = datasets.load_iris()
    return iris.data[:, :2], iris.target

# load = "data/yourdataset.tsv"

The Iris Example is loading an object and extracts two useful components from it:
a data frame, and target numbers.

Notice the loaded iris variable wasn't returned, though it could've been. By returning a generic tuple of python primatives you can avoid coupling your code to a data object. By explicitly stating your data requirements in the function's arguments, it will make it much easier to plug in different data sources, and unit test method separately.

training on data

Now that you have your data, you will want to train your model against it. Rather than procedurally continuing our code, let's make another method which takes the previous function's returned values. Let's name those returned values sensibly: data and target.

def train(data, target):
    """
    Further describing the inputs here will help later.
    data: a DataFrame with x, y, z column requirements.
    target: a list of numbers
    """
    clf = SVC()
    clf.fit(data, target)

    return clf

prediction

You have your SVC model at this point. Here we finish up with making a prediction.

def predict(clf, row):
    return clf.predict(row)

This is using a generic row like our example is, but getting more explicit with your parameters may suit you better.

Running the code...

We've defined three methods: load, train, and predict. There are implicit dependencies between these functions which we could write some code to execute, but that's where makeup comes in.

in a notebook

import iris
from makeup import run, target

target(iris.train, requires=iris.load)
run(iris, 'train')

On the command line, this could be executed with:

python -m makeup iris train

You may also override the data source with a URL.

python -m makeup iris train --load file://./data.csv

You could imagine dependencies getting more intricate:

from makeup import target
import examples.iris as iris

target(iris.features, requires=iris.load)
target(plot, requires=iris.features)

target(iris.split, requires=iris.features)
target(iris.train, requires=iris.split)
load -> features |-> plot
                 \-> split -> train

OR, in abbreviated form:

from makeup import workflow
import examples.iris as iris

workflow({
    iris.features: iris.load,
    plot: iris.features,
    iris.split: iris.features,
    iris.train: iris.split,
})

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

makeup-0.1.2.tar.gz (12.8 kB view details)

Uploaded Source

File details

Details for the file makeup-0.1.2.tar.gz.

File metadata

  • Download URL: makeup-0.1.2.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.1

File hashes

Hashes for makeup-0.1.2.tar.gz
Algorithm Hash digest
SHA256 545d35fc0c2497709752a21dc58f7718dbf09cef3e07fa9ec7c164a4d51f8836
MD5 5c00e641b3bf97284ff16480d2ed2534
BLAKE2b-256 f1880f90a47a49ca4a517cbbc976db0848cff0d4a301f4cfe75b4b3bca33df97

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page