Skip to main content

PyPaDS aims to to add tracking functionality to machine learning libraries.

Project description

PyPads

Building on the MLFlow toolset this project aims to extend the functionality for MLFlow, increase the automation and therefore reduce the workload for the user. The production of structured results is an additional goal of the extension.

Documentation Status PyPI version
pipeline status

Intalling

This tool requires those libraries to work:

Python (>= 3.6),
cloudpickle (>= 1.3.3),
mlflow (>= 1.6.0),
boltons (>= 19.3.0),
loguru (>=0.4.1)

PyPads only support python 3.6 and higher. To install pypads run this in you terminal

Using source code

First, you have to install poetry

pip install poetry
poetry build (in the root folder of the repository pypads/)

This would create two files under pypads/dist that can be used to install,

pip install dist/pypads-X.X.X.tar.gz
OR
pip install dist/pypads-X.X.X-py3-none-any.whl

Using pip (PyPi release)

The package can be found on PyPi in following project.

pip install pypads

Tests

The unit tests can be found under 'test/' and can be executed using

poetry run pytest test/

Documentation

For more information, look into the official documentation of PyPads.

Getting Started

Usage example

pypads is easy to use. Just define what is needed to be tracked in the config and call PyPads.

A simple example looks like the following,

from pypads.app.base import PyPads
# define the configuration, in this case we want to track the parameters, 
# outputs and the inputs of each called function included in the hooks (pypads_fit, pypads_predict)
hook_mappings = {
    "parameters": {"on": ["pypads_fit"]},
    "output": {"on": ["pypads_fit", "pypads_predict"]},
    "input": {"on": ["pypads_fit"]}
}
# A simple initialization of the class will activate the tracking
PyPads(hooks=hook_mappings)

# An example
from sklearn import datasets, metrics
from sklearn.tree import DecisionTreeClassifier

# load the iris datasets
dataset = datasets.load_iris()

# fit a model to the data
model = DecisionTreeClassifier()
model.fit(dataset.data, dataset.target) # pypads will track the parameters, output, and input of the model fit function.
# get the predictions
predicted = model.predict(dataset.data) # pypads will track only the output of the model predict function.

The used hooks for each event are defined in the mapping file where each hook represents the functions to listen to. Users can use regex for goruping functions and even provide paths to hook functions. In the sklearn mapping YAML file, an example entry would be:

fragments:
  default_model:
    !!python/pPath __init__:
      hooks: "pypads_init"
    !!python/rSeg (fit|.fit_predict|fit_transform)$:
      hooks: "pypads_fit"
    !!python/rSeg (fit_predict|predict|score)$:
      hooks: "pypads_predict"
    !!python/rSeg (fit_transform|transform)$:
      hooks: "pypads_transform"

mappings:
  !!python/pPath sklearn:
    !!python/pPath base.BaseEstimator:
      ;default_model: ~

For instance, "pypads_fit" is an event listener on any fit, fit_predict and fit_transform call made by the tracked model class which is in this case BaseEstimator that most estimators inherits from.

Using no custom yaml types and no fragments the mapping file would be equal to following definition:

mappings:
  :sklearn:
    :base.BaseEstimator:
        :__init__:
          hooks: "pypads_init"
        :{re:(fit|.fit_predict|fit_transform)$}:
          hooks: "pypads_fit"
        :{re:(fit_predict|predict|score)$}:
          hooks: "pypads_predict"
        :{re:(fit_transform|transform)$}:
          hooks: "pypads_transform"

Acknowledgement

This work has been partially funded by the Bavarian Ministry of Economic Affairs, Regional Development and Energy by means of the funding programm "Internetkompetenzzentrum Ostbayern" as well as by the German Federal Ministry of Education and Research in the project "Provenance Analytics" with grant agreement number 03PSIPT5C.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pypads-0.5.7.tar.gz (195.6 kB view details)

Uploaded Source

Built Distribution

pypads-0.5.7-py3-none-any.whl (228.4 kB view details)

Uploaded Python 3

File details

Details for the file pypads-0.5.7.tar.gz.

File metadata

  • Download URL: pypads-0.5.7.tar.gz
  • Upload date:
  • Size: 195.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.7.9 Linux/4.15.0-72-generic

File hashes

Hashes for pypads-0.5.7.tar.gz
Algorithm Hash digest
SHA256 51cd4bc1f033eb73a6eb45ce4a869f8a47a8a42d8e2c608c1c5718da2ba82184
MD5 f96d858c719ff2560835fbc0a45277a8
BLAKE2b-256 a2ba07cd8f9247bb19301d11573c34ec0fc63f43604c14a713278c983e11c3e1

See more details on using hashes here.

File details

Details for the file pypads-0.5.7-py3-none-any.whl.

File metadata

  • Download URL: pypads-0.5.7-py3-none-any.whl
  • Upload date:
  • Size: 228.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.7.9 Linux/4.15.0-72-generic

File hashes

Hashes for pypads-0.5.7-py3-none-any.whl
Algorithm Hash digest
SHA256 40faba70f4c13f3360943128a6ce003330d97a614b833cf418d6b52c6c415b2a
MD5 cab8ec2d1f3f5b29174637d6856b8f97
BLAKE2b-256 36caf6a01bb963c7dd1dab317b0d65feb523b814c11272dffbd9557312cbce9d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page