Skip to main content

A framework to define a machine learning pipeline

Project description

mlpipeline

This is a simple frawork to organize you machine learning workflow. It automates most of the basic functionalities such as logging, a framework for testing models and gluing together different steps at different stages. This project came about as a result of me abstracting the boilerplate code and automating different parts of the process.

The aim of this simple framework is to consolidate the different sub-problems (such as loading data, model configurations, training process, evalutaion process, exporting trained models, etc.) when working/researching with machine learning models. This allows the user to define how the different sub-problems are to be solved using their choice of tools and mlpipeline would handle piecing them together.

Core operations

This framework chains the different operations (sub-problems) depending on the mode it is executed in. mlpipeline currently has 3 modes:

  • TEST mode: When in TEST mode, it doesn't perform any logging or tracking. It creates a temporory empty directory for the experiment to store the artifacts of an experiment in. When developing and testing the different operations, this mode can be used.
  • RUN mode: In this mode, logging and tracking is performed. In addition, for each experiment run (refered to as a experiment version in mlpipeline) a directory is created for artifacts to be stored.
  • EXPORT mode: In this mode, the exporting related operations will be executed instead of the training/evaluation related operations.

In addition to providing different modes, the pipeline also supports logging and recording various details. Currently mlpipeline records all logs, metrics and artifacts using a bacis log files as well using mlflow <https://github.com/databricks/mlflow>_.

The following information is recorded:

  • The scripts that were executed/impoerted in relation to an experiment.
  • The any output results
  • The metrics and parameters

Documentation

The documentation is hosted at ReadTheDocs <https://mlpipeline.readthedocs.io/>_.

Installing

Can be installed directly using the Python Package Index using pip: pip install mlpipeline

Usage

work in progress

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlpipeline-2.0a3.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlpipeline-2.0a3-py3-none-any.whl (24.3 kB view details)

Uploaded Python 3

File details

Details for the file mlpipeline-2.0a3.tar.gz.

File metadata

  • Download URL: mlpipeline-2.0a3.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.3

File hashes

Hashes for mlpipeline-2.0a3.tar.gz
Algorithm Hash digest
SHA256 e89275b895c507be254b6536e358bd4d759df1b633efd6b5e8e0f383bb581c48
MD5 f0861072ff2fd40b31b332cfe5fd4262
BLAKE2b-256 21c622307eaa5a471bf83c569da91cd67df2eb13ea590628383f66b0ab277d4a

See more details on using hashes here.

File details

Details for the file mlpipeline-2.0a3-py3-none-any.whl.

File metadata

  • Download URL: mlpipeline-2.0a3-py3-none-any.whl
  • Upload date:
  • Size: 24.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.3

File hashes

Hashes for mlpipeline-2.0a3-py3-none-any.whl
Algorithm Hash digest
SHA256 6095af5ddeb96c3a48959751ff118c1e99952db2896167cab9bbf952cc8d0283
MD5 a7fa4ca2b598bfdbb0823734c05cff0f
BLAKE2b-256 ebaf826649ad62e11921c1f2bdb94c2c7fa38f5f91083dd8cccbfd4d3be41f5e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page