A framework to define a machine learning pipeline
Project description
mlpipeline
This is a simple frawork to organize you machine learning workflow. It automates most of the basic functionalities such as logging, a framework for testing models and gluing together different steps at different stages. This project came about as a result of me abstracting the boilerplate code and automating different parts of the process.
The aim of this simple framework is to consolidate the different sub-problems (such as loading data, model configurations, training process, evalutaion process, exporting trained models, etc.) when working/researching with machine learning models. This allows the user to define how the different sub-problems are to be solved using their choice of tools and mlpipeline would handle piecing them together.
Core operations
This framework chains the different operations (sub-problems) depending on the mode it is executed in. mlpipeline currently has 3 modes:
- TEST mode: When in TEST mode, it doesn't perform any logging or tracking. It creates a temporory empty directory for the experiment to store the artifacts of an experiment in. When developing and testing the different operations, this mode can be used.
- RUN mode: In this mode, logging and tracking is performed. In addition, for each experiment run (refered to as a experiment version in mlpipeline) a directory is created for artifacts to be stored.
- EXPORT mode: In this mode, the exporting related operations will be executed instead of the training/evaluation related operations.
In addition to providing different modes, the pipeline also supports logging and recording various details. Currently mlpipeline records all logs, metrics and artifacts using a bacis log files as well using mlflow <https://github.com/databricks/mlflow>
_.
The following information is recorded:
- The scripts that were executed/impoerted in relation to an experiment.
- The any output results
- The metrics and parameters
Documentation
The documentation is hosted at ReadTheDocs <https://mlpipeline.readthedocs.io/>
_.
Installing
Can be installed directly using the Python Package Index using pip: pip install mlpipeline
Usage
work in progress
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for mlpipeline-2.0a3.post1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0451e00f204de54b9687b12546b675ea3220de6e60249ed08eb72aa91f03682 |
|
MD5 | 20418983d2fddbb1b7f862793c749078 |
|
BLAKE2b-256 | 623c25b187c0e3e6c0738446e9e415e2a75c964ef5be144d820669cf13950a81 |