mle-toolbox

Machine Learning Experiment Toolbox

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

MLE_Toolbox_Banner

Coming up with the right hypothesis to test is hard - testing them should be easy. Often times one needs to coordinate different types of experiments on separate remote resources. The MLE-Toolbox is designed to facilitate your workflow providing a common interface, standardized logging, many common experiment types (multi-seed/-config runs, gridsearches and hyperparameter optimization pipelines) as well as an easy retrieval of results. You can run experiments on your local machine, on Slurm and Sun Grid Engine clusters as well as Google Cloud compute instances.

MLE_demo

Here are 4 steps to get started with running your distributed jobs:

Follow the instructions below to install the mle-toolbox and to set up your credentials/configurations.
Read the documentation explaining the pillars of the toolbox & how to compose the meta-configuration job .yaml files for your experiments.
Check out the examples :notebook: to get started with a toy ODE integration, MNIST-CNN training or an example of how to train a PPO agent.
Start up your own experiments using the template files.

Installing `mle_toolbox` & dependencies

If you want to use the toolbox on your local machine follow the instructions locally. Otherwise do so on your respective remote resource (Slurm, SGE, or GCP). A simple PyPI installation can be done via:

pip install mle-toolbox

Alternatively, you can clone this repository and afterwards 'manually' install the toolbox (preferably in a clean Python 3.6 environment):

git clone https://github.com/RobertTLange/mle-toolbox.git
cd mle-toolbox
pip install -e .

This will install all required dependencies. Please note that the toolbox is tested only for Python 3.6.

Setting up your Remote Credentials & Configuration

By default the toolbox will only run locally and without any GCS storage of your experiments. If you want to integrate the mle-toolbox with your remote resources, please edit the template_config.toml template. This consists of 4 optional steps:

Set whether or not you want to store all results and your database locally or remote in the Google Cloud Storage bucket.
Add the Slurm credentials as well as cluster-specific details (headnode names, partitions, proxy server for internet) and default job arguments.
Add the SGE credentials as well as cluster-specific details (headnode names, queues, proxy server for internet) and default job arguments.
Add the path to your GCP credentials .json file as well as project and GCS bucket name to store your experiment data (as well as protocol database).

Afterwards, please move and rename the template to the home directory directory as mle_config.toml.

mv templates/template_config.toml ~/mle_config.toml

Note: If you only intend to use a single resource, then simply only update the configuration for that resource.

The 4 Commands of the Toolbox

You are now ready to dive deeper into the specifics of job configuration and can start running your first experiments from the cluster (or locally on your machine) with the following commands:

Start up an experiment: run-experiment <experiment_config>.yaml
Monitor resource utilisation: monitor-cluster
Retrieve the experiment results: retrieve-experiment
Create a one-page experiment report: report-experiment

Examples & Getting Your First Job Running

:notebook: Euler ODE - Integrate a simple ODE using forward Euler & get to know the toolbox.
:notebook: MNIST CNN - Train a CNN on multiple random seeds & different training configurations.
:notebook: Pendulum PPO - Search through the hyperparameter space of a PPO agent.

The PPO examples depend on another package of mine: drl-toolbox. Note: This has not been open-sourced yet. Contact me if you want to run it!

Notes, Development & Questions

If you find a bug or would like to see a feature implemented, feel free to contact me @RobertTLange or create an issue :hugs:
You can run all unit/integration tests from mle-toolbox/ with pytest (run locally & remote).
Details on how to submit jobs with qsub
More notes on the SGE system
On Slurm it can make sense to start up a job for the experiment management in a screen/tmux session for monitoring of many jobs:

screen
source activate mle-env
salloc --job-name "InteractiveJob" --cpus-per-task 8 --mem-per-cpu 1500 --time 04:30:00 --partition standard
ssh <allocated_id>

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.3.4

Mar 8, 2023

0.3.3

Dec 10, 2021

0.3.2

Dec 10, 2021

0.3.1

Oct 21, 2021

0.3.0

Aug 21, 2021

0.2.9

Jun 23, 2021

0.2.8

May 6, 2021

0.2.7

Apr 24, 2021

0.2.6

Apr 9, 2021

0.2.5

Apr 5, 2021

0.2.4

Feb 16, 2021

This version

0.2.3

Feb 16, 2021

0.2.2

Oct 14, 2020

0.2.1

Oct 6, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mle_toolbox-0.2.3.tar.gz (4.5 kB view hashes)

Uploaded Feb 16, 2021 Source

Built Distribution

mle_toolbox-0.2.3-py3-none-any.whl (4.9 kB view hashes)

Uploaded Feb 16, 2021 Python 3

Hashes for mle_toolbox-0.2.3.tar.gz

Hashes for mle_toolbox-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`649858b1944103efaf3e82ab90f208d77de240ff02a71cf7151a890da1ed4099`
MD5	`1073f9fe2f0cc75dfe125cbb9b9a0685`
BLAKE2b-256	`72682ca4fbc1fff3aeecab057ef30d6207eedbdb440a5676d10282646804595a`

Hashes for mle_toolbox-0.2.3-py3-none-any.whl

Hashes for mle_toolbox-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0ba509f7f3b6888f801538d3c705ae460ee85d1bdedf845e7fb623663e548d7c`
MD5	`69df6abe94b26241fa9e671e0c296f47`
BLAKE2b-256	`e9289f4e3e30a428f4e8eb7f314baab5f5ba0e4e84c694a22f67d6b799355aef`

mle-toolbox 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Installing `mle_toolbox` & dependencies

Setting up your Remote Credentials & Configuration

The 4 Commands of the Toolbox

Examples & Getting Your First Job Running

Notes, Development & Questions

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

mle-toolbox 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Installing mle_toolbox & dependencies

Setting up your Remote Credentials & Configuration

The 4 Commands of the Toolbox

Examples & Getting Your First Job Running

Notes, Development & Questions

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

Installing `mle_toolbox` & dependencies