Skip to main content

Machine Learning Experiment Toolbox

Project description

Lightweight Distributed ML Experiments Management ๐Ÿ› ๏ธ

Pyversions Docs Latest PyPI version Code style: black Status codecov Colab

Coming up with the right hypothesis is hard - testing it should be easy.

ML researchers need to coordinate different types of experiments on separate remote resources. The Machine Learning Experiment (MLE)-Toolbox is designed to facilitate the workflow by providing a simple interface, standardized logging, many common ML experiment types (multi-seed/configurations, grid-searches and hyperparameter optimization pipelines). You can run experiments on your local machine, high-performance compute clusters (Slurm and Sun Grid Engine) as well as on cloud VMs (GCP). The results are archived (locally/GCS bucket) and can easily be retrieved or automatically summarized/reported.


What Does The mle-toolbox Provide? ๐Ÿง‘โ€๐Ÿ”ง

  1. API for launching jobs on cluster/cloud computing platforms (Slurm, GridEngine, GCP).
  2. Common machine learning research experiment setups:
    • Launching and collecting multiple random seeds in parallel/batches or async.
    • Hyperparameter searches: Random, Grid, SMBO, PBT, Nevergrad, etc.
    • Pre- and post-processing pipelines for data preparation/result visualization.
  3. Automated report generation for hyperparameter search experiments.
  4. Storage/retrieval of results and database in Google Cloud Storage Bucket.
  5. Resource monitoring with dashboard visualization.


The 4 Step mle-toolbox Cooking Recipe ๐Ÿฒ

  1. Follow the instructions below to install the mle-toolbox and set up your credentials/configuration.
  2. Learn more about the individual infrastructure subpackages with the dedicated tutorial.
  3. Read the docs explaining the pillars of the toolbox & the experiment meta-configuration job .yaml files .
  4. Check out the example workflows ๐Ÿ“„ to get started.
  5. Run your own experiment using the template files/project and mle run.

Installation โณ

If you want to use the toolbox on your local machine follow the instructions locally. Otherwise do so on your respective cluster resource (Slurm/SGE). A PyPI installation is available via:

pip install mle-toolbox

If you want to get the most recent commit, please install directly from the repository:

pip install git+https://github.com/mle-infrastructure/mle-toolbox.git@main

The Core Toolbox Subcommands ๐ŸŒฑ

You are now ready to dive deeper into the specifics of experiment configuration and can start running your first experiments from the cluster (or locally on your machine) with the following commands:

Command Description
๐Ÿš€ mle run Start up an experiment (multi-config/seeds, search).
๐Ÿ–ฅ๏ธ mle monitor Monitor resource utilisation (mle-monitor wrapper).
๐Ÿ“ฅ mle retrieve Retrieve experiment result from GCS/cluster.
๐Ÿ’Œ mle report Create an experiment report with figures.
โณ mle init Setup of credentials & toolbox settings.
๐Ÿ”„ mle sync Extract all GCS-stored results to your local drive.
๐Ÿ—‚ mle project Initialize a new project by cloning mle-project.
๐Ÿ“ mle protocol List a summary of the most recent experiments.

You can find more documentation for each subcommand here.

Examples ๐Ÿ“„ & Notebook Walkthroughs ๐Ÿ““

Job Types Description
๐Ÿ“„ Single-Objective multi-configs, hyperparameter-search Core experiment types.
๐Ÿ“„ Multi-Objective hyperparameter-search Multi-objective tuning.
๐Ÿ“„ Multi Bash multi-configs Bash-based jobs.
๐Ÿ“„ Quadratic PBT hyperparameter-search PBT on toy quadratic surrogate.
๐Ÿ“„ Hyperband hyperparameter-search Hyperband on toy polynomial problem.
Description Colab
๐Ÿ““ Getting Started Get started with the toolbox. Colab
๐Ÿ““ Subpackages Get started with the toolbox subpackages. Colab
๐Ÿ““ MLExperiment Introduction to MLExperiment wrapper. Colab
๐Ÿ““ Evaluation Evaluation of gridsearch results. Colab
๐Ÿ““ GIF Animations Walk through a set of animation helpers. Colab
๐Ÿ““ Testing Perform hypothesis tests on logs. Colab

Acknowledgements & Citing the MLE-Infrastructure โœ๏ธ

If you use parts the mle-toolbox in your research, please cite it as follows:

@software{mle_infrastructure2021github,
  author = {Robert Tjarko Lange},
  title = {{MLE-Infrastructure}: A Set of Lightweight Tools for Distributed Machine Learning Experimentation},
  url = {http://github.com/mle-infrastructure},
  year = {2021},
}

Development ๐Ÿ‘ท

You can run the test suite via python -m pytest -vv tests/. If you find a bug or are missing your favourite feature, feel free to create an issue and/or start contributing ๐Ÿค—.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mle_toolbox-0.3.5.tar.gz (56.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mle_toolbox-0.3.5-py3-none-any.whl (70.6 kB view details)

Uploaded Python 3

File details

Details for the file mle_toolbox-0.3.5.tar.gz.

File metadata

  • Download URL: mle_toolbox-0.3.5.tar.gz
  • Upload date:
  • Size: 56.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for mle_toolbox-0.3.5.tar.gz
Algorithm Hash digest
SHA256 cba71799d948d5ae2c566be3234c7622e8efe1d58e6c28004e83dc50c6e2476b
MD5 5aadd1a736291bd3393537322e338ac6
BLAKE2b-256 940c6815f7d238193e47a0cd369453fb3bc9471bed2770c27993241156d14cf0

See more details on using hashes here.

File details

Details for the file mle_toolbox-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: mle_toolbox-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 70.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for mle_toolbox-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 236a0f14dde7627504086dcb6ba6865c5252c30c8c7544a30fd2018764160dfc
MD5 33a5ae853c934f04d50975a7cea1afcf
BLAKE2b-256 86bcf2ee3323b274a1fbd43822f8b010a17fece3aa5aeb8e5ea21effec7b5f34

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page