ml_benchmark - A ML-Job for Benchmarking.
Project description
Basht - A Benchmarking Approach for Sustainabile Hyperparameter Tuning
Tuning This repository supplies a job to benchmark hyperparameter tuning.
Start up
Prerequisites
- Ubuntu >= 16.04 (not tested on MAC OS)
- Python >= 3.7.0
- PyTorch
Note: If you run your benchmark on GPU make sure to install Cuda and install the correct PyTorch Version, which fits your Cuda version.
Install
- Clone the repository with
git clone <url>
- Create a Python environment with
python -m venv .venv
. - Activate your environment with
source .venv/bin/activate
- Upgrade pip with
pip install pip --upgrade
- If not already installed install PyTorch and Torchvision
- To install the benchmark and use it locally , clone the repository and switch to the root_folder of the repository and type in
pip install -e .
. Otherwise you can also install the package via PyPi withpip install ml-benchmark
RayTune Example
For running the raytune example experiment add the following after the process above:
pip install ray[tune]
pip install pandas
Optuna Minikube Example
For minikube, install the requirements.txt
in experiments/optuna_minikube
if present. Also make sure you installed minikube.
Before starting the minikube example you need to start your minikube vm with minikube start
. Then execute experiments/optuna_minikube/optuna_minikube_benchmark.py
.
Class Explanation
Class | Description |
---|---|
MNISTTask | Use it to get the Data for the Model. Please do not change its configuration |
MLPObjective | The Job that needs to be executed. Adjustments should not be neccessary. |
MLP | The model that is trained over the MNIST Task. |
Benchmark Methodolegy
Each implementation uses a common experiment-docker-container that represents the full lifecycle of a benchmarking experiment, see the lifecycle figure.
Lifecycle
The Lifecycle consists of 7 steps, that we describe in detail in the following: .
Step | Description |
---|---|
Deploy | Describes all deployment operations necessary to run the required components of a hyperparameter optimization (HPO) framework to run the HPO task. With the completion of this step the desired architecture of the HPO Framework should be running on a platform, e.g,. in the case of Kubernetes it referes to the steps nassary to deploy all pods and services in kubernetes. |
Setup | All operations needed to initialize and start a trial. For instance the handover of relevant classes to other workers, or the scheduling of a new worker. |
Trial | A Trial defines the loading, training and validation of a model with a specific hyperparameter setting. A hyperparameter setting is one combination of hyperparameters, which can be used to initialize a model. E.g.learning_rate=1e-2 etc. |
Load | Primarily includes all I/O operations that are needed to provide a Trial with the required data, as well as initializations of the model class, with a certain hyperparameter setting. |
Train | The training procedure of a model, which computes model parameters to solve a classification or regression problem. Generally training is repeated for a fixed number of epochs. |
Validate | The trained model has to be validated on a given dataset. Validation captures the performance of a hyperparameter setting of a certain model. The performance of this model on the validation set is later used to find the best hyperparameter setting. |
Result Collection | The collection of models, classifcation/regression results or other metrics for the problem at hand of each trial. After running of all trials results have to be consolidated for a compairison. Depending on the framework, this step might be a continues process that end once all trails are compleat or a process that is triggered after the framework under test observed the compleation of all trails. However, for this benchmark we allways measure the result collection as the time between the last compleated trail and the identification of the best performing hyperparameter set. |
Test | The final evaluation of the model which performed the best of all trials on the validation set. The test results are thus the final results for the model with the best hyperparameter setting. |
Metric Collection | Describes the collection of all gathered Metrics, which are not used by the HPO framework (Latencies, CPU Resources, etc.). This step runs outside of the HPO Framework. |
Un-Deploy | The clean-up procedure to undeploy all components of the HPO Framework that were deployed in the Deploy step. |
The docker container stub is located here.
Collaboration
Make sure to fork the repository. For informations on how to fork read here. When you start working on your code, in your forked repositry, make sure to create a new branch with collaborator_name/feature_name
. When performing your work assure, that you keep your main
branch of your forked repository up to date with the main
brain of this repository as it is a direct copy. How to sync your main
branch with the one of this repository can be read here. Once your work is done submit a pull request to the original repository (this one). A guide can be found here. For the Pull Request regular rules apply. Provide a short description of the feature you are adding and make sure your code is in a good shape (decent amount of documentation, no extremely long script files). If you use Python it is recommended to use Flake8
.
System Components
Adresses all components that perform necessary task in an HPO. E.g. Scheduling, I/O Operations etc. A component is not necessarily one Object within the framework, but it can be a collection of multiple objects performing similar elemental tasks. A fair compairison between HPO Frameworks would map all HPO Frameworks into these components and measure their performed tasks against each other.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ml_benchmark-0.4.2.tar.gz
.
File metadata
- Download URL: ml_benchmark-0.4.2.tar.gz
- Upload date:
- Size: 22.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5104e7813e30291802dd1f559196a25bdd15f3a15d293ba2499aea03e3c62538 |
|
MD5 | 15f5a59bf2ee3bd7024c20fa79d7e501 |
|
BLAKE2b-256 | d81f647bec974e9437b0f051faf14690954112aa186549dd51c8ae67eb7009a6 |
File details
Details for the file ml_benchmark-0.4.2-py3-none-any.whl
.
File metadata
- Download URL: ml_benchmark-0.4.2-py3-none-any.whl
- Upload date:
- Size: 25.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c255bd8986c6105678e2d20b6e692cac7f11aa42efd585cad979e9770e44d5e0 |
|
MD5 | 47e449012043b751e8ea8c675b0b7fe0 |
|
BLAKE2b-256 | 8fe2877a8af515c0d9ab7bdf3890b77ff7e2a3c0155a4ae7923b70683595e5ae |