Skip to main content

Lightning HPO

Project description

[experimental] Lightning HPO / Training Studio App

Lightning HPO provides a pythonic implementation for Scalable Hyperparameter Tuning.

This library relies on Optuna for providing state-of-the-art sampling hyper-parameters algorithms and efficient trial pruning strategies.

This is built upon the highly scalable and distributed Lightning App framework from lightning.ai.

The Training Studio App relies on Lightning HPO to provide abilities to run, show, stop, delete Sweeps, Notebooks, Tensorboard, etc..

Installation

Create a new virtual environment with python 3.8+

python -m venv .venv
source .venv/bin/activate

Clone and install lightning-hpo.

git clone https://github.com/Lightning-AI/lightning-hpo.git
cd lightning-hpo
pip install -r requirements.txt -r requirements/test.txt --find-links https://download.pytorch.org/whl/cpu/torch_stable.html
pip install -e .

Make sure everything works fine.

pytest tests --capture=no -v

Getting started

Imagine you want to optimize a simple function called objective inside a objective.py file.

def objective(x: float):
    return (x - 2) ** 2

Import a Sweep component, provide the path to your script and what you want to optimize on.

import os.path as ops
from lightning import LightningApp
from lightning_hpo import Sweep
from lightning_hpo.distributions import Uniform

app = LightningApp(
    Sweep(
        script_path="objective.py",
        n_trials=50,
        simultaneous_trials=10,
        direction="maximize",
        distributions={"x": Uniform(-10, 10)},
    )
)

Now, you can optimize it locally.

python -m lightning run app examples/1_app_agnostic.py

or with --cloud to run it in the cloud.

python -m lightning run app examples/1_app_agnostic.py --cloud

Note: Locally, each trial runs into its own process, so there is an overhead if your objective is quick to run.

Find the example here

PyTorch Lightning Users

Here is how to launch 100 trials 10 at a times with 2 nodes of 4 GPUs for each in the cloud.

import os.path as ops

from lightning import LightningApp
from lightning_hpo.algorithm import OptunaAlgorithm
from lightning_hpo import Sweep, CloudCompute
from lightning_hpo.distributions import Uniform, IntUniform, Categorical, LogUniform

app = LightningApp(
    Sweep(
        script_path="train.py",
        n_trials=100,
        simultaneous_trials=10,
        distributions={
            "model.lr": LogUniform(0.001, 0.1),
            "model.gamma": Uniform(0.5, 0.8),
            "data.batch_size": Categorical([16, 32, 64]),
            "trainer.max_epochs": IntUniform(3, 15),
        },
        algorithm=OptunaAlgorithm(direction="maximize"),
        cloud_compute=CloudCompute("gpu-fast-multi", count=2),  # 2 * 4 V100
        framework="pytorch_lightning",
        logger="wandb",
        sweep_id="Optimizing a Simple CNN over MNIST with Lightning HPO",
    )
)
python -m lightning run app examples/2_app_pytorch_lightning.py --cloud --env WANDB_ENTITY={WANDB_ENTITY} --env WANDB_API_KEY={WANDB_API_KEY}

Find the example here

Lightning App UI

Convert from raw Optuna to a Lightning App

Below, we are going to convert Optuna Efficient Optimization Algorithms into a Lightning App.

The Optuna example optimize the value (e.g learning-rate) of a SGDClassifier from sklearn trained over the Iris Dataset.

The example above has been re-organized below in order to run as Lightning App.

from lightning import LightningApp
from sklearn import datasets
import optuna
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import train_test_split
from lightning_hpo.distributions import LogUniform
from lightning_hpo.algorithm import OptunaAlgorithm
from lightning_hpo import Objective, Sweep


class MyObjective(Objective):

    def objective(self, alpha: float):

        iris = datasets.load_iris()
        classes = list(set(iris.target))
        train_x, valid_x, train_y, valid_y = train_test_split(iris.data, iris.target, test_size=0.25, random_state=0)

        clf = SGDClassifier(alpha=alpha)

        self.monitor = "accuracy"

        for step in range(100):
            clf.partial_fit(train_x, train_y, classes=classes)
            intermediate_value = clf.score(valid_x, valid_y)

            # WARNING: Assign to reports,
            # so the state is instantly sent to the flow.
            self.reports = self.reports + [[intermediate_value, step]]

        self.best_model_score = clf.score(valid_x, valid_y)


app = LightningApp(
    Sweep(
        objective_cls=MyObjective,
        n_trials=20,
        algorithm=OptunaAlgorithm(
            optuna.create_study(pruner=optuna.pruners.MedianPruner()),
            direction="maximize",
        ),
        distributions={"alpha": LogUniform(1e-5, 1e-1)}
    )
)

Lightning App UI

python -m lightning run app examples/3_app_sklearn.py

As you can see, several trials were pruned (stopped) before they finished all of the iterations. Same as when using pure optuna.

A new study created in memory with name: no-name-a93d848e-a225-4df3-a9c3-5f86680e295d
Trial 0 finished with value: 0.23684210526315785 and parameters: {'alpha': 0.006779437004523296}. Best is trial 0 with value: 0.23684210526315785.
Trial 1 finished with value: 0.07894736842105265 and parameters: {'alpha': 0.008936151407006062}. Best is trial 1 with value: 0.07894736842105265.
Trial 2 finished with value: 0.052631578947368474 and parameters: {'alpha': 0.0035836511240528008}. Best is trial 2 with value: 0.052631578947368474.
Trial 3 finished with value: 0.052631578947368474 and parameters: {'alpha': 0.0005393218926409795}. Best is trial 2 with value: 0.052631578947368474.
Trial 4 finished with value: 0.1578947368421053 and parameters: {'alpha': 6.572557493358585e-05}. Best is trial 2 with value: 0.052631578947368474.
Trial 5 finished with value: 0.02631578947368418 and parameters: {'alpha': 0.0013953760106345603}. Best is trial 5 with value: 0.02631578947368418.
Trail 6 pruned.
Trail 7 pruned.
Trail 8 pruned.
Trail 9 pruned.
Trial 10 finished with value: 0.07894736842105265 and parameters: {'alpha': 0.00555435554783454}. Best is trial 5 with value: 0.02631578947368418.
Trail 11 pruned.
Trial 12 finished with value: 0.052631578947368474 and parameters: {'alpha': 0.025624276147153992}. Best is trial 5 with value: 0.02631578947368418.
Trial 13 finished with value: 0.07894736842105265 and parameters: {'alpha': 0.014613957457075546}. Best is trial 5 with value: 0.02631578947368418.
Trail 14 pruned.
Trail 15 pruned.
Trail 16 pruned.
Trial 17 finished with value: 0.052631578947368474 and parameters: {'alpha': 0.01028208215647372}. Best is trial 5 with value: 0.02631578947368418.
Trail 18 pruned.
Trail 19 pruned.

Find the example here

Select your logger

Lightning HPO supports Wandb and Streamlit by default.

import optuna

Sweep(..., logger="wandb")
python -m lightning run app app.py --env WANDB_ENTITY=YOUR_USERNAME --env WANDB_API_KEY=YOUR_API_KEY --cloud

Use advanced algorithms with your Lightning App

Here is how to use the latest research such as Hyperband paper

from lightning_hpo.algorithm import OptunaAlgorithm
import optuna

Sweep(
    algorithm=OptunaAlgorithm(
        optuna.create_study(
            direction="maximize",
            pruner=optuna.pruners.HyperbandPruner(
                min_resource=1,
                max_resource=3,
                reduction_factor=3,
            ),
        )
    )
)

Learn more here

The Training App (WIP)

In terminal 1, run the Training Application.

python -m lightning run app examples/4_app_sweeper.py --env WANDB_ENTITY={ENTITY} --env WANDB_API_KEY={API_KEY}

In terminal 2, connect to the App and run your first sweep or start your notebook.

lightning connect localhost
lightning --help

You are connected to the local Lightning App.
Usage: lightning [OPTIONS] COMMAND [ARGS]...

  --help     Show this message and exit.

Lightning App Commands
  delete sweep
  download artefacts
  run notebook
  run sweep
  show artefacts
  show sweeps
  stop sweep
cd examples/scripts && lightning run sweep train.py --n_trials=3 --model.lr="log_uniform(0.001, 0.1)" --logger="wandb" --direction=maximize

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lightning_hpo-0.0.2.tar.gz (38.7 kB view hashes)

Uploaded Source

Built Distribution

lightning_hpo-0.0.2-py3-none-any.whl (47.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page