Lightning HPO
Project description
Research Studio App
The Research Studio App is a full-stack AI application built using the Lightning App framework to enable running experiments or sweeps with state-of-the-art sampling hyper-parameters algorithms and efficient experiment pruning strategies and more.
Learn more here.
Installation
Create a new virtual environment with python 3.8+
python -m venv .venv
source .venv/bin/activate
Clone and install lightning-hpo.
git clone https://github.com/Lightning-AI/lightning-hpo && cd lightning-hpo
pip install -e . -r requirements.txt --find-links https://download.pytorch.org/whl/cpu/torch_stable.html
Make sure everything works fine.
python -m lightning run app app.py
Check the documentation to learn more !
Run the Research Studio App locally
In your first terminal, run the Lightning App.
python -m lightning run app app.py.
In second terminal, connect to the Lightning App and download its CLI.
python -m lightning connect localhost -y
python -m lightning --help
Usage: lightning [OPTIONS] COMMAND [ARGS]...
--help Show this message and exit.
Lightning App Commands
create drive Create a Drive.
delete drive Delete a Drive.
delete experiment Delete an Experiment.
delete sweep Delete a Sweep.
download artifacts Download an artifact.
run experiment Run an Experiment.
run sweep Run a Sweep.
show artifacts Show artifacts.
show drives Show Drives.
show experiments Show Experiments.
show sweeps Show all Sweeps or the Experiments from a given Sweep.
stop experiment Stop an Experiment.
stop sweep Stop a Sweep.
You are connected to the local Lightning App. Return to the primary CLI with `lightning disconnect`.
Run your first Sweep from sweep_examples/scripts folder
lightning run sweep train.py --model.lr "[0.001, 0.01, 0.1]" --data.batch "[32, 64]" --algorithm="grid_search" --requirements 'jsonargparse[signatures]>=4.15.2'
Scale by running the Research Studio App in the Cloud
Below, we are about to train a 1B+ LLM Model with multi-node.
python -m lightning run app app.py --cloud
Connect to the App once ready.
python -m lightning connect {APP_NAME} -y
Find below an example with a 1.6B parameter GPT2 transformer model using Lightning Transformers and DeepSpeed using the Lightning Transformers library.
import pytorch_lightning as pl
from lightning_transformers.task.nlp.language_modeling import LanguageModelingDataModule, LanguageModelingTransformer
from transformers import AutoTokenizer
model_name = "gpt2-xl"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = LanguageModelingTransformer(
pretrained_model_name_or_path=model_name,
tokenizer=tokenizer,
deepspeed_sharding=True,
)
dm = LanguageModelingDataModule(
batch_size=1,
dataset_name="wikitext",
dataset_config_name="wikitext-2-raw-v1",
tokenizer=tokenizer,
)
trainer = pl.Trainer(
accelerator="gpu",
devices="auto",
strategy="deepspeed_stage_3",
precision=16,
max_epochs=1,
)
trainer.fit(model, dm)
Run your first multi node training experiment from sweep_examples/scripts folder (2 nodes of 4 V100 GPUS each).
python -m lightning run experiment big_model.py --requirements deepspeed lightning-transformers==0.2.3 --num_nodes=2 --cloud_compute=gpu-fast-multi --disk_size=80
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for lightning_hpo-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3d78c85e5996c97ba3daeec26775e7e2d9cb001a3cce12f34f0c2acc6edae4c4 |
|
MD5 | 02b5a307ec88399e6fbbc0e9cefad471 |
|
BLAKE2b-256 | 0543b0c75f89760d4907bd82a716d66f634d4d6f647c4b5669bbb450b17ee3f0 |