Skip to main content

Launch experiments locally or on SLURM clusters with ease. Fork from 'experiment-launcher'.

Project description

Experiment Launcher

Launch experiments locally or on a cluster running SLURM in a single file.
Fork from experiment_launcher

Description

The experiment_launcher package provides a simple way to run multiple experiments using SLURM or Joblib with minimum effort - you just have to set the local parameter to True to run locally, or to False to run with a cluster using SLURM.

It is particularly useful to run multiple seeds and/or test multiple configurations of hyperparameters such as learning rates, batch size.

Installation

You can install the package from pip with:

pip install "experiment-launcher-meco

If you want to install it locally, you can do so with:

pip install -e .

How to Use

Basic Usage

The best way to understand experiment launcher is to look at the basic example in examples/basic.

Single experiment

  • examples/basic/test.py consists of:
    • The function experiment is the entry point of your experiment

      • It takes as arguments your experiment settings (e.g., the number of layers in a neural network, the learning rate, ...)
      • The arguments need to be assigned a type and default value in the function definition
        • Current accepted types are int, float, str, bool, list
        • The arguments seed and results_dir must always be included
        • Python kwargs can also be added as **kwargs (accepted types are the same as above)
      • This function must be decorated with a decorator, e.g. @single_experiment. This will take care of creating proper results directories.
    • The if __name__ == '__main__' block

      • This must contain one single line: run_experiment(experiment)
  • You can test your code by running
    cd examples/basic
    python test.py
    

Launch file

  • examples/basic/launch_test.py consists of:
    • Creating an instance of the LauncherConfig object, that contains the SLURM or Joblib (if run locally) parameters. These are some of the important parameters. For more consult the class definition.
      • exp_name is the experiment name, under which a results directory will be created
      • exp_file is the path to the python file where the experiment is implemented (without the extension .py)
      • n_seeds is the number of random seeds for each single experiment configuration
    • Advanced configs are provided via nested Pydantic objects:
      • ResourceConfig specifies:
        • n_exps_in_parallel is the number of experiments to be run in parallel. This is useful to run multiple jobs in a single GPU in the cluster
        • n_cores is the number of cores for each experiment. Note that if n_exps_in_parallel > 1, then n_exps_in_parallel jobs will share n_cores.
        • memory_per_core is amount of memory in MB requested for each core in SLURM. If you specify this too low, SLURM might crash.
      • SlurmConfig specifies:
        • partition is the SLURM partition, which is cluster dependent
        • gres are special resources asked for a SLURM experiment
        • project_name is the project name in the cluster
      • EnvironmentConfig specifies:
        • conda_env if you are using a conda environment, specify its name here
      • DurationConfig specifies the max runtime of the SLURM job (days, hours, minutes, seconds).
    • Creating an instance of the Launcher object: launcher = Launcher(config)
    • Adding experiments with launcher.add_experiment
      • Use launcher.add_experiment to create an experiment for a particular configuration (e.g., different learning rates)
      • You can use the Sweep class to automatically sweep over a list of parameters.
      • E.g. launcher.add_experiment(learning_rate=Sweep(values=[1e-3, 1e-4]), batch_size=32) creates two experiments: one with 1e-3 and another with 1e-4.
      • Swept parameters automatically are used to organize the results directories. Results directories are going to be created by default as:
        • ./logs/exp_name_DATE/learning_rate_0.001/SEED/ and ./logs/exp_name_DATE/learning_rate_0.0001/SEED/
      • If multiple sweeps are provided, the Cartesian product of all sweep values is calculated to generate the experiments.
    • Running the experiments with launcher.run(LOCAL)
      • This runs your experiment either locally (LOCAL: True) or in the cluster (LOCAL: False)

Running the experiment

  • To run the launcher simply call
    cd examples/basic
    python launch_test.py
    
  • Log files will be placed in
    • ./logs if running locally or in the IAS cluster
    • /work/scratch/$USERNAME if ran in the Lichtenberg-Hochleistungsrechner of the TU Darmstadt

Integration with Weights and Biases

The experiment launcher provides an easy way to integrate with Weights and Biases.

  • In the experiment file add
    • **kwargs in the experiment function definition
  • In the launcher file, create wandb options and pass them to the launcher.add_experiment()
    wandb_options = dict(
      wandb_enabled=False,  # If True, runs and logs to wandb.
      wandb_entity='joaocorreiacarvalho',
      wandb_project='experiment_launcher_test',
      wandb_group='group_test'
    )
    
  • To use wandb, you need to install it with pip install wandb and log in with wandb login

Running experiments with parameters in configuration files

  • If you have many parameters that you need to change, a good idea can be to use configuration files.
  • If you want to specify your parameters in a configuration file, look at the example under examples/config_files

Notes

  • For reproducibility, the seeds are created sequentially from 0 to n_exps-1.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

experiment_launcher_meco-4.0.1.tar.gz (25.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

experiment_launcher_meco-4.0.1-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file experiment_launcher_meco-4.0.1.tar.gz.

File metadata

  • Download URL: experiment_launcher_meco-4.0.1.tar.gz
  • Upload date:
  • Size: 25.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for experiment_launcher_meco-4.0.1.tar.gz
Algorithm Hash digest
SHA256 cff7efe0da905d2899fcff3707127dd089d15f24ac971a1ff0fb0a71b7e62c03
MD5 627bbddec4028fdcc8b3a64fd7063ca0
BLAKE2b-256 306920c89a64a2254db296208f935942f9f7470c5f3b555180f3d30c07696aae

See more details on using hashes here.

Provenance

The following attestation bundles were made for experiment_launcher_meco-4.0.1.tar.gz:

Publisher: publish.yaml on meco-group/experiment-launcher

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file experiment_launcher_meco-4.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for experiment_launcher_meco-4.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 19557cb0471995b72bb2bdef80a39cb8117b07be8a4d099a3f28e9a7d679616b
MD5 650ddf59af4ebfb625e9daf3dc71e47e
BLAKE2b-256 d4a29f779c5aa73ff3a12bba20161de09f8b7b82686a4759452cb617ee6a7f95

See more details on using hashes here.

Provenance

The following attestation bundles were made for experiment_launcher_meco-4.0.1-py3-none-any.whl:

Publisher: publish.yaml on meco-group/experiment-launcher

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page