Skip to main content

Lightweight python library for launching experiments and tuning hyperparameters, either locally or on a cluster

Project description

onager

Lightweight python library for launching experiments and tuning hyperparameters, either locally or on a cluster.

By Cameron Allen & Neev Parikh

Installation

Currently requires Python 3

pip install onager

Usage

Prelaunch

Prelaunch generates commands and adds them to a jobfile. The default behavior also prints the list of generated commands.

onager prelaunch +jobname experiment1 +command myscript +arg --learningrate 0.1 0.01 0.001 +arg --batchsize 32 64 128 +tag --mytag

Output:

myscript --learningrate 0.1 --batchsize 32 --mytag experiment1_1__learningrate_0.1__batchsize_32
myscript --learningrate 0.01 --batchsize 32 --mytag experiment1_2__learningrate_0.01__batchsize_32
myscript --learningrate 0.001 --batchsize 32 --mytag experiment1_3__learningrate_0.001__batchsize_32
myscript --learningrate 0.1 --batchsize 64 --mytag experiment1_4__learningrate_0.1__batchsize_64
myscript --learningrate 0.01 --batchsize 64 --mytag experiment1_5__learningrate_0.01__batchsize_64
myscript --learningrate 0.001 --batchsize 64 --mytag experiment1_6__learningrate_0.001__batchsize_64
myscript --learningrate 0.1 --batchsize 128 --mytag experiment1_7__learningrate_0.1__batchsize_128
myscript --learningrate 0.01 --batchsize 128 --mytag experiment1_8__learningrate_0.01__batchsize_128
myscript --learningrate 0.001 --batchsize 128 --mytag experiment1_9__learningrate_0.001__batchsize_128

Launch

Launch reads a jobfile (or accepts a single user-specified command), and launches the associated job(s) on the specified backend. Currently onager supports 'slurm' and 'gridengine' as cluster backends, and 'local' for running on a single host.

onager launch --backend slurm --jobname experiment1

Output:

sbatch -J experiment1 -t 0-01:00:00 -n 1 -p batch --mem=2G -o .onager/logs/slurm/%x_%A_%a.o -e .onager/logs/slurm/%x_%A_%a.e --parsable --array=1,2,3,4,5,6,7,8,9 .onager/scripts/experiment1/wrapper.sh

Config

By default, onager will simply launch commands for you. If you need to do additional initialization or cleanup, you can configure it using the config subcommand and writing to the header or footer fields of the appropriate backend.

onager config --write slurm header "module load python/3.7.4
module load cuda/10.2
module load cudnn/7.6.5
source ./venv/bin/activate"

List

List is useful for displaying information about launched jobs and tasks, since the backend will typically assign the same jobname to all subtasks.

onager list

Output:

  job_id    task_id  jobname      command                                                                                                   tag
--------  ---------  -----------  --------------------------------------------------------------------------------------------------------  ------------------------------------------------
13438569          1  experiment1  'myscript --learningrate 0.1 --batchsize 32 --mytag experiment1_1__learningrate_0.1__batchsize_32'        experiment1_1__learningrate_0.1__batchsize_32
13438569          2  experiment1  'myscript --learningrate 0.01 --batchsize 32 --mytag experiment1_2__learningrate_0.01__batchsize_32'      experiment1_2__learningrate_0.01__batchsize_32
13438569          3  experiment1  'myscript --learningrate 0.001 --batchsize 32 --mytag experiment1_3__learningrate_0.001__batchsize_32'    experiment1_3__learningrate_0.001__batchsize_32
13438569          4  experiment1  'myscript --learningrate 0.1 --batchsize 64 --mytag experiment1_4__learningrate_0.1__batchsize_64'        experiment1_4__learningrate_0.1__batchsize_64
13438569          5  experiment1  'myscript --learningrate 0.01 --batchsize 64 --mytag experiment1_5__learningrate_0.01__batchsize_64'      experiment1_5__learningrate_0.01__batchsize_64
13438569          6  experiment1  'myscript --learningrate 0.001 --batchsize 64 --mytag experiment1_6__learningrate_0.001__batchsize_64'    experiment1_6__learningrate_0.001__batchsize_64
13438569          7  experiment1  'myscript --learningrate 0.1 --batchsize 128 --mytag experiment1_7__learningrate_0.1__batchsize_128'      experiment1_7__learningrate_0.1__batchsize_128
13438569          8  experiment1  'myscript --learningrate 0.01 --batchsize 128 --mytag experiment1_8__learningrate_0.01__batchsize_128'    experiment1_8__learningrate_0.01__batchsize_128
13438569          9  experiment1  'myscript --learningrate 0.001 --batchsize 128 --mytag experiment1_9__learningrate_0.001__batchsize_128'  experiment1_9__learningrate_0.001__batchsize_128

Cancel

Quickly cancel the specified jobs (and subtasks) on the backend

onager cancel --backend slurm --jobid 13438569 --tasklist 1-3:1,5,8-9

Output:

scancel 13438569_1 13438569_2 13438569_3 13438569_5 13438569_8 13438569_9

Re-launch

Launch also supports re-running selected subtasks from a previously launched job

onager launch --backend slurm --jobname experiment1 --tasklist 1-3:1,5,8-9

Output:

sbatch -J experiment1 -t 0-01:00:00 -n 1 -p batch --mem=2G -o .onager/logs/slurm/%x_%A_%a.o -e .onager/logs/slurm/%x_%A_%a.e --parsable --array=1-3:1,5,8-9 .onager/scripts/experiment1/wrapper.sh

Help

For a list of the available subcommands and their respective arguments, use the help subcommand:

onager help
onager help launch

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

onager-0.1.1.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

onager-0.1.1-py3-none-any.whl (17.8 kB view details)

Uploaded Python 3

File details

Details for the file onager-0.1.1.tar.gz.

File metadata

  • Download URL: onager-0.1.1.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.4

File hashes

Hashes for onager-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2c5b61d8d95bfec5143902666afeb05b00dbd6cbadde54709f17eafa2caaba2f
MD5 98ebb4396270fcd0f567f4285334895d
BLAKE2b-256 37091c0adc6b283cbb406fe804ca048315ec48ef401cc77c7e3af03ce17b6b77

See more details on using hashes here.

File details

Details for the file onager-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: onager-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 17.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.4

File hashes

Hashes for onager-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 44655bf08bb0536a166323011e56c078bdb4701a3872c2d04de330244a1d5608
MD5 566ac11620c9f72a5d4d5962ec3e9897
BLAKE2b-256 cd3200545c62480773c20c4d9664e857e346b140d68d6aa7296bd374c6ed208f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page