Skip to main content

A package to make submitit easier to use

Project description

:rocket: Submit it Now! :rocket:

 License  Code style: black  Supported Python Versions  Twitter Follow

A makeshift toolkit, built on top of submitit, to launch SLURM jobs over a range of hyperparameters from the command line. It is designed to be used with existing Python scripts and interactively monitor their status.

submititnow provides two command-line tools:

  • slaunch to launch a Python script as SLURM job(s).
  • jt (job-tracker) to interactively monitor the jobs.

It also provides an abstracted experiment_lib.Experiment API to create, launch and monitor an experiment, or a group of job(s), from Python scripts with customized parameter-sweeping configurations, while being able to track them with jt.

slaunch : Launching a python script over SLURM

Let's say you have a Python script examples/annotate_queries.py which can be run using the following command:

python examples/annotate_queries.py --model='BERT-LARGE-uncased' \
    --dataset='NaturalQuestions' --fold='dev'

You can launch a job that runs this script over a SLURM cluster using the following:

slaunch examples/annotate_queries.py \
    --mem="16g" --gres="gpu:rtxa4000:1" \
    --model='BERT-LARGE-uncased' --dataset='NaturalQuestions' --fold='dev'

You can put all the slurm params in a config file and pass it to slaunch using --slurm_config flag. For example, the above command can be written as:

slaunch examples/annotate_queries.py \
    --config="examples/configs/gpu.json" \
    --model='BERT-LARGE-uncased' --dataset='NaturalQuestions' --fold='dev'

Launching multiple jobs with parameter-sweep

slaunch examples/annotate_queries.py \
    --config="examples/configs/gpu.json" \
    --sweep fold model \
    --model 'BERT-LARGE-uncased' 'Roberta-uncased' 'T5-cased-small' \
    --dataset='NaturalQuestions' --fold 'dev' 'train'

This will launch a total of 6 jobs with the following configuration:

Slaunch Terminal Response

Any constraints on the target Python script that we launch?

The target Python script must have the following format:

import argparse

# User defined functions and classes

def main(args: argparse.Namespace):
    # Code goes here
    pass


def add_arguments(parser = None) -> argparse.ArgumentParser:
    parser = parser or argparse.ArgumentParser()
    # Return the parser after populating it with arguments.
    return parser


if __name__ == '__main__':
    parser = add_arguments()
    main(parser.parse_args())

jt :   Looking up info on previously launched experiments:

As instructed in the above screenshot of the Launch response, user can utilize the jt (short for job-tracker) command to monitor the job progress.

jt jobs EXP_NAME [EXP_ID]

Executing jt jobs examples.annotate_queries 227720 will give the following response:

jt jobs EXP_NAME EXP_ID Terminal Response

In fact, user can also lookup all examples.annotate_queries jobs simply by removing [EXP_ID] from the previous command:

jt jobs examples.annotate_queries

jt jobs EXP_NAME Terminal Response

jt {err, out} JOB_ID

Looking up stderr and stdout of a Job

Executing jt out 227720_2 reveals the stdout output of the corresponding Job:

jt out JOB_ID Terminal Response Similarly, jt err 227720_2 reveals the stderr logs.

jt sh JOB_ID

Looking up SBATCH script for a Job

The submitit tool internally creates an SBATCH shell script per experiment to launch the jobs on a SLURM cluster. This command outputs this submission.sh file for inspection.

Executing jt sh 227720_2 reveals the following:

jt out JOB_ID Terminal Response

jt ls

Finally, user can use jt ls to simply list the experiments maintained by the submititnow tool.

jt_ls

The experiment names output by this command can then be passed into the jt jobs command.

Installing

Python 3.8+ is required.

pip install -U git+https://github.com/maharshi95/submititnow.git

Experiment API

Sometimes the slaunch command-line tool is not enough. For example, one may want to launch a job with customized parameter-sweep configurations, or vary a certain parameter (e.g. output_filepath) for each job in the launch. In such cases, one can use the Experiment API provided by submititnow to launch jobs from Python scripts and also get the benefits of being able to track them with jt.

examples/launch_demo_script.py provides a demo of how to use the Experiment API to launch a job with customized parameter-sweep configurations.

python examples/launch_demo_script.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

submititnow-0.9.6.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

submititnow-0.9.6-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file submititnow-0.9.6.tar.gz.

File metadata

  • Download URL: submititnow-0.9.6.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.19

File hashes

Hashes for submititnow-0.9.6.tar.gz
Algorithm Hash digest
SHA256 acdc5c16ac0436432af523d0f0e652045ba99e1074ce616f3001b3f372bace65
MD5 e288e1eae007fe265c618c15cddf6861
BLAKE2b-256 9d322f492958b2bf10579f3bc4de7956f8e9cb4969e19a4d9105dc4057b2a176

See more details on using hashes here.

File details

Details for the file submititnow-0.9.6-py3-none-any.whl.

File metadata

  • Download URL: submititnow-0.9.6-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.19

File hashes

Hashes for submititnow-0.9.6-py3-none-any.whl
Algorithm Hash digest
SHA256 28d09c7e6d6a9ccd0a6de6cee3b2b1e30f44003ecb8d467688090c3512a69d5a
MD5 c34c5abf5b5503f0881d5ad3b1ef23ed
BLAKE2b-256 b4c4e42da5301a0d7abf197b1d5a1d6032eba679b2fb60c2935d95247ff82e7c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page