A package to make submitit easier to use
Project description
:rocket: Submit it Now! :rocket:
A makeshift toolkit, built on top of submitit, to launch SLURM jobs over a range of hyperparameters from the command line. It is designed to be used with existing Python scripts and interactively monitor their status.
submititnow
provides two command-line tools:
slaunch
to launch a Python script as SLURM job(s).jt
(job-tracker) to interactively monitor the jobs.
It also provides an abstracted experiment_lib.Experiment
API to create, launch and monitor an experiment, or a group of job(s), from Python scripts with customized parameter-sweeping configurations, while being able to track them with jt
.
slaunch
: Launching a python script over SLURM
Let's say you have a Python script examples/annotate_queries.py
which can be run using the following command:
python examples/annotate_queries.py --model='BERT-LARGE-uncased' \
--dataset='NaturalQuestions' --fold='dev'
You can launch a job that runs this script over a SLURM cluster using the following:
slaunch examples/annotate_queries.py \
--mem="16g" --gres="gpu:rtxa4000:1" \
--model='BERT-LARGE-uncased' --dataset='NaturalQuestions' --fold='dev'
You can put all the slurm params in a config file and pass it to slaunch
using --slurm_config
flag. For example, the above command can be written as:
slaunch examples/annotate_queries.py \
--config="examples/configs/gpu.json" \
--model='BERT-LARGE-uncased' --dataset='NaturalQuestions' --fold='dev'
Launching multiple jobs with parameter-sweep
slaunch examples/annotate_queries.py \
--config="examples/configs/gpu.json" \
--sweep fold model \
--model 'BERT-LARGE-uncased' 'Roberta-uncased' 'T5-cased-small' \
--dataset='NaturalQuestions' --fold 'dev' 'train'
This will launch a total of 6 jobs with the following configuration:
Any constraints on the target Python script that we launch?
The target Python script must have the following format:
import argparse
# User defined functions and classes
def main(args: argparse.Namespace):
# Code goes here
pass
def add_arguments(parser = None) -> argparse.ArgumentParser:
parser = parser or argparse.ArgumentParser()
# Return the parser after populating it with arguments.
return parser
if __name__ == '__main__':
parser = add_arguments()
main(parser.parse_args())
jt
: Looking up info on previously launched experiments:
As instructed in the above screenshot of the Launch response, user can utilize the jt
(short for job-tracker
) command to monitor the job progress.
jt jobs EXP_NAME [EXP_ID]
Executing jt jobs examples.annotate_queries 227720
will give the following response:
In fact, user can also lookup all examples.annotate_queries
jobs simply by removing [EXP_ID]
from the previous command:
jt jobs examples.annotate_queries
jt {err, out} JOB_ID
Looking up stderr and stdout of a Job
Executing jt out 227720_2
reveals the stdout
output of the corresponding Job:
Similarly, jt err 227720_2
reveals the stderr
logs.
jt sh JOB_ID
Looking up SBATCH script for a Job
The submitit tool internally creates an SBATCH shell script per experiment to launch the jobs on a SLURM cluster. This command outputs this submission.sh
file for inspection.
Executing jt sh 227720_2
reveals the following:
jt ls
Finally, user can use jt ls
to simply list the experiments maintained by the submititnow
tool.
The experiment names output by this command can then be passed into the jt jobs
command.
Installing
Python 3.8+ is required.
pip install -U git+https://github.com/maharshi95/submititnow.git
Experiment API
Sometimes the slaunch
command-line tool is not enough. For example, one may want to launch a job with customized parameter-sweep configurations, or vary a certain parameter (e.g. output_filepath
) for each job in the launch. In such cases, one can use the Experiment API provided by submititnow
to launch jobs from Python scripts and also get the benefits of being able to track them with jt
.
examples/launch_demo_script.py provides a demo of how to use the Experiment
API to launch a job with customized parameter-sweep configurations.
python examples/launch_demo_script.py
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file submititnow-0.9.6.tar.gz
.
File metadata
- Download URL: submititnow-0.9.6.tar.gz
- Upload date:
- Size: 16.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | acdc5c16ac0436432af523d0f0e652045ba99e1074ce616f3001b3f372bace65 |
|
MD5 | e288e1eae007fe265c618c15cddf6861 |
|
BLAKE2b-256 | 9d322f492958b2bf10579f3bc4de7956f8e9cb4969e19a4d9105dc4057b2a176 |
File details
Details for the file submititnow-0.9.6-py3-none-any.whl
.
File metadata
- Download URL: submititnow-0.9.6-py3-none-any.whl
- Upload date:
- Size: 17.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 28d09c7e6d6a9ccd0a6de6cee3b2b1e30f44003ecb8d467688090c3512a69d5a |
|
MD5 | c34c5abf5b5503f0881d5ad3b1ef23ed |
|
BLAKE2b-256 | b4c4e42da5301a0d7abf197b1d5a1d6032eba679b2fb60c2935d95247ff82e7c |