Skip to main content

Submit array jobs to a SGE cluster without all the suffering

Project description

psub

badge

Submit and monitor array jobs on Hoffman2 with minimal configuration and suffering.

psub provides an intuitive way to submit array jobs on UCLA's Hoffman2 compute cluster and monitor their output logs. Instead of trying to write scripts that generate scripts that in turn gets submitted to the scheduler or dealing with environmental variables, you can do this with psub and forget about the rest:

psub --mem 4G --time 12:00:00 "./my_script.sh {} --argument {} ::: *.csv ::: arg1 arg2"

When run in a folder containing f1.csv, f2.csv and f3.csv, this will submit a job array of 6 jobs for each combination of arg and each file, and request 4 GB of memory and 12 hours from the scheduler:

./my_script.sh f1.csv --argument arg1
./my_script.sh f1.csv --argument arg2
./my_script.sh f2.csv --argument arg1
./my_script.sh f2.csv --argument arg2
./my_script.sh f3.csv --argument arg1
./my_script.sh f3.csv --argument arg2

psub keeps all stdouts and stderrs nice and tidy. You can view logs associated with a particular job with the psub logs subcommand.

See psub --help for all features.

There is also a Python programmatic interface:

from psub import Psub

pp = Psub(name="big_job",
          l_arch="intel*",
          l_mem="4G", 
          l_time="1:00:00", 
          l_highp=True)

for i in range(3):
    pp.add(f"echo hi {i}")  # add individually

# or add parameter combinations in one go
pp.add_parameter_combinations(
    "./my_script.sh {} --argument {}", 
    ["f1.csv", "f2.csv", "f3.csv"], ["arg1", "arg2"]
)

pp.submit()  # submit jobs

pp.status  # view job status
pp.exit_codes  # view exit codes of individual jobs

pp.rerun_failed()  # rerun any failed jobs (TBA)

psub is still in alpha, please let me know of any bugs.

psub is for quickly running and monitoring straightforward array jobs. If your workflow has complex interdependencies, you should look into the excellent snakemake tool.

Installation:

pip install psub

psub stands for petko-submit, the OG ernstlab member who had the core idea.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for psub, version 0.1.1a0
Filename, size File type Python version Upload date Hashes
Filename, size psub-0.1.1a0-py3-none-any.whl (9.4 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size psub-0.1.1a0.tar.gz (8.6 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page