Skip to main content

No project description provided

Project description

Slurm Job

A simple library to run any python function as a SLURM job, designed for XENONnT experiment but can be used for any other purpose.

Installation

pip install https://github.com/Microwave-WYB/slurm_job.git

Usage (For XENONnT users)

Basic usage:

import datetime

from slurm_job.slurm import SlurmOptions
from slurm_job.xenon_slurm import xenon_job


@xenon_job(
    options=SlurmOptions(
        partition="xenon1t", time="10:00", mem_per_cpu="100M", cpus_per_task=1, output="job.out"
    ),
    singularity_image="xenonnt-2024.04.1.simg",
    is_dali=False,
    timeout=datetime.timedelta(seconds=100),
)
def add(a, b):
    print("Adding:", a, b)
    return a + b


ret = add(1, 2)
print("Return:", ret)

Output:

Submitted batch job 40125577

slurm-40125577-add | Adding: 1 2
Return: 3

Where slurm describes the type of job, 40125577 is the job id, add is the function name, and 3 is the return value.

Check here for all available options. The options are equivalent to the options in the sbatch command. Refer to the Slurm documentation for more details.

Error handling:

Error during job execution will raise an JobFailedError exception. Complete error trace will be printed to the log file.

@xenon_job(
    options=SlurmOptions(
        partition="xenon1t", time="10:00", mem_per_cpu="100M", cpus_per_task=1, output="job.out"
    ),
    singularity_image="xenonnt-2024.04.1.simg",
    is_dali=False,
    timeout=datetime.timedelta(seconds=100),
    wait=False,
)
def fail():
    raise ValueError("This job will fail")


fail()

Output will be something like:

Submitted batch job 40125621

slurm-40125621-fail | Traceback (most recent call last):
slurm-40125621-fail |   File "<string>", line 17, in <module>
slurm-40125621-fail |   File "<string>", line 13, in <module>
slurm-40125621-fail |   File "/home/yuem/slurm_job/test.py", line 33, in fail
slurm-40125621-fail |     raise ValueError("This job will fail")
slurm-40125621-fail | ValueError: This job will fail
ValueError: This job will fail

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/yuem/slurm_job/test.py", line 36, in <module>
    fail()
  File "/home/yuem/slurm_job/slurm_job/xenon_slurm.py", line 74, in wrapper
    return job.run()
  File "/home/yuem/slurm_job/slurm_job/slurm.py", line 181, in run
    result = self.result()
  File "/home/yuem/slurm_job/slurm_job/core.py", line 170, in result
    return self._load_return()
  File "/home/yuem/slurm_job/slurm_job/core.py", line 143, in _load_return
    raise JobFailedError(
slurm_job.core.JobFailedError: Job fail with id 40125621 failed with exception: This job will fail

Define custom decorator:

Defining a custom decorator is useful when you want to use the same set of options for multiple functions. This simplifies the code and makes it easier to maintain.

import datetime
from functools import partial

from slurm_job.slurm import SlurmOptions
from slurm_job.xenon_slurm import xenon_job

my_job = partial(
    xenon_job,
    options=SlurmOptions(
        partition="xenon1t", time="10:00", mem_per_cpu="100M", cpus_per_task=1, output="job.out"
    ),
    singularity_image="xenonnt-2024.04.1.simg",
    is_dali=False,
)

@my_job(timeout=datetime.timedelta(seconds=100))
def add(a, b):
    print("Adding:", a, b)
    return a + b

"""
The above code is equivalent to:

@xenon_job(
    options=SlurmOptions(
        partition="xenon1t", time="10:00", mem_per_cpu="100M", cpus_per_task=1, output="job.out"
    ),
    singularity_image="xenonnt-2024.04.1.simg",
    is_dali=False,
    timeout=datetime.timedelta(seconds=100),
)
"""

Or, if you don't need any decorator options, you can define a simple decorator:

my_job = xenon_job(
    options=SlurmOptions(
        partition="xenon1t", time="10:00", mem_per_cpu="100M", cpus_per_task=1, output="job.out"
    ),
    singularity_image="xenonnt-2024.04.1.simg",
    is_dali=False,
    timeout=datetime.timedelta(seconds=100),
)

@my_job
def add(a, b):
    print("Adding:", a, b)
    return a + b

"""
The above code is equivalent to:

@xenon_job(
    options=SlurmOptions(
        partition="xenon1t", time="10:00", mem_per_cpu="100M", cpus_per_task=1, output="job.out"
    ),
    singularity_image="xenonnt-2024.04.1.simg",
    is_dali=False,
    timeout=datetime.timedelta(seconds=100),
)
"""

Advanced usage (with Jupiter notebook started as a Slurm job):

TODO: Add examples

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slurm_job-0.1.0.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

slurm_job-0.1.0-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file slurm_job-0.1.0.tar.gz.

File metadata

  • Download URL: slurm_job-0.1.0.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for slurm_job-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f3b171f5dac27a288908582097872633b186d35f7d1bd132810a34766959702d
MD5 0a9c59c93fa98aed9f06f8e3c87f25fa
BLAKE2b-256 e0f5efa9b130b7094ff276c5ea5cebf5c134e27fe96d3f036c27c03df837d163

See more details on using hashes here.

File details

Details for the file slurm_job-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: slurm_job-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for slurm_job-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2f28395320005b5ecf649e6d038b039c5e469820acf8f7326ba0f8a72b53efcc
MD5 2ac81e7b027ee9e93a00715fbabf0a22
BLAKE2b-256 36f152875bdfc74b1d09ace9fb467cba53e46b057a2cdbbafdb94e1d9c4e9951

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page