Skip to main content

FSL Cluster Submission Plugin for Slurm

Project description

fsl_sub_plugin_slurm

Job submission to SLURM variant cluster queues.

Copyright 2018-2023 University of Oxford, Oxford, UK.

Introduction

fsl_sub provides a consistent interface to various cluster backends, with a fall back to running tasks locally where no cluster is available. This fsl_sub plugin provides support for submitting tasks to SLURM clusters.

For installation instructions please see INSTALL.md; for building packages see BUILD.md.

Configuration

Use the command:

fsl_sub_config slurm > fsl_sub.yml

to generate an example configuration, including queue definitions gleaned from the SLURM software - check these, paying attention to any warnings generated.

Use the fsl_sub.yml file as per the main fsl_sub documentation.

The configuration for the SLURM plugin is in the method_opts section, under the key slurm.

Method options

Key Values (default/recommended in bold) Description
queues True Does this method use queues/partitions (should be always be True)
memory_in_gb True/False Whether SLURM reports memory in GB - normally false
copy_environment True/False Whether to replicate the environment variables in the shell that called fsl_sub into the job's shell
has_parallel_envs False/True SLURM does not provide parallel environments so this should always be false
script_conf True/False Whether --usesscript option to fsl_sub is available via this method. This option allows you to define the grid options as comments in a shell script and then provide this to the cluster for running. Should be set to True.
mail_support True/False Whether the grid installation is configured to send email on job events.
mail_modes Dictionary of option lists If the grid has email notifications turned on, this option configures the submission options for different verbosity levels, 'b' = job start, 'e' = job end, 'a' = job abort, 'f' = all events, 'n' = no mail. Each event type should then have a list of submission mail arguments that will be applied to the submitted job. Typically, these should not be edited.
mail_mode b/e/a/f/n Which of the above mail_modes to use by default
notify_ram_usage True/False Whether to notify SLURM of the RAM you have requested. SLURM is typically configured to give jobs a small RAM allocation so you will invariably need this set to true.
set_time_limit True/False Whether to notify SLURM of the expected maximum run-time of your job. This helps the scheduler fill in reserved slots (for e.g. parallel environment jobs), however, this time limit will be enforced, resulting in a job being killed if it is exceeded, even if this is less than the queue run-time limit. This can be disabled on a per-job basis by setting the environment variable FSLSUB_NOTIMELIMIT to '1' (or 'True').
array_holds True/False Enable support array holds, e.g. sub-task 1 waits for parent sub-task 1.
array_limit True/False Enable limiting number of concurrent array tasks.
job_resources True/False Enable additional job resource specification support.
projects True/False Enable support for projects typically used auditing/charging purposes.
preseve_modules True/False Whether to re-load shell modules on the compute node. Required if you have multiple CPU generations and per-generation optimised libraries configured with modules.
add_module_paths []/ a list List of file system paths to search for modules in addition to the system defined ones. Useful if you have your own shell modules directory but need to allow the compute node to auto-set it's MODULEPATH environment variable (e.g. to a architecture specific folder). Only used when preserve_modules is True.
export_vars []/List List of environment variables that should transfered with the job to the compute node
keep_jobscript True/False Whether to preserve the generated wrapper in a file wrapper_<jobid>.sh. This file contains sufficient information to resubmit this job in the future. If the job fails to submit, this will be preserved in the file wrapper_failed_<DD>-<MMM>-<YYYY>_<HH><MM><SS>.sh.
extra_args []/List List of additional SLURM arguments to pass through to the sheduler.
strict_dependencies True/False Whether to use 'afterok' (True) or 'afterany' (False) when specifying simple job dependencies. This can also be controlled by the environment variable FSLSUB_STRICTDEPS (True="1", False="0").

Coprocessor Configuration

This plugin is not capable of automatically determining all the necessary information to configure your co-processors but will advise of the information it can find and propose queue definitions for these GPU resources.

SLURM typically selects GPU resources with a GRES (Generic RESource) that defines the type and quantity of the co-processor. Where multiple classes of co-processor are available this might be selectable via the GRES or you may need to provide a constraint. If you would like to be able to support running on a class and all superior devices you need to be able to use constraints as GRES requests do not support logical combinations. The automatically generated configuration should include useful information about your GRES and constraints, but should you wish to obtain this information yourself use the commands:

  • sinfo -p <partition> -o %G - This will list all the GRES defined on <partition>.
  • sinfo -p <partition> -o %f - This will list all features selectable by a --constraint as a comma-separated list.

Typically CUDA resources will be controlled using GRES or constraints with gpu in the name, so look for these.

For each coprocessor hardware type you need a sub-section given an identifier than will be used to request this type of coprocessor. For CUDA processors this sub-section must be called 'cuda' to ensure that FSL tools can auto-detect and use CUDA hardware/queues.

Key Values (default/recommended in bold) Description
resource String GRES that, when requested, selects machines with the hardware present, e.g. gpu.
classes True/False Whether more than one type of this co-processor is available
include_more_capable True/False Whether to automatically request all classes that are more capable than the requested class. This requires the class_constraints option to be set to True and for your SLURM cluster to be set up with GPU features/constraints
class_types Configuration dictionary This contains the definition of the GPU classes...
Key
class selector This is the letter (or word) that is used to select this class of co-processor from the fsl_sub commandline. For CUDA devices you may consider using the card name e.g. A100.
resource This is the name of the SLURM GRES 'type' or contraint that will be used to select this GPU family.
doc The description that appears in the fsl_sub help text about this device.
capability An integer defining the feature set of the device, your most basic device should be given the value 1 and more capable devices higher values, e.g. GTX = 1, Kelper = 2, Pascal = 3, Volta = 4.
default_class Class type key The class selector for the class to assign jobs to where a class has not been specified in the fsl_sub call. For FSL tools that automatically submit to CUDA queues you should aim to select one that has good double-precision performance (K40
class_constraint False/string Whether your SLURM cluster is configured to use constraints to select co-processor models/features. If so this should be set to the name of the feature that selects between the models and the co-processor class resource strings set appropriately to match the available values.
presence_test Program path (nvidia-smi for CUDA) The name of a program that can be used to look for this coprocessor type, for example nvidia-smi for CUDA devices. Program needs to return non-zero exit status if there are no available coprocessors.

Queue Definitions

Slurm refers to queues as partitions. The example configuration should contain definitions for the automatically discovered partitions but you should review these, in particular any warnings generated. To query SLURM for queue information you can use the following SLURM commands.

To get a list of all available partitions use:

sinfo -s -o %P

Then the details for a queue can be obtained with:

sinfo -p partitionname -O 'CPUs,MaxCPUsPerNode,Memory,Time,NodeHost'

This will return details for every node within that partition. The queue definition should then be setup as follows:

Key Value type Description
time integer in minutes The TIMELIMIT column reports in days-hours:minutes:seconds, this needs converting to minutes. Provide the maximum value observed, but if there are multiple values you should consider enabling job time notification so that SLURM can select the correct node.
max_size integer in GB This is the maximum permitted memory on a node. This is usually reported by SLURM in MB, so for example 63000 should be configured as 63 (GB). It is equal to the maximum MEMORY value reported. Once again, if there are multiple node types you should turn on RAM nofitication so that nodes can be correctly selected.
max_slots CPU contains the number of CPUs (threads) available on each node. Set this option to the maximum number reported.
slot_size Null/integer in GB This is largely meaningless on SLURM and left at None. If you find that you need to get fsl_sub to split your job into multiple threads to achieve your memory requirements then set this to the figure provided by your cluster manager.
group integer (Optional) All partitions with the same group number will be considered together when scheduling, typically this would be all queues with the same run time but differing memory/core counts.
priority integer (Optional) Priority within a group - higher wins.
default True Is this the default queue when no time/RAM details provided.
copros Co-processor dictionary Optional If this queue has hosts with co-processors (e.g. CUDA devices), then provide this entry, with a key identical to the associated co-processor definition, e.g. cuda.
max_quantity An integer representing the maximum number of this coprocessor type available on a single compute node. This can be obtained by looking at the complexes entry of qconf -se <hostname> for all of the hosts in this queue. If the complex is gpu then an entry of gpu=2 would indicated that this value should be set to 2.
classes A list of coprocessor classes (as defined in the coprocessor configuration section) that this queue has hardware for.
exclusive True/False

Where a partition has obvious GRES or features that define GPUs a proposed GPU configuration will be added as comments to the start of the queue definition. You should review this, create/update the coproc_opts>cuda record with the information in the comments and then this section can be uncommented to enable GPU support.

Compound Queues

Some clusters may be configured with multiple variants of the same partition, e.g. short.a, short.b, with each queue having different hardware, perhaps CPU generation or maximum memory or memory available per slot. To maximise scheduling options you can define compound queues which have the configuration of the least capable constituent. To define a compound queue, the queue name (key of the YAML dictionary) should be a comma separated list of queue names (no space).

SLURM specific usage

Job dependencies

The default dependency handler (-j) takes a job id and uses --dependency=afterany:\<jobid> to control job hierarchies of non-array tasks. To switch to the old behaviour (pre-version 1.6.0) of using afterok (ancestor job must complete successfully), either modify the method configuration to set strict_dependencies to True or set the environment variable FSLSUB_STRICTDEPS to "1". Where you need to specify complex dependencies you may pass the raw SLURM dependency description (without the --dependency=) in the -j argument. For array tasks, the array_hold argument is used instead of the -j argument. This can take the same forms as the -j argument (job ids or complex hold descriptions).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fsl_sub_plugin_slurm-1.6.2.tar.gz (32.9 kB view hashes)

Uploaded Source

Built Distribution

fsl_sub_plugin_slurm-1.6.2-py3-none-any.whl (29.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page