Skip to main content

Meta scheduler merging Slurm Arrays

Project description

sarray

Merge multiple independent Slurm job arrays into a single sbatch submission.

Instead of flooding the scheduler with N separate job arrays, sarray combines them into one array job where each task is routed to the right script with the right arguments. This reduces scheduler overhead and makes queue management easier.

How it works

Given two scripts:

# train.slurm  →  --array=0-2  (3 tasks)
# eval.slurm   →  --array=0-4  (5 tasks)

sarray generates a single array job with --array=0-7 (8 tasks) and a dispatcher that maps each global task ID back to the right script and local task ID:

global 0,1,2     → train.slurm  with SLURM_ARRAY_TASK_ID = 0,1,2
global 3,4,5,6,7 → eval.slurm   with SLURM_ARRAY_TASK_ID = 0,1,2,3,4

All SLURM_ARRAY_TASK_* environment variables are set correctly in each task.

Constraint: all merged jobs must have identical #SBATCH options (same resources, partition, etc.) — only --array can differ.


Installation

pip install sarray
# or
uv add sarray

Usage

Interactive mode (recommended)

Start a listen session — this spawns a subshell where sbatch is intercepted:

sarray listen

Your prompt changes to [sarray] (bold yellow) to indicate you're in a session.

Inside the session, call sbatch normally. Every call is queued instead of submitted:

sbatch --array=0-4 train.slurm model_a
sbatch --array=0-4 train.slurm model_b
sbatch eval.slurm

When ready, submit everything as one merged array:

sarray submit

Or discard and exit without submitting:

sarray cancel

Both commands exit the subshell automatically.


Standalone mode

Pass a queue file directly — no subshell needed:

sarray submit jobs.conf

Where jobs.conf contains one sbatch call per line:

sbatch --array=0-2 train.slurm lr=0.01
sbatch --array=0-2 train.slurm lr=0.001
sbatch --array=0-2 train.slurm lr=0.0001

Read from stdin:

echo "sbatch job.slurm" | sarray submit -

Commands

sarray listen

Spawns an interactive subshell with a fake sbatch that queues calls into a temporary file. The real sbatch is shadowed only inside this subshell — your parent shell is unaffected.

Exit the session with sarray submit or sarray cancel.


sarray submit [FILE|-]

Generate and submit the merged job array.

Argument / Flag Description
FILE Queue file to read (one sbatch ... line per job). Omit to use the active listen session.
- Read queue from stdin.
-o, --output FILE Save the generated script to this file (default: sarray.slurm in the current directory).
-n, --dry-run Print the generated script to stdout (syntax-highlighted) without submitting.
-t, --throttle N Limit the number of simultaneously running tasks (%N appended to --array).
any sbatch flag Any unknown flag is treated as an sbatch option and applied to the merged script (see below).

The generated script is always written to disk before submission — sarray.slurm by default — so you can always inspect what was submitted.

Submit-time sbatch overrides. Any flag not recognized by sarray submit is passed through as an #SBATCH directive in the generated script, overriding whatever the individual jobs had. Useful for options you can't know ahead of time:

# Chain two job arrays
sarray submit --dependency=aftercorr:12345

# Use a reservation you just got
sarray submit --reservation=my_nodes

# Delay start
sarray submit --begin=2026-04-04T08:00:00

CLI flags override #SBATCH directives. For example:

sbatch --mem=8GB job.slurm    # overrides #SBATCH --mem in job.slurm
sbatch --array=0-9 job.slurm  # overrides #SBATCH --array in job.slurm

--wrap is also supported (no script file needed):

sbatch --wrap "python train.py" --array=0-4 --mem=16GB

sarray cancel

Discard the current listen session queue and exit the subshell.


sarray throttle JOBID -n N [--requeue] [--kill]

Update the concurrent task limit of a running job array without cancelling it.

Argument / Flag Description
JOBID ID of the running job array.
-n, --max, --max-tasks N New maximum number of simultaneously running tasks.
-r, --requeue Requeue the most recently started tasks running above the new limit (they will run again later).
-k, --kill Cancel (via scancel) the most recently started tasks running above the new limit.

Excess tasks are always selected by recency — the most recently started ones are acted on first. --requeue and --kill are mutually exclusive.

# Slow down a running array to 2 concurrent tasks
sarray throttle 123456 --max 2

# Slow down and requeue the excess running tasks (they will be rescheduled)
sarray throttle 123456 --max 2 --requeue

# Slow down and permanently cancel the excess running tasks
sarray throttle 123456 --max 2 --kill

The command checks that the job exists, belongs to you, and is a job array before updating.


Example workflow

$ sarray listen
[sarray] $ sbatch --array=0-9 experiments/baseline.slurm
[sarray] $ sbatch --array=0-9 experiments/ablation.slurm
[sarray] $ sbatch --array=0-9 experiments/ablation2.slurm
[sarray] $ sarray submit --dry-run   # preview the merged script
[sarray] $ sarray submit             # submit and exit the session
Submitted batch job 42137
$

Result: one job array with 30 tasks instead of 3 separate submissions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sarray-0.3.1.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sarray-0.3.1-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file sarray-0.3.1.tar.gz.

File metadata

  • Download URL: sarray-0.3.1.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sarray-0.3.1.tar.gz
Algorithm Hash digest
SHA256 02947242fecb106977d744fea887e15e8a6e89bacb729167a76bbf9ad6840c60
MD5 b04edce83c79b2ee0a270812a4bac53f
BLAKE2b-256 4af4c878b929f49cd20c02f98dce93ddea1eccdb3ce5bcbaabd86dbdafd6cd3a

See more details on using hashes here.

Provenance

The following attestation bundles were made for sarray-0.3.1.tar.gz:

Publisher: release.yml on ncassereau/sarray

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sarray-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: sarray-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 17.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sarray-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4dced48d726d053a4981e6c9b30c09f3a91a800723558b38b485d96194829857
MD5 f2554b448f2080ba481b50ea9d4c72fc
BLAKE2b-256 8ca7cd08d9eba37b049dc4bef647cd3f9ff8155ea18e1fd2e0ddf6d0397282dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for sarray-0.3.1-py3-none-any.whl:

Publisher: release.yml on ncassereau/sarray

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page