Skip to main content

Meta scheduler merging Slurm Arrays

Project description

sarray

Merge multiple independent Slurm job arrays into a single sbatch submission.

Instead of flooding the scheduler with N separate job arrays, sarray combines them into one array job where each task is routed to the right script with the right arguments. This reduces scheduler overhead and makes queue management easier.

How it works

Given two scripts:

# train.slurm  →  --array=0-2  (3 tasks)
# eval.slurm   →  --array=0-4  (5 tasks)

sarray generates a single array job with --array=0-7 (8 tasks) and a dispatcher that maps each global task ID back to the right script and local task ID:

global 0,1,2     → train.slurm  with SLURM_ARRAY_TASK_ID = 0,1,2
global 3,4,5,6,7 → eval.slurm   with SLURM_ARRAY_TASK_ID = 0,1,2,3,4

All SLURM_ARRAY_TASK_* environment variables are set correctly in each task.

Constraint: all merged jobs must have identical #SBATCH options (same resources, partition, etc.) — only --array can differ.


Installation

pip install sarray
# or
uv add sarray

Usage

Interactive mode (recommended)

Start a listen session — this spawns a subshell where sbatch is intercepted:

sarray listen

Your prompt changes to [sarray] (bold yellow) to indicate you're in a session.

Inside the session, call sbatch normally. Every call is queued instead of submitted:

sbatch --array=0-4 train.slurm model_a
sbatch --array=0-4 train.slurm model_b
sbatch eval.slurm

When ready, submit everything as one merged array:

sarray submit

Or discard and exit without submitting:

sarray cancel

Both commands exit the subshell automatically.


Standalone mode

Pass a queue file directly — no subshell needed:

sarray submit jobs.conf

Where jobs.conf contains one sbatch call per line:

sbatch --array=0-2 train.slurm lr=0.01
sbatch --array=0-2 train.slurm lr=0.001
sbatch --array=0-2 train.slurm lr=0.0001

Read from stdin:

echo "sbatch job.slurm" | sarray submit -

Commands

sarray listen

Spawns an interactive subshell with a fake sbatch that queues calls into a temporary file. The real sbatch is shadowed only inside this subshell — your parent shell is unaffected.

Exit the session with sarray submit or sarray cancel.


sarray submit [FILE|-]

Generate and submit the merged job array.

Argument / Flag Description
FILE Queue file to read (one sbatch ... line per job). Omit to use the active listen session.
- Read queue from stdin.
-o, --output FILE Save the generated script to this file (default: sarray.slurm in the current directory).
-n, --dry-run Print the generated script to stdout (syntax-highlighted) without submitting.
-t, --throttle N Limit the number of simultaneously running tasks (%N appended to --array).

The generated script is always written to disk before submission — sarray.slurm by default — so you can always inspect what was submitted.

CLI flags override #SBATCH directives. For example:

sbatch --mem=8GB job.slurm    # overrides #SBATCH --mem in job.slurm
sbatch --array=0-9 job.slurm  # overrides #SBATCH --array in job.slurm

--wrap is also supported (no script file needed):

sbatch --wrap "python train.py" --array=0-4 --mem=16GB

sarray cancel

Discard the current listen session queue and exit the subshell.


sarray throttle JOBID -n N [--kill]

Update the concurrent task limit of a running job array without cancelling it.

Argument / Flag Description
JOBID ID of the running job array.
-n, --max, --max-tasks N New maximum number of simultaneously running tasks.
-k, --kill Requeue tasks currently running above the new limit.
# Slow down a running array to 2 concurrent tasks
sarray throttle 123456 --max 2

# Slow down and immediately requeue the excess running tasks
sarray throttle 123456 --max 2 --kill

The command checks that the job exists, belongs to you, and is a job array before updating.


Example workflow

$ sarray listen
[sarray] $ sbatch --array=0-9 experiments/baseline.slurm
[sarray] $ sbatch --array=0-9 experiments/ablation.slurm
[sarray] $ sbatch --array=0-9 experiments/ablation2.slurm
[sarray] $ sarray submit --dry-run   # preview the merged script
[sarray] $ sarray submit             # submit and exit the session
Submitted batch job 42137
$

Result: one job array with 30 tasks instead of 3 separate submissions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sarray-0.1.0.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sarray-0.1.0-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file sarray-0.1.0.tar.gz.

File metadata

  • Download URL: sarray-0.1.0.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sarray-0.1.0.tar.gz
Algorithm Hash digest
SHA256 42b6e93f46d7fbb8c52da9c0645f4c09422ab87203513f892b1d33daaaee6ee6
MD5 963b58119560d7718b32a8475d68a21b
BLAKE2b-256 db9a86c243caf5740a01eda3b21503c1635c5b2bb610469b28b5c948fe2c03d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for sarray-0.1.0.tar.gz:

Publisher: release.yml on ncassereau/sarray

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sarray-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: sarray-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sarray-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3c2f933a8f43f5fd33ec708931a4cc01be5448ef2e6aa392ff0f764d5f4de535
MD5 d7f86139982f2d5096e5fe119be8e664
BLAKE2b-256 12fbdb0a60c91a03665f48cbf6dc4de38f51aa08549fafcd5b050063da5470bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for sarray-0.1.0-py3-none-any.whl:

Publisher: release.yml on ncassereau/sarray

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page