Skip to main content

A Snakemake executor plugin for submitting jobs to an AWS Parallel Cluster (pcluster) SLURM cluster.

Project description

Snakemake executor plugin: pcluster-slurm v_0.0.7_

Snakemake Executor Plugins (generally)

Snakemake plugin catalog docs.

pcluster-slurm plugin

AWS Parallel Cluster, pcluster slurm

AWS Parallel Cluster is a framework to deploy and manage dynamically scalable HPC clusters on AWS, running SLURM as the batch system, and pcluster manages all of the creating, configuring, and deleting of the cluster compute nodes. Nodes may be spot or dedicated. note, the AWS Parallel Cluster port of slurm has a few small, but critical differences from the standard slurm distribution. This plugin enables using slurm from pcluster head and compute nodes via snakemake >=v8.*.

Daylily Bfx Framework

Daylily is a bioinformatics framework that automates and standardizes all aspects of creating a self-scaling ephemeral cluster which can grow from 1 head node to many thousands of as-needed compute spot instances (modulo your quotas and budget). This is accomplished by using AWS Parallel Cluster to manage the cluster, and snakemake to manage the bfx workflows. In this context, slurm is the intermediary between snakemake and the cluster resource management. The pcluster slurm variant does not play nicely with vanilla slurm, and to date, the slurm snakemake executor has not worked with pcluster slurm. This plugin is a bridge between snakemake and pcluster-slurm.

Pre-requisites

Snakemake >=8.*

Conda

conda create -n snakemake -c conda-forge -c bioconda snakemake==8.20.6
conda activate snakemake

Installation (pip)

from an environment with snakemake and pip installed

pip install snakemake-executor-plugin-pcluster-slurm

Example Usage daylily cluster headnode

mkdir -p /fsx/resources/environments/containers/ubuntu/cache/
export SNAKEMAKE_OUTPUT_CACHE=/fsx/resources/environments/containers/ubuntu/cache/
snakemake --use-conda --use-singularity -j 10  --singularity-prefix /fsx/resources/environments/containers/ubuntu/ip-10-0-0-240/ --singularity-args "  -B /tmp:/tmp -B /fsx:/fsx  -B /home/$USER:/home/$USER -B $PWD/:$PWD" --conda-prefix /fsx/resources/environments/containers/ubuntu/ip-10-0-0-240/ --executor pcluster-slurm --default-resources slurm_partition='i64,i128,i192' --cache  --verbose -k

What Partitions Are Available?

Use sinfo to learn about your cluster (note, sinfo reports on all potential and active compute nodes. Read the docs to interpret which are active, which are not yet requested s
pot instances, etc). Below is what the daylily AWS parallel cluster looks like.

sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
i8*          up   infinite     12  idle~ i8-dy-gb64-[1-12]
i64          up   infinite     16  idle~ i64-dy-gb256-[1-8],i64-dy-gb512-[1-8]
i96          up   infinite     16  idle~ i96-dy-gb384-[1-8],i96-dy-gb768-[1-8]
i128         up   infinite     28  idle~ i128-dy-gb256-[1-8],i128-dy-gb512-[1-10],i128-dy-gb1024-[1-10]
i192         up   infinite     30  idle~ i192-dy-gb384-[1-10],i192-dy-gb768-[1-10],i192-dy-gb1536-[1-10]
a192         up   infinite     30  idle~ a192-dy-gb384-[1-10],a192-dy-gb768-[1-10],a192-dy-gb1536-[1-10]
  • As I look at this, it is possible that if unset, the partition will default to i8 in the output above. Maybe.

Other Cool Stuff

Real Time Cost Tracking & Use Throttling via Budgets, Tagging ... and the --comment sbatch flag.

I etensively make use of Cost allocation tags with AWS ParallelCluster in the daylily omics analysis framework $3 30x WGS analysis to track AWS cluster usage costs in realtime, and impose limits where appropriate (by user and project). This makes use of overriding the --comment flag to hold project/budget tags applied to ephemeral AWS resources, and thus enabling cost tracking/controls.

  • To change the --comment flag in v0.0.8 of the pcluster-slurm plugin, set the comment flag value in the envvar SMK_SLURM_COMMENT=RandD (RandD is the default).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

File details

Details for the file snakemake_executor_plugin_pcluster_slurm-0.0.31.tar.gz.

File metadata

File hashes

Hashes for snakemake_executor_plugin_pcluster_slurm-0.0.31.tar.gz
Algorithm Hash digest
SHA256 b500d8f7400f40659ccca7e2421b5069108a0b3bb352ffe1c1dacbf1193d58bb
MD5 936a98d03b75fe806815c37dd38f3ca6
BLAKE2b-256 6fe60cf9baa76a46b2360e46d130a13d57092710eff64e3e1b7d3b5db9a0ad90

See more details on using hashes here.

File details

Details for the file snakemake_executor_plugin_pcluster_slurm-0.0.31-py3-none-any.whl.

File metadata

File hashes

Hashes for snakemake_executor_plugin_pcluster_slurm-0.0.31-py3-none-any.whl
Algorithm Hash digest
SHA256 3bde77a2e97b622d090e30127856ec7b48f055a9791ddec9b96b2b731407d288
MD5 c3305d282d14f592782676bf8c976212
BLAKE2b-256 228a46f9cbd2cf38352d12f492be62316529ab8dc7855bcb06e4b62e33d3f895

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page