Utilities for expanding dask-jobqueue with appropriate settings for NCAR's clusters
Project description
ncar-jobqueue
ncar-jobqueue
provides utilities for configuring dask-jobqueue with appropriate default settings for NCAR's clusters.
The following compute servers are supported:
- Cheyenne (cheyenne.ucar.edu)
- Casper (DAV) (casper.ucar.edu)
- Hobart (hobart.cgd.ucar.edu)
- Izumi (izumi.unified.ucar.edu)
Badges
CI | |
---|---|
Package | |
License |
Installation
NCAR-jobqueue can be installed from PyPI with pip:
python -m pip install ncar-jobqueue
NCAR-jobqueue is also available from conda-forge for conda installations:
conda install -c conda-forge ncar-jobqueue
Configuration
ncar-jobqueue
provides a custom configuration file with appropriate default settings for different clusters. This configuration file resides in ~/.config/dask/ncar-jobqueue.yaml
:
ncar-jobqueue.yaml
cheyenne: pbs: #project: XXXXXXXX name: dask-worker-cheyenne cores: 18 # Total number of cores per job memory: '109GB' # Total amount of memory per job processes: 18 # Number of Python processes per job interface: ib0 # Network interface to use like eth0 or ib0 queue: regular walltime: '01:00:00' resource-spec: select=1:ncpus=36:mem=109GB log-directory: '/glade/scratch/${USER}/dask/cheyenne/logs' local-directory: '/glade/scratch/${USER}/dask/cheyenne/local-dir' job-extra: [] env-extra: [] death-timeout: 60 casper-dav: pbs: #project: XXXXXXXX name: dask-worker-casper-dav cores: 2 # Total number of cores per job memory: '25GB' # Total amount of memory per job processes: 1 # Number of Python processes per job interface: ib0 walltime: '01:00:00' resource-spec: select=1:ncpus=1:mem=25GB queue: casper log-directory: '/glade/scratch/${USER}/dask/casper-dav/logs' local-directory: '/glade/scratch/${USER}/dask/casper-dav/local-dir' job-extra: [] env-extra: [] death-timeout: 60 hobart: pbs: name: dask-worker-hobart cores: 10 # Total number of cores per job memory: '96GB' # Total amount of memory per job processes: 10 # Number of Python processes per job # interface: null # ib0 doesn't seem to be working on Hobart queue: medium walltime: '08:00:00' resource-spec: nodes=1:ppn=48 log-directory: '/scratch/cluster/${USER}/dask/hobart/logs' local-directory: '/scratch/cluster/${USER}/dask/hobart/local-dir' job-extra: ['-r n'] env-extra: [] death-timeout: 60 izumi: pbs: name: dask-worker-izumi cores: 10 # Total number of cores per job memory: '96GB' # Total amount of memory per job processes: 10 # Number of Python processes per job # interface: null # ib0 doesn't seem to be working on Hobart queue: medium walltime: '08:00:00' resource-spec: nodes=1:ppn=48 log-directory: '/scratch/cluster/${USER}/dask/izumi/logs' local-directory: '/scratch/cluster/${USER}/dask/izumi/local-dir' job-extra: ['-r n'] env-extra: [] death-timeout: 60
Note:
- To configure a default project account that is used by
dask-jobqueue
when submitting batch jobs, uncomment theproject
key/line in~/.config/dask/ncar-jobqueue.yaml
and set it to an appropriate value.
Usage
Note:
⚠️ Online documentation for dask-jobqueue
is available here. ⚠️
Casper
>>> from ncar_jobqueue import NCARCluster >>> from dask.distributed import Client >>> cluster = NCARCluster(project='XXXXXXXX') >>> cluster PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B) >>> cluster.scale(jobs=2) >>> client = Client(cluster)
Cheyenne
>>> from ncar_jobqueue import NCARCluster >>> from dask.distributed import Client >>> cluster = NCARCluster(project='XXXXXXXX') >>> cluster PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B) >>> cluster.scale(jobs=2) >>> client = Client(cluster)
Hobart
>>> from ncar_jobqueue import NCARCluster >>> from dask.distributed import Client >>> cluster = NCARCluster(project='XXXXXXXX') >>> cluster PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B) >>> cluster.scale(jobs=2) >>> client = Client(cluster)
Izumi
>>> from ncar_jobqueue import NCARCluster >>> from dask.distributed import Client >>> cluster = NCARCluster(project='XXXXXXXX') >>> cluster PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B) >>> cluster.scale(jobs=2) >>> client = Client(cluster)
Non-NCAR machines
On non-NCAR machines, ncar-jobqueue
will warn the user, and it will use distributed.LocalCluster
:
>>> from ncar_jobqueue import NCARCluster .../ncar_jobqueue/cluster.py:17: UserWarning: Unable to determine which NCAR cluster you are running on... Returning a `distributed.LocalCluster` class. warn(message) >>> from dask.distributed import Client >>> cluster = NCARCluster() >>> cluster LocalCluster(3a7dd0f6, 'tcp://127.0.0.1:64184', workers=4, threads=8, memory=17.18 GB)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ncar_jobqueue-2021.4.14-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ffba69c025fb9062398bae75dde1a0ce87f166c428baf4503f0d85c485e7bbf |
|
MD5 | 53de845d5e53a0b94b6b0aebf6aed41a |
|
BLAKE2-256 | 240a02f0c21a1476046196d3aa05afcf76d641f20add1a6bb144326f664aa0fa |