Skip to main content

Emulate Slurm/PBS/LSF HPC scheduler in Azure ML

Project description

amlhpc

Package to provide a -just enough- Slurm or PBS experience on Azure Machine Learning. Use the infamous sbatch/qsub/sinfo to submit jobs and get insight into the state of the HPC system through a familiar way. Allow applications to interact with AML without the need to re-program another integration.

For the commands to function, the following environment variables have to be set:

SUBSCRIPTION=<guid of you Azure subscription e.g. 12345678-1234-1234-1234-1234567890ab>
CI_RESOURCE_GROUP=<name of the resource group where your Azure Machine Learning Workspace is created>
CI_WORKSPACE=<name of your Azure MAchine Learning Workspace>

In the Azure Machine Learning environment, the CI_RESOURCE_GROUP and CI_WORKGROUP are normally set, so you only need to export SUBSCRIPTION.

sinfo

Show the available partitions. sinfo does not take any options.

(azureml_py38) azureuser@login-vm:~/cloudfiles/code/Users/username$ sinfo
PARTITION       AVAIL   VM_SIZE                 NODES   STATE
f16s            UP      STANDARD_F16S_V2        37
hc44            UP      STANDARD_HC44RS         3
hbv2            UP      STANDARD_HB120RS_V2     4
login-vm        UP      STANDARD_DS12_V2        None

squeue

Show the queue with historical jobs. squeue does not take any options.

(azureml_py38) azureuser@login-vm:~/cloudfiles/code/Users/username$ squeue
JOBID                           NAME            PARTITION       STATE   TIME
crimson_root_52y4l9yfjd         sbatch  	f16s
polite_lock_v8wyc9gnx9          runscript.sh    f16s

sbatch

Submit a job, either as a command through the --wrap option or a (shell) script. sbatch uses several options, which are explained in sbatch --help. Quite a bit of sbatch options are supported such as running multi-node MPI jobs with the option to set the amount of nodes to be used. Also array jobs are supported with the default --array option.

Some additional options are introduced to support e.g. the data-handling methods available in AML. These are explaned in data.md.

(azureml_py38) azureuser@login-vm:~/cloudfiles/code/Users/username$ sbatch -p f16s --wrap="hostname"
gifted_engine_yq801rygm2
(azureml_py38) azureuser@login-vm:~/cloudfiles/code/Users/username$ sbatch --help
usage: sbatch [-h] [-a ARRAY] -p PARTITION [-N NODES] [-w WRAP] [script]

sbatch: submit jobs to Azure Machine Learning

positional arguments:
  script                script to be executed

optional arguments:
  -h, --help            show this help message and exit
  -a ARRAY, --array ARRAY
                        index for array jobs
  -p PARTITION, --partition PARTITION
                        set compute partition where the job should be run. Use <sinfo> to view available partitions
  -N NODES, --nodes NODES
                        amount of nodes to use for the job
  -w WRAP, --wrap WRAP  command line to be executed, should be enclosed with quotes

If you encounter a scenario or option that is not supported yet or behaves unexpected, please create an issue and explain the option and the scenario.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

amlhpc-0.2.2.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

amlhpc-0.2.2-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file amlhpc-0.2.2.tar.gz.

File metadata

  • Download URL: amlhpc-0.2.2.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for amlhpc-0.2.2.tar.gz
Algorithm Hash digest
SHA256 f836c020d75bb13f57d496493a216c151324964cb79580a347c6d5a1efba9e36
MD5 0632d69a55a7c051ddb81b18caaebde3
BLAKE2b-256 153693d99b4972fa4a70a57af2b2ac98f5c1571d704f33bb72d50563b448ef2d

See more details on using hashes here.

File details

Details for the file amlhpc-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: amlhpc-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for amlhpc-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 bf6f53a6e6f8a7e0d7d40909f4b19b239d610ff505e253c9fc487b8e28b9c2c2
MD5 f59daafb455d5ed26a5e751345792cc3
BLAKE2b-256 e6a5ceda76e8d6b23e256754c45bcd46949d56f3618b711c31febd8022beee99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page