Skip to main content

A tool to run Slurm sbatch commands over SSH

Project description

🚀 HPC Rocket

DOI Python application Quality Gate Status Coverage Static Badge

HPC Rocket is a tool to send slurm commands to a remote machine and monitor the job progress. It was primarily written to launch slurm jobs from a CI pipeline.

Installation

You can get the latest version of HPC Rocket on PyPI:

python3 -m pip install hpc-rocket

Authentication

HPC Rocket does support authentication via password and private key. Both can be set via environment variables.

Slurm configuration

Currently all sbatch configuration must happen in the job file. HPC Rocket does not offer any other way of configuring your batch jobs.

Configuration file

HPC Rocket uses a configuration file in YAML format containing credentials to connect to the remote machine. Additionally it allows copying files to the remote machine, copying results back to the local machine (collecting) and eventually cleaning up copied or produced files. Note that all paths in the configuration file must be relative paths. On the local machine paths are evaluated from the current working directory, on the remote machine from the user's home directory, unless absolute paths are specified. If you want to overwrite existing files on the remote machine, make sure to specify the overwrite instruction for each file you would like to overwrite. HPC Rocket will evaluate environment variables on the LOCAL machine in the form of ${VAR} and $VAR when parsing the config file.

host: cluster.example.com
user: myuser
private_keyfile: ~/.ssh/id_rsa

proxyjumps:
  - host: myproxy.example.com
    user: myproxy-user
    private_keyfile: ~/.ssh/proxy_keyfile

copy:
  - from: jobs/slurm.job
    to: slurm.job
    overwrite: true

  - from: bin/myexecutable
    to: myexecutable

collect:
  - from: remote_slurmresult.out
    to: local_slurmresult.out
    overwrite: true

clean:
  - slurm.job
  - myexecutable

sbatch: slurm.job

Usage

Launching a job on the remote machine

Use the launch command to launch a job on the remote machine. You must provide a configuration file. The optional --watch flag makes hpc-rocket wait until your job is finished (defaults to false). The collection and cleaning steps in the configuration file are only executed if --watch is set.

hpc-rocket launch --watch config.yml

Checking a job's status

If a job was launched without --watch you can still check its status using the status command. You will need to provide a configuration file with connection data and a job ID to check.

hpc-rocket status config.yml 12345

Monitoring a job until it finishes

Similar to the status command, hpc-rocket also provides the watch command to monitor a job's status continuously by entering a config file and a job id.

hpc-rocket watch config.yml 12345

Canceling a running job

Jobs may also be canceled using the cancel command. Like the previous commands it accepts a config file and the id of a running job.

hpc-rocket cancel config.yml 12345

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hpc-rocket-0.6.3.tar.gz (54.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hpc_rocket-0.6.3-py3-none-any.whl (84.2 kB view details)

Uploaded Python 3

File details

Details for the file hpc-rocket-0.6.3.tar.gz.

File metadata

  • Download URL: hpc-rocket-0.6.3.tar.gz
  • Upload date:
  • Size: 54.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.24.2 CPython/3.13.4 Darwin/24.5.0

File hashes

Hashes for hpc-rocket-0.6.3.tar.gz
Algorithm Hash digest
SHA256 dff5ac485de2161c0c9c19fbcb3c8271f2196043857bdf3cb9609aefc7fd519c
MD5 219fc98e74832637ab20ae4e43be0783
BLAKE2b-256 cbe541a0ee9add341b60bce9f9ae51dc9b39cb3b0f5c3bae3f0ee74337aa3991

See more details on using hashes here.

File details

Details for the file hpc_rocket-0.6.3-py3-none-any.whl.

File metadata

  • Download URL: hpc_rocket-0.6.3-py3-none-any.whl
  • Upload date:
  • Size: 84.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.24.2 CPython/3.13.4 Darwin/24.5.0

File hashes

Hashes for hpc_rocket-0.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 26b8e69f01ad78dd61db687e5582886b7a7a8c1a1692b8f15c9a243898fe4aec
MD5 1f5d86b1fa63e240fa795a64a83c66e7
BLAKE2b-256 f4ab01e5545b075ff93bdbee55875885b81a66d27a4fd6f39dc36d01b74b36f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page