Skip to main content

A tool to retrieve and display Slurm usage data

Project description

slurm-scredits

scredits is a Slurm utility for checking account balance. The utility calculates the remaining service units or SU left in the account. The utility shows SU as an aggregate of cpu+gpu+mem usage.

Also, there's a companion script (scredits-crontab-script.sh) that can automatically reset accounts credits each X months on clusters.

Prerequisites

  • Slurm with Accounting enabled.
  • TRES resources enabled and GrpTRESMins billing set.
  • Optionally gres/gpu enabled and configured.

Usage

usage: scredits [-h] [-v] [-V] [-d] [-a ACCOUNT]

Retrieve and display Slurm usage data.

options:
  -h, --help            show this help message and exit
  -v, --verbose         Print debug messages
  -V, --version         Print program version
  -d, --detailed        Show detailed account and user association
  -a ACCOUNT, --account ACCOUNT
                        Account name to filter results
  -j, --json            Print the output in JSON. Compatible with Open Ondemand

Installation

Main commandlet

pip install scredits

To use the credits reset script automation

mkdir /etc/scredits && cd /etc/scredits
wget https://github.com/giuliolibrando/slurm-scredits/blob/main/scredits-crontab-script.sh
chmod +x scredits-crontab-script.sh

Add this string to crontab to run each midnight (add flags if you need them)

sudo crontab -e
0 0 * * * /etc/scredits/scredits-crontab-script.sh

Setting up Slurm

scredits currently support the following setup.

  • Balance is limited per account
  • Account limit is set through GrpTRESMins using billing parameter.

Following is an example setup

Creating account test_account with billing balance of 1000

sacctmgr add account test_account set GrpTRESMins=billing=1000

Add test_user user to account test_account

sacctmgr add user test_user set Account=test_account
sacctmgr add user test_user2 set Account=test_account

Checking balance for all the Accounts

[test@localhost ~]$ scredits

Last credits reset: 09/07/2024 00:01
Next credits reset: 31/07/2024 23:59

Account         | Allocation(SU)  | Remaining(SU)   | Used(SU)   | Used(%) |
-----------------------------------------------------------------------------
test_account    | 1000.0          | 1000.0          | 0          | 0.0

If you want more details use the -d flag.

[test@localhost ~]$ scredits -d

Last credits reset: 09/07/2024 00:01
Next credits reset: 31/07/2024 23:59

------------------------------------------------------------------------------------------
Account              | User            | Consumed (SU)   | % SU Usage      | Used Resources
------------------------------------------------------------------------------------------
root                 |                 |                 |                 |
                     | root            | 0               | 0.00%           | cpu=0, mem=0, gpu=0
                     |                 |                 |                 |
                     | Total:          | 0/0             | 0.00%           | cpu=0, mem=0, gpu=0
------------------------------------------------------------------------------------------
test_account         |                 |                 |                 |
                     | test_account    | 0               | 0.00%           | cpu=0, mem=0, gpu=0
                     | test_account2   | 0               | 0.00%           | cpu=0, mem=0, gpu=0
                     |                 |                 |                 |
                     | Total:          | 0/1000          | 0.00%           | cpu=0, mem=0, gpu=0
------------------------------------------------------------------------------------------

You can filter for Account with the -a flag

[test@localhost ~]$ scredits -d -a test_account

Last credits reset: 09/07/2024 00:01
Next credits reset: 31/07/2024 23:59

------------------------------------------------------------------------------------------
Account              | User            | Consumed (SU)   | % SU Usage      | Used Resources
------------------------------------------------------------------------------------------
test_account         |                 |                 |                 |
                     | test_account    | 0               | 0.00%           | cpu=0, mem=0, gpu=0
                     | test_account2   | 0               | 0.00%           | cpu=0, mem=0, gpu=0
                     |                 |                 |                 |
                     | Total:          | 0/1000          | 0.00%           | cpu=0, mem=0, gpu=0
------------------------------------------------------------------------------------------

Use the flag --json if you need a json output compatible with Open OnDemand - Balance Warning

 root@master1:~/slurm-scredits# scredits --json
{
  "version": 1,
  "timestamp": 1729506306,
  "config": {
    "unit": "SU",
    "project_type": "project"
  },
  "balances": [
    {
      "user": "userA",
      "project": "projecA",
      "value": 792
    },
    {
      "user": "userB",
      "project": "projectA",
      "value": 792
    },
    {
      "user": "userA",
      "project": "projectB",
      "value": 73445
    }
  ]
}

N.B. "Last credits reset" and "Next credits reset" are shown only if the companion crontab script is enabled

Crontab script

[test@localhost ~]$ /etc/scredits/scredits-crontab-script.sh -h
Usage: ./scredits-crontab-script.sh [-v] [-c CLUSTER] [-h] [-m MONTHS]

Options:
  -v          Enable verbose mode.
  -c CLUSTER  Specify the cluster name(s), separated by commas.
  -m MONTHS   Specify the number of months before the next prune.
  -h          Show this help message.

The script accepts multiple clusters with the -c parameter.

root@master1:~/slurm-scredits# ./scredits-crontab-script.sh -v -c clusterA,clusterB
Modifying account aaaaaaa in cluster clusterA
Modifying account bbbbbbb in cluster clusterB
SCREDITS_LAST_PRUNE set to: 2024-07-09-14-39
SCREDITS_NEXT_PRUNE set to: 2024-07-31-23-59

The script resets credits each 3 months, if you want to set a different interval use -m

root@master1:~/slurm-scredits# ./scredits-crontab-script.sh -v -c clusterA,clusterB -m 5
Modifying account aaaaaaa in cluster clusterA
Modifying account bbbbbbb in cluster clusterB
SCREDITS_LAST_PRUNE set to: 2024-07-09-14-39
SCREDITS_NEXT_PRUNE set to: 2024-12-31-23-59

Build yourself

Clone the repo

git clone https://github.com/giuliolibrando/slurm-scredits.git

enter into the folder

cd slurm-scredits

install via pip

pip install .


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scredits-1.4.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

scredits-1.4-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file scredits-1.4.tar.gz.

File metadata

  • Download URL: scredits-1.4.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for scredits-1.4.tar.gz
Algorithm Hash digest
SHA256 121cdaef0b929e1b22829de85f010e0146b5a5a9f647838e863e9c47b3b99115
MD5 2ff8f9bf25db7d6d67839711f3076ce5
BLAKE2b-256 81e945758467424921ddf46e09e9d701d9a77ef70a55e9b5a0d7a07c2ddcc651

See more details on using hashes here.

File details

Details for the file scredits-1.4-py3-none-any.whl.

File metadata

  • Download URL: scredits-1.4-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for scredits-1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e04a7a6dbae67b71282204ddecb22ed9520a0d46f5b09d56c93c32bedcdd90a6
MD5 1a80a78dc37bfb262d89d14d0e3defd7
BLAKE2b-256 87fe58bef08fe64faacc8c9de203758d8a67f9655596462b5245d72a19ab0157

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page