Skip to main content

A stand alone Python application for killing errant processes on Slurm based compute nodes.

Project description

Shinigami

Shinigami is a stand alone Python application for killing errant processes on Slurm based compute nodes. The application scans for and terminates any running processes not associated with a currently running Slurm job. Processes associated with whitelisted users (root, administrators, service accounts, etc.) are ignored.

Installation and Setup

The shinigami command line utility is installable via the pip (or pipx) package manager:

pipx install shinigami

To be of maximal use, it is recommended to run the utility every half hour. However, you may find a different cadence more appropriate depending on your cluster size and use case. Running the utility automatically is accomplished via a simple cron job:

0,30 * * * * shinigami

You may wish to configure the cron job to run under a dedicated service account. When doing so, ensure the user is added to the admin list and satisfies the following criteria:

  • Exists on all compute nodes
  • Has appropriate permissions to terminate system processes on compute nodes
  • Has established SSH keys for connecting to compute nodes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crc_shinigami-0.5.0.tar.gz (18.5 kB view hashes)

Uploaded Source

Built Distribution

crc_shinigami-0.5.0-py3-none-any.whl (20.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page