Skip to main content

A lightweight HPC monitoring and predictive analytics tool

Project description

Nร˜MAD-HPC

Nร˜de Monitoring And Diagnostics โ€” Lightweight HPC monitoring, visualization, and predictive analytics.

"Travels light, adapts to its environment, and doesn't need permanent infrastructure."

PyPI License: AGPL v3 Python 3.9+ DOI


๐Ÿ“– Full Documentation โ€” Installation guides, configuration, CLI reference, network methodology, ML framework, and more.


Quick Start

pip install nomad-hpc
nomad demo                    # Try with synthetic data

For production:

nomad init                    # Configure for your cluster
nomad collect                 # Start data collection
nomad dashboard               # Launch web interface

Features

Feature Description Command
Dashboard Real-time multi-cluster monitoring with partition views nomad dashboard
Educational Analytics Track computational proficiency development nomad edu explain <job>
Alerts Threshold + predictive alerts (email, Slack, webhook) nomad alerts
ML Prediction Job failure prediction using similarity networks nomad predict
Community Export Anonymized datasets for cross-institutional research nomad community export
Interactive Sessions Monitor RStudio/Jupyter sessions nomad report-interactive
Derivative Analysis Detect accelerating trends before thresholds Built into alerts

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         Nร˜MAD                              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Collectors  โ”‚   Analysis   โ”‚     Viz      โ”‚    Alerts     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ disk         โ”‚ derivatives  โ”‚ dashboard    โ”‚ thresholds    โ”‚
โ”‚ iostat       โ”‚ similarity   โ”‚ network 3D   โ”‚ predictive    โ”‚
โ”‚ slurm        โ”‚ ML ensemble  โ”‚ partitions   โ”‚ email/slack   โ”‚
โ”‚ gpu          โ”‚ edu scoring  โ”‚ edu views    โ”‚ webhooks      โ”‚
โ”‚ nfs          โ”‚              โ”‚              โ”‚               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  SQLite Database  โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

CLI Reference

Core Commands

nomad init                    # Setup wizard
nomad collect                 # Start collectors
nomad dashboard               # Web interface
nomad demo                    # Demo mode
nomad status                  # System status

Educational Analytics

nomad edu explain <job_id>    # Job analysis with recommendations
nomad edu trajectory <user>   # User proficiency over time
nomad edu report <group>      # Course/group report

Analysis & Prediction

nomad disk /path              # Filesystem trends
nomad jobs --user <user>      # Job history
nomad similarity              # Network analysis
nomad train                   # Train ML models
nomad predict                 # Run predictions

Community & Alerts

nomad community export        # Export anonymized data
nomad community preview       # Preview export
nomad alerts                  # View alerts
nomad alerts --unresolved     # Unresolved only

Installation

From PyPI

pip install nomad-hpc

From Source

git clone https://github.com/jtonini/nomad-hpc
cd nomad && pip install -e .

Requirements

  • Python 3.9+
  • SQLite 3.35+
  • sysstat package (iostat, mpstat)
  • Optional: SLURM, nvidia-smi, nfsiostat

System Check

nomad syscheck

Documentation

๐Ÿ“– jtonini.github.io/nomad-hpc


License

Dual-licensed:

  • AGPL v3 โ€” Free for academic, educational, and open-source use
  • Commercial License โ€” Available for proprietary deployments

Citation

@software{nomad2026,
  author = {Tonini, Joรฃo Filipe Riva},
  title = {Nร˜MAD: Lightweight HPC Monitoring with Machine Learning-Based Failure Prediction},
  year = {2026},
  url = {https://github.com/jtonini/nomad-hpc},
  doi = {10.5281/zenodo.18614517}
}

Contributing

See CONTRIBUTING.md for guidelines.


Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nomad_hpc-1.2.4.tar.gz (290.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nomad_hpc-1.2.4-py3-none-any.whl (331.5 kB view details)

Uploaded Python 3

File details

Details for the file nomad_hpc-1.2.4.tar.gz.

File metadata

  • Download URL: nomad_hpc-1.2.4.tar.gz
  • Upload date:
  • Size: 290.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nomad_hpc-1.2.4.tar.gz
Algorithm Hash digest
SHA256 23c3dde2fa763f55005beb1877e143e55d91da71b5e7685b5a6016e8faa217b5
MD5 104a3fb602fe94190be3daac32af2ed7
BLAKE2b-256 78987317c826f10f07e476377b33724916f2d9fc92ed908bbfb8f6a4cb4a77ec

See more details on using hashes here.

Provenance

The following attestation bundles were made for nomad_hpc-1.2.4.tar.gz:

Publisher: publish.yml on jtonini/nomad-hpc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nomad_hpc-1.2.4-py3-none-any.whl.

File metadata

  • Download URL: nomad_hpc-1.2.4-py3-none-any.whl
  • Upload date:
  • Size: 331.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nomad_hpc-1.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ec99820ad224e22a40df0485348242ad8111b8284d247f706f0ddadb95631d87
MD5 85c511d0dcdb712b9d7373725801f9fc
BLAKE2b-256 f106eaf3e42d2d3fc1dc7d15bcadf9a3896c1898ae2fa395165b0536b577f998

See more details on using hashes here.

Provenance

The following attestation bundles were made for nomad_hpc-1.2.4-py3-none-any.whl:

Publisher: publish.yml on jtonini/nomad-hpc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page