A lightweight HPC monitoring and predictive analytics tool
Project description
NรMAD-HPC
Nรde Monitoring And Diagnostics โ Lightweight HPC monitoring, visualization, and predictive analytics.
"Travels light, adapts to its environment, and doesn't need permanent infrastructure."
๐ Full Documentation โ Installation guides, configuration, CLI reference, network methodology, ML framework, and more.
Quick Start
pip install nomad-hpc
nomad demo # Try with synthetic data
For production:
nomad init # Configure for your cluster
nomad collect # Start data collection
nomad dashboard # Launch web interface
Features
| Feature | Description | Command |
|---|---|---|
| Dashboard | Real-time multi-cluster monitoring with partition views | nomad dashboard |
| Educational Analytics | Track computational proficiency development | nomad edu explain <job> |
| Alerts | Threshold + predictive alerts (email, Slack, webhook) | nomad alerts |
| ML Prediction | Job failure prediction using similarity networks | nomad predict |
| Community Export | Anonymized datasets for cross-institutional research | nomad community export |
| Interactive Sessions | Monitor RStudio/Jupyter sessions | nomad report-interactive |
| Derivative Analysis | Detect accelerating trends before thresholds | Built into alerts |
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ NรMAD โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโค
โ Collectors โ Analysis โ Viz โ Alerts โ
โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโค
โ disk โ derivatives โ dashboard โ thresholds โ
โ iostat โ similarity โ network 3D โ predictive โ
โ slurm โ ML ensemble โ partitions โ email/slack โ
โ gpu โ edu scoring โ edu views โ webhooks โ
โ nfs โ โ โ โ
โโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโดโโโโโโโโโโ
โ SQLite Database โ
โโโโโโโโโโโโโโโโโโโโโ
CLI Reference
Core Commands
nomad init # Setup wizard
nomad collect # Start collectors
nomad dashboard # Web interface
nomad demo # Demo mode
nomad status # System status
Educational Analytics
nomad edu explain <job_id> # Job analysis with recommendations
nomad edu trajectory <user> # User proficiency over time
nomad edu report <group> # Course/group report
Analysis & Prediction
nomad disk /path # Filesystem trends
nomad jobs --user <user> # Job history
nomad similarity # Network analysis
nomad train # Train ML models
nomad predict # Run predictions
Community & Alerts
nomad community export # Export anonymized data
nomad community preview # Preview export
nomad alerts # View alerts
nomad alerts --unresolved # Unresolved only
Installation
From PyPI
pip install nomad-hpc
From Source
git clone https://github.com/jtonini/nomad.git
cd nomad && pip install -e .
Requirements
- Python 3.9+
- SQLite 3.35+
- sysstat package (
iostat,mpstat) - Optional: SLURM, nvidia-smi, nfsiostat
System Check
nomad syscheck
Documentation
๐ jtonini.github.io/nomad-hpc
- Installation & Configuration
- System Install (
--system) - Dashboard Guide
- Educational Analytics
- Network Methodology
- ML Framework
- Proficiency Scoring
- CLI Reference
- Configuration Options
License
Dual-licensed:
- AGPL v3 โ Free for academic, educational, and open-source use
- Commercial License โ Available for proprietary deployments
Citation
@software{nomad2026,
author = {Tonini, Joรฃo Filipe Riva},
title = {NรMAD: Lightweight HPC Monitoring with Machine Learning-Based Failure Prediction},
year = {2026},
url = {https://github.com/jtonini/nomad},
doi = {10.5281/zenodo.18614517}
}
Contributing
See CONTRIBUTING.md for guidelines.
Contact
- Author: Joรฃo Tonini
- Email: jtonini@richmond.edu
- Issues: GitHub Issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nomad_hpc-1.2.1-py3-none-any.whl.
File metadata
- Download URL: nomad_hpc-1.2.1-py3-none-any.whl
- Upload date:
- Size: 328.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
761132f7cc89c254becca6b594267f165fbf4e3b085df27f965fd81346c9bae9
|
|
| MD5 |
79cca594006112a3c4a19ed7a1d38484
|
|
| BLAKE2b-256 |
92188abff0af751c2c1c8bac2d419701b79bbef0d66443e719c848a66c9a8974
|