Skip to main content

Look up GPU/CPU/RAM usage on multiple servers at the same time

Project description

Multiple smi

Look up GPU/CPU/RAM usage on multiple machines at the same time !

intended to work with python 3+

Based on pyNVML, and psutil.

Features

  • Allows you to get nvidia-smi output and psutilinformation for multiple connected computers at once, and display it on a a selected GUI.
    • Availables frontends :
      • Ubuntu Appindicator
        • works best on Unity, partially supported on Gnome-shell
      • Argos
        • works on Gnome shell, but also on MacOS thanks to BitBar compatibility

status

  • Allows you to get a notification every time a new process is launched or finished. A default minimum of 1GB memory use is needed for the notification to appear.

notif

  • This tool is aimed at small research teams, with multiple GPU-equipped computers, which you can manually ssh to. At a glance you can see every usage of your computer stock, and where you can launch your computation. It also provides some basis if you want to develop a tool to automatically launch your computation on the least busy computer of your network.

Installation

Server side

[sudo] pip3 install multiple-smi[server]

You will then be able to install it as a service with the install_server_service command. See Usage/Server side below.

If you installed it with sudo, simply do

sudo install_server_service

If you installed it in a virtual environment, you will need to provide the path to the binary executable

sudo /path/to/venv/bin/install_server_service

For both cases, you can add the -h to get help.

Client side

You need to install these with your package manager (e.g. apt for ubuntu or brew for MacOS) :

  • nmap
  • libcairo2-dev
  • libzmq3-dev
sudo apt install nmap libcairo2-dev libzmq3-dev
brew install nmap libcairo2-dev libzmq3-dev

Gnome

If using appindicator frontend or gnome notifier, you will also need to install gnome related libraries with apt

Ubuntu 20+
sudo apt install libgirepository1.0-dev
Ubuntu 18
sudo apt install gir1.2-appindicator3-0.1

You will finally be able to install it with

[sudo] pip3 install multiple-smi[client]

Usage

Server side

To allow clients to access your computer's smi stats, simply run server_smi

But you can also enable it as a service that will be launched at boot.

Ubuntu 16+

A script is provided to automatically create the service file, whih will allow the server_smi to run automatically during boot (some options are available)

sudo install_server_service

to uninstall:

sudo install_server_service -u

(make the --systemd-path folder specified the same as during installation)

Ubuntu 14

You have to daemonize the script and put it in init.d, you can do it with the provided script server_smi_daemon.sh

sudo cp server_smi_daemon.sh /etc/init.d/.
sudo chmod 0755 /etc/init.d/server_smi_daemon.sh
sudo update-rc.d server_smi_daemon.sh defaults

to uninstall:

sudo update-rc.d -f service_smi_daemon.sh remove

Gpu usage stats:

Server-side, gpu usage history is stored in ~/.server_smi/{date}.csv if launched from CLI, /etc/server_smi/{date}.csv if launched from systemctl/init.d. Usage is written on it every ~60 sec, feel free to make some data science with it.

To enable it, you can use option -s in install_server_service or add it in server_smi_daemon.sh (line 6) before installing

Client side

to run the client_smi as only a CLI tool with no gui or notificaion:

client_smi

to run the appindicator

client_smi --frontend {argos,appindicator} --notify-backend {gnome,ntfy}

Configuration:

To know which servers have a running server_smi in your local network, you can use the discover_hosts script, it will automatically populate a json file in ~/.client_smi/hosts_to_smi.json with found machines.

The following command will try to connect to all ip addresses from 192.168.30.0 to 192.168.30.255 with the port 26110 and populate the hosts file.

discover_hosts --ip 192.168.30.0 --level 1 -p 26110

To add your own hosts manually, simply run a client_smi or discover_hosts once and add your entries in the json file that should be created here: ~/.client_smi/hosts_to_smi.json

Tunnel Connexion

Thanks to pyzmq backend for netork, a tunnel connexion is available, when you are outside your usual local network and have to go through a bastion.

Simply launch client_smi with --tunnel option set to your bastion address

client_smi --tunnel user@bastion_ip

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multiple_smi-2.1.0.tar.gz (18.3 kB view details)

Uploaded Source

Built Distribution

multiple_smi-2.1.0-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file multiple_smi-2.1.0.tar.gz.

File metadata

  • Download URL: multiple_smi-2.1.0.tar.gz
  • Upload date:
  • Size: 18.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.5 Linux/6.2.0-37-generic

File hashes

Hashes for multiple_smi-2.1.0.tar.gz
Algorithm Hash digest
SHA256 2acfd0cd4230c2c91783f9256f74455395df05b2384e3dffae7e808090e6dfe9
MD5 8f287a79f8687a24de38288c4438a006
BLAKE2b-256 fc58335d4f8ebc26c59d8635ba0a7641a813ad37fc933bcf88e6fb13a64cd2f4

See more details on using hashes here.

File details

Details for the file multiple_smi-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: multiple_smi-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.5 Linux/6.2.0-37-generic

File hashes

Hashes for multiple_smi-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5bf08b69079f0da83d34c62694d955bea9212dd43fdfe53f77fae642e9f07447
MD5 007d1d1f5f621ae084d388ffd4691953
BLAKE2b-256 cfc780bd0b3f3bfac30f3f15794b1ee59fa2959e02b931f922b10246732918fa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page