A Dask Scheduler Plugin to monitor GPU memory usage for tasks
Project description
Dask Memory Usage Plugin for GPUs
If you familiar with dask-memusage plugin, this is an alternative version to profile GPU(s) memory usage by using nvidia-smi
command and its XML output.
This plugin is a low impact memory tracker, but if you need something more advanced, check Scalene and other profilers. For pros and cons of this plugin, see FAQ file.
Code Example
Import some blobs and machine learning models. Also import the Dask Client to connect to the scheduler.
import argparse
import cupy as cp
from cuml.dask.datasets import make_blobs as make_blobs_MGPU
from cuml.dask.cluster import KMeans as KMeans_MGPU
from dask.distributed import Client
Now, we can run the main client. Remember that Dask is preferred to be executed inside the main section. This example uses RAPIDS AI, check if the libraries are proper installed. We strongly recommend to use a pre built container.
def main():
parser = argparse.ArgumentParser(description="Test KMeans experiment.")
parser.add_argument('--scheduler-file', type=str, required=True,
help='Location of the scheduler file.')
parser.add_argument('--nodes', type=int, required=True,
help='Number of worker nodes.')
args = parser.parse_args()
client = Client(scheduler_file=args.scheduler_file)
client.wait_for_workers(n_workers=args.nodes)
n_samples = 500000000
n_bins = 3
centers = cp.asarray([(-6, -6), (0, 0), (9, 1)])
X, y = make_blobs_MGPU(n_samples=n_samples, centers=centers, shuffle=False, random_state=42, client=client)
model = KMeans_MGPU(n_clusters=3, random_state=42, max_iter=100)
model.fit(X=X)
pred = model.predict(X=X)
pred.compute()
client.close()
if __name__ == '__main__':
main()
This code can be executed by passing the proper parameters to the command line.
Scheduler CLI usage
To run the scheduler to monitor the GPU memory usage, the scheduler just requires the preloaded plugin module as the example below.
$ dask scheduler --preload dask_memusage_gpus_plugin --memusage-gpus-path memusage-gpus.csv --memusage-gpus-record-type csv --memusage-gpus-max
This plugin also supports other formats like Parquet and Excel for example. There is no problem with workers and threads because Dask CUDA worker only executes 1 thread per GPU.
The results of this execution within the plugin enabled inside the cluster can be seen below.
Limitations and Useful Content
For further information hints about this plugin visit the FAQ document.
Authors
- Julio Faracco
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dask_memusage_gpus-1.0.1.tar.gz
.
File metadata
- Download URL: dask_memusage_gpus-1.0.1.tar.gz
- Upload date:
- Size: 9.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 14cc945286a08f5742ed409bc8d6c77c2c4c20124e4005c5aff26281e7023c1d |
|
MD5 | d76724c41a2a0f72332627f6b2202893 |
|
BLAKE2b-256 | 644b79b9f0e6c18c93131af730f3f0121ab7d2142dd523cde4d0f8274399182e |
File details
Details for the file dask_memusage_gpus-1.0.1-py2.py3-none-any.whl
.
File metadata
- Download URL: dask_memusage_gpus-1.0.1-py2.py3-none-any.whl
- Upload date:
- Size: 9.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f4908fd7dc1c2d28e0832f965358b60d37205a3ae00c3b1ae557b49a5ed6055 |
|
MD5 | d06c1fa02140502858437a7d91ce3052 |
|
BLAKE2b-256 | f3f442d2fc6f009b88da8d40830eeba970e57e35aa486cae30be93901bf37456 |