Skip to main content

Nodify Tensorboard Plugin.

Project description

Nodify Tensorboard Plugin

License GitHub Repo stars

This is a tensorboard plugin by Trainy to supplement the existing PyTorch Profiler. This provides additional visualizations to effectively characterize traces for runs involving multiple GPUs. The plugin expects all traces to be collected using torch.profile and to be located in the same folder.

Installation & Quickstart

Install tensorboard and the plugin.

pip install tensorboard
pip install nodify-plugin

Generate PyTorch profiler traces as shown here and bring up the tensorboard where your traces are living. A set of example logs are provided in this repo under log/resnet18

tensorboard --logdir log/resnet18/

Take a look at our quickstart guide to learn about the different graph views and how you can use them to debug your multinode training.

Development

To view the plugin for development, create a virtual environment, install the requirements, and install the plugin.

python -m venv venv
. venv/bin/activate
pip install -e .

Feature roadmap

A lot of the features on the roadmap use Meta's Dynolog, kineto, and holistic trace analyzer.

  • On-demand tracing and metrics through dynologger
  • Recommendations for fixing multinode bottlenecks
  • reading from logs stored on cloud object-stores (e.g. Amazon S3, Azure Blob)

Contributing

For feature requests or bug reports/fixes, feel free to open a Github issue or make a pull request. We'd love to connect with any interested developers and we check our Discord to discuss the direction of our projects everyday. Connect with us either throught the text channels or through DM's.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

nodify_plugin-0.1.1-py3-none-any.whl (122.1 kB view details)

Uploaded Python 3

File details

Details for the file nodify_plugin-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for nodify_plugin-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 eb8da63e5f2abfbe3009efc8e722567f1cde3a57d1cd4517f74b5e12be44e400
MD5 a46ecfa1c66d60aae8d72eee608ec83f
BLAKE2b-256 4ceb19697ab1bc8b9aac3ad54f5477fd63f7eccf6bed16bbfed5e967e1f6f2af

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page