Skip to main content

Amazon SageMaker Debugger is an offering from AWS which helps you automate the debugging of machine learning training jobs.

Project description

This library powers Amazon SageMaker Debugger, and helps you develop better, faster and cheaper models by catching common errors quickly. It allows you to save tensors from training jobs and makes these tensors available for analysis, all through a flexible and powerful API. It supports TensorFlow, PyTorch, MXNet, and XGBoost on Python 3.6+.

  • Zero Script Change experience on SageMaker when using supported versions of SageMaker Framework containers or AWS Deep Learning containers
  • Full visibility into any tensor which is part of the training process
  • Real-time training job monitoring through Rules
  • Automated anomaly detection and state assertions
  • Interactive exploration of saved tensors
  • Distributed training support
  • TensorBoard support

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page