Skip to main content

TraceML: Lightweight ML Profiler

Project description

TraceML

If you find useful, consider giving it a ⭐ on GitHub — it helps others discover the project!

License: MIT GitHub Stars Open In Colab Python

A lightweight library to make PyTorch training memory visible in real time (in CLI and Notebook).

The Problem

Training large machine learning models often feels like a black box. One minute everything's running and the next, you're staring at a cryptic "CUDA out of memory" error.

Pinpointing which part of the model is consuming too much memory or slowing things down is frustrating and time-consuming. Traditional profiling tools can be overly complex or lack the granularity deep learning developers need.

💡 Why TraceML?

traceml is designed to give you real-time, granular insights into memory usage without heavy overhead. It works both in the terminal (CLI) and inside Jupyter notebooks, so you can pick the workflow that fits you best:

✅ System + process-level usage (CPU, RAM, GPU)

✅ PyTorch layer-level memory allocation (via decorator/instance tracing)

✅ Live activation & gradient memory

No config, no setup, just plug-and-trace.

📦 Installation

pip install .

For developer mode:

pip install '.[dev]'

🚀 Usage

Registering your model for tracing

To capture memory usage, you first need to register your model with TraceML. There are two simple ways:

1. With a class decorator (recommended)

import torch.nn as nn
from traceml.decorator import trace_model

@trace_model()
class TinyNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(100, 10)

    def forward(self, x):
        return self.fc(x)

✅ Any instance of TinyNet will now be automatically traced.

2. With an explicit model instance

import torch.nn as nn
from traceml.decorator import trace_model_instance

model = nn.Sequential(
    nn.Linear(100, 50),
    nn.ReLU(),
    nn.Linear(50, 10)
).to("cuda")

# Attach hooks so TraceML can see memory events
trace_model_instance(model)

✅ Best when you build models dynamically or don't want to decorate the class.

Then, choose whichever fits your workflow.

📓 Notebook

Run TraceML directly in Jupyter/Colab:

from traceml.decorator import trace_model_instance
from traceml.manager.tracker_manager import TrackerManager

# Attach TraceML hooks
trace_model_instance(model)

# Start live tracker
tracker = TrackerManager(interval_sec=1.0, mode="notebook")
tracker.start()

# 🔄 Train as usual
train_model(model, train_loader, val_loader, optimizer, scheduler, scaler, device, dtype)

# Stop and show summaries
tracker.stop()
tracker.log_summaries()

Terminal/CLI

Wrap your training script to see live dashboards in your terminal:

traceml run <your_training_script.py>

Examples

# Trace an explicitly defined model instance
traceml run src/examples/tracing_with_model_instance

# Trace a model using a class decorator (recommended)
traceml run src/examples/tracing_with_class_decorator

TraceML Live Dashboard

📓 Notebook Example

You can also run TraceML inside Jupyter/Colab. See the full example notebook for a working demo.

Notebook output will refresh live per interval, similar to the terminal dashboard.

🔎 How the Samplers Work

TraceML introduces samplers that collect memory usage at intervals, not layer-by-layer traces only:

  • SystemSampler → CPU, RAM, GPU usage sampled at a fixed frequency.

  • LayerMemorySampler → Parameter allocation (per module, not per parameter).

  • ActivationMemorySampler → Tracks per-layer forward activations. Maintains current and global peak values, and estimates total activation memory for a forward pass.

  • GradientMemorySampler → Tracks per-layer backward gradients. Maintains current and global peak values, and estimates total gradient memory during backpropagation.

This means what you see in your terminal is a rolling snapshot of memory over time, giving you:

  • Live per-layer breakdowns

  • Current vs global peaks

  • Running totals of activation + gradient memory

This design makes TraceML lightweight compared to full profilers — you get practical insights without slowing training to a crawl.

📊 Current Features

  • Live CPU, RAM, and GPU usage (System + Current Process)
  • PyTorch module-level memory tracking
  • Live activation memory tracking (per layer, plus totals)
  • Live gradient memory tracking (per layer, plus totals)
  • Real-time terminal dashboards via Rich
  • Notebook support

Coming Soon

  • Step & operation timers (forward, backward, optimizer)
  • Export logs as JSON / CSV
  • More visual dashboards

🙌 Contribute & Feedback

TraceML is early-stage and evolving quickly. Contributions, feedback, and ideas are welcome!

  • Found it useful? Please ⭐ the repo to support development.

  • Issues / feature requests → open a GitHub issue.

  • Want to contribute? See CONTRIBUTING.md (coming soon).

📧 Contact: traceml.ai@gmail.com


TraceML - Making PyTorch memory usage visible, one trace at a time.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

traceml_ai-0.1.0.tar.gz (38.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

traceml_ai-0.1.0-py3-none-any.whl (52.3 kB view details)

Uploaded Python 3

File details

Details for the file traceml_ai-0.1.0.tar.gz.

File metadata

  • Download URL: traceml_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 38.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.23

File hashes

Hashes for traceml_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 25ae2a2e363e4c24e0e88e640026b7a5a7e3db8411d87e499623adefeb291c7d
MD5 dbebad1539540a2891b5a42de249d2bd
BLAKE2b-256 2cbf653a6d7f0e01d507feb3a3cc4d4515a7f83d68c2874bf11031968cc34f84

See more details on using hashes here.

File details

Details for the file traceml_ai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: traceml_ai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 52.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.23

File hashes

Hashes for traceml_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e4f855afd791e0b14e4e3a41e9134d9e214446f24208ea2539fd04aa8f4ac706
MD5 92478608ed9654187541fe79e0c602bc
BLAKE2b-256 0d22f29894cc16651eba3f7c9650eaffa6ecc717ad96c52151c5903e57c1d5ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page