Skip to main content

Delve lets you monitor PyTorch model layer saturation during training

Project description

Delve: Deep Live Visualization and Evaluation logo

PyPI version Build Status License: MIT

Delve is a Python package for analyzing the inference dynamics of your model.

playground

Use Delve if you need a lightweight PyTorch extension that:

  • Gives you insight into the inference dynamics of your architecture
  • Allows you to optimize and adjust neural networks models to your dataset without much trial and error
  • Allows you to analyze the eigenspaces your data at different stages of inference
  • Provides you basic tooling for experiment logging

Motivation

Designing a deep neural network is a trial and error heavy process that mostly revolves around comparing performance metrics of different runs. One of the key issues with this development process is that the results of metrics not realy propagte back easily to concrete design improvements. Delve provides you with spectral analysis tools that allow you to investigate the inference dynamic evolving in the model while training. This allows you to spot underutilized and unused layers. Missmatches between object size and neural architecture among other inefficiencies. These observations can be propagated back directly to design changes in the architecture even before the model has fully converged, allowing for a quicker and mor guided design process.

Installation

pip install delve

Using Layer Saturation to improve model performance

The saturation metric is the core feature of delve. By default saturation is a value between 0 and 1.0 computed for any convolutional, lstm or dense layer in the network. The saturation describes the percentage of eigendirections required for explaining 99% of the variance. Simply speaking, it tells you how much your data is "filling up" the individual layers inside your model.

In the image below you can see how saturation portraits inefficiencies in your neural network. The depicted model is ResNet18 trained on 32 pixel images, which is way to small for a model with a receptive field exceeding 400 pixels in the final layers.

resnet.PNG

To visualize what this poorly chosen input resolution does to the inference, we trained logistic regressions on the output of every layer to solve the same task as the model. You can clearly see that only the first half of the model (at best) is improving the intermedia solutions of our logistic regression "probes". The layers following this are contributing nothing to the quality of the prediction! You also see that saturation is extremly low for this layers!

We call this a tail and it can be removed by either increasing the input resolution or (which is more economical) reducing the receptive field size to match the object size of your dataset.

resnetBetter.PNG

We can do this by removing the first two downsampling layers, which quarters the growth of the receptive field of your network, which reduced not only the number of parameters but also makes more use of the available parameters, by making more layers contribute effectivly!

For more details check our publication on this topics

Demo

import torch
from delve import CheckLayerSat
from torch.cuda import is_available
from torch.nn import CrossEntropyLoss
from torchvision.datasets import CIFAR10
from torchvision.transforms import ToTensor, Compose
from torch.utils.data.dataloader import DataLoader
from torch.optim import Adam
from torchvision.models.vgg import vgg16

# setup compute device
from tqdm import tqdm

if __name__ == "__main__":

    device = "cuda:0" if is_available() else "cpu"

    # Get some data
    train_data = CIFAR10(root="./tmp", train=True,
                         download=True, transform=Compose([ToTensor()]))
    test_data = CIFAR10(root="./tmp", train=False, download=True, transform=Compose([ToTensor()]))

    train_loader = DataLoader(train_data, batch_size=1024,
                              shuffle=True, num_workers=6,
                              pin_memory=True)
    test_loader = DataLoader(test_data, batch_size=1024,
                             shuffle=False, num_workers=6,
                             pin_memory=True)

    # instantiate model
    model = vgg16(num_classes=10).to(device)

    # instantiate optimizer and loss
    optimizer = Adam(params=model.parameters())
    criterion = CrossEntropyLoss().to(device)

    # initialize delve
    tracker = CheckLayerSat("my_experiment", save_to="plotcsv", modules=model, device=device)

    # begin training
    for epoch in range(10):
        model.train()
        for (images, labels) in tqdm(train_loader):
            images, labels = images.to(device), labels.to(device)
            prediction = model(images)
            optimizer.zero_grad(set_to_none=True)
            with torch.cuda.amp.autocast():
                outputs = model(images)
                _, predicted = torch.max(outputs.data, 1)

                loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

        total = 0
        test_loss = 0
        correct = 0
        model.eval()
        for (images, labels) in tqdm(test_loader):
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            loss = criterion(outputs, labels)
            _, predicted = torch.max(outputs.data, 1)

            total += labels.size(0)
            correct += torch.sum((predicted == labels)).item()
            test_loss += loss.item()
    
        # add some additional metrics we want to keep track of
        tracker.add_scalar("accuracy", correct / total)
        tracker.add_scalar("loss", test_loss / total)

        # add saturation to the mix
        tracker.add_saturations()

    # close the tracker to finish training
    tracker.close()

Why this name, Delve?

delve (verb):

  • reach inside a receptacle and search for something
  • to carry on intensive and thorough research for data, information, or the like

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

delve-0.1.44.tar.gz (26.5 kB view details)

Uploaded Source

Built Distribution

delve-0.1.44-py2.py3-none-any.whl (25.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file delve-0.1.44.tar.gz.

File metadata

  • Download URL: delve-0.1.44.tar.gz
  • Upload date:
  • Size: 26.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/54.1.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.10

File hashes

Hashes for delve-0.1.44.tar.gz
Algorithm Hash digest
SHA256 93fec208c7d43212e138162645c02a4cfaa93b115fc249994e6c08c17a1f331a
MD5 72c2f3166ddaad40d5f60bbcc4204ab4
BLAKE2b-256 298cbcd50b0f2c9c6bc448f07abfd837f400b3b39e7cc371763efdce83a515c5

See more details on using hashes here.

Provenance

File details

Details for the file delve-0.1.44-py2.py3-none-any.whl.

File metadata

  • Download URL: delve-0.1.44-py2.py3-none-any.whl
  • Upload date:
  • Size: 25.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.9.6

File hashes

Hashes for delve-0.1.44-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 c87bc64986db397cd5c0951811a55cb53a6835feea9435d262c1f1f9734753ad
MD5 afda4013ac5335f86718e0e9cc07999b
BLAKE2b-256 bf731e9bff2a65f7195cdca056f2add56a118859e6289aadc0e3a71e0cf9ff97

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page