Skip to main content

neptune.ai pytorch integration library

Project description

Neptune - PyTorch integration

PyPI version neptune_scale Python 3.10+ License

The Neptune-PyTorch integration simplifies tracking your PyTorch experiments with Neptune by providing automated tracking of PyTorch model internals including activations, gradients, and parameters.

Installation

pip install -U neptune-pytorch

Requirements

  • Neptune 3.x: Requires a Neptune 3.x account. See the Getting Started Guide for setup instructions.
  • Python 3.10+: Minimum Python version requirement
  • PyTorch 1.11+: For tensor operations and model support
  • NumPy 1.20+: For numerical computations

Quickstart

The below quickstart example logs the following data to Neptune:

  • Model architecture: Visual diagram and summary of the neural network
  • Training metrics: Loss curves and epoch progress
  • Layer activations: Mean, std, norm, histograms for each layer
  • Gradient analysis: Gradient statistics to detect vanishing/exploding gradients
  • Parameter tracking: Weight and bias distributions over time
import torch
import torch.nn as nn
import torch.optim as optim
from neptune_scale import Run
from neptune_pytorch import NeptuneLogger

# Initialize Neptune run
run = Run(project="your-project/experiment-tracking")

# Create your PyTorch model
model = nn.Sequential(
    nn.Linear(784, 128),
    nn.ReLU(),
    nn.Linear(128, 10)
)

# Initialize Neptune logger with model tracking
neptune_logger = NeptuneLogger(
    run=run,
    model=model,
    base_namespace="mnist_classification",  # Organizes all metrics under this folder
    log_model_diagram=True,  # Generates model architecture diagram
)

# Training setup
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())

# Training loop with comprehensive tracking
for epoch in range(num_epochs):
    for batch_idx, (data, target) in enumerate(train_loader):
        # Forward pass
        output = model(data)
        loss = criterion(output, target)

        # Backward pass
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # Log training metrics to Neptune
        run.log_metrics({
            f"{neptune_logger.base_namespace}/batch/loss": loss.item(),
            f"{neptune_logger.base_namespace}/epoch": epoch,
        })

        # Track model internals every 10 steps
        if batch_idx % 10 == 0:
            neptune_logger.log_model_internals(
                step=batch_idx,
                prefix="train",
                track_activations=True,   # Monitor activation patterns
                track_gradients=True,     # Track gradient flow
                track_parameters=True     # Log parameter statistics
            )

Advanced configuration

The below example demonstrates the following additional features:

  • Layer filtering: Only track Conv2d and Linear layers (reduces overhead)
  • Custom statistics: Use mean, std, hist instead of all 8 statistics
  • Phase-specific tracking: Different tracking strategies for train/validation
  • Frequency control: Track every 20 steps in training, every 50 in validation
import torch
import torch.nn as nn
from neptune_scale import Run
from neptune_pytorch import NeptuneLogger

# Initialize Neptune run
run = Run(project="your-project/advanced-tracking")

# Create a more complex model (e.g., CNN for image classification)
class CNNModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
        self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
        self.fc1 = nn.Linear(64 * 8 * 8, 128)
        self.fc2 = nn.Linear(128, 10)
        self.relu = nn.ReLU()
        self.pool = nn.MaxPool2d(2)

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.pool(self.relu(self.conv2(x)))
        x = x.view(x.size(0), -1)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = CNNModel()

# Advanced Neptune logger configuration
neptune_logger = NeptuneLogger(
    run=run,
    model=model,
    base_namespace="cnn_experiment",  # Custom organization folder
    track_layers=[nn.Conv2d, nn.Linear],  # Only track conv and linear layers
    tensor_stats=["mean", "norm", "hist"],  # Custom statistics (faster than default)
    log_model_diagram=True,  # Log model summary and diagram
)

# Training with phase-specific tracking
for epoch in range(num_epochs):
    # Training phase - comprehensive tracking
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        # ... your training code ...
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()

        # Track everything during training
        if batch_idx % 20 == 0:  # Every 20 steps
            neptune_logger.log_model_internals(
                step=batch_idx,
                prefix="train",
                track_activations=True,   # Monitor activation patterns
                track_gradients=True,     # Track gradient flow
                track_parameters=True     # Log parameter statistics
            )

    # Validation phase - lightweight tracking
    model.eval()
    with torch.no_grad():
        for batch_idx, (data, target) in enumerate(val_loader):
            # ... your validation code ...
            output = model(data)
            val_loss = criterion(output, target)

            # Only track activations during validation (faster)
            if batch_idx % 50 == 0:  # Every 50 steps
                neptune_logger.log_model_internals(
                    step=batch_idx,
                    prefix="validation",
                    track_activations=True,   # Monitor activation patterns
                    track_gradients=False,    # Skip gradients (no backward pass)
                    track_parameters=False    # Skip parameters (expensive)
                )

Features

Model monitoring

  • Layer activations: Track activation patterns across all layers with 8 different statistics
  • Gradient analysis: Monitor gradient flow and detect vanishing/exploding gradients
  • Parameter tracking: Log parameter statistics and distributions for model analysis
  • Custom statistics: Choose from mean, std, norm, min, max, var, abs_mean, and hist

Configuration options

  • Layer filtering: Track only specific layer types (Conv2d, Linear, etc.)
  • Phase organization: Separate tracking for training/validation phases with custom prefixes
  • Custom namespaces: Organize experiments with custom folder structures

Visualizations

  • Model architecture: Automatic model diagram generation with torchviz
  • Distribution histograms: 50-bin histograms for all tracked metrics
  • Real-time monitoring: Live tracking during training with Neptune
  • Comparative analysis: Easy comparison across experiments and runs

Integration

  • Minimal setup: Simple integration with existing code
  • PyTorch native: Works with existing PyTorch workflows

Performance optimization

Since parameter logging can be expensive for large models, you can control the frequency explicitly:

for step in range(num_steps):
    # ... training code ...

    # Log lightweight metrics every step
    neptune_logger.log_model_internals(
        step=step,
        track_activations=True,
        track_gradients=True,
        track_parameters=False  # Skip expensive parameter logging
    )

    # Log expensive parameters less frequently
    if step % 100 == 0:
        neptune_logger.log_model_internals(
            step=step,
            track_activations=False,
            track_gradients=False,
            track_parameters=True
        )

Namespace structure

The integration organizes all logged data under a clear hierarchical and customizable namespace structure:

{base_namespace}/                   # Optional custom top-level folder
├── batch/
│   └── loss                        # Training loss per batch (logged by the user)
├── model/
│   ├── summary                     # Model architecture (if log_model_diagram=True)
│   └── internals/                  # Model internals tracking
│       └── {prefix}/               # Optional prefix (e.g., "train", "validation")
│           ├── activations/        # Layer activations
│           │   └── {layer_name}/
│           │       ├── mean        # Mean activation value
│           │       ├── std         # Standard deviation
│           │       ├── norm        # L2 norm
│           │       ├── min         # Minimum value
│           │       ├── max         # Maximum value
│           │       ├── var         # Variance
│           │       ├── abs_mean    # Mean of absolute values
│           │       └── hist        # Histogram (50 bins)
│           ├── gradients/          # Layer gradients
│           │   └── {layer_name}/
│           │       └── {statistic} # Same statistics as activations
│           └── parameters/         # Model parameters
│               └── {layer_name}/
│                   └── {statistic} # Same statistics as activations

Example namespaces:

With base_namespace="my_experiment":

  • my_experiment/batch/loss - Training loss
  • my_experiment/model/summary - Model architecture
  • my_experiment/model/internals/activations/conv/1/mean - Mean activation (no prefix)
  • my_experiment/model/internals/train/activations/conv/1/mean - Mean activation (with "train" prefix)
  • my_experiment/model/internals/validation/gradients/linear1/norm - L2 norm of gradients (with "validation" prefix)

With base_namespace=None:

  • batch/loss - Training loss
  • model/summary - Model architecture
  • model/internals/activations/conv/1/mean - Mean activation (no prefix)
  • model/internals/train/activations/conv/1/mean - Mean activation (with "train" prefix)
  • model/internals/validation/gradients/linear1/norm - L2 norm of gradients (with "validation" prefix)

Layer name handling:

  • Dots in layer names are automatically replaced with forward slashes for proper namespace organization
  • Example: seq_model.0.weight becomes seq_model/0/weight in the namespace
  • Example: module.submodule.layer becomes module/submodule/layer in the namespace

Available statistics: mean, std, norm, min, max, var, abs_mean, hist

API reference

NeptuneLogger

NeptuneLogger(
    run: Run,
    model: torch.nn.Module,
    base_namespace: Optional[str] = None,
    track_layers: Optional[List[Type[nn.Module]]] = None,
    tensor_stats: Optional[List[TensorStatType]] = None,
    log_model_diagram: bool = False
)

Parameters:

  • run: Neptune run object for logging
  • model: PyTorch model to track
  • base_namespace: Optional top-level folder for organization (default: None)
  • track_layers: List of layer types to track (default: None = all layers)
  • tensor_stats: Statistics to compute (default: ["mean", "norm", "hist"])
  • log_model_diagram: Log the model summary and diagram (default: False)

log_model_internals()

log_model_internals(
    step: int,
    track_activations: bool = True,
    track_gradients: bool = True,
    track_parameters: bool = False,
    prefix: Optional[str] = None
)

Parameters:

  • step: Current training step for logging
  • track_activations: Track layer activations (default: True)
  • track_gradients: Track layer gradients (default: True)
  • track_parameters: Track model parameters (default: False)
  • prefix: Optional phase identifier (e.g., "train", "validation")

Available statistics

Statistic Description Use case
mean Mean value Monitor activation levels
std Standard deviation Detect activation variance
norm L2 norm Monitor gradient/activation magnitude
min Minimum value Detect dead neurons
max Maximum value Detect saturation
var Variance Monitor activation spread
abs_mean Mean of absolute values Monitor activation strength
hist 50-bin histogram Visualize distributions

Contributing

Contributions to neptune-pytorch are welcome. Here's how you can help:

Report issues

  • Found a bug? Open an issue
  • Include Python version, PyTorch version, and error traceback
  • Provide a minimal reproducible example

Suggest features

  • Have an idea? Create a feature request
  • Describe the use case and expected behavior
  • Check existing issues first to avoid duplicates

Contribute code

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes and add tests
  4. Run tests: pytest tests/
  5. Commit your changes: git commit -m 'Add amazing feature'
  6. Push to remote: git push origin feature/amazing-feature
  7. Open a Pull Request

Support

Get help

Resources

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.


Made with ❤️ by the Neptune team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neptune_pytorch-3.0.0.tar.gz (18.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neptune_pytorch-3.0.0-py3-none-any.whl (18.0 kB view details)

Uploaded Python 3

File details

Details for the file neptune_pytorch-3.0.0.tar.gz.

File metadata

  • Download URL: neptune_pytorch-3.0.0.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.24

File hashes

Hashes for neptune_pytorch-3.0.0.tar.gz
Algorithm Hash digest
SHA256 8163af4abf396bec248450d7a2b387ae26ba46277904cf45d858f39515763d53
MD5 5e0437295c69acb864baaa0464712413
BLAKE2b-256 9f4855c238ee66c29d3f27be4210d4722af90bd394a3bca04add1f79f2d530b3

See more details on using hashes here.

File details

Details for the file neptune_pytorch-3.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for neptune_pytorch-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1fe412453bd2df9787daed4c4ea3d5d92c41902e3a80117e3ab37d7004dd5501
MD5 b28eb4b54ad7591fef10448fd769156d
BLAKE2b-256 a0d7eeb06a17cd0856f19cfba715455a2fa3a7fdf4b1d27a10649d4be0ccaede

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page