Skip to main content

A collection of useful modules and utilities for kaggle not available in Pytorch

Project description

Pytorch Zoo

A collection of useful modules and utilities for kaggle not available in Pytorch

forthebadge code style: prettier License: MIT PRs Welcome GitHub issues GitHub pull requests GitHub last commit

OverviewInstallationDocumentationContributingAuthorsLicense

Overview

Installation

pytorch_zoo can be installed from pip

pip install pytorch_zoo

Documentation

Notifications

Sending yourself notifications when your models finish training

IFTTT allows you to easily do this. Follow https://medium.com/datadriveninvestor/monitor-progress-of-your-training-remotely-f9404d71b720 to setup an IFTTT webhook and get a secret key.

Once you have a key, you can send yourself a notification with:

from pytorch_zoo.utils import notify

message = f'Validation loss: {val_loss}'
obj = {'value1': 'Training Finished', 'value2': message}

notify(obj, [YOUR_SECRET_KEY_HERE])

Viewing training progress with tensorboard in a kaggle kernel

Make sure tensorboard is installed in the kernel and run the following in a code cell near the beginning of your kernel:

!mkdir logs
!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip -o ngrok-stable-linux-amd64.zip
LOG_DIR = './logs'
get_ipython().system_raw(
    'tensorboard --logdir {} --host 0.0.0.0 --port 6006 &'
    .format(LOG_DIR)
)
get_ipython().system_raw('./ngrok http 6006 &')

!curl -s http://localhost:4040/api/tunnels | python3 -c \
    "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

temp = !curl -s http://localhost:4040/api/tunnels | python3 -c "import sys,json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

from pytorch_zoo.utils import notify

obj = {'value1': 'Tensorboard URL', 'value2': temp[0]}
notify(obj, [YOUR_SECRET_KEY_HERE])

!rm ngrok
!rm ngrok-stable-linux-amd64.zip

This will start tensorboard, set up a http tunnel, and send you a notification with a url where you can access tensorboard.

Data

DynamicSampler(sampler, batch_size=32)

A dynamic batch length data sampler. To be used with trim_tensors.

train_dataset = data.TensorDataset(data)
sampler = data.RandomSampler(train_dataset)
sampler = DynamicSampler(sampler, batch_size=32, drop_last=False)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_sampler=len_sampler)

for epoch in range(10):
    for batch in train_loader:
        batch = trim_tensors(batch)
        train_batch(...)

Arguments:
sampler (torch.utils.data.Sampler): Base sampler.
batch_size (int): Size of minibatch.
drop_last (bool): If True, the sampler will drop the last batch if its size would be less than batch_size.

trim_tensors(tensors)

Trim padding off of a batch of tensors to the smallest possible length. To be used with DynamicSampler.

train_dataset = data.TensorDataset(data)
sampler = data.RandomSampler(train_dataset)
sampler = DynamicSampler(sampler, batch_size=32, drop_last=False)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_sampler=len_sampler)

for epoch in range(10):
    for batch in train_loader:
        batch = trim_tensors(batch)
        train_batch(...)

Arguments:
tensors ([torch.tensor]): list of tensors to trim.

Returns:
([torch.tensor]): list of trimmed tensors.

Loss

lovasz_hinge(logits, labels, per_image=True)

The binary Lovasz Hinge loss for semantic segmentation.

loss = lovasz_hinge(logits, labels)

Arguments:
logits (torch.tensor): Logits at each pixel (between -\infty and +\infty).
labels (torch.tensor): Binary ground truth masks (0 or 1).
per_image (bool, optional): Compute the loss per image instead of per batch. Defaults to True.

Shape:

  • Input:
    • logits: (batch, height, width)
    • labels: (batch, height, width)
  • Output: (batch)

Returns:
(torch.tensor): The lovasz hinge loss

Metrics

iou(y_true_in, y_pred_in)

Calculates the average IOU (intersection over union) score on thresholds from 0.5 to 0.95 with a step size of 0.05.

val_iou = iou(y_val, val_preds)

Arguments:
y_true_in (numpy array): Ground truth labels.
y_pred_in (numpy array): Predictions from model.

Returns:
(int): Averaged IOU score over predictions.

Modules

SqueezeAndExcitation(in_ch, r=16)

The channel-wise SE (Squeeze and Excitation) block from the Squeeze-and-Excitation Networks paper.

# in __init__()
self.SE = SqueezeAndExcitation(in_ch, r=16)

# in forward()
x = self.SE(x)

Arguments:
in_ch (int): The number of channels in the feature map of the input.
r (int): The reduction ratio of the intermidiate channels. Default: 16.

Shape:

  • Input: (batch, channels, height, width)
  • Output: (batch, channels, height, width) (same shape as input)
ChannelSqueezeAndSpatialExcitation(in_ch)

The sSE (Channel Squeeze and Spatial Excitation) block from the Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks paper.

# in __init__()
self.sSE = ChannelSqueezeAndSpatialExcitation(in_ch)

# in forward()
x = self.sSE(x)

Arguments:
in_ch (int): The number of channels in the feature map of the input.

Shape:

  • Input: (batch, channels, height, width)
  • Output: (batch, channels, height, width) (same shape as input)
ConcurrentSpatialAndChannelSqueezeAndChannelExcitation(in_ch)

The scSE (Concurrent Spatial and Channel Squeeze and Channel Excitation) block from the Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks paper.

# in __init__()
self.scSE = ConcurrentSpatialAndChannelSqueezeAndChannelExcitation(in_ch, r=16)

# in forward()
x = self.scSE(x)

Arguments:
in_ch (int): The number of channels in the feature map of the input.
r (int): The reduction ratio of the intermidiate channels. Default: 16.

Shape:

  • Input: (batch, channels, height, width)
  • Output: (batch, channels, height, width) (same shape as input)
GaussianNoise(0.1)

A gaussian noise module.

# in __init__()
self.gaussian_noise = GaussianNoise(0.1)

# in forward()
x = self.gaussian_noise(x)

Arguments:
stddev (float): The standard deviation of the normal distribution. Default: 0.1.

Shape:

  • Input: (batch, *)
  • Output: (batch, *) (same shape as input)

Schedulers

CyclicalMomentum(optimizer, base_momentum=0.8, max_momentum=0.9, step_size=2000, mode="triangular")

Pytorch's cyclical learning rates, but for momentum, which leads to better results when used with cyclic learning rates, as shown in A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay.

optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9)
scheduler = torch.optim.CyclicMomentum(optimizer)
data_loader = torch.utils.data.DataLoader(...)
for epoch in range(10):
    for batch in data_loader:
        scheduler.batch_step()
        train_batch(...)

Arguments: optimizer (Optimizer): Wrapped optimizer. base_momentum (float or list): Initial momentum which is the lower boundary in the cycle for each param groups. Default: 0.8 max_momentum (float or list): Upper boundaries in the cycle for each parameter group. scaling function. Default: 0.9 step_size (int): Number of training iterations per half cycle. Authors suggest setting step_size 2-8 x training iterations in epoch. Default: 2000 mode (str): One of {triangular, triangular2, exp_range}. Default: 'triangular' gamma (float): Constant in 'exp_range' scaling function. Default: 1.0 scale_fn (function): Custom scaling policy defined by a single argument lambda function. Mode paramater is ignored Default: None scale_mode (str): {'cycle', 'iterations'}. Defines whether scale_fn is evaluated on cycle number or cycle iterations (training iterations since start of cycle). Default: 'cycle' last_batch_iteration (int): The index of the last batch. Default: -1

Utils

notify({'value1': 'Notification title', 'value2': 'Notification body'}, key)

Send a notification to your phone with IFTTT

Setup a IFTTT webhook with https://medium.com/datadriveninvestor/monitor-progress-of-your-training-remotely-f9404d71b720

notify({'value1': 'Notification title', 'value2': 'Notification body'}, key=[YOUR_PRIVATE_KEY_HERE])

Arguments: obj (Object): Object to send to IFTTT key ([type]): IFTTT webhook key

seed_environment(seed=42)

Set random seeds for python, numpy, and pytorch to ensure reproducible research.

seed_envirionment(42)

Arguments: seed (int): The random seed to set.

gpu_usage(device, digits=4)

Prints the amount of GPU memory currently allocated in GB.

gpu_usage(device, digits=4)

Arguments: device (torch.device, optional): The device you want to check. Defaults to device. digits (int, optional): The number of digits of precision. Defaults to 4.

n_params(model)

Return the number of parameters in a pytorch model.

print(n_params(model))

Arguments: model (nn.Module): The model to analyze.

Returns: (int): The number of parameters in the model.

save_model(model, fold=0)

Save a trained pytorch model on a particular cross-validation fold to disk.

save_model(model, fold=0)

Arguments: model (nn.Module): The model to save. fold (int): The cross-validation fold the model was trained on.

load_model(model, fold=0)

Load a trained pytorch model saved to disk using save_model.

model = load_model(model, fold=0)

Arguments: model (nn.Module): The model to save. fold (int): Which saved model fold to load.

Returns: (nn.Module): The same model that was passed in, but with the pretrained weights loaded.

save(obj, 'obj.pkl')

Save an object to disk.

save(tokenizer, 'tokenizer.pkl')

Arguments: obj (Object): The object to save. filename (String): The name of the file to save the object to.

load('obj.pkl')

Load an object saved to disk with save.

tokenizer = load('tokenizer.pkl')

Arguments: path (String): The path to the saved object.

Returns: (Object): The loaded object.

masked_softmax(logits, mask, dim=-1)

A masked softmax module to correctly implement attention in Pytorch.

out = masked_softmax(logits, mask, dim=-1)

Arguments: vector (torch.tensor): The tensor to softmax. mask (torch.tensor): The tensor to indicate which indices are to be masked and not included in the softmax operation. dim (int, optional): The dimension to softmax over. Defaults to -1. memory_efficient (bool, optional): Whether to use a less precise, but more memory efficient implementation of masked softmax. Defaults to False. mask_fill_value ([type], optional): The value to fill masked values with if memory_efficient is True. Defaults to -1e32.

Returns: (torch.tensor): The masked softmaxed output

masked_log_softmax(logits, mask, dim=-1)

A masked log-softmax module to correctly implement attention in Pytorch.

out = masked_log_softmax(logits, mask, dim=-1)

Arguments: vector (torch.tensor): The tensor to log-softmax. mask (torch.tensor): The tensor to indicate which indices are to be masked and not included in the log-softmax operation. dim (int, optional): The dimension to log-softmax over. Defaults to -1.

Returns: (torch.tensor): The masked log-softmaxed output

Contributing

This repository is still a work in progress, so if you find a bug, think there is something missing, or have any suggestions for new features or modules, feel free to open an issue or a pull request

Authors

  • Bilal Khan - Initial work

License

This project is licensed under the MIT License - see the license file for details

Acknowledgements

This project contains code adapted from:

This README is based on:


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pytorch-zoo, version 1.1.1
Filename, size File type Python version Upload date Hashes
Filename, size pytorch_zoo-1.1.1-py3-none-any.whl (17.3 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size pytorch_zoo-1.1.1.tar.gz (17.8 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page