A collection of useful modules and utilities for kaggle not available in Pytorch

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Pytorch Zoo

A collection of useful modules and utilities (especially helpful for kaggling) not available in Pytorch

Overview • Installation • Documentation • Contributing • Authors • License • Acknowledgements

Made by Bilal Khan • https://bilal.software

Installation
Documentation
Contributing
Authors
License
Acknowledgements

Installation

pytorch_zoo can be installed from pip

pip install pytorch_zoo

Documentation

Notifications

Sending yourself notifications when your models finish training

IFTTT allows you to easily do this. Follow https://medium.com/datadriveninvestor/monitor-progress-of-your-training-remotely-f9404d71b720 to setup an IFTTT webhook and get a secret key.

Once you have a key, you can send yourself a notification with:

from pytorch_zoo.utils import notify

message = f'Validation loss: {val_loss}'
obj = {'value1': 'Training Finished', 'value2': message}

notify(obj, [YOUR_SECRET_KEY_HERE])

Viewing training progress with tensorboard in a kaggle kernel

Make sure tensorboard is installed in the kernel and run the following in a code cell near the beginning of your kernel:

!mkdir logs
!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip -o ngrok-stable-linux-amd64.zip
LOG_DIR = './logs'
get_ipython().system_raw(
    'tensorboard --logdir {} --host 0.0.0.0 --port 6006 &'
    .format(LOG_DIR)
)
get_ipython().system_raw('./ngrok http 6006 &')

!curl -s http://localhost:4040/api/tunnels | python3 -c \
    "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

temp = !curl -s http://localhost:4040/api/tunnels | python3 -c "import sys,json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

from pytorch_zoo.utils import notify

obj = {'value1': 'Tensorboard URL', 'value2': temp[0]}
notify(obj, [YOUR_SECRET_KEY_HERE])

!rm ngrok
!rm ngrok-stable-linux-amd64.zip

This will start tensorboard, set up a http tunnel, and send you a notification with a url where you can access tensorboard.

Data

DynamicSampler(sampler, batch_size=32)

A dynamic batch length data sampler. To be used with trim_tensors.

Implementation adapted from https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/discussion/94779 and https://github.com/pytorch/pytorch/blob/master/torch/utils/data/sampler.py

train_dataset = data.TensorDataset(data)
sampler = data.RandomSampler(train_dataset)
sampler = DynamicSampler(sampler, batch_size=32, drop_last=False)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_sampler=len_sampler)

for epoch in range(10):
    for batch in train_loader:
        batch = trim_tensors(batch)
        train_batch(...)

Arguments:
sampler (torch.utils.data.Sampler): Base sampler.
batch_size (int): Size of minibatch.
drop_last (bool): If True, the sampler will drop the last batch if its size would be less than batch_size.

trim_tensors(tensors)

Trim padding off of a batch of tensors to the smallest possible length. To be used with DynamicSampler.

Implementation adapted from https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/discussion/94779

train_dataset = data.TensorDataset(data)
sampler = data.RandomSampler(train_dataset)
sampler = DynamicSampler(sampler, batch_size=32, drop_last=False)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_sampler=len_sampler)

for epoch in range(10):
    for batch in train_loader:
        batch = trim_tensors(batch)
        train_batch(...)

Arguments:
tensors ([torch.tensor]): list of tensors to trim.

Returns:
([torch.tensor]): list of trimmed tensors.

Loss

lovasz_hinge(logits, labels, per_image=True)

The binary Lovasz Hinge loss for semantic segmentation.

Implementation adapted from https://github.com/bermanmaxim/LovaszSoftmax

loss = lovasz_hinge(logits, labels)

Arguments:
logits (torch.tensor): Logits at each pixel (between -\infty and +\infty).
labels (torch.tensor): Binary ground truth masks (0 or 1).
per_image (bool, optional): Compute the loss per image instead of per batch. Defaults to True.

Shape:

Input:
- logits: (batch, height, width)
- labels: (batch, height, width)
Output: (batch)

Returns:
(torch.tensor): The lovasz hinge loss

DiceLoss()

The dice loss for semantic segmentation

Implementation adapted from https://www.kaggle.com/soulmachine/siim-deeplabv3

criterion = DiceLoss()
loss = criterion(logits, targets)

Shape:

Input:
- logits: (batch, *)
- targets: (batch, *) same as logits
Output: (1)

Returns:
(torch.tensor): The dice loss

Metrics

Modules

SqueezeAndExcitation(in_ch, r=16)

The channel-wise SE (Squeeze and Excitation) block from the Squeeze-and-Excitation Networks paper.

Implementation adapted from https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/65939 and https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/66178

# in __init__()
self.SE = SqueezeAndExcitation(in_ch, r=16)

# in forward()
x = self.SE(x)

Arguments:
in_ch (int): The number of channels in the feature map of the input.
r (int): The reduction ratio of the intermidiate channels. Default: 16.

Shape:

Input: (batch, channels, height, width)
Output: (batch, channels, height, width) (same shape as input)

ChannelSqueezeAndSpatialExcitation(in_ch)

The sSE (Channel Squeeze and Spatial Excitation) block from the Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks paper.

Implementation adapted from https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/66178

# in __init__()
self.sSE = ChannelSqueezeAndSpatialExcitation(in_ch)

# in forward()
x = self.sSE(x)

Arguments:
in_ch (int): The number of channels in the feature map of the input.

Shape:

Input: (batch, channels, height, width)
Output: (batch, channels, height, width) (same shape as input)

ConcurrentSpatialAndChannelSqueezeAndChannelExcitation(in_ch)

The scSE (Concurrent Spatial and Channel Squeeze and Channel Excitation) block from the Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks paper.

Implementation adapted from https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/66178

# in __init__()
self.scSE = ConcurrentSpatialAndChannelSqueezeAndChannelExcitation(in_ch, r=16)

# in forward()
x = self.scSE(x)

Arguments:
in_ch (int): The number of channels in the feature map of the input.
r (int): The reduction ratio of the intermidiate channels. Default: 16.

Shape:

Input: (batch, channels, height, width)
Output: (batch, channels, height, width) (same shape as input)

GaussianNoise(0.1)

A gaussian noise module.

# in __init__()
self.gaussian_noise = GaussianNoise(0.1)

# in forward()
if self.training:
    x = self.gaussian_noise(x)

Arguments:
stddev (float): The standard deviation of the normal distribution. Default: 0.1.

Shape:

Input: (batch, *)
Output: (batch, *) (same shape as input)

Schedulers

CyclicalMomentum(optimizer, base_momentum=0.8, max_momentum=0.9, step_size=2000, mode="triangular")

Pytorch's cyclical learning rates, but for momentum, which leads to better results when used with cyclic learning rates, as shown in A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay.

optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9)
scheduler = torch.optim.CyclicMomentum(optimizer)
data_loader = torch.utils.data.DataLoader(...)
for epoch in range(10):
    for batch in data_loader:
        scheduler.batch_step()
        train_batch(...)

Arguments:
optimizer (Optimizer): Wrapped optimizer.
base_momentum (float or list): Initial momentum which is the lower boundary in the cycle for each param groups. Default: 0.8
max_momentum (float or list): Upper boundaries in the cycle for each parameter group. scaling function. Default: 0.9
step_size (int): Number of training iterations per half cycle. Authors suggest setting step_size 2-8 x training iterations in epoch. Default: 2000
mode (str): One of {triangular, triangular2, exp_range}. Default: 'triangular'
gamma (float): Constant in 'exp_range' scaling function. Default: 1.0
scale_fn (function): Custom scaling policy defined by a single argument lambda function. Mode paramater is ignored Default: None
scale_mode (str): {'cycle', 'iterations'}. Defines whether scale_fn is evaluated on cycle number or cycle iterations (training iterations since start of cycle). Default: 'cycle'
last_batch_iteration (int): The index of the last batch. Default: -1

Utils

notify({'value1': 'Notification title', 'value2': 'Notification body'}, key)

Send a notification to your phone with IFTTT

Setup a IFTTT webhook with https://medium.com/datadriveninvestor/monitor-progress-of-your-training-remotely-f9404d71b720

notify({'value1': 'Notification title', 'value2': 'Notification body'}, key=[YOUR_PRIVATE_KEY_HERE])

Arguments:
obj (Object): Object to send to IFTTT
key ([type]): IFTTT webhook key

seed_environment(seed=42)

Set random seeds for python, numpy, and pytorch to ensure reproducible research.

seed_envirionment(42)

Arguments:
seed (int): The random seed to set.

gpu_usage(device, digits=4)

Prints the amount of GPU memory currently allocated in GB.

gpu_usage(device, digits=4)

Arguments:
device (torch.device, optional): The device you want to check. Defaults to device.
digits (int, optional): The number of digits of precision. Defaults to 4.

n_params(model)

Return the number of parameters in a pytorch model.

print(n_params(model))

Arguments:
model (nn.Module): The model to analyze.

Returns:
(int): The number of parameters in the model.

save_model(model, fold=0)

Save a trained pytorch model on a particular cross-validation fold to disk.

Implementation adapted from https://github.com/floydhub/save-and-resume.

save_model(model, fold=0)

Arguments:
model (nn.Module): The model to save.
fold (int): The cross-validation fold the model was trained on.

load_model(model, fold=0)

Load a trained pytorch model saved to disk using save_model.

model = load_model(model, fold=0)

Arguments: model (nn.Module): The model to save.
fold (int): Which saved model fold to load.

Returns:
(nn.Module): The same model that was passed in, but with the pretrained weights loaded.

save(obj, 'obj.pkl')

Save an object to disk.

save(tokenizer, 'tokenizer.pkl')

Arguments:
obj (Object): The object to save.
filename (String): The name of the file to save the object to.

load('obj.pkl')

Load an object saved to disk with save.

tokenizer = load('tokenizer.pkl')

Arguments:
path (String): The path to the saved object.

Returns:
(Object): The loaded object.

masked_softmax(logits, mask, dim=-1)

A masked softmax module to correctly implement attention in Pytorch.

Implementation adapted from: https://github.com/allenai/allennlp/blob/master/allennlp/nn/util.py

out = masked_softmax(logits, mask, dim=-1)

Arguments:
vector (torch.tensor): The tensor to softmax.
mask (torch.tensor): The tensor to indicate which indices are to be masked and not included in the softmax operation.
dim (int, optional): The dimension to softmax over. Defaults to -1.
memory_efficient (bool, optional): Whether to use a less precise, but more memory efficient implementation of masked softmax. Defaults to False.
mask_fill_value ([type], optional): The value to fill masked values with if memory_efficient is True. Defaults to -1e32.

Returns:
(torch.tensor): The masked softmaxed output

masked_log_softmax(logits, mask, dim=-1)

A masked log-softmax module to correctly implement attention in Pytorch.

Implementation adapted from: https://github.com/allenai/allennlp/blob/master/allennlp/nn/util.py

out = masked_log_softmax(logits, mask, dim=-1)

Arguments:
vector (torch.tensor): The tensor to log-softmax.
mask (torch.tensor): The tensor to indicate which indices are to be masked and not included in the log-softmax operation.
dim (int, optional): The dimension to log-softmax over. Defaults to -1.

Returns:
(torch.tensor): The masked log-softmaxed output

Contributing

This repository is still a work in progress, so if you find a bug, think there is something missing, or have any suggestions for new features or modules, feel free to open an issue or a pull request. Feel free to use the library or code from it in your own projects, and if you feel that some code used in this project hasn't been properly accredited, please open an issue.

Authors

Bilal Khan - Initial work

License

This project is licensed under the MIT License - see the license file for details

Acknowledgements

This project contains code adapted from:

This README is based on:

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.2.2

Nov 30, 2019

1.2.1

Aug 24, 2019

1.2.0

Aug 24, 2019

1.1.3

Jul 19, 2019

1.1.2

Jul 18, 2019

1.1.1

Jun 8, 2019

1.1.0

Jun 8, 2019

1.0.0

Jun 7, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytorch_zoo-1.2.2.tar.gz (18.2 kB view details)

Uploaded Nov 30, 2019 Source

Built Distribution

pytorch_zoo-1.2.2-py3-none-any.whl (17.0 kB view details)

Uploaded Nov 30, 2019 Python 3

File details

Details for the file pytorch_zoo-1.2.2.tar.gz.

File metadata

Download URL: pytorch_zoo-1.2.2.tar.gz
Upload date: Nov 30, 2019
Size: 18.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.1.post20191125 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.5

File hashes

Hashes for pytorch_zoo-1.2.2.tar.gz
Algorithm	Hash digest
SHA256	`d81711ea67aea4064b21d3a3822c8ff43a11ebdb041c41faa60a19240668be20`
MD5	`cfdb717e80f9313a230de91f0622e337`
BLAKE2b-256	`a57017351c8ae69a5e6588534a5b4c2f9a2cf8dc7dfc20375483fddf25088e5f`

See more details on using hashes here.

File details

Details for the file pytorch_zoo-1.2.2-py3-none-any.whl.

File metadata

Download URL: pytorch_zoo-1.2.2-py3-none-any.whl
Upload date: Nov 30, 2019
Size: 17.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.1.post20191125 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.5

File hashes

Hashes for pytorch_zoo-1.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6044fad6b6e2b1fa2f297e2e44448753636732f6458c258c377e3ac8780433f4`
MD5	`43d0caef07e33e7afa447f70528e28d1`
BLAKE2b-256	`5c9bbd9065c8971b950c6e5b979f97ce057b5e774dabbcc87a8790ee681b5215`

See more details on using hashes here.

pytorch-zoo 1.2.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Pytorch Zoo

A collection of useful modules and utilities (especially helpful for kaggling) not available in Pytorch

Installation

Documentation

Notifications

Sending yourself notifications when your models finish training

Viewing training progress with tensorboard in a kaggle kernel

Data

Loss

Metrics

Modules

Schedulers

Utils

Contributing

Authors

License

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes