Skip to main content

batchify anything

Project description

batchify: structured data in pytorch

PyTorch can already batchify tensors (and tuples of tensors), but what about arbitrary classes? If your neural network is dealing with complex datatypes, structuring your data in classes is the solution. With batchify, you can seemlessly return classes in a pytorch dataset.

As an example, let's say you want your neural network do something with people. People, as we all know, are faces and names

import torch
from batchify import Batch, Batchable

MAX_NAME_LEN = 128
IMG_SIZE = 256
class Person(Batchable):
    
    face: torch.Tensor
    name: torch.Tensor
    
    def __init__(self): # not a very interesting person
        self.face = torch.zeros((3, IMG_SIZE, IMG_SIZE))
        self.name = torch.zeros((MAX_NAME_LEN,))
    

Now here's the fun part: we can make a batch of people. This automatically batchifies both the face and the name

dave = Person()
rhonda = Person()
batch = Batch([dave, rhonda])
print(len(batch))
print(dave.name.shape)
print(batch.name.shape) # notice the extra batch dimension
print(batch[0].name.shape) # un-batchification
2
torch.Size([128])
torch.Size([2, 128])
torch.Size([128])

But what about a custom person dataset? Pretty easy with the batchify dataloader

from batchify import DataLoader

class PersonDataset(torch.utils.data.Dataset):
    
    def __len__(self):
        return 16
    
    def __getitem__(self, index):
        return Person()
    
batch_size = 8
dataset = PersonDataset()
loader = DataLoader(dataset, batch_size=batch_size)
for batch in loader:
    print(batch.face.shape)
torch.Size([8, 3, 256, 256])
torch.Size([8, 3, 256, 256])

This is all great if you want to input a Person into your network. But what if you want to output a person?

(warning: this functionality only works if you have correct type annotations on your Batchable classes)

out_batch = Batch(Person,
                  face=torch.zeros((batch_size, 3, IMG_SIZE, IMG_SIZE)),
                  name=torch.zeros((batch_size, MAX_NAME_LEN)))
print(len(out_batch))
print(out_batch[0].name.shape)
8
torch.Size([128])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

batchify-0.0.2.tar.gz (3.5 kB view details)

Uploaded Source

File details

Details for the file batchify-0.0.2.tar.gz.

File metadata

  • Download URL: batchify-0.0.2.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.0 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.5

File hashes

Hashes for batchify-0.0.2.tar.gz
Algorithm Hash digest
SHA256 b9e30c1ff45cccd47c14eeb1de4934d3f9b4ca4843371e3e4d5445bbde5e0a17
MD5 741f9f7a951623e50b3e215409614e59
BLAKE2b-256 3632510e7fcb99c87383d038bd33db4962da6214987c6bca29dc8a07ff0054cc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page