Skip to main content

batchify anything

Project description

batchify: structured data in pytorch

PyTorch can already batchify tensors (and tuples of tensors), but what about arbitrary classes? If your neural network is dealing with complex datatypes, structuring your data in classes is the solution. With batchify, you can seemlessly return classes in a pytorch dataset.

As an example, let's say you want your neural network do something with people. People, as we all know, are faces and names

import torch
from batchify import Batch, Batchable

MAX_NAME_LEN = 128
IMG_SIZE = 256
class Person(Batchable):
    
    face: torch.Tensor
    name: torch.Tensor
    
    def __init__(self): # not a very interesting person
        self.face = torch.zeros((3, IMG_SIZE, IMG_SIZE))
        self.name = torch.zeros((MAX_NAME_LEN,))
    

Now here's the fun part: we can make a batch of people. This automatically batchifies both the face and the name

dave = Person()
rhonda = Person()
batch = Batch([dave, rhonda])
print(len(batch))
print(dave.name.shape)
print(batch.name.shape) # notice the extra batch dimension
print(batch[0].name.shape) # un-batchification
2
torch.Size([128])
torch.Size([2, 128])
torch.Size([128])

But what about a custom person dataset? Pretty easy with the batchify dataloader

from batchify import DataLoader

class PersonDataset(torch.utils.data.Dataset):
    
    def __len__(self):
        return 16
    
    def __getitem__(self, index):
        return Person()
    
batch_size = 8
dataset = PersonDataset()
loader = DataLoader(dataset, batch_size=batch_size)
for batch in loader:
    print(batch.face.shape)
torch.Size([8, 3, 256, 256])
torch.Size([8, 3, 256, 256])

This is all great if you want to input a Person into your network. But what if you want to output a person?

(warning: this functionality only works if you have correct type annotations on your Batchable classes)

out_batch = Batch(Person,
                  face=torch.zeros((batch_size, 3, IMG_SIZE, IMG_SIZE)),
                  name=torch.zeros((batch_size, MAX_NAME_LEN)))
print(len(out_batch))
print(out_batch[0].name.shape)
8
torch.Size([128])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

batchify-0.0.3.tar.gz (3.5 kB view details)

Uploaded Source

File details

Details for the file batchify-0.0.3.tar.gz.

File metadata

  • Download URL: batchify-0.0.3.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.0 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.5

File hashes

Hashes for batchify-0.0.3.tar.gz
Algorithm Hash digest
SHA256 f01d5ee7c2d0b9b2605fd95beee244f67a6d955be377e77b7d2749cf4e8d1cf3
MD5 09292ba4365967f335793db0b6b2bb07
BLAKE2b-256 36639cf02701ec190ab0fc31d2026bb911f613942a86560cf18a69f966e70592

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page