Skip to main content

A DataLoader library for Continual Learning in PyTorch.

Project description

Continual Loader (CLLoader)

PyPI version Build Status

A library for PyTorch's loading of datasets in the field of Continual Learning

Aka Continual Learning, Lifelong-Learning, Incremental Learning, etc.

Example:

Install from and PyPi:

pip3 install clloader

And run!

from torch.utils.data import DataLoader

from clloader import CLLoader
from clloader.datasets import MNIST

clloader = CLLoader(
    MNIST("my/data/path", download=True),
    increment=1,
    initial_increment=5
)

print(f"Number of classes: {clloader.nb_classes}.")
print(f"Number of tasks: {clloader.nb_tasks}.")

for task_id, (train_dataset, test_dataset) in enumerate(clloader):
    train_loader = DataLoader(train_dataset)
    test_loader = DataLoader(test_dataset)

    # Do your cool stuff here

Supported Scenarios

Name Acronym  Supported
New Instances  NI :x:
New Classes  NC :white_check_mark:
New Instances & Classes  NIC :x:

Supported Datasets:

Note that the task sizes are fully customizable.

Name Nb classes  Image Size Automatic Download
MNIST 10  28x28x1 :white_check_mark:
Fashion MNIST 10  28x28x1 :white_check_mark:
KMNIST 10  28x28x1 :white_check_mark:
EMNIST 10  28x28x1 :white_check_mark:
QMNIST 10  28x28x1 :white_check_mark:
MNIST Fellowship 30  28x28x1 :white_check_mark:
CIFAR10 10 32x32x3 :white_check_mark:
CIFAR100 100 32x32x3 :white_check_mark:
CIFAR Fellowship 110 32x32x3 :white_check_mark:
ImageNet100 100 224x224x3 :x:
ImageNet1000 1000 224x224x3 :x:
Permuted MNIST 10 28x28x1 :white_check_mark:
Rotated MNIST 10 28x28x1 :white_check_mark:

Furthermore some "Meta"-datasets are available:

InMemoryDataset, for in-memory numpy array:

x_train, y_train = gen_numpy_array()
x_test, y_test = gen_numpy_array()

clloader = CLLoader(
    InMemoryDataset(x_train, y_train, x_test, y_test),
    increment=10,
)

PyTorchDataset,for any dataset defined in torchvision:

clloader = CLLoader(
    PyTorchDataset("/my/data/path", dataset_type=torchvision.datasets.CIFAR10),
    increment=10,
)

ImageFolderDataset, for datasets having a tree-like structure, with one folder per class:

clloader = CLLoader(
    ImageFolderDataset("/my/train/folder", "/my/test/folder"),
    increment=10,
)

Fellowship, to combine several continual datasets.:

clloader = CLLoader(
    Fellowship("/my/data/path", dataset_list=[CIFAR10, CIFAR100]),
    increment=10,
)

Some datasets cannot provide an automatic download of the data for miscealleneous reasons. For example for ImageNet, you'll need to download the data from the official page. Then load it likewise:

clloader = CLLoader(
    ImageNet1000("/my/train/folder", "/my/test/folder"),
    increment=10,
)

Some papers use a subset, called ImageNet100 or ImageNetSubset. You'll need to get the subset ids. It's either a file in the following format:

my/path/to/image0.JPEG target0
my/path/to/image1.JPEG target1

Or a list of tuple [("my/path/to/image0.JPEG", target0), ...]. Then loading the continual loader is very simple:

clloader = CLLoader(
    ImageNet100(
        "/my/train/folder",
        "/my/test/folder",
        train_subset=... # My subset ids
        test_subset=... # My subset ids
    ),
    increment=10,
)

Continual Loader

The Continual Loader CLLoader loads the data and batch it in several tasks. See there some example arguments:

clloader = CLLoader(
    my_continual_dataset,
    increment=10,
    initial_increment=2,
    train_transformations=[transforms.RandomHorizontalFlip()],
    common_transformations=[
        transforms.ToTensor(),
        transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
    ],
    evaluate_on="seen"
)

Here the first task is made of 2 classes, then all following tasks of 10 classes. You can have a more finegrained increment by providing a list of ìncrement=[2, 10, 5, 10]`.

The train_transformations is applied only on the training data, while the common_transformations on both the training and testing data.

By default, we evaluate our model after each task on seen classes. But you can evalute only on current classes, or even on all classes.

Sample Images

MNIST:

Task 0 Task 1 Task 2 Task 3 Task 4

FashionMNIST:

Task 0 Task 1 Task 2 Task 3 Task 4

CIFAR10:

Task 0 Task 1 Task 2 Task 3 Task 4

MNIST Fellowship (MNIST + FashionMNIST + KMNIST):

Task 0 Task 1 Task 2

PermutedMNIST:

Task 0 Task 1 Task 2 Task 3 Task 4

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

continnum-0.0.1.tar.gz (414.8 kB view details)

Uploaded Source

Built Distribution

continnum-0.0.1-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file continnum-0.0.1.tar.gz.

File metadata

  • Download URL: continnum-0.0.1.tar.gz
  • Upload date:
  • Size: 414.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.6

File hashes

Hashes for continnum-0.0.1.tar.gz
Algorithm Hash digest
SHA256 6c08b65441dd81ea71002f5890f94b34936b24b1e9677cd98a2156dab8e1c2b0
MD5 c0e8e2d8f0f024bbcd8cef47f6ff8002
BLAKE2b-256 bfb36a6d515b02123239e91d062f2b8eb42816a1f22fcf497efb370a3ed4b1de

See more details on using hashes here.

File details

Details for the file continnum-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: continnum-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.6

File hashes

Hashes for continnum-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 96f8c419935e1e485ab78261f24aad0e04a1726b3aa7da544c4b0d8a741f8888
MD5 690ca265e9c46cc22cf5a356d9110f36
BLAKE2b-256 f8bfa5f774657776bf62dea96b67e01e91f037a118eb0901717f20a18ff9ae9e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page