Skip to main content

A DataLoader library for Continual Learning in PyTorch.

Project description

Continual Loader (CLLoader)

PyPI version Build Status

A library for PyTorch's loading of datasets in the field of Continual Learning

Aka Continual Learning, Lifelong-Learning, Incremental Learning, etc.

Example:

Install from and PyPi:

pip3 install clloader

And run!

from torch.utils.data import DataLoader

from clloader import CLLoader
from clloader.datasets import MNIST

clloader = CLLoader(
    MNIST("my/data/path", download=True),
    increment=1,
    initial_increment=5
)

print(f"Number of classes: {clloader.nb_classes}.")
print(f"Number of tasks: {clloader.nb_tasks}.")

for task_id, (train_dataset, test_dataset) in enumerate(clloader):
    train_loader = DataLoader(train_dataset)
    test_loader = DataLoader(test_dataset)

    # Do your cool stuff here

Supported Scenarios

Name Acronym  Supported
New Instances  NI :x:
New Classes  NC :white_check_mark:
New Instances & Classes  NIC :x:

Supported Datasets:

Note that the task sizes are fully customizable.

Name Nb classes  Image Size Automatic Download
MNIST 10  28x28x1 :white_check_mark:
Fashion MNIST 10  28x28x1 :white_check_mark:
KMNIST 10  28x28x1 :white_check_mark:
EMNIST 10  28x28x1 :white_check_mark:
QMNIST 10  28x28x1 :white_check_mark:
MNIST Fellowship 30  28x28x1 :white_check_mark:
CIFAR10 10 32x32x3 :white_check_mark:
CIFAR100 100 32x32x3 :white_check_mark:
CIFAR Fellowship 110 32x32x3 :white_check_mark:
ImageNet100 100 224x224x3 :x:
ImageNet1000 1000 224x224x3 :x:
Permuted MNIST 10 28x28x1 :white_check_mark:
Rotated MNIST 10 28x28x1 :white_check_mark:

Furthermore some "Meta"-datasets are available:

InMemoryDataset, for in-memory numpy array:

x_train, y_train = gen_numpy_array()
x_test, y_test = gen_numpy_array()

clloader = CLLoader(
    InMemoryDataset(x_train, y_train, x_test, y_test),
    increment=10,
)

PyTorchDataset,for any dataset defined in torchvision:

clloader = CLLoader(
    PyTorchDataset("/my/data/path", dataset_type=torchvision.datasets.CIFAR10),
    increment=10,
)

ImageFolderDataset, for datasets having a tree-like structure, with one folder per class:

clloader = CLLoader(
    ImageFolderDataset("/my/train/folder", "/my/test/folder"),
    increment=10,
)

Fellowship, to combine several continual datasets.:

clloader = CLLoader(
    Fellowship("/my/data/path", dataset_list=[CIFAR10, CIFAR100]),
    increment=10,
)

Some datasets cannot provide an automatic download of the data for miscealleneous reasons. For example for ImageNet, you'll need to download the data from the official page. Then load it likewise:

clloader = CLLoader(
    ImageNet1000("/my/train/folder", "/my/test/folder"),
    increment=10,
)

Some papers use a subset, called ImageNet100 or ImageNetSubset. You'll need to get the subset ids. It's either a file in the following format:

my/path/to/image0.JPEG target0
my/path/to/image1.JPEG target1

Or a list of tuple [("my/path/to/image0.JPEG", target0), ...]. Then loading the continual loader is very simple:

clloader = CLLoader(
    ImageNet100(
        "/my/train/folder",
        "/my/test/folder",
        train_subset=... # My subset ids
        test_subset=... # My subset ids
    ),
    increment=10,
)

Continual Loader

The Continual Loader CLLoader loads the data and batch it in several tasks. See there some example arguments:

clloader = CLLoader(
    my_continual_dataset,
    increment=10,
    initial_increment=2,
    train_transformations=[transforms.RandomHorizontalFlip()],
    common_transformations=[
        transforms.ToTensor(),
        transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
    ],
    evaluate_on="seen"
)

Here the first task is made of 2 classes, then all following tasks of 10 classes. You can have a more finegrained increment by providing a list of ìncrement=[2, 10, 5, 10]`.

The train_transformations is applied only on the training data, while the common_transformations on both the training and testing data.

By default, we evaluate our model after each task on seen classes. But you can evalute only on current classes, or even on all classes.

Sample Images

MNIST:

Task 0 Task 1 Task 2 Task 3 Task 4

FashionMNIST:

Task 0 Task 1 Task 2 Task 3 Task 4

CIFAR10:

Task 0 Task 1 Task 2 Task 3 Task 4

MNIST Fellowship (MNIST + FashionMNIST + KMNIST):

Task 0 Task 1 Task 2

PermutedMNIST:

Task 0 Task 1 Task 2 Task 3 Task 4

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clloader-0.0.2.tar.gz (414.7 kB view details)

Uploaded Source

Built Distribution

clloader-0.0.2-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file clloader-0.0.2.tar.gz.

File metadata

  • Download URL: clloader-0.0.2.tar.gz
  • Upload date:
  • Size: 414.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.6

File hashes

Hashes for clloader-0.0.2.tar.gz
Algorithm Hash digest
SHA256 aafb30f3e120e539ff8fe74d3fd47672e73801cab7bc106ffebfb207fe6b4a5b
MD5 991fa2bc41e0ea3816286ecada44862d
BLAKE2b-256 af10b1bddcc8220562151ec02b664858810c23ff3c02b21cf71cfdd685a6e350

See more details on using hashes here.

File details

Details for the file clloader-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: clloader-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.6

File hashes

Hashes for clloader-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fdfbeef3e0b41223ce80235c356dbcafedcefc27dccbdb1c6177399c666c70db
MD5 201cca0869ab8112e97d209afd491921
BLAKE2b-256 f777b89997f30222756d0e83990c35c390aff82b33bfa8296bf0a8eca172fc41

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page