vitcifar10

Python module to train and test on CIFAR-10.

Project description

Description

This repository contains the Python package vitcifar10, which is a Vision Transformer (ViT) baseline code for training and testing on CIFAR-10. This implementation only supports CUDA (no CPU training).

The idea of this repository is not to have a very flexible implementation, but one that can be used as a baseline for research with results on testing close to the state of the art.

The code in this repository relies on timm, which is a Python package with many state-of-the-art models implemented and pretrained.

Use within a Docker container

If you do not have Docker, here you can find a tutorial to install it.

Build vitcifar10 Docker image:

$ git clone https://github.com/luiscarlosgph/vitcifar10.git
$ cd vitcifar10/docker
$ docker build -t vitcifar10 .

Launch vitcifar10 container:

$ docker run --name wild_vitcifar10 --runtime=nvidia -v /dev/shm:/dev/shm vitcifar10:latest &

Get a terminal inside the container (you can execute this command multiple times to get multiple container terminals):
```
$ docker exec -it wild_vitcifar10 /bin/zsh
$ cd $HOME
```

Launch CIFAR-10 training:

$ python3 -m vitcifar10.train --lr 1e-4 --opt adam --nepochs 200 --bs 16 --cpdir checkpoints --logdir logs --cpint 5 --data ./data

Launch CIFAR-10 testing:

$ python3 -m vitcifar10.test --data ./data --resume checkpoints/model_best.pt --bs 1

If you want to kill the container run $ docker kill wild_vitcifar10.

To remove it execute $ docker rm wild_vitcifar10.

Install with pip

$ pip install vitcifar10 --user

Install from source

$ git clone https://github.com/luiscarlosgph/vitcifar10.git
$ cd vitcifar10
$ python3 setup.py install

Download CIFAR-10 training and testing data

$ python3 -m vitcifar10.download_data --data ./data

Train on CIFAR-10

Launch training:

$ python3 -m vitcifar10.train --lr 1e-4 --opt adam --nepochs 200 --bs 16 --cpdir checkpoints --logdir logs --cpint 5 --data ./data

Resume training from a checkpoint:

$ python3 -m vitcifar10.train --lr 1e-4 --opt adam --nepochs 200 --bs 16 --cpdir checkpoints --logdir logs --cpint 5 --data ./data --resume   checkpoints/epoch_21.pt

Options:
- --lr: learning rate.
- --opt: optimizer, choose either sgd or adam.
- --nepochs: number of epochs to execute.
- --bs: training batch size.
- --cpdir: checkpoint directory.
- --logdir: path to the directory where the Tensorboard logs will be saved.
- --cpint: interval of epochs to save a checkpoint of the training process.
- --resume: path to the checkpoint file you want to resume.
- --data: path to the directory where the dataset will be stored.
- --seed: fix random seed for reproducibility.

Launch Tensorboard:

$ python3 -m  tensorboard.main --logdir logs --bind_all

Train multiple models from scratch (different random seeds)

$ python3 -m vitcifar10.run --lr 1e-4 --opt adam --nepochs 100 --bs 16 --cpdir checkpoints --logdir logs --cpint 5 --niter 5 --data data

Options:
- --niter: number of training cycles to run, e.g. --niter 5 will lead to training five networks.

The rest of the options are identical to those of vitcifar10.train.

Train with a custom dataloader

This is useful when you want to experiment with the input/labels without modifying the model. To do so, vitcifar10.run accepts the parameter --data-loader, e.g.:

$ python3 -m vitcifar10.run --lr 1e-4 --opt adam --nepochs 100 --bs 16 --cpdir checkpoints --logdir logs --cpint 10 --niter 5 --data data --data-loader <python_class_name>

The dataloader class that you pass must align with the following requirements:

Constructor __init__ should accept the batch size as first parameter.
Methods __next__ and __iter__ should be implemented.

Test on CIFAR-10

$ python3 -m vitcifar10.test --data ./data --resume checkpoints/model_best.pt --bs 1

Options:
- --data: path to the directory where the dataset will be stored.
- --resume: path to the checkpoint file you want to test.

Perform inference on a single image

After training, you can classify images such as this dog or this cat following:

$ python3 -m vitcifar10.inference --image data/dog.jpg --model checkpoints/model_best.pt 
It is a dog!

$ python3 -m vitcifar10.inference --image data/cat.jpg --model checkpoints/model_best.pt
It is a cat!

Training | validation | testing splits

This code uses torchvision to download and load the CIFAR-10 dataset. The constructor of the class torchvision.datasets.CIFAR10 has a boolean parameter called train.

In our code we set train=True to obtain the images for training and validation, using 90% for training (45K images) and 10% for validation (5K images). The validation set is used to discover the best model during training (could also be used for hyperparameter tunning or early stopping). For testing, we set train=False. The testing set contains 10K images.

Author

Luis C. Garcia Peraza Herrera (luiscarlos.gph@gmail.com).

License

This repository is shared under an MIT license.

Project details

Release history Release notifications | RSS feed

This version

0.0.6

Nov 10, 2022

0.0.5

Nov 4, 2022

0.0.4

Oct 27, 2022

0.0.3

Oct 27, 2022

0.0.2

Oct 27, 2022

0.0.1

Oct 26, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vitcifar10-0.0.6.tar.gz (15.1 kB view details)

Uploaded Nov 10, 2022 Source

File details

Details for the file vitcifar10-0.0.6.tar.gz.

File metadata

Download URL: vitcifar10-0.0.6.tar.gz
Upload date: Nov 10, 2022
Size: 15.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for vitcifar10-0.0.6.tar.gz
Algorithm	Hash digest
SHA256	`9d8ce2ac9561a117d0db2d13f6e775d17845f2935470ed414673f4736b9ccb19`
MD5	`5f83ca9f55bf14523a2c93830aaf0c40`
BLAKE2b-256	`5e16d3716a98b3d8bac42ca1fa6e73dae4c7d87cb9e803fd0ea1922e68c4505d`