Python module to train and test on CIFAR-10.
Project description
Description
This repository contains the Python package vitcifar10
, which is a Vision Transformer (ViT) baseline code for training and testing on CIFAR-10. This implementation only supports CUDA (no CPU training).
The idea of this repository is not to have a very flexible implementation, but one that can be used as a baseline for research with results on testing close to the state of the art.
The code in this repository relies on timm, which is a Python package with many state-of-the-art models implemented and pretrained.
Use within a Docker container
If you do not have Docker, here you can find a tutorial to install it.
-
Build
vitcifar10
Docker image:$ git clone https://github.com/luiscarlosgph/vitcifar10.git $ cd vitcifar10/docker $ docker build -t vitcifar10 .
-
Launch
vitcifar10
container:$ docker run --name wild_vitcifar10 --runtime=nvidia -v /dev/shm:/dev/shm vitcifar10:latest &
-
Get a terminal inside the container (you can execute this command multiple times to get multiple container terminals):
$ docker exec -it wild_vitcifar10 /bin/zsh $ cd $HOME
-
Launch CIFAR-10 training:
$ python3 -m vitcifar10.train --lr 1e-4 --opt adam --nepochs 200 --bs 16 --cpdir checkpoints --logdir logs --cpint 5 --data ./data
-
Launch CIFAR-10 testing:
$ python3 -m vitcifar10.test --data ./data --resume checkpoints/model_best.pt --bs 1
If you want to kill the container run $ docker kill wild_vitcifar10
.
To remove it execute $ docker rm wild_vitcifar10
.
Install with pip
$ pip install vitcifar10 --user
Install from source
$ git clone https://github.com/luiscarlosgph/vitcifar10.git
$ cd vitcifar10
$ python3 setup.py install
Download CIFAR-10 training and testing data
$ python3 -m vitcifar10.download_data --data ./data
Train on CIFAR-10
-
Launch training:
$ python3 -m vitcifar10.train --lr 1e-4 --opt adam --nepochs 200 --bs 16 --cpdir checkpoints --logdir logs --cpint 5 --data ./data
-
Resume training from a checkpoint:
$ python3 -m vitcifar10.train --lr 1e-4 --opt adam --nepochs 200 --bs 16 --cpdir checkpoints --logdir logs --cpint 5 --data ./data --resume checkpoints/epoch_21.pt
-
Options:
--lr
: learning rate.--opt
: optimizer, choose eithersgd
oradam
.--nepochs
: number of epochs to execute.--bs
: training batch size.--cpdir
: checkpoint directory.--logdir
: path to the directory where the Tensorboard logs will be saved.--cpint
: interval of epochs to save a checkpoint of the training process.--resume
: path to the checkpoint file you want to resume.--data
: path to the directory where the dataset will be stored.--seed
: fix random seed for reproducibility.
-
Launch Tensorboard:
$ python3 -m tensorboard.main --logdir logs --bind_all
Train multiple models from scratch (different random seeds)
$ python3 -m vitcifar10.run --lr 1e-4 --opt adam --nepochs 100 --bs 16 --cpdir checkpoints --logdir logs --cpint 5 --niter 5 --data data
- Options:
--niter
: number of training cycles to run, e.g.--niter 5
will lead to training five networks.
The rest of the options are identical to those of vitcifar10.train
.
Train with a custom dataloader
This is useful when you want to experiment with the input/labels without modifying the model. To do so, vitcifar10.run
accepts the parameter --data-loader
, e.g.:
$ python3 -m vitcifar10.run --lr 1e-4 --opt adam --nepochs 100 --bs 16 --cpdir checkpoints --logdir logs --cpint 10 --niter 5 --data data --data-loader <python_class_name>
The dataloader class that you pass must align with the following requirements:
- Constructor
__init__
should accept the batch size as first parameter. - Methods
__next__
and__iter__
should be implemented.
Test on CIFAR-10
$ python3 -m vitcifar10.test --data ./data --resume checkpoints/model_best.pt --bs 1
- Options:
--data
: path to the directory where the dataset will be stored.--resume
: path to the checkpoint file you want to test.
Perform inference on a single image
After training, you can classify images such as this dog or this cat following:
$ python3 -m vitcifar10.inference --image data/dog.jpg --model checkpoints/model_best.pt
It is a dog!
$ python3 -m vitcifar10.inference --image data/cat.jpg --model checkpoints/model_best.pt
It is a cat!
Training | validation | testing splits
This code uses torchvision to download and load the CIFAR-10 dataset. The constructor of the class torchvision.datasets.CIFAR10
has a boolean parameter called train
.
In our code we set train=True
to obtain the images for training and validation, using 90% for training (45K images) and 10% for validation (5K images). The validation set is used to discover the best model during training (could also be used for hyperparameter tunning or early stopping). For testing, we set train=False
. The testing set contains 10K images.
Author
Luis C. Garcia Peraza Herrera (luiscarlos.gph@gmail.com).
License
This repository is shared under an MIT license.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file vitcifar10-0.0.6.tar.gz
.
File metadata
- Download URL: vitcifar10-0.0.6.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d8ce2ac9561a117d0db2d13f6e775d17845f2935470ed414673f4736b9ccb19 |
|
MD5 | 5f83ca9f55bf14523a2c93830aaf0c40 |
|
BLAKE2b-256 | 5e16d3716a98b3d8bac42ca1fa6e73dae4c7d87cb9e803fd0ea1922e68c4505d |