Skip to main content

Image Classification with Self-Supervised Regularization

Project description

Image Classification with Self-Supervised Regularization

Why SESEMI?

SESEMI is an open source image classification library built on PyTorch and PyTorch Lightning. SESEMI enables various modern supervised classifiers to be robust semi-supervised learners based on the principles of self-supervised regularization.

Highlights and Features

  • Integration with the popular PyTorch Image Models (timm) library for access to contemporary, high-performance supervised architectures with optional pretrained ImageNet weights. See the list of supported backbones
  • Demonstrated utility on large realistic image datasets and is currently competitive on the FastAI Imagenette benchmarks
  • Easy to use out-of-the-box requiring little hyper-parameter tuning across many tasks related to supervised learning, semi-supervised learning, and learning with noisy labels. In most use cases, one only needs to tune the learning rate, batch size, and backbone architecture
  • Simply add unlabeled data for improved image classification without any tricks

Our goal is to expand the utility of SESEMI for the ML/CV practitioner by incorporating the latest advances in self-supervised, semi-supervised, and few-shot learning to boost the accuracy performance of conventional supervised classifiers in the limited labeled data setting. If you find this work useful please star this repo to let us know. Contributions are also welcome!

Installation

Our preferred installation method is Docker, however, you can use any virtual environment tool to install the necessary Python dependencies. Below are instructions for both these methods.

Pip

To use pip, configure a virtual environment of choice with at least Python 3.6 (e.g. miniconda). Then install the requirements as follows:

$ pip install git+https://github.com/FlyreelAI/sesemi.git

While the above installs the latest version from the main branch, a version from PyPI can be installed instead as follows:

$ pip install sesemi

Docker

If you would like to use docker, then ensure you have it installed by following the instructions here. The Dockerfile at the root can be used to build an image with the code in this repository. To build the image, run the following bash command :

$ USER_ID=$(id -u) SESEMI_IMAGE=sesemi:latest
$ DOCKER_BUILDKIT=1 docker build \
    --build-arg USER_ID=${USER_ID} \
    -t ${SESEMI_IMAGE} https://github.com/FlyreelAI/sesemi.git

Note that your OS user ID is obtained through the bash command id -u. This command will create an image named sesemi:latest.

Getting Started

You can find more detailed documentation which is hosted here, however, this section will guide you through the process of using SESEMI to train a model on FastAI's imagewoof2 dataset. If you don't have access to a GPU machine, training will work but will take a very long time.

  1. Create a directory for the experiment and enter it.

    $ mkdir sesemi-experiments
    $ cd sesemi-experiments
    $ mkdir data runs .cache
    
  2. Download and extract the imagewoof2 dataset to the data directory.

    $ curl https://s3.amazonaws.com/fast-ai-imageclas/imagewoof2.tgz | tar -xzv -C ./data
    
  3. Run training using SESEMI for 80 epochs. You should get 90-91% accuracy on the imagewoof2 dataset, which is competitive on the FastAI leaderboard, using a standard training protocol + unlabeled data, without fancy tricks.

    If you're not using docker this can be done as follows:

    $ open_sesemi -cn imagewoof_rotation
    

    If you use docker and have nvidia-docker installed you can instead use:

    $ USER_ID=$(id -u) SESEMI_IMAGE=sesemi:latest GPUS=all
    $ docker run \
        --gpus ${GPUS} \
        -u ${USER_ID} \
        --rm --ipc=host \
        --mount type=bind,src=$(pwd),dst=/home/appuser/sesemi-experiments/ \
        -w /home/appuser/sesemi-experiments \
        ${SESEMI_IMAGE} \
        open_sesemi -cn imagewoof_rotation
    

    The training logs with all relevant training statistics (accuracy, losses, learning rate, etc.) are written to the ./runs directory. You can use TensorBoard to view and monitor them in your browser during training.

    $ tensorboard --logdir ./runs
    
  4. Run evaluation on the trained checkpoint.

    Without docker:

    $ CHECKPOINT_PATH=$(echo ./runs/imagewoof_rotation/*/lightning_logs/version_0/checkpoints/last.ckpt)
    $ open_sesemi -cn imagewoof_rotation \
        run.mode=VALIDATE \
        run.pretrained_checkpoint_path=$CHECKPOINT_PATH
    

    With docker:

    $ USER_ID=$(id -u) SESEMI_IMAGE=sesemi:latest GPUS=all
    $ CHECKPOINT_PATH=$(echo ./runs/imagewoof_rotation/*/lightning_logs/version_0/checkpoints/last.ckpt)
    $ docker run \
        --gpus ${GPUS} \
        -u ${USER_ID} \
        --rm --ipc=host \
        --mount type=bind,src=$(pwd),dst=/home/appuser/sesemi-experiments/ \
        -w /home/appuser/sesemi-experiments \
        ${SESEMI_IMAGE} \
        open_sesemi -cn imagewoof_rotation \
            run.mode=VALIDATE \
            run.pretrained_checkpoint_path=$CHECKPOINT_PATH
    

Citation

If you find this work useful, consider citing the related paper:

@inproceedings{TranSESEMI,
  title="{Exploring Self-Supervised Regularization for Supervised and Semi-Supervised Learning}",
  author={Phi Vu Tran},
  booktitle={NeurIPS Workshop on Learning with Rich Experience: Integration of Learning Paradigms},
  year={2019}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sesemi-0.3.0.tar.gz (43.6 kB view details)

Uploaded Source

File details

Details for the file sesemi-0.3.0.tar.gz.

File metadata

  • Download URL: sesemi-0.3.0.tar.gz
  • Upload date:
  • Size: 43.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.11

File hashes

Hashes for sesemi-0.3.0.tar.gz
Algorithm Hash digest
SHA256 1276b25982d0eae024c8795009a862c804cb645fa0c542b89b76aeab94035e89
MD5 0b732b79e62113cae1f33904de1f63e6
BLAKE2b-256 7221a87a64b49e956db27d48b39dbbea6d68c790308962c785478bf7e9981bf1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page