Skip to main content

A Self-Supervised Learning Library

Project description


AK_SSL: A Self-Supervised Learning Library

GitHub Code style: black


📒 Table of Contents


📍 Overview

Welcome to the Self-Supervised Learning Library! This repository hosts a collection of tools and implementations for self-supervised learning. Self-supervised learning is a powerful paradigm that leverages unlabeled data to pre-train models, which can then be fine-tuned on specific tasks with smaller labeled datasets. This library aims to provide researchers and practitioners with a comprehensive set of tools to experiment, learn, and apply self-supervised learning techniques effectively. This project was our assignment during the summer apprenticeship in the newly established Intelligent and Learning System (ILS) laboratory at the University of Isfahan.


✍️ Self Supervised Learning

Self-supervised learning is a subfield of machine learning where models are trained to predict certain aspects of the input data without relying on manual labeling. This approach has gained significant attention due to its ability to leverage large amounts of unlabeled data, which is often easier to obtain than fully annotated datasets. This library provides implementations of various self-supervised techniques, allowing you to experiment with and apply these methods in your own projects.


🔎 Supported Methods

BarlowTwins

Barlow Twins is a self-supervised learning method that aims to learn embeddings invariant to distortions of the input sample. It achieves this by applying two distinct sets of augmentations to the same input sample, resulting in two distorted views of the same image. The objective function measures the cross-correlation matrix between the outputs of two identical networks fed with these distorted sample versions, striving to make it as close to the identity matrix as possible. This causes the embedding vectors of the distorted sample versions to become similar while minimizing redundancy among the components of these vectors. Barlow Twins particularly benefits from utilizing high-dimensional output vectors.

Details of this method
Loss Transformation Transformation Prime Projection Head Paper Original Code
BarlowTwins Loss SimCLR Transformation SimCLR Transformation BarlowTwins Projection Head Link Link

BarlowTwins Loss is inspired by HSIC loss.

BYOL

BYOL (Bootstrap Your Own Latent) is one of the new approaches to self-supervised learning. Like other methods, BYOL aims to learn a representation that can be utilized for downstream tasks. It employs two neural networks for learning: the online and target networks. The online network is trained to predict the target network's representation of the same image from a different augmented view. Simultaneously, the target network is updated with a slow-moving average of the online network's parameters. While state-of-the-art methods rely on negative pairs, BYOL achieves a new state of the art without them. It directly minimizes the similarity between the representations of the same image from different augmented views (positive pair).

Details of this method
Loss Transformation Transformation Prime Projection Head Prediction Head Paper Original Code
BYOL Loss SimCLR Transformation SimCLR Transformation BarlowTwins Projection Head BarlowTwins Prediction Head Link Link

DINO

DINO (self-distillation with no labels) is a self-supervised learning method that directly predicts the output of a teacher network—constructed with a momentum encoder—by utilizing a standard cross-entropy loss. It is an innovative self-supervised learning algorithm developed by Facebook AI. Through the utilization of self-supervised learning with Transformers, DINO paves the way for creating machines that can comprehend images and videos at a much deeper level.

Details of this method
Loss Transformation Global 1 Transformation Global 2 Transformation Local Projection Head Paper Original Code
DINO Loss SimCLR Transformation SimCLR Transformation SimCLR Transformation DINO Projection Head Link Link

MoCos

MoCo, short for Momentum Contrast, is a self-supervised learning algorithm that employs a contrastive loss. MoCo v2 represents an enhanced iteration of the original Momentum Contrast self-supervised learning algorithm. Motivated by the findings outlined in the SimCLR paper, the authors introduced several modifications in MoCo v1, which included replacing the 1-layer fully connected layer with a 2-layer MLP head featuring ReLU activation for the unsupervised training stage. Additionally, they incorporated blur augmentation and adopted a cosine learning rate schedule. These adjustments enabled MoCo to outperform the state-of-the-art SimCLR, even when utilizing a smaller batch size and fewer epochs.

MoCo v3, introduced in the paper "An Empirical Study of Training Self-Supervised Vision Transformers," represents another advancement in self-supervised learning. It builds upon the foundation of MoCo v1 / MoCo v2 and addresses the instability issue observed when employing ViT for self-supervised learning.

In contrast to MoCo v2, MoCo v3 adopts a different approach where the keys naturally coexist within the same batch. The memory queue (memory bank) is discarded, resulting in a setting similar to that of SimCLR. The encoder fq comprises a backbone (e.g., ResNet, ViT), a projection head, and an additional prediction head.

Details of this method
Method Loss Transformation Transformation Prime Projection Head Prediction Head Paper Original Code
MoCo v2 InfoNCE SimCLR Transformation None SimCLR Projection Head None Link Link
MoCo v3 InfoNCE SimCLR Transformation SimCLR Transformation SimCLR Projection Head BYOL Prediction Head Link Link

SimCLRs

SimCLR (Simple Framework for Contrastive Learning of Representations) is a self-supervised technique used to learn image representations. The fundamental building blocks of contrastive self-supervised methods, such as SimCLR, are image transformations. Each image is transformed into multiple new images through randomly applied augmentations. The goal of the self-supervised model is to identify images that originate from the same original source among a set of negative examples. SimCLR operates on the principle of maximizing the similarity between positive pairs of augmented images while minimizing the similarity with negative pairs. The training process can be summarized as follows: Data Augmentation - SimCLR employs robust data augmentation techniques to generate multiple augmented versions of each input image.

Details of this method
Loss Transformation Projection Head Paper Original Code
NT_Xent SimCLR Transformation SimCLR Projection Head Link Link

SimSiam

SimSiam is a self-supervised representation learning model that was proposed by Facebook AI Research (FAIR). It is a simple Siamese network designed to learn meaningful representations without requiring negative sample pairs, large batches, or momentum encoders.

Details of this method
Loss Transformation Projection Head Prediction Head Paper Original Code
Negative Cosine Similarity SimCLR Transformation SimSiam Projection Head SimSiam Prediction Head Link Link

SwAV

SwAV, or Swapping Assignments Between Views, is a self-supervised learning approach that takes advantage of contrastive methods without requiring to compute pairwise comparisons. Specifically, it simultaneously clusters the data while enforcing consistency between cluster assignments produced for different augmentations (or views) of the same image, instead of comparing features directly as in contrastive learning. Simply put, SwAV uses a swapped prediction mechanism where we predict the cluster assignment of a view from the representation of another view.

Details of this method
Loss Transformation Global Transformation Local Projection Head Paper Original Code
SwAV Loss SimCLR Transformation SimCLR Transformation SwAV Projection Head Link Link

🚀 Getting Started

✔️ Requirements

Before you begin, ensure that you have the packages in requirements.txt installed.

📦 Installation

  1. Clone the AK_SSL repository:
git clone https://github.com/audrina-ebrahimi/AK_SSL.git
  1. Change to the project directory:
cd AK_SSL
  1. Install the dependencies:
pip install -r ./Codes/requirements.txt

💡 Tutorial


📊 Benchmarks


📜 References Used

In the development of this project, we have drawn inspiration and utilized code, libraries, and resources from various sources. We would like to acknowledge and express our gratitude to the following references and their respective authors:

These references have played a crucial role in enhancing the functionality and quality of our project. We extend our thanks to the authors and contributors of these resources for their valuable work.


💯 License

This project is licensed under the MIT License.


🤝 Collaborators

By:

Thanks to Dr. Peyman Adibi and Dr. Hossein Karshenas, for their invaluable guidance and support throughout this project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AK_SSL-0.0.1.tar.gz (19.6 kB view hashes)

Uploaded Source

Built Distribution

AK_SSL-0.0.1-py3-none-any.whl (26.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page