A modular framework for research on multimodal representation learning.
Project description
mmlearn
This project aims at enabling the evaluation of existing multimodal representation learning methods, as well as facilitating experimentation and research for new techniques.
Quick Start
Installation
Prerequisites
The library requires Python 3.9 or later. We recommend using a virtual environment to manage dependencies. You can create a virtual environment using the following command:
python3 -m venv /path/to/new/virtual/environment
source /path/to/new/virtual/environment/bin/activate
Installing binaries
To install the pre-built binaries, run:
python3 -m pip install mmlearn
Building from source
To install the library from source, run:
git clone https://github.com/VectorInstitute/mmlearn.git
cd mmlearn
python3 -m pip install -e .
Running Experiments
To run an experiment, create a folder with a similar structure as the configs
folder.
Then, use the mmlearn_run
command to run the experiment as defined in a .yaml
file under the experiment
folder, like so:
mmlearn_run --config-dir /path/to/config/dir +experiment=<name_of_experiment_config> experiment=your_experiment_name
Notice that the config directory refers to the top-level directory containing the experiment
folder. The experiment
name is the name of the .yaml
file under the experiment
folder, without the extension.
We use Hydra to manage configurations, so you can override any configuration parameter from the command line. To see the available options and other information, run:
mmlearn_run --config-dir /path/to/config/dir +experiment=<name_of_experiment> --help
By default, the mmlearn_run
command will run the experiment locally. To run the experiment on a SLURM cluster, we use
the submitit launcher plugin built into Hydra. The following is an example
of how to run an experiment on a SLURM cluster:
mmlearn_run --multirun hydra.launcher.mem_gb=32 hydra.launcher.qos=your_qos hydra.launcher.partition=your_partition hydra.launcher.gres=gpu:4 hydra.launcher.cpus_per_task=8 hydra.launcher.tasks_per_node=4 hydra.launcher.nodes=1 hydra.launcher.stderr_to_stdout=true hydra.launcher.timeout_min=60 '+hydra.launcher.additional_parameters={export: ALL}' --config-dir /path/to/config/dir +experiment=<name_of_experiment_config> experiment=your_experiment_name
This will submit a job to the SLURM cluster with the specified resources.
Note: After the job is submitted, it is okay to cancel the program with Ctrl+C
. The job will continue running on
the cluster. You can also add &
at the end of the command to run it in the background.
Summary of Implemented Methods
Pretraining Methods | Notes |
---|---|
Contrastive Pretraining |
Uses the contrastive loss to align the representations from N modalities. Supports sharing of encoders, projection heads or postprocessing modules (e.g. logit/temperature scaling) across modalities. Also supports multi-task learning with auxiliary unimodal tasks applied to specific modalities. |
Evaluation Methods | Notes |
Zero-shot Cross-modal Retrieval |
Evaluates the quality of the learned representations in retrieving the k most similar examples from a different modality, using recall@k metric. This is applicable to any number of pairs of modalities at once, depending on memory constraints. |
Components
Datasets
Every dataset object must return an instance of Example
with one or more keys/attributes
corresponding to a modality name as specified in the Modalities registry
.
The Example
object must also include an example_index
attribute/key, which is used, in addition to the dataset index,
to uniquely identify the example.
CombinedDataset
The CombinedDataset
object is used to combine multiple datasets into one. It
accepts an iterable of torch.utils.data.Dataset
and/or torch.utils.data.IterableDataset
objects and returns an Example
object from one of the datasets, given an index. Conceptually, the CombinedDataset
object is a concatenation of the
datasets in the input iterable, so the given index can be mapped to a specific dataset based on the size of the datasets.
As iterable-style datasets do not support random access, the examples from these datasets are returned in order as they
are iterated over.
The CombinedDataset
object also adds a dataset_index
attribute to the Example
object, corresponding to the index of
the dataset in the input iterable. Every example returned by the CombinedDataset
will have an example_ids
attribute,
which is instance of Example
containing the same keys/attributes as the original example, with the exception of the
example_index
and dataset_index
attributes, with values being a tensor of the dataset_index
and example_index
.
Dataloading
When dealing with multiple datasets with different modalities, the default collate_fn
of torch.utils.data.DataLoader
may not work, as it assumes that all examples have the same keys/attributes. In that case, the collate_example_list
function can be used as the collate_fn
argument of torch.utils.data.DataLoader
. This function takes a list of Example
objects and returns a dictionary of tensors, with all the keys/attributes of the Example
objects.
Contributing
If you are interested in contributing to the library, please see CONTRIBUTING.MD. This file contains many details around contributing to the code base, including are development practices, code checks, tests, and more.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for mmlearn-0.1.0a0.dev1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 70ce7a38388ed290a78b2160fd30f633abd9e071a4b1389434f8b4ef970282ed |
|
MD5 | 393fedcad8c93ba23c3b67c9bcd177d1 |
|
BLAKE2b-256 | 9666655f1f57a21ade56219138c20b393f7975521272cbf6e759b0b97ddf803f |