A modular framework for research on multimodal representation learning.

These details have not been verified by PyPI

Project description

mmlearn

mmlearn aims at enabling the evaluation of existing multimodal representation learning methods, as well as facilitating experimentation and research for new techniques.

Quick Start

Installation

Prerequisites

The library requires Python 3.10 or later. We recommend using a virtual environment to manage dependencies. You can create a virtual environment using the following command:

python3 -m venv /path/to/new/virtual/environment
source /path/to/new/virtual/environment/bin/activate

Installing binaries

To install the pre-built binaries, run:

python3 -m pip install mmlearn

Installation Options

You can install optional dependencies to enable additional features. Use one or more of the pip extras listed below to install the desired dependencies.

pip extra	Dependencies	Notes
vision	"torchvision", "opencv-python", "timm"	Enables image processing and vision tasks.
audio	"torchaudio"	Enables audio processing and tasks.
peft	"peft"	Uses the PEFT library to enable parameter-efficient fine-tuning.

For example, to install the library with the vision and audio extras, run:

python3 -m pip install mmlearn[vision,audio]

Building from source

To install the library from source, run:

git clone https://github.com/VectorInstitute/mmlearn.git
cd mmlearn
python3 -m pip install -e .

Running Experiments

We use Hydra and hydra-zen to manage configurations in the library.

For new experiments, it is recommended to create a new directory to store the configuration files. The directory should have an __init__.py file to make it a Python package and an experiment folder to store the experiment configuration files. This format allows the use of .yaml configuration files as well as Python modules (using structured configs or hydra-zen) to define the experiment configurations.

To run an experiment, use the following command:

mmlearn_run 'hydra.searchpath=[pkg://path.to.config.directory]' +experiment=<name_of_experiment_yaml_file> experiment=your_experiment_name

Hydra will compose the experiment configuration from all the configurations in the specified directory as well as all the configurations in the mmlearn package. Note the dot-separated path to the directory containing the experiment configuration files. One can add a path to hydra.searchpath either as a package (pkg://path.to.config.directory) or as a file system (file://path/to/config/directory). However, new configs in mmlearn are added to hydra's external store inside path/to/config/directory/__init__.py which is only interpreted when the config directory is added as a package. Hence, please refrain from using the file:// notation.

Hydra also allows for overriding configuration parameters from the command line. To see the available options and other information, run:

mmlearn_run 'hydra.searchpath=[pkg://path.to.config.directory]' +experiment=<name_of_experiment_yaml_file> --help

By default, the mmlearn_run command will run the experiment locally. To run the experiment on a SLURM cluster, we use the submitit launcher plugin built into Hydra. The following is an example of how to run an experiment on a SLURM cluster:

mmlearn_run --multirun \
    hydra.launcher.mem_per_cpu=5G \
    hydra.launcher.qos=your_qos \
    hydra.launcher.partition=your_partition \
    hydra.launcher.gres=gpu:4 \
    hydra.launcher.cpus_per_task=8 \
    hydra.launcher.tasks_per_node=4 \
    hydra.launcher.nodes=1 \
    hydra.launcher.stderr_to_stdout=true \
    hydra.launcher.timeout_min=720 \
    'hydra.searchpath=[pkg://path.to.my_project.configs]' \
    +experiment=my_experiment \
    experiment_name=my_experiment_name

This will submit a job to the SLURM cluster with the specified resources.

Note: After the job is submitted, it is okay to cancel the program with Ctrl+C. The job will continue running on the cluster. You can also add & at the end of the command to run it in the background.

Summary of Implemented Methods

Pretraining Methods	Notes
Contrastive Pretraining	Uses the contrastive loss to align the representations from N modalities. Supports sharing of encoders, projection heads or postprocessing modules (e.g. logit/temperature scaling) across modalities. Also supports multi-task learning with auxiliary unimodal tasks applied to specific modalities.
I-JEPA	The Image-based Joint-Embedding Predictive Architecture (I-JEPA) is a unimodal non-generative self-supervised learning method that predicts the representations of several target blocks of an image given a context block from the same image. This task can be combined with the contrastive pretraining task to learn multimodal representations from paired and unpaired data.
Evaluation Methods	Notes
Zero-shot Cross-modal Retrieval	Evaluates the quality of the learned representations in retrieving the k most similar examples from a different modality, using recall@k metric. This is applicable to any number of pairs of modalities at once, depending on memory constraints.
Zero-shot Classification	Evaluates the ability of a pre-trained encoder-based multimodal model to predict classes that were not explicitly seen during training. The new classes are given as text prompts, and the query modality can be any of the supported modalities. Binary and multi-class classification tasks are supported.

Contributing

If you are interested in contributing to the library, please see CONTRIBUTING.MD. This file contains many details around contributing to the code base, including are development practices, code checks, tests, and more.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.0b2 pre-release

Feb 21, 2025

0.1.0b1 pre-release

Feb 21, 2025

This version

0.1.0b0 pre-release

Feb 19, 2025

0.1.0a1 pre-release

Dec 5, 2024

0.1.0a0 pre-release

Dec 3, 2024

0.1.0a0.dev8 pre-release

Nov 4, 2024

0.1.0a0.dev7 pre-release

Oct 29, 2024

0.1.0a0.dev6 pre-release

Oct 10, 2024

0.1.0a0.dev5 pre-release

Sep 24, 2024

0.1.0a0.dev4 pre-release yanked

Sep 9, 2024

0.1.0a0.dev3 pre-release yanked

Sep 4, 2024

0.1.0a0.dev2 pre-release yanked

Aug 19, 2024

0.1.0a0.dev1 pre-release yanked

Aug 12, 2024

0.1.0a0.dev0 pre-release yanked

Aug 9, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mmlearn-0.1.0b0.tar.gz (378.9 kB view details)

Uploaded Feb 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mmlearn-0.1.0b0-py3-none-any.whl (111.9 kB view details)

Uploaded Feb 19, 2025 Python 3

File details

Details for the file mmlearn-0.1.0b0.tar.gz.

File metadata

Download URL: mmlearn-0.1.0b0.tar.gz
Upload date: Feb 19, 2025
Size: 378.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for mmlearn-0.1.0b0.tar.gz
Algorithm	Hash digest
SHA256	`8122f96a16a6f3c289c1092741d4bf49bc6886f6d68f4ec9104b90b5164c03da`
MD5	`704564ed38e5814d68c3eb3cd2b5c398`
BLAKE2b-256	`eb1b6599ecf986b70e22f33ed88f9fb719f1f1acd55e3227cc02aff1bb9f278b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mmlearn-0.1.0b0.tar.gz:

Publisher: publish.yml on VectorInstitute/mmlearn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mmlearn-0.1.0b0.tar.gz
- Subject digest: 8122f96a16a6f3c289c1092741d4bf49bc6886f6d68f4ec9104b90b5164c03da
- Sigstore transparency entry: 172635208
- Sigstore integration time: Feb 19, 2025
Source repository:
- Permalink: VectorInstitute/mmlearn@e5a25b787e575f0a69dd1ddf8c6b2cab56f48487
- Branch / Tag: refs/tags/0.1.0b0
- Owner: https://github.com/VectorInstitute
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e5a25b787e575f0a69dd1ddf8c6b2cab56f48487
- Trigger Event: release

File details

Details for the file mmlearn-0.1.0b0-py3-none-any.whl.

File metadata

Download URL: mmlearn-0.1.0b0-py3-none-any.whl
Upload date: Feb 19, 2025
Size: 111.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for mmlearn-0.1.0b0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a95e3a66f5f5c55b2db07b0442d08417320a51f408af7415ce18d39aae32d81f`
MD5	`979613acc7f802c8ac5743a716b9ee02`
BLAKE2b-256	`6482e4329babbcf7fddc6f678f7b8412896e203782129d4ba34ec634f9130c8b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mmlearn-0.1.0b0-py3-none-any.whl:

Publisher: publish.yml on VectorInstitute/mmlearn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mmlearn-0.1.0b0-py3-none-any.whl
- Subject digest: a95e3a66f5f5c55b2db07b0442d08417320a51f408af7415ce18d39aae32d81f
- Sigstore transparency entry: 172635211
- Sigstore integration time: Feb 19, 2025
Source repository:
- Permalink: VectorInstitute/mmlearn@e5a25b787e575f0a69dd1ddf8c6b2cab56f48487
- Branch / Tag: refs/tags/0.1.0b0
- Owner: https://github.com/VectorInstitute
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e5a25b787e575f0a69dd1ddf8c6b2cab56f48487
- Trigger Event: release

mmlearn 0.1.0b0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

mmlearn

Quick Start

Installation

Prerequisites

Installing binaries

Building from source

Running Experiments

Summary of Implemented Methods

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance