Skip to main content

Extension of the Monge Gap to learn conditional optimal transport maps

Project description

Conditional Monge Gap

CI License: MIT DOI

Contents

Overview

An extension of the Monge Gap, an approach to estimate transport maps conditionally on arbitrary context vectors. It is based on a two-step training procedure combining an encoder-decoder architecture with an OT estimator. The model is applied to 4i and scRNA-seq datasets.

Systems and software requirements

Software package requirements and version information are defined in pyproject.toml and locked in uv.lock. This package has been tested on Python versions 3.10 and 3.11. Hardware requirements enough memory (RAM) to process the data and batches. GPU is not needed but does accelerate computation. This software has been tested on HPCs and local machines (iOS).

Installation from PyPI

You can install this package as follows

pip install cmonge

which should take about two minutes on a laptop.

Development setup & installation

The package environment is managed by uv.

curl -LsSf https://astral.sh/uv/install.sh | sh
git clone https://github.com/AI4SCR/conditional-monge.git
cd conditional-monge
uv sync --frozen

If the installation was successful you can run the tests using pytest

uv run pytest

Data

The preprocessed version of the Sciplex3 and 4i datasets can be downloaded here.

Example usage

You can find a demo config in configs/demo_config.yml. To train an autoencoder and CMonge you can use the following script (also provided in scripts/demo_train.py), make sure that the paths in the scripts/demo_train.py, configs/demo_condig.yml, and configs/autoencoder-demo.yml point to the correct data reading and saving locations and well as model checkpoint locations.

from loguru import logger

from cmonge.datasets.conditional_loader import ConditionalDataModule
from cmonge.trainers.ae_trainer import AETrainerModule
from cmonge.trainers.conditional_monge_trainer import ConditionalMongeTrainer
from cmonge.utils import load_config


logger_path = "logs/demo_logs.yml"
config_path = "configs/demo_config.yml"

config = load_config(config_path)
logger.info(f"Experiment: Training model on {config.condition.conditions}")


# Train an AE model to reduce data dimension
config.data.ae = True
config.data.reduction = None
datamodule = ConditionalDataModule(config.data, config.condition)
ae_trainer = AETrainerModule(config.ae)
ae_trainer.train(datamodule)

# Train conditional monge model
config.data.ae = False
config.ae.model.act_fn = "gelu"
config.data.reduction = "ae"
datamodule = ConditionalDataModule(config.data, config.condition, ae_config=config.ae)
trainer = ConditionalMongeTrainer(
    jobid=1, logger_path=logger_path, config=config.model, datamodule=datamodule
)
trainer.train(datamodule)
trainer.evaluate(datamodule)

This demo trains a CMonge model using in-distribution data split. First, an autoencoder is trained to reduce the dimensionality of the data, which can be found in data/dummy_data.h5ad. This autoencoder model is checkpointed and the resulting checkpoint can also be found in models/demo. This example uses the RDkit fingerprint embedding, which uses the SMILES information provided in the data directory to compute a numerical embedding, which is saved in models/embed/rdkit. Next, the conditional monge model is trained and evaluated, the results of training and evaluation are saved in the logger file which is defined by the logger_path variable. For this example, the logs can thus be found in logs/demo.yml. On a 2025 MacBook Air with M4 chip, this demo only takes a few minutes.

Instructions for running on your own data

For running CMonge on your own data, you probably need to adapt the dataloader to ensure the correct handling of your data. At least you need to implement your own single data loader, of which examples can be found in cmonge/datasets/single_loader.py. This single loader needs to be passed to the conditional dataloader, as can be found in cmonge/datasets/conditional_loader.py, where some adaptions might be necessary. The conditional dataloader interacts with the CMonge model (and if needed the autoencoder). The hyperparameters of the CMonge neural network and the autoencoder can be defined in the config files.

Older checkpoints loading

If you want to load model weights of older checkpoints (cmonge-{moa, rdkit}-ood or cmonge-{moa, rdkit}-homogeneous), make sure you are on the tag cmonge_checkpoint_loading.

git checkout cmonge_checkpoint_loading

Citation

If you use the package, please cite:

@article{driessen2025towards,
  title={Towards generalizable single-cell perturbation modeling via the Conditional Monge Gap},
  author={Driessen, Alice and Harsanyi, Benedek and Rapsomaniki, Marianna and Born, Jannis},
  journal={arXiv preprint arXiv:2504.08328},
  note={Preliminary version at ICLR 2024 Workshop on Machine Learning for Genomics Explorations}
  year={2025}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmonge-0.1.3.tar.gz (36.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cmonge-0.1.3-py3-none-any.whl (45.3 kB view details)

Uploaded Python 3

File details

Details for the file cmonge-0.1.3.tar.gz.

File metadata

  • Download URL: cmonge-0.1.3.tar.gz
  • Upload date:
  • Size: 36.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cmonge-0.1.3.tar.gz
Algorithm Hash digest
SHA256 e9c2baeda8e63bdde4ac0213526d20291202b471d1687ce35f7e633da5952cbb
MD5 7d0c6a0addd18f9a4a2dc2bf84b99a9c
BLAKE2b-256 18e9b277ddd86a7ba5e620667aa48c75446b62b19862c4e54333bb80b411220b

See more details on using hashes here.

File details

Details for the file cmonge-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: cmonge-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 45.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cmonge-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a2dce737321965157ce308bdf4a65aecb0a10325b5b63cc3e3ec1bfd64c1a1cb
MD5 4969f1051922155a8283215c091390e5
BLAKE2b-256 c018f811c08bbb0699ca9936e7c4cfa03d0fecc71f5e519394180c2b238ad72e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page