Skip to main content

A torch-based integration method for single-cell multi-omic data.

Project description

MIDAS: A Deep Generative Model for Mosaic Integration and Knowledge Transfer of Single-Cell Multimodal Data

MIDAS Logo

MIDAS turns raw mosaic data into both imputed, batch-corrected data and disentangled latent representations, powering robust downstream analysis.

GitHub Stars PyPI version Documentation Status License


MIDAS is a powerful deep probabilistic framework designed for the mosaic integration and knowledge transfer of single-cell multimodal data. It addresses key challenges in single-cell analysis, such as modality alignment, batch effect removal, and data imputation. By leveraging self-supervised modality alignment and information-theoretic latent disentanglement, MIDAS transforms fragmented, mosaic data into a complete and harmonized dataset ready for downstream analysis.

Whether you are working with transcriptomics (RNA), proteomics (ADT), or chromatin accessibility (ATAC), MIDAS provides a versatile solution to uncover deeper biological insights from complex, multi-source datasets.

✨ Key Features

  • Mosaic Data Integration: Seamlessly integrates datasets where different batches measure different sets of modalities (e.g., some samples have RNA and ATAC, while others have only RNA).
  • Multi-Modal Support: Natively supports RNA, ADT, and ATAC data, and can be easily configured to incorporate additional modalities.
  • Data Imputation: Accurately imputes missing modalities, turning incomplete data into a complete multi-modal matrix.
  • Batch Correction: Effectively removes technical variations between different batches, enabling consistent and reliable analysis across datasets.
  • Knowledge Transfer: Leverages a pre-trained reference atlas to enable flexible and accurate knowledge transfer to new query datasets.
  • Efficient and Scalable: Built on PyTorch Lightning for highly efficient model training, with support for advanced strategies like Distributed Data Parallel (DDP).
  • Advanced Visualization: Integrates with TensorBoard for real-time monitoring of training loss and UMAP visualizations.

🚀 Installation

Get started with MIDAS by setting up a conda environment.

# 1. Create and activate a new conda environment
conda create -n scmidas python=3.12
conda activate scmidas

# 2. Install MIDAS from PyPI
pip install scmidas

⚡ Getting Started: A Quick Example

Here is a minimal example to get you started with a mosaic integration task. For more detailed tutorials, please refer to our documentation.

from scmidas.config import load_config
from scmidas.model import MIDAS
import lightning as L

# 1. Configure and initialize the MIDAS model
# The configuration file allows you to specify modalities, layers, and other parameters.
configs = load_config()

# 2. Load your mosaic dataset
# The input should be an AnnData object where modalities are stored.
# Different batches can have different combinations of modalities.
model = MIDAS.configure_data_from_dir(configs, 'path/to/your/data', transform={'atac':'binarize'})

# 3. Train the model on your data
trainer = L.Trainer(max_epochs=2000)
trainer.fit(model=model)

# 4. Obtain the integrated and imputed results
# The model returns an AnnData object with a unified latent space 
# and imputed values for the missing modalities.
pred = model.predict()

# 5. Visualize the results
model.get_emb_umap()

📈 Reproducibility

To reproduce the results from our publication, please visit the reproducibility branch of this repository: github.com/labomics/midas/tree/reproducibility

📜 Citation

If you use MIDAS in your research, please cite our paper:

He, Z., Hu, S., Chen, Y. et al. Mosaic integration and knowledge transfer of single-cell multimodal data with MIDAS. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-023-02040-y

@article{he2024mosaic,
  title={Mosaic integration and knowledge transfer of single-cell multimodal data with MIDAS},
  author={He, Zhen and Hu, Shuofeng and Chen, Yaowen and An, Sijing and Zhou, Jiahao and Liu, Runyan and Shi, Junfeng and Wang, Jing and Dong, Guohua and Shi, Jinhui and others},
  journal={Nature Biotechnology},
  pages={1--12},
  year={2024},
  publisher={Nature Publishing Group US New York}
}

🙌 Contributing

We welcome contributions from the community! If you have a suggestion, bug report, or want to contribute to the code, please feel free to open an issue or submit a pull request.

📝 License

MIDAS is available under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scmidas-0.1.12.tar.gz (38.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scmidas-0.1.12-py3-none-any.whl (37.6 kB view details)

Uploaded Python 3

File details

Details for the file scmidas-0.1.12.tar.gz.

File metadata

  • Download URL: scmidas-0.1.12.tar.gz
  • Upload date:
  • Size: 38.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.0

File hashes

Hashes for scmidas-0.1.12.tar.gz
Algorithm Hash digest
SHA256 d286ee88121814c9fe4b416b3210e2c76d663196d27f2c85abc842307ec28df9
MD5 ef054e782845450061edcd4e84bad49f
BLAKE2b-256 854a8ccb775013d4f87320a1d1b778b1b9aeb101f6d06c4fdb124c567d3b9360

See more details on using hashes here.

File details

Details for the file scmidas-0.1.12-py3-none-any.whl.

File metadata

  • Download URL: scmidas-0.1.12-py3-none-any.whl
  • Upload date:
  • Size: 37.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.0

File hashes

Hashes for scmidas-0.1.12-py3-none-any.whl
Algorithm Hash digest
SHA256 f865088fd5e00b86d33f7a82cd55a401a56801c281688de3845c0fa56feee6cc
MD5 fb10621e69f40ed402d0de77119ecd78
BLAKE2b-256 8c04add25d9c5319011c56160e8039da581b2b31bb464e8e56e1e51794c724bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page