Skip to main content

A torch-based integration method for single-cell multi-omic data.

Project description

MIDAS: A Deep Generative Model for Mosaic Integration and Knowledge Transfer of Single-Cell Multimodal Data

MIDAS Logo

MIDAS turns raw mosaic data into both imputed, batch-corrected data and disentangled latent representations, powering robust downstream analysis.

GitHub Stars PyPI version Documentation Status License


MIDAS is a powerful deep probabilistic framework designed for the mosaic integration and knowledge transfer of single-cell multimodal data. It addresses key challenges in single-cell analysis, such as modality alignment, batch effect removal, and data imputation. By leveraging self-supervised modality alignment and information-theoretic latent disentanglement, MIDAS transforms fragmented, mosaic data into a complete and harmonized dataset ready for downstream analysis.

Whether you are working with transcriptomics (RNA), proteomics (ADT), or chromatin accessibility (ATAC), MIDAS provides a versatile solution to uncover deeper biological insights from complex, multi-source datasets.

✨ Key Features

  • Mosaic Data Integration: Seamlessly integrates datasets where different batches measure different sets of modalities (e.g., some samples have RNA and ATAC, while others have only RNA).
  • Multi-Modal Support: Natively supports RNA, ADT, and ATAC data, and can be easily configured to incorporate additional modalities.
  • Data Imputation: Accurately imputes missing modalities, turning incomplete data into a complete multi-modal matrix.
  • Batch Correction: Effectively removes technical variations between different batches, enabling consistent and reliable analysis across datasets.
  • Knowledge Transfer: Leverages a pre-trained reference atlas to enable flexible and accurate knowledge transfer to new query datasets.
  • Efficient and Scalable: Built on PyTorch Lightning for highly efficient model training, with support for advanced strategies like Distributed Data Parallel (DDP).
  • Advanced Visualization: Integrates with TensorBoard for real-time monitoring of training loss and UMAP visualizations.

🚀 Installation

Get started with MIDAS by setting up a conda environment.

# 1. Create and activate a new conda environment
conda create -n scmidas python=3.12
conda activate scmidas

# 2. Install MIDAS from PyPI
pip install scmidas

⚡ Getting Started: A Quick Example

Here is a minimal example to get you started with a mosaic integration task. For more detailed tutorials, please refer to our documentation.

from scmidas.config import load_config
from scmidas.model import MIDAS
import lightning as L

# 1. Configure and initialize the MIDAS model
# The configuration file allows you to specify modalities, layers, and other parameters.
configs = load_config()

# 2. Load your mosaic dataset
# The input should be an AnnData object where modalities are stored.
# Different batches can have different combinations of modalities.
model = MIDAS.configure_data_from_dir(configs, 'path/to/your/data', transform={'atac':'binarize'})

# 3. Train the model on your data
trainer = L.Trainer(max_epochs=2000)
trainer.fit(model=model)

# 4. Obtain the integrated and imputed results
# The model returns an AnnData object with a unified latent space 
# and imputed values for the missing modalities.
pred = model.predict()

# 5. Visualize the results
model.get_emb_umap()

📈 Reproducibility

To reproduce the results from our publication, please visit the reproducibility branch of this repository: github.com/labomics/midas/tree/reproducibility

📜 Citation

If you use MIDAS in your research, please cite our paper:

He, Z., Hu, S., Chen, Y. et al. Mosaic integration and knowledge transfer of single-cell multimodal data with MIDAS. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-023-02040-y

@article{he2024mosaic,
  title={Mosaic integration and knowledge transfer of single-cell multimodal data with MIDAS},
  author={He, Zhen and Hu, Shuofeng and Chen, Yaowen and An, Sijing and Zhou, Jiahao and Liu, Runyan and Shi, Junfeng and Wang, Jing and Dong, Guohua and Shi, Jinhui and others},
  journal={Nature Biotechnology},
  pages={1--12},
  year={2024},
  publisher={Nature Publishing Group US New York}
}

🙌 Contributing

We welcome contributions from the community! If you have a suggestion, bug report, or want to contribute to the code, please feel free to open an issue or submit a pull request.

📝 License

MIDAS is available under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scmidas-0.1.13.tar.gz (38.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scmidas-0.1.13-py3-none-any.whl (37.3 kB view details)

Uploaded Python 3

File details

Details for the file scmidas-0.1.13.tar.gz.

File metadata

  • Download URL: scmidas-0.1.13.tar.gz
  • Upload date:
  • Size: 38.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.8

File hashes

Hashes for scmidas-0.1.13.tar.gz
Algorithm Hash digest
SHA256 ddb8a1c23c948288b3a48dd8858a1a442be7b6830e98fde1724b321ee8650996
MD5 96472712214d04c918ad07892275a36a
BLAKE2b-256 194101ac41217bbc4ad456a673ff98c907938c7df2fcee29af2bd3098f0cef11

See more details on using hashes here.

File details

Details for the file scmidas-0.1.13-py3-none-any.whl.

File metadata

  • Download URL: scmidas-0.1.13-py3-none-any.whl
  • Upload date:
  • Size: 37.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.8

File hashes

Hashes for scmidas-0.1.13-py3-none-any.whl
Algorithm Hash digest
SHA256 a113dd853a26e01da23864f0d5d8b4903bf53541e0335a355fa43fa73d825717
MD5 8a3f35cc5692dbdbacd5b82254735b1b
BLAKE2b-256 438bab024a1910bb792f89a7521f07135daa653e201691366c2fea8e0aae75b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page