Skip to main content

A torch-based integration method for single-cell multi-omic data.

Project description

MIDAS

MIDAS Logo

MIDAS turns raw mosaic single-cell multimodal data into imputed, batch-corrected matrices and disentangled latent representations.

CI Status PyPI version Documentation Status License Open In Colab GitHub Stars

Documentation: scmidas.readthedocs.io

Key features

  • Mosaic integration — handle datasets where different batches measure different modality combinations (e.g. some batches have RNA + ATAC, others only RNA).
  • Multi-modal support — RNA, ADT, ATAC out of the box; configurable for additional modalities.
  • Imputation — fill in missing modalities with model-derived values.
  • Batch correction — remove technical variation across batches.
  • Knowledge transfer — fine-tune a pre-trained reference model on a query dataset.
  • Multi-GPU training — built on PyTorch Lightning, with DDP support for mosaic data.
  • TensorBoard integration — live training loss and UMAP visualisations.

Installation

conda create -n scmidas python=3.12
conda activate scmidas
pip install scmidas

Quick start

A bundled 1600-cell PBMC RNA+ADT mosaic dataset lets you go from pip install to a UMAP in about a minute on a single GPU — no extra downloads, no config files. Click the Colab badge to run it without installing anything, or copy the snippet:

import scmidas

mdata = scmidas.datasets.quickstart()      # bundled toy MuData
model = scmidas.integrate(mdata)           # ~1 min on a mid-range GPU
out   = model.predict(joint_latent=True)   # latent embeddings per batch

This produces lineage-separated clusters that mix freely across batches:

quickstart UMAP

⚠️ scmidas.integrate() defaults are tuned for the bundled toy dataset. For your own data, override max_epochs (1000-2000 is typical) and consider letting batch_size default back to 256, e.g. scmidas.integrate(my_mdata, max_epochs=2000, batch_size=256). See the full demos for end-to-end pipelines on real-sized data, including imputation, batch correction, and cross-modality translation.

Reproducibility

Code and data to reproduce the results in the paper live on the reproducibility branch.

Citation

If you use MIDAS in your research, please cite:

@article{he2024mosaic,
  title   = {Mosaic integration and knowledge transfer of single-cell multimodal data with {MIDAS}},
  author  = {He, Zhen and Hu, Shuofeng and Chen, Yaowen and An, Sijing
             and Zhou, Jiahao and Liu, Runyan and Shi, Junfeng and Wang, Jing
             and Dong, Guohua and Shi, Jinhui and Zhao, Jiaxin and Ou-Yang, Le
             and Zhu, Yuan and Bo, Xiaochen and Ying, Xiaomin},
  journal = {Nature Biotechnology},
  volume  = {42},
  number  = {10},
  pages   = {1594--1605},
  year    = {2024},
  doi     = {10.1038/s41587-023-02040-y},
  publisher = {Nature Publishing Group}
}

Contributing

Bug reports and feature requests: please open a GitHub issue. For code contributions, branch from main, make sure pytest tests/ passes, and open a pull request — for non-trivial changes, an issue first to discuss the design is appreciated.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scmidas-0.2.0.tar.gz (492.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scmidas-0.2.0-py3-none-any.whl (525.2 kB view details)

Uploaded Python 3

File details

Details for the file scmidas-0.2.0.tar.gz.

File metadata

  • Download URL: scmidas-0.2.0.tar.gz
  • Upload date:
  • Size: 492.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for scmidas-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0cd3d84e6b53790ff8c999869cdc5ae6e11246fd5044f4f9a20b3797d299445c
MD5 09018278a1698663ba13d4579578a03a
BLAKE2b-256 060167b9d892f79fa27fd5af60acbcce67f2b54653ca1cae261a72d49394e046

See more details on using hashes here.

File details

Details for the file scmidas-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: scmidas-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 525.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for scmidas-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6aa29b2ee2625262255ca4b1d6abaa2fdfcc5cf0c9097c83b0f07326bddb0008
MD5 7473c3c44da8577113166103815f7cd4
BLAKE2b-256 137138e536f883839d72ade777090ce41bf51b73e6ebd55fb9b30387e754ff0f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page