A torch-based integration method for single-cell multi-omic data.
Project description
MIDAS
MIDAS turns raw mosaic single-cell multimodal data into imputed, batch-corrected matrices and disentangled latent representations.
Documentation: scmidas.readthedocs.io
Key features
- Mosaic integration — handle datasets where different batches measure different modality combinations (e.g. some batches have RNA + ATAC, others only RNA).
- Multi-modal support — RNA, ADT, ATAC out of the box; configurable for additional modalities.
- Imputation — fill in missing modalities with model-derived values.
- Batch correction — remove technical variation across batches.
- Knowledge transfer — fine-tune a pre-trained reference model on a query dataset.
- Multi-GPU training — built on PyTorch Lightning, with DDP support for mosaic data.
- TensorBoard integration — live training loss and UMAP visualisations.
Installation
conda create -n scmidas python=3.12
conda activate scmidas
pip install scmidas
Quick start
A bundled 1600-cell PBMC RNA+ADT mosaic dataset lets you go from pip install to a UMAP in about a minute on a single GPU — no extra downloads, no config files. Click the Colab badge to run it without installing anything, or copy the snippet:
import scanpy as sc
import scmidas
mdata = scmidas.datasets.quickstart() # bundled toy MuData
model = scmidas.integrate(mdata) # ~1 min on a mid-range GPU; writes mdata.obsm['X_midas']
sc.pp.neighbors(mdata, use_rep='X_midas') # any scanpy downstream works directly
This produces lineage-separated clusters that mix freely across batches:
⚠️
scmidas.integrate()defaults are tuned for the bundled toy dataset. For your own data, overridemax_epochs(1000-2000 is typical) and consider lettingbatch_sizedefault back to 256, e.g.scmidas.integrate(my_mdata, max_epochs=2000, batch_size=256). See the full demos for end-to-end pipelines on real-sized data, including imputation, batch correction, and cross-modality translation.
Bring your own data
If you already have an :class:AnnData per modality, the bridge to MIDAS is one cell:
import mudata as mu
import scmidas
mdata = mu.MuData({'rna': adata_rna, 'adt': adata_adt}) # share a 'batch' obs column
scmidas.MIDAS.setup_mudata(mdata, batch_key='batch')
model = scmidas.MIDAS(mdata)
model.train(max_epochs=2000)
mdata.obsm['X_midas'] = model.get_latent_representation()
For a full scanpy-native walkthrough (download a 10x CITE-seq sample → QC → HVG → MuData → MIDAS), see Preparing your data. For the data contract (what goes where in the MuData), see Data layout.
Reproducibility
Code and data to reproduce the results in the paper live on the reproducibility branch.
Citation
If you use MIDAS in your research, please cite:
@article{he2024mosaic,
title = {Mosaic integration and knowledge transfer of single-cell multimodal data with {MIDAS}},
author = {He, Zhen and Hu, Shuofeng and Chen, Yaowen and An, Sijing
and Zhou, Jiahao and Liu, Runyan and Shi, Junfeng and Wang, Jing
and Dong, Guohua and Shi, Jinhui and Zhao, Jiaxin and Ou-Yang, Le
and Zhu, Yuan and Bo, Xiaochen and Ying, Xiaomin},
journal = {Nature Biotechnology},
volume = {42},
number = {10},
pages = {1594--1605},
year = {2024},
doi = {10.1038/s41587-023-02040-y},
publisher = {Nature Publishing Group}
}
Contributing
Bug reports and feature requests: please open a GitHub issue. For code contributions, branch from main, make sure pytest tests/ passes, and open a pull request — for non-trivial changes, an issue first to discuss the design is appreciated.
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scmidas-0.3.0.tar.gz.
File metadata
- Download URL: scmidas-0.3.0.tar.gz
- Upload date:
- Size: 504.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef4da333c81dd8535f67148ae0fae30eb603349bbd79e234c64a53e306c9cab4
|
|
| MD5 |
ca112cb0e4720ece917f920308bc8c96
|
|
| BLAKE2b-256 |
012267a9e8fb44ece3ea8e6eb4a2b2ce83fc26319f59474df2a77cffe2d3a290
|
File details
Details for the file scmidas-0.3.0-py3-none-any.whl.
File metadata
- Download URL: scmidas-0.3.0-py3-none-any.whl
- Upload date:
- Size: 535.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24df81372decc6d3edf6c2481d7b0c58f4685386d2ac938564946916f1bd9522
|
|
| MD5 |
175964e11c4c5b8bbeb4b030e86044b1
|
|
| BLAKE2b-256 |
78cbbf09c17b06f6f05dfa20a497c0d10d85ae0e434ec9cc3f851dfacdfb2b68
|