Skip to main content

Cell DISentangled Experts for Covariate counTerfactuals (CellDISECT). Causal generative model designed to disentangle known covariate variations from unknown ones at test time while simultaneously learning to make counterfactual predictions.

Project description

celldisect-logo

PyPI version Documentation Status License Stars Downloads bioRxiv

ℹ️ Beta Version Available: A beta version with compatibility for Google Colab and newer versions of torch and scvi-tools is available on the beta-colab branch. Install it with pip install celldisect==0.2.0b1.

🧬 Overview

CellDISECT (Cell DISentangled Experts for Covariate counTerfactuals) is a powerful causal generative model that enhances single-cell analysis by:

  • 🔍 Disentangling Variations: Separates covariate variations at test time
  • 🧪 Counterfactual Predictions: Learns to make accurate counterfactual predictions
  • 🎯 Flexible Fairness: Achieves flexible fairness through expert models for each latent space
  • 🔬 Enhanced Discovery: Captures both covariate-specific information and novel biological insights

📚 Documentation

Visit our comprehensive documentation for:

  • Detailed API reference
  • Step-by-step tutorials
  • Best practices and examples
  • Advanced usage guides

🚀 Quick Start

Prerequisites

We recommend using Anaconda/Miniconda. Create and activate a new environment:

conda create -n CellDISECT python=3.9
conda activate CellDISECT

Installation

  1. Install PyTorch (tested with pytorch 2.1.2 and cuda 12):
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia
  1. Install CellDISECT:
# Via pip (stable version)
pip install celldisect

# Or via GitHub (latest development version)
pip install git+https://github.com/Lotfollahi-lab/CellDISECT

Optional Dependencies

Click to expand optional installations

RAPIDS/rapids-singlecell:

pip install \
    --extra-index-url=https://pypi.nvidia.com \
    cudf-cu12==24.4.* dask-cudf-cu12==24.4.* cuml-cu12==24.4.* \
    cugraph-cu12==24.4.* cuspatial-cu12==24.4.* cuproj-cu12==24.4.* \
    cuxfilter-cu12==24.4.* cucim-cu12==24.4.* pylibraft-cu12==24.4.* \
    raft-dask-cu12==24.4.* cuvs-cu12==24.4.*

pip install rapids-singlecell

CUDA-enabled JAX:

pip install -U "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

📖 Tutorials & Examples

Basic Tutorials

Tutorial Description Links
Basic Training Learn how to train CellDISECT and make counterfactual predictions using the Kang dataset Open In Colab Documentation

Perturbation Prediction

Tutorial Description Links
Perturbation Prediction Predict gene expression under seen, unseen, and combinatorial perturbations using predefined embeddings (GenePT, ESM) Documentation

Advanced Applications

Tutorial Description Links
Latent Space Analysis Explore combinations of CellDISECT latent spaces for erythroid subset inference Documentation
Double Counterfactual Advanced tutorial recreating Scenario 2 counterfactual on the Eraslan dataset Documentation

🧪 Perturbation Prediction

CellDISECT supports perturbation prediction using predefined gene embeddings (e.g. GenePT, ESM). This enables prediction for unseen perturbations and combinatorial perturbations.

import numpy as np
from celldisect import CellDISECT, perturbation_metrics

# Store predefined embeddings in adata.uns
adata.uns['pert_embeddings'] = gene_embeddings  # dict: gene_name -> np.ndarray

# Setup with perturbation support
CellDISECT.setup_anndata(
    adata,
    layer='counts',
    categorical_covariate_keys=['cell_type', 'perturbation'],
    perturbation_key='perturbation',
    perturbation_embedding_key='pert_embeddings',
)

# Train the model
model = CellDISECT(adata, n_latent_shared=32, n_latent_attribute=32)
model.train(max_epochs=200)

# Predict seen, unseen, or combinatorial perturbations
x_ctrl, x_true, x_pred = model.predict_perturbation(
    adata,
    perturbation='GeneA+GeneB',
    source_perturbation='ctrl',
    cats=['cell_type', 'perturbation'],
    perturbation_key='perturbation',
)

# Evaluate
metrics = perturbation_metrics(x_pred.numpy(), x_true.numpy(), x_ctrl.numpy())

See the perturbation prediction tutorial for a full walkthrough.

🤝 Contributing

We welcome contributions! Please see our contributing guidelines for details on how to:

  • Report issues
  • Submit bug fixes
  • Propose new features
  • Submit pull requests

📜 License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

📫 Contact

For questions and support:

📝 Citation

If you use CellDISECT in your research, please cite our paper:

Megas, S., Amani, A., Rose, A., Dufva, O., Shamsaie, K., Asadollahzadeh, H., Polanski, K., Haniffa, M., Teichmann, S. A., & Lotfollahi, M. (2025). Integrating multi-covariate disentanglement with counterfactual analysis on synthetic data enables cell type discovery and counterfactual predictions. bioRxiv. https://doi.org/10.1101/2025.06.03.657578

@article{Megas2025CellDISECT,
    title={Integrating multi-covariate disentanglement with counterfactual analysis on synthetic data enables cell type discovery and counterfactual predictions},
    author={Megas, Stathis and Amani, Arian and Rose, Antony and Dufva, Olli and Shamsaie, Kian and Asadollahzadeh, Hesam and Polanski, Krzysztof and Haniffa, Muzlifah and Teichmann, Sarah Amalia and Lotfollahi, Mohammad},
    journal={bioRxiv},
    year={2025},
    doi={10.1101/2025.06.03.657578},
    elocation-id={2025.06.03.657578},
    publisher={Cold Spring Harbor Laboratory},
    URL={https://www.biorxiv.org/content/10.1101/2025.06.03.657578v1}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

celldisect-0.1.6.tar.gz (42.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

celldisect-0.1.6-py3-none-any.whl (43.3 kB view details)

Uploaded Python 3

File details

Details for the file celldisect-0.1.6.tar.gz.

File metadata

  • Download URL: celldisect-0.1.6.tar.gz
  • Upload date:
  • Size: 42.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.10.12 Darwin/25.3.0

File hashes

Hashes for celldisect-0.1.6.tar.gz
Algorithm Hash digest
SHA256 c23f7ba205bc8840b92fc012175eecd8cd1513920199c48c4ced1e1b5c30d485
MD5 d36ce5d89fabda55eb05fbc1b909d569
BLAKE2b-256 f33aa6e83dfafa0bc447549c14b3256b8f998fb27d429ffe5cb552f8cca88b51

See more details on using hashes here.

File details

Details for the file celldisect-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: celldisect-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 43.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.10.12 Darwin/25.3.0

File hashes

Hashes for celldisect-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 01e5d29aedab9d83b86c8cbb9d67cd4e3b2cf9f2df48e68d45b2a9d350b3c351
MD5 c544464e4a28dbd042ad900f5a983170
BLAKE2b-256 c0077362eff5e025e13168efc45e5fbcb8a36eabffc78585df13dfb1de070f83

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page