Skip to main content

PRISM: Probabilistic Inference of Subject-level Mixture for contextualized differential expression

Project description

PRISM

Phenotype-Resolved Inference in Single-Cell Mixed Models via Latent Disease States and Contextualized Differential Expression

▶ Full model explainer (1080p, ~1 min)

Overview

PRISM extends the NEBULA negative-binomial log-normal mixed model for single-cell differential expression by introducing:

  1. Latent per-cell disease states $d_{ij} \sim \text{Bernoulli}(\rho_i)$ that separate truly affected cells from unaffected ones within disease subjects.
  2. Contextualized covariate effects $x^\top \Gamma_g z$ that model how context modulates all covariate effects (a main effect of context on expression).
  3. Context-dependent disease effects $\Delta_g(z) = \alpha_g + \theta_g^\top z$ that let the DE magnitude vary with continuous cell-level covariates ($\alpha_g$ = constant disease effect, $\theta_g$ = context modulation).
  4. An EM algorithm that alternates between inferring $q_{ij} = P(d_{ij}=1 \mid Y, \Theta)$ (E-step) and optimizing the NEBULA-LN approximate likelihood weighted by $q_{ij}$ (M-step).

Installation

Option A — conda (recommended)

git clone https://github.com/AndreaRubbi/ContextualizedDifferentialExpression.git && cd ContextualizedDifferentialExpression
conda env create -f environment.yml
conda activate prism
pip install -e .

Option B — pip + venv

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

# If CUDA 12.4 driver, pin a compatible torch build:
pip install torch==2.5.1+cu124 --index-url https://download.pytorch.org/whl/cu124

# Required for HVG selection (seurat_v3 flavor):
pip install scikit-misc

Verify

python -c "
import torch, prism
print(f'PRISM v{prism.__version__}')
print(f'PyTorch {torch.__version__}, CUDA available: {torch.cuda.is_available()}')
"

Quick start

PRISM auto-builds a two-stage q-prior on disease biology by default — no configuration needed. On real cohorts this anchors the EM to the disease axis and avoids the latent-axis identifiability problem (Discussion §latent).

from prism import PRISMConfig, PRISMTrainer, PrismData
from prism.data.simulation import generate_prism_data

data, _ = generate_prism_data(n_subjects=50, n_genes=60, n_cells_per_subject=200,
                              rho=0.6, seed=42, device="cpu")
trainer = PRISMTrainer(PRISMConfig(
    n_genes=data.n_genes, n_covars=data.n_covars, n_context=data.n_context,
    max_em_iter=30, run_wald=True, device="cpu",
))                                  # auto_q_prior=True is the default
results = trainer.fit(data)
print("general DE (FDR<0.05):", int((results.q_values_de < 0.05).sum()))
print("context DE (FDR<0.05):", int((results.q_values_context.min(dim=1).values < 0.05).sum()))

Anchoring strategies for real data

  • Curated marker prior (best when biology is known):
    from prism import marker_prior, DAM_MICROGLIA_UP, HOMEOSTATIC_MICROGLIA
    import torch
    prior = marker_prior(adata, up_markers=DAM_MICROGLIA_UP,
                         down_markers=HOMEOSTATIC_MICROGLIA, condition_col="disease")
    cfg = PRISMConfig(..., q_prior=torch.from_numpy(prior), q_prior_weight=0.9)
    
  • Two-stage prior (data-adaptive; this is what auto_q_prior=True builds for you).
  • Safe fallback PRISM-C (fix_q=True): pins q to the donor label, loses per-cell q_ij and context-DE θ_g(z) but matches bulk DE most closely.

Project layout

prism/
├── prism/
│   ├── model/           # PrismModel, encoders, NB/HL likelihoods
│   ├── inference/       # EM (e_step, m_step, em_loop), Wald & score tests
│   ├── data/            # PrismData, simulation, ROSMAP preprocessing
│   ├── baselines/       # NEBULA, context-only, latent-only, stratified
│   ├── evaluation/      # Metrics and plotting
│   └── utils/           # Numerical helpers, logging
├── tests/               # Unit and integration tests
└── tutorials/           # Notebook tutorials (API, synthetic, ROSMAP)

Tutorials

Reproducing the paper

All scripts, pipelines and notebooks that reproduce the paper's results live in the top-level reproducibility/ folder (Snakemake ablations, NEBULA-style benchmark, external baselines, ROSMAP and COVID real-data analyses). See ../reproducibility/README.md.

Citation

@article{prism2026,
  title={PRISM: Phenotype-Resolved Inference in Single-Cell Mixed Models
         via Latent Disease States and Contextualized Differential Expression},
  author={Anonymous Authors},
  journal={Under review},
  year={2026}
}

Regenerating the animations

The animated visuals are built with Manim Community.

pip install manim
cd media

# Standalone logo (1080p MP4)
manim -qh prism_explainer.py PRISMLogo

# Full explainer (1080p MP4)
manim -qh prism_explainer.py PRISMExplainer

# Full explainer (720p GIF — large file)
manim -qm --format=gif prism_explainer.py PRISMExplainer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prism_de-0.2.0.tar.gz (90.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prism_de-0.2.0-py3-none-any.whl (90.0 kB view details)

Uploaded Python 3

File details

Details for the file prism_de-0.2.0.tar.gz.

File metadata

  • Download URL: prism_de-0.2.0.tar.gz
  • Upload date:
  • Size: 90.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for prism_de-0.2.0.tar.gz
Algorithm Hash digest
SHA256 7ece379d7af295b127d7feec75cae36438d3450a93c01403b4acd9ec54bcc594
MD5 945c04f71d95e069dc45e5d9ff777805
BLAKE2b-256 85a91276477abc6cb3adb80ab34e797dd6be8e0ce7a3becd4946c3bc01f21b07

See more details on using hashes here.

Provenance

The following attestation bundles were made for prism_de-0.2.0.tar.gz:

Publisher: pypi.yml on AndreaRubbi/PRISM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file prism_de-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: prism_de-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 90.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for prism_de-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 daa7de6bcca9d2e29d60ec997538eaf9acaa1378aaa0ba7b00dd4201475e9d6b
MD5 438656f46fb0d6e7753ebc1c8730d150
BLAKE2b-256 2073f1c8b0a9db4c16f4f65b5964eb0b873c9cc54ca59bbb93e7f6f7385bda6a

See more details on using hashes here.

Provenance

The following attestation bundles were made for prism_de-0.2.0-py3-none-any.whl:

Publisher: pypi.yml on AndreaRubbi/PRISM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page