PRISM: Probabilistic Inference of Subject-level Mixture for contextualized differential expression
Project description
PRISM
Phenotype-Resolved Inference in Single-Cell Mixed Models via Latent Disease States and Contextualized Differential Expression
▶ Full model explainer (1080p, ~1 min)
Overview
PRISM extends the NEBULA negative-binomial log-normal mixed model for single-cell differential expression by introducing:
- Latent per-cell disease states $d_{ij} \sim \text{Bernoulli}(\rho_i)$ that separate truly affected cells from unaffected ones within disease subjects.
- Contextualized covariate effects $x^\top \Gamma_g z$ that model how context modulates all covariate effects (a main effect of context on expression).
- Context-dependent disease effects $\Delta_g(z) = \alpha_g + \theta_g^\top z$ that let the DE magnitude vary with continuous cell-level covariates ($\alpha_g$ = constant disease effect, $\theta_g$ = context modulation).
- An EM algorithm that alternates between inferring $q_{ij} = P(d_{ij}=1 \mid Y, \Theta)$ (E-step) and optimizing the NEBULA-LN approximate likelihood weighted by $q_{ij}$ (M-step).
Installation
Option A — conda (recommended)
git clone https://github.com/AndreaRubbi/ContextualizedDifferentialExpression.git && cd ContextualizedDifferentialExpression
conda env create -f environment.yml
conda activate prism
pip install -e .
Option B — pip + venv
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
# If CUDA 12.4 driver, pin a compatible torch build:
pip install torch==2.5.1+cu124 --index-url https://download.pytorch.org/whl/cu124
# Required for HVG selection (seurat_v3 flavor):
pip install scikit-misc
Verify
python -c "
import torch, prism
print(f'PRISM v{prism.__version__}')
print(f'PyTorch {torch.__version__}, CUDA available: {torch.cuda.is_available()}')
"
Quick start
PRISM auto-builds a two-stage q-prior on disease biology by default — no configuration needed. On real cohorts this anchors the EM to the disease axis and avoids the latent-axis identifiability problem (Discussion §latent).
from prism import PRISMConfig, PRISMTrainer, PrismData
from prism.data.simulation import generate_prism_data
data, _ = generate_prism_data(n_subjects=50, n_genes=60, n_cells_per_subject=200,
rho=0.6, seed=42, device="cpu")
trainer = PRISMTrainer(PRISMConfig(
n_genes=data.n_genes, n_covars=data.n_covars, n_context=data.n_context,
max_em_iter=30, run_wald=True, device="cpu",
)) # auto_q_prior=True is the default
results = trainer.fit(data)
print("general DE (FDR<0.05):", int((results.q_values_de < 0.05).sum()))
print("context DE (FDR<0.05):", int((results.q_values_context.min(dim=1).values < 0.05).sum()))
Anchoring strategies for real data
- Curated marker prior (best when biology is known):
from prism import marker_prior, DAM_MICROGLIA_UP, HOMEOSTATIC_MICROGLIA import torch prior = marker_prior(adata, up_markers=DAM_MICROGLIA_UP, down_markers=HOMEOSTATIC_MICROGLIA, condition_col="disease") cfg = PRISMConfig(..., q_prior=torch.from_numpy(prior), q_prior_weight=0.9)
- Two-stage prior (data-adaptive; this is what
auto_q_prior=Truebuilds for you). - Safe fallback PRISM-C (
fix_q=True): pins q to the donor label, loses per-cell q_ij and context-DE θ_g(z) but matches bulk DE most closely.
Project layout
prism/
├── prism/
│ ├── model/ # PrismModel, encoders, NB/HL likelihoods
│ ├── inference/ # EM (e_step, m_step, em_loop), Wald & score tests
│ ├── data/ # PrismData, simulation, ROSMAP preprocessing
│ ├── baselines/ # NEBULA, context-only, latent-only, stratified
│ ├── evaluation/ # Metrics and plotting
│ └── utils/ # Numerical helpers, logging
├── tests/ # Unit and integration tests
└── tutorials/ # Notebook tutorials (API, synthetic, ROSMAP)
Tutorials
- tutorials/getting_started.ipynb — end-to-end API walkthrough on simulated data (fit, test, interpret).
- tutorials/synthetic_data_tutorial.ipynb — generate data, sweep parameters, plot FPR/power.
- tutorials/rosmap_real_data.ipynb — applying PRISM to ROSMAP microglia.
- docs/TUTORIAL.md — installation + CLI reference.
Reproducing the paper
All scripts, pipelines and notebooks that reproduce the paper's results live
in the top-level reproducibility/ folder (Snakemake ablations, NEBULA-style
benchmark, external baselines, ROSMAP and COVID real-data analyses). See
../reproducibility/README.md.
Citation
@article{prism2026,
title={PRISM: Phenotype-Resolved Inference in Single-Cell Mixed Models
via Latent Disease States and Contextualized Differential Expression},
author={Anonymous Authors},
journal={Under review},
year={2026}
}
Regenerating the animations
The animated visuals are built with Manim Community.
pip install manim
cd media
# Standalone logo (1080p MP4)
manim -qh prism_explainer.py PRISMLogo
# Full explainer (1080p MP4)
manim -qh prism_explainer.py PRISMExplainer
# Full explainer (720p GIF — large file)
manim -qm --format=gif prism_explainer.py PRISMExplainer
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file prism_de-0.2.0.tar.gz.
File metadata
- Download URL: prism_de-0.2.0.tar.gz
- Upload date:
- Size: 90.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ece379d7af295b127d7feec75cae36438d3450a93c01403b4acd9ec54bcc594
|
|
| MD5 |
945c04f71d95e069dc45e5d9ff777805
|
|
| BLAKE2b-256 |
85a91276477abc6cb3adb80ab34e797dd6be8e0ce7a3becd4946c3bc01f21b07
|
Provenance
The following attestation bundles were made for prism_de-0.2.0.tar.gz:
Publisher:
pypi.yml on AndreaRubbi/PRISM
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
prism_de-0.2.0.tar.gz -
Subject digest:
7ece379d7af295b127d7feec75cae36438d3450a93c01403b4acd9ec54bcc594 - Sigstore transparency entry: 1393460738
- Sigstore integration time:
-
Permalink:
AndreaRubbi/PRISM@a2c2a4042a565ee1b433887c8487363740ccf904 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/AndreaRubbi
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@a2c2a4042a565ee1b433887c8487363740ccf904 -
Trigger Event:
release
-
Statement type:
File details
Details for the file prism_de-0.2.0-py3-none-any.whl.
File metadata
- Download URL: prism_de-0.2.0-py3-none-any.whl
- Upload date:
- Size: 90.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
daa7de6bcca9d2e29d60ec997538eaf9acaa1378aaa0ba7b00dd4201475e9d6b
|
|
| MD5 |
438656f46fb0d6e7753ebc1c8730d150
|
|
| BLAKE2b-256 |
2073f1c8b0a9db4c16f4f65b5964eb0b873c9cc54ca59bbb93e7f6f7385bda6a
|
Provenance
The following attestation bundles were made for prism_de-0.2.0-py3-none-any.whl:
Publisher:
pypi.yml on AndreaRubbi/PRISM
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
prism_de-0.2.0-py3-none-any.whl -
Subject digest:
daa7de6bcca9d2e29d60ec997538eaf9acaa1378aaa0ba7b00dd4201475e9d6b - Sigstore transparency entry: 1393460745
- Sigstore integration time:
-
Permalink:
AndreaRubbi/PRISM@a2c2a4042a565ee1b433887c8487363740ccf904 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/AndreaRubbi
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@a2c2a4042a565ee1b433887c8487363740ccf904 -
Trigger Event:
release
-
Statement type: