Latent State Dynamics (LSD) for single-cell trajectory inference via neural ODE gradient flow
Project description
sclsd: Latent Space Dynamics for Single-Cell Trajectory Inference
sclsd implements Latent Space Dynamics (LSD), a thermodynamic framework for modeling cell differentiation from single-cell RNA sequencing data.
Notebooks for reproducing manuscript figures and analyses are available at csglab/sclsd-manuscript.
Overview
LSD reinterprets Waddington's epigenetic landscape as an energy landscape in a learned latent cell state space. Cell differentiation is modeled as a stochastic dynamical system governed by a gradient flow down this potential surface, combined with noise representing gene expression variability.
The model jointly infers:
- Cell state: A latent representation of each cell's gene expression profile
- Differentiation state: A 2D embedding capturing developmental progression
- Waddington potential: An energy function whose gradient defines differentiation dynamics
- Developmental entropy: A measure of cellular plasticity derived from the uncertainty in differentiation state
Installation
pip install sclsd
Or from source:
git clone https://github.com/csglab/sclsd.git
cd sclsd
pip install -e .
[!NOTE] As a lightweight Python package, installation time varies depending on pre-existing dependencies in your environment, starting from scratch, expect it to take less than 10 minutes.
Dependencies
- Python ≥3.9
- PyTorch ≥2.0.0
- Pyro-PPL ≥1.8.0
- torchdiffeq ≥0.2.0
- scanpy ≥1.9.0
- cellrank ≥2.0.0
Quick Start
import scanpy as sc
import torch
from sclsd import LSD, LSDConfig
# Load preprocessed AnnData (log-normalized, with neighbors computed)
adata = sc.read("data.h5ad")
# Configure model
cfg = LSDConfig()
cfg.model.z_dim = 10 # Cell state dimensionality
cfg.walks.path_len = 50 # Trajectory length for training
cfg.walks.num_walks = 4096 # Number of training trajectories
# Initialize model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
lsd = LSD(adata, cfg, device=device)
# Set prior transition matrix from pseudotime
lsd.set_prior_transition(prior_time_key="dpt_pseudotime")
# Generate training trajectories
lsd.prepare_walks()
# Train
lsd.train(num_epochs=100)
# Get results
result = lsd.get_adata()
Output
After training, lsd.get_adata() returns an AnnData object with:
| Key | Location | Description |
|---|---|---|
X_cell_state |
obsm |
Latent cell state representation |
X_diff_state |
obsm |
2D differentiation state embedding |
potential |
obs |
Waddington potential value |
entropy |
obs |
Developmental entropy (plasticity) |
lsd_pseudotime |
obs |
Pseudotime derived from potential |
transitions |
obsp |
Cell-cell transition probability matrix |
Key Methods
Cell Fate Prediction
Propagate cells through the learned landscape to predict terminal fates:
result = lsd.get_cell_fates(
adata=result,
time_range=15.0,
cluster_key="clusters",
return_paths=True
)
# Predicted fates stored in result.obs["fate"]
Velocity Streamlines
Visualize differentiation flow fields:
lsd.stream_lines("X_umap", color="clusters")
In Silico Gene Perturbation
Simulate gene knockouts and predict fate changes:
X = torch.from_numpy(result.X.toarray()).float()
perturbed_fates, unperturbed_fates = lsd.perturb(
adata=result,
x=X,
gene_name="Noto",
cluster_key="clusters",
perturbation_level=0 # Knockout
)
Configuration
Key parameters in LSDConfig:
cfg = LSDConfig()
# Model architecture
cfg.model.z_dim = 10 # Cell state dimensions
cfg.model.B_dim = 2 # Differentiation state dimensions (fixed at 2)
cfg.model.V_coeff = 0.01 # Potential regularization
# Training trajectories
cfg.walks.path_len = 50 # Steps per trajectory
cfg.walks.num_walks = 4096 # Number of trajectories
cfg.walks.batch_size = 256 # Batch size
# Optimizer
cfg.optimizer.adam.lr = 1e-3 # Learning rate
Data Requirements
Input AnnData should contain:
- Log-normalized expression in
adata.X - Raw counts in
adata.layers["raw"] - Library sizes in
adata.obs["librarysize"] - Precomputed neighbor graph in
adata.obsp["connectivities"] - Pseudotime values (e.g., from diffusion pseudotime) for prior initialization
Method
LSD models cell state dynamics via the stochastic differential equation:
$$dz = -\nabla V(z) , dt + \sigma , dW$$
where $V(z)$ is the Waddington potential parameterized by a neural network, and the gradient defines a neural ODE. The model is trained by variational inference, reconstructing gene expression through a zero-inflated negative binomial likelihood.
Training trajectories are generated by random walks on a k-nearest neighbor graph, biased by pseudotime to follow developmental progression.
Citation
If you use sclsd, please cite:
Poursina A, Hajhashemi S, Mikaeili Namini A, Saberi A, Emad A, Najafabadi HS. A Latent Space Thermodynamic Model of Cell Differentiation. 2026.
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sclsd-0.3.0.tar.gz.
File metadata
- Download URL: sclsd-0.3.0.tar.gz
- Upload date:
- Size: 39.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
da1b017d53faa4b953578ac7206db3c0688390220ce359ed927bd0e4b3d86c2e
|
|
| MD5 |
3809424dcf1f151fa160c23d7cef7b68
|
|
| BLAKE2b-256 |
8a5eec1890dea307902dac471e52caaedcc61dcdf353f399c0276159fd826ea1
|
File details
Details for the file sclsd-0.3.0-py3-none-any.whl.
File metadata
- Download URL: sclsd-0.3.0-py3-none-any.whl
- Upload date:
- Size: 47.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ddfc1183d47ce4fda01d89553591cdba07d114ed46678ff7208feb2f953608e
|
|
| MD5 |
c53ea48d283bbd80e8390d23b620e846
|
|
| BLAKE2b-256 |
aaa3c5808825b838d1d1de2d597486810f11ea6f3938672a5b56e02948aeb892
|