Skip to main content

statOT

Project description

StationaryOT: Dynamic inference from single-cell snapshots by optimal transport

PyPI version Downloads Documentation Status

Schematic

Introduction and overview

Entropy-regularized optimal transport has been used to infer cellular trajectories in time-course data [1]. Stationary optimal transport (OT) extends this approach to snapshots of biological systems in equilibrium. A system is in equilibrium if you would expect the same proportions of populations in each snapshot, though individual cells progress along trajectories.

We model biological processes with a diffusion-drift process subject to branching due to cell birth and death. To maintain equlibrium, we define source and sink regions, where cells are created in the source regions and are absorbed in the sinks subject to birth and death rates. At a population level, the effects of growth, entry, and exit can be captured by a spatially dependent flux, R(x), and population dynamics can be described by a population balance partial differential equation:

Population Balance PDE

where equilibrium equation by our equilibirum assumption.

This problem has previously been explored by Weinreb et al. [2] who approach the problem by solving a system of linear equations to recover the potential. In contrast, stationary OT solves a convex optimization problem for the transition probabilities. This approach has the flexibility to allow additional information such as RNA velocity, which may allow recovery of non-conservative dynamics such as oscillations. Combined with earlier OT approaches, stationary OT provides a framework for approaching both time-series and snapshot data.

This package provides the ability to run stationary OT on single-cell expression data to recover transition and fate probabilities. This package also provides a wrapper for use with CellRank.

Installation

Run pip install statot in your working environment. Alternatively, clone this repository and run pip install . in the top level directory.

Usage

Read the full documentation here.

Inputs

  1. Projected Expression Matrix: An n x m matrix of cells by expression data. We recommend using a lower-dimensional embedding such as PCA.
  2. Sinks: An n x 1 boolean vector indicating cells that leave the system.
  3. Sources: An n x 1 boolean vector indicating cells that are where mass is entering the system. Typically, this will be the complement of the sink vector.
  4. Sink Weights: An n x 1 vector indicating the weight of each sink, where sinks with higher weight will absorb more mass.
  5. Cost Matrix: An n x n matrix specifiying the cost for each cell transporting to each other cell in the coupling, such as the Euclidean distance between cells in PCA coordinates.
  6. Growth Rates: An n x 1 vector specifying the expected number of decendents of each cell in dt = 1.
  7. dt: The timestep between snapshots.

Quick Start

Compute first the vector g of growth rates. Then use the statot.statot function to calculate transition probabilities for cells in a single timestep:

statot(x, C = None, eps = None, method = "ent", g = None,
           flow_rate = None,
           dt = None, 
           maxiter = 5000, tol = 1e-9)

To compute fate probabilities by sink, use compute_fate_probs:

compute_fate_probs(P, sink_idx)

Finally, to compute the fate probabilities by lineage, use compute_fate_probs_lineages:

compute_fate_probs_lineages(P, sink_idx, labels)

where labels should be a np.array of strings corresponding to the lineage annotation for each cell.

Example application to Arabidopsis root tip

Open In Colab

An example application to Arabidopsis root tip data is available as a Jupyter notebook in the examples/ directory.

Citing

Read the preprint.

Optimal transport analysis reveals trajectories in steady-state systems

Stephen Zhang, Anton Afanassiev, Laura Greenstreet, Tetsuya Matsumoto, Geoffrey Schiebinger

bioRxiv 2021.03.02.433630; doi: https://doi.org/10.1101/2021.03.02.433630

References

[1] Schiebinger, G et al. Optimal-transport analysis of single-cell gene expression identifies developmentaltrajectories in reprogramming. Cell. 2019; 176(4): 928–943. https://doi.org/10.1016/j.cell.2019.02.026.

[2] Weinreb et al. Fundamental limits ondynamic inference from single-cell snapshots. Proceedings of the NationalAcademy of Sciences. 2018; 115(10):E2467–E2476. https://doi.org/10.1073/pnas.1714723115.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

statot-0.0.14.tar.gz (5.7 MB view details)

Uploaded Source

Built Distribution

statot-0.0.14-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file statot-0.0.14.tar.gz.

File metadata

  • Download URL: statot-0.0.14.tar.gz
  • Upload date:
  • Size: 5.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.4

File hashes

Hashes for statot-0.0.14.tar.gz
Algorithm Hash digest
SHA256 c748921cc7168de39ac57bfef9eef9dde6f442b943dc940cc366fb222d65df6c
MD5 d1e8cc6c19e6803351978651365db8de
BLAKE2b-256 2159066a3e0f9c2f762632d560f3ef4a3f9e3160cbca859f595a33e3a5a33d1c

See more details on using hashes here.

File details

Details for the file statot-0.0.14-py3-none-any.whl.

File metadata

  • Download URL: statot-0.0.14-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.4

File hashes

Hashes for statot-0.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 e05c1cfe0f7ceb32e6997444b9e5570f0820c7efecc88b6d9257562c630a5748
MD5 9184d123673ac137a9f1336af964dd14
BLAKE2b-256 7599360e3221f1b5e9bcd711f3cacaf1f16854919b23fae302b633c0baf50397

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page