Skip to main content

An implementation of nonlinear archetypal analysis on single-cell RNA-seq data through autoencoder

Project description

Single-cell archetypal analysis neural network (scAAnet)

scAAnet is an autoencoder-based neural network which can perform nonlinear archetypal analysis on single-cell RNA-seq data. The underlying assumption is that the expression profile of each single cell is a nonlinear combination of several gene expression programs [1] (GEPs) with each program corresponds to an archetype in the archetypal analysis. The purpose of scAAnet is to decompose an expression profile into a usage matrix and a GEP/archetype matrix. From the usage matrix, we can know how much a cell utilizes different GEPs and from the GEP matrix we will be able to quantify the relative importance of genes in each GEP. One novelty of this method is to make use of the nonlinearity of neural network to take into consideration the complex interaction among genes. Another novelty is that the negative log-likelihood of some discrete distribution (Poisson, zero-inflated Poisson, negative binomial, or zero-inflated negative binomial distribution) is used as the reconstruction loss instead of the traditional MSE loss. This is because single-cell RNA-seq data is a type of count data.

alt text

More details about scAAnet can be found in our manuscript.

Usage

scAAnet is implemented in Python. To use scAAnet, TensorFlow 2, numpy and pandas are required. A virtual environment is recommended to be used for installing these packages. You can create a virtual environment and install packages as follows:

  • Create a virtual environment by python3 -m venv ./venv (./venv can be changed to your path).
  • Activate the virtual environment by source ./venv/bin/activate.
  • Install required packages by pip install --upgrade tensorflow (numpy wil be installed together with tensorflow) and pip install pandas.

Then you can use scAAnet after activating the virtual environment. Here is an example of running scAAnet:

  • from scAAnet.api import scAAnet
  • re = scAAnet(count, hidden_size=(128, K, 128), ae_type='zinb', epochs=200, batch_size=64, early_stop=100, reduce_lr=10, learning_rate=0.01)
  • recon, usage, spectra = re['recon'], re['usage'], re['spectra']

The input count variable is single-cell expression raw count data with N cells and G genes and K is the number of archetypes/GEPs. The input count can be a pandas dataframe, a numpy array, or an AnnData object. Note that recon, usage and spectra are reconstructed expression count data (N by G), the usage matrix (N by K) and the archetype matrix (K by G) of the input count data, respectively. The argument ae_type can be chosen from poisson, zipoisson, nb and zinb.

More details about how to use scAAnet can be found in this tutorial on simulated data based on Splatter鈥檚 [2] framework. Analysis code for the manuscript and more usage of scAAnet can be found in this folder.

References

  • [1]: Kotliar, Dylan, et al. "Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-Seq." Elife 8 (2019).
  • [2]: Zappia, Luke, Belinda Phipson, and Alicia Oshlack. "Splatter: simulation of single-cell RNA sequencing data." Genome biology 18.1 (2017): 174.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scAAnet-1.0.0.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scAAnet-1.0.0-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file scAAnet-1.0.0.tar.gz.

File metadata

  • Download URL: scAAnet-1.0.0.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.2 tqdm/4.60.0 importlib-metadata/4.11.0 keyring/21.6.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.5

File hashes

Hashes for scAAnet-1.0.0.tar.gz
Algorithm Hash digest
SHA256 49dd1738893a9ae2e753b38a59c25357a5425ddb574994f30db55f43730289d7
MD5 49614137e7fbb7cd9e4a182b93daf8e2
BLAKE2b-256 d7a237b345f480b89d083ed3b65051a798ae8a1e30756c2d29b6c48859fe7118

See more details on using hashes here.

File details

Details for the file scAAnet-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: scAAnet-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.2 tqdm/4.60.0 importlib-metadata/4.11.0 keyring/21.6.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.5

File hashes

Hashes for scAAnet-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f3283295946c6c62173dd465437704c6999ec902c4d628dc3eaf1d5df30af2d4
MD5 005c2204965a94b4080533ffd6ee844f
BLAKE2b-256 36b6d94e3629d8f595746fe45cd8c86118a43c9f58a79be0a2f069d63e27e15f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page