Skip to main content

De novo spatial reconstruction of single-cell gene expression.

Project description

novoSpaRc - de novo Spatial Reconstruction of single-cell gene expression

About

This package is created and maintained by Nikos Karaiskos and Mor Nitzan. novoSpaRc can be used to predict the locations of cells in space by using single-cell RNA sequencing data. An existing reference database of marker genes is not required, but enhances mappability if available.

novoSpaRc accompanies the following preprint

Charting tissues from single-cell transcriptomes,
bioRxiv (2018)

M. Nitzan#, N. Karaiskos#, N. Friedman& and N. Rajewsky&

# Contributed equally
& Corresponding authors: N. Friedman, N.Rajewsky

Installation and requirements

A working Python 3.5 installation and the following libraries are required: matplotlib, numpy, sklearn, scipy, ot and networkx. Having all dependencies available, novoSpaRc can be employed by cloning the repository, modifying the template reconstruct_tissue.py accordingly and running it to perform the spatial reconstruction.

The code is partially based on adjustments of the POT (Python Optimal Transport) library (https://github.com/rflamary/POT).

environments_and_versions.txt contains environments in which we successfully tested novoSpaRc.

General usage

To spatially reconstruct gene expression, novoSpaRc performs the following steps:

  1. Read the gene expression matrix.

    1a. Optional: select a random set of cells for the reconstruction.

    1b. Optional: select a small set of genes (e.g. highly variable).

  2. Construct the target space.

  3. Setup the optimal transport reconstruction.

    3a. Optional: use existing information of marker genes, if available.

  4. Perform the spatial reconstruction.

    4a. assigning cells a probability distribution over the target space.

    4b. derive a virtual in situ hybridization (vISH) for all genes over the target space.

  5. Write outputs to file for further use, such as the spatial gene expression matrix and the target space coordinates.

  6. Optional: plot spatial gene expression patterns.

  7. Optional: identify and plot spatial archetypes.

Demonstration code

We provide scripts that spatially reconstruct two of the tissues presented in the paper: the intestinal epithelium (Moor, A.E., et al., Cell, 2018) and the stage 6 Drosophila embryo (Berkley Drosophila Transcription Network Project).

The intestinal epithelium

The reconstruct_intestine_denovo.py script reconstructs the crypt-to-villus axis of the mammalian intestinal epithelium, based on data from Moor et al. The reconstruction is performed de novo, without using any marker genes. The script outputs plots of (a) a histogram showing the distribution of assignment values over embedded zones for each original villus zone, and (b) average spatial gene expression over the original villus zones and embedded zones of 4 gene groups.

Running time on a standard computer is under a minute.

The Drosophila embryo

The reconstruct_bdtnp_with_markers.py script reconstructs the early Drosophila embryo with only a handful of markers, based on the BDTNP dataset. All cells are used and a random set of 1-4 markers is selected. The script outputs plots of gene expression for a list of genes, as well as Pearson correlations of the reconstructed and original expression values for all genes. Notice that the results depend on which marker genes are selected. In the manuscript we averaged the results over many different choices of marker genes.

Running time on a standard desktop computer is around 6-7 minutes.

Running novoSpaRc on your data

A template file for running novoSpaRc on custom datasets is provided (reconstruct_tissue.py). To successfully run novoSpaRc modify the template file accordingly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

novosparc-0.3.2.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

novosparc-0.3.2-py3-none-any.whl (32.3 kB view details)

Uploaded Python 3

File details

Details for the file novosparc-0.3.2.tar.gz.

File metadata

  • Download URL: novosparc-0.3.2.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.9.1 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.19.5 CPython/3.5.2

File hashes

Hashes for novosparc-0.3.2.tar.gz
Algorithm Hash digest
SHA256 5ec37c078cbaefcf381919652c66aa9180bfd2b954459ba8403f703c1935a6f6
MD5 107e94f0acafedac00c883e2e5beae08
BLAKE2b-256 eaafbea47673d1c64ccb905536e7e932796a9268c14b589db68fd05f654c78d2

See more details on using hashes here.

File details

Details for the file novosparc-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: novosparc-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 32.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.9.1 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.19.5 CPython/3.5.2

File hashes

Hashes for novosparc-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5a241cf987c436bb496b0b328d7e7d77cc79f9738b5a96ecb1e68637182c42e6
MD5 5d016132cf534258aab03f7f439c86ea
BLAKE2b-256 6f4d32dfa5f1b6a5c9b36eac414a5219ff390a74908e5d1210b0f301a9e396d2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page