Skip to main content

Topological Identification and Interpretation for High-throughput Single-cell Gene Regulation Elucidation across Multiple Platforms using scMGCA

Project description

scMGCA

PyPI badge License

scMGCA is a Python package containing tools for clustering single-cell data based on a graph-embedding autoencoder that simultaneously learns cell–cell topology representation and cluster assignments.

Overview

Single-cell RNA sequencing (scRNA-seq) provides high-throughput gene expression information to explore cellular heterogeneity at the individual cell level. A major challenge in characterizing high-throughput gene expression data arises from the curse of dimensionality, and the prevalence of dropout events. To address these concerns, we developed a single-cell clustering method (scMGCA) based on a graph-embedding autoencoder that simultaneously learns cell–cell topology representation and cluster assignments. In scMGCA, we propose a graph convolutional autoencoder to preserve the topological information of cells from the embedded space in multinomial distribution, and employs the positive pointwise mutual information (PPMI) matrix for cell graph augmentation. Experiments show that scMGCA is accurate and effective for cell segregation and superior to other state-of-the-art models across multiple platforms, and is also able to correct for the batch effect from multiple scRNA-seq protocols. In addition, we perform genomic interpretation on the key compressed transcriptomic space of the graph-embedding autoencoder to demonstrate the underlying gene regulation mechanism. In a pancreatic ductal adenocarcinoma (PDAC) dataset, with 57,530 individual pancreatic cells from primary PDAC tumors and control pancreases, scMGCA successfully provided annotations on the specific cell types and revealed differential gene expression levels across multiple tumor-associated and cell signalling pathways in PDAC progression through single-cell trajectory and gene set enrichment analysis.

System Requirements

Hardware requirements

scMGCA package requires only a standard computer with enough RAM to support the in-memory operations.

Software requirements

OS Requirements

This package is supported for Linux. The package has been tested on the following systems:

  • Linux: Ubuntu 18.04

Python Dependencies

scMGCA mainly depends on the Python scientific stack.

numpy
scipy
tensorflow
scikit-learn
pandas
sklearn

requirements in https://github.com/Philyzh8/scMGCA

Installation Guide:

Install from PyPi

$ conda create -n scMGCA_env python=3.6.8
$ conda activate scMGCA_env
$ pip install -r requirements.txt
$ pip install scMGCA

Usage

scMGCA is a deep graph embedding learning method for single-cell clustering, which can be used to:

  • Single-cell data clustering. The example can be seen in the demo.py.
  • Correct the batch effect of data from different scRNA-seq protocols. The example can be seen in the demo_batch.py.
  • Analysis of the mouse brain data with 1.3 million cells. The example can be seen in the demo_scale.py.
  • Provide a automatic supershell search algorithm. The example can be seen in the demo_para.py.

We give users some suggestions for running in the tutorial.md.

Data Availability

The real data sets we used can be download in data.

License

This project is covered under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scMGCA-1.0.7.tar.gz (12.3 kB view details)

Uploaded Source

Built Distribution

scMGCA-1.0.7-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file scMGCA-1.0.7.tar.gz.

File metadata

  • Download URL: scMGCA-1.0.7.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.4 tqdm/4.60.0 importlib-metadata/3.10.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.0

File hashes

Hashes for scMGCA-1.0.7.tar.gz
Algorithm Hash digest
SHA256 9bf0a1973e0a807a8d420bbcb6fc2cde8ba82415070eb7ce39df3233df361d8c
MD5 4bc93a8f97613f7122a37142218cb6dc
BLAKE2b-256 32a4680997785249040e3204320b3fee3a3891e1ed282d1742bce63b3474f475

See more details on using hashes here.

File details

Details for the file scMGCA-1.0.7-py3-none-any.whl.

File metadata

  • Download URL: scMGCA-1.0.7-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.4 tqdm/4.60.0 importlib-metadata/3.10.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.0

File hashes

Hashes for scMGCA-1.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 f810aea2cce076ec57d2dcebd57fa8ccab7ff59c974c111ebb358eececfd38df
MD5 6a2a7143d803982ff62f742762e829e7
BLAKE2b-256 194946af3da522937ef14692efc0102896e9ed7d94b6971b5475114e9047ba10

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page