Skip to main content

cNMF Solution Network Space

Project description

cNMF-SNS logo


cNMF-SNS: powerful factorization-based multi-omics integration toolkit

version badge PyPI Latest Release Conda Latest Release Documentation status Downloads License

Authors: Ted Verhey, Heewon Seo, Sorana Morrissy

cNMF-SNS (consensus Non-negative Matrix Factorization Solution Network Space) is a Python package enabling mosaic integration of bulk, single-cell, and spatial expression data between and within datasets. Datasets can have partially overlapping features (eg. genes) as well as non-overlapping features. cNMF provides a robust, unsupervised deconvolution of each dataset into gene expression programs (GEPs). Network-based integration of GEPs enables flexible integration of many datasets across assays (eg. Protein, RNA-Seq, scRNA-Seq, spatial expression) and patient cohorts.

Communities with GEPs from multiple datasets can be annotated with dataset-specific annotations to facilitate interpretation.

⚡Main Features

Here are just a few of the things that cNMF-SNS does well:

  • Identifies interpretable, non-negative programs at multiple resolutions
  • Mosaic integration does not require subsetting features/genes to a shared or overdispersed subset
  • Ideal for incremental integration (adding datasets one at a time) since deconvolution is performed independently on each dataset
  • Integration performs well even when the datasets have mismatched features (eg. Microarray, RNA-Seq, Proteomics) or sparsity (eg single-cell vs bulk RNA-Seq and ATAC-Seq)
  • Two interfaces: command-line interface for rapid data exploration and python interface for extensibility and flexibility

🔧 Install

☁️ Public Release

Install the package with conda (in an isolated conda environment)

conda create -n cnmfsns -c conda-forge cnmfsns
conda activate cnmfsns

📖 Documentation

🗐 Data guidelines

cNMF-SNS can factorize a wide variety of datasets, but will work optimally in these conditions:

  • Use untransformed (raw) data where possible, and avoid log-transformed data.
  • For single-cell or spatial RNA-Seq data, the best data to use is feature counts, then TPM-normalized values, then RPKM/FPKM-normalized values.

📓 Python interface

To get started, sample proteomics datasets and a Jupyter notebook tutorial is available here.

Detailed API reference can be found on ReadTheDocs.

⌨️ Command line interface

See the command line interface documentation.

💭 Getting Help

For errors arising during use of cNMF-SNS, create and browse issues in the GitHub "issues" tab.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cnmfsns-1.8.2.tar.gz (65.0 kB view hashes)

Uploaded Source

Built Distribution

cnmfsns-1.8.2-py3-none-any.whl (67.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page