Skip to main content

Bayesian NMF methods for mutational signature analysis & transcriptomic profiling on GPUs (Getz Lab).

Project description


Automatic Relevance Determination (ARD) - NMF of mutational signature & expression data. Designed for scalability using Pytorch to run using GPUs if available.

  • See docs for a more in-depth description of how to use method.

Requires Python 3.6.0 or higher.



pip3 install signatureanalyzer


Git Clone
  • git clone --recursive
  • cd getzlab-SignatureAnalyzer
  • pip3 install -e .

Note --recurisve flag is required to clone submodules.



  • docker pull
  • docker run -it --rm

Source Publications

PCAWG Mutational Signatures

  • Alexandrov, L. B., Kim, J., Haradhvala, N. J., Huang, M. N., Ng, A. W. T., Wu, Y., ... & Islam, S. A. (2020). The repertoire of mutational signatures in human cancer. Nature, 578(7793), 94-101.
  • see:
  • see ./PCAWG/

SignatureAnalyzer-GPU source publication

SignatureAnalyzer-CPU source publications

  • Kim, J. et al. Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors. Nat. Genet. 48, 600–606 (2016). (

  • Kasar, S. et al. Whole-genome sequencing reveals activation-induced cytidine deaminase signatures during indolent chronic lymphocytic leukaemia evolution. Nat. Commun. 6, 8866 (2015). (

Mathematical details

  • Tan, V. Y. F., Edric, C. & Evotte, F. Automatic Relevance Determination in Nonnegative Matrix Factorization with the β-Divergence. (2012). (

Command Line Interface

usage: signatureanalyzer [-h] [-t {maf,spectra,matrix}] [-n NRUNS] [-o OUTDIR]
                         [--cosmic {cosmic2,cosmic3,cosmic3_exome,cosmic3_DBS,cosmic3_ID,cosmic3_TSB}]
                         [--hg_build HG_BUILD] [--cuda_int CUDA_INT]
                         [--verbose] [--K0 K0] [--max_iter MAX_ITER]
                         [--del_ DEL_] [--tolerance TOLERANCE] [--phi PHI]
                         [--a A] [--b B] [--objective {poisson,gaussian}]
                         [--prior_on_W {L1,L2}] [--prior_on_H {L1,L2}]
                         [--report_freq REPORT_FREQ]
                         [--active_thresh ACTIVE_THRESH] [--cut_norm CUT_NORM]
                         [--cut_diff CUT_DIFF]


signatureanalyzer input.maf -n 10 --cosmic cosmic2 --objective poisson

Python API

import signatureanalyzer as sa

# ---------------------
# ---------------------

# Run array of decompositions with mutational signature processing
sa.run_maf(input.maf, outdir='./ardnmf_output/', cosmic='cosmic2', hg_build='./ref/hg19.2bit', nruns=10)

# Run ARD-NMF algorithm standalone

# ---------------------
# ---------------------
import pandas as pd

H = pd.read_hdf('nmf_output.h5', 'H')
W = pd.read_hdf('nmf_output.h5', 'W')
Hraw = pd.read_hdf('nmf_output.h5', 'Hraw')
Wraw = pd.read_hdf('nmf_output.h5', 'Wraw')
feature_signatures = pd.read_hdf('nmf_output.h5', 'signatures')
markers = pd.read_hdf('nmf_output.h5', 'markers')
cosine = pd.read_hdf('nmf_output.h5', 'cosine')
log = pd.read_hdf('nmf_output.h5', 'log')

# Output for each run may be found at...
Hrun1 = pd.read_hdf('nmf_output.h5', 'run1/H')
Wrun1 = pd.read_hdf('nmf_output.h5', 'run1/W')
# etc...

# Aggregate output information for each run
aggr = pd.read_hdf('nmf_output.h5', 'aggr')

# ---------------------
# ---------------------

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

signatureanalyzer-0.0.7.tar.gz (169.6 kB view hashes)

Uploaded source

Built Distribution

signatureanalyzer-0.0.7-py3-none-any.whl (179.1 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page