Skip to main content

Bi-cross validation of NMF and signature generation and analysis

Project description

cvaNMF logo

cvanmf

ci-cd

An implementation of bicrossvalidation for Non-negative Matrix Factorisation (NMF) rank selection, along with methods for analysis and visualisation of NMF decomposition.

For details on the method, please see:

Graphical Abstract

cvaNMF asbtract

The left section is a schematic depicting the procedures implemented in cvaNMF; on the right is a summary of results reported in the manuscript (in preparation).

Documentation

Documentation can be found at readthedocs.

Installation

cvanmf is available from bioconda

conda install --name {envname} -c bioconda -c conda-forge cvanmf

or pip

pip install cvanmf

Overview

NMF is an unsupervised machine learning techniques which provides a representation of a numeric input matrix $X$ as a mixture of $k$ of underlying parts. In this package we refer to each of these parts as a signature. Each signature can be described by how much each feature contributes to it. For example, we can represent the abundance of bacteria in the human gut as a mixture of 5 signatures.

The number of signatures (or rank, $k$) has to specified when performing NMF, and selecting an appropriate value for $k$ is an important step. We implement bicrossvalidation with Gabriel style holdouts. Broadly speaking, this method holds out one block of the matrix ($A$) and makes an estimate of it ($A'$) using the remainder of the matrix. How closely $A'$ resembles $A$ is used to identify and appropriate rank.

Input

Any numeric matrix can be used as input, with samples on columns, and features on rows. Each row should describe something similar, e.g. each is the abundance of a microbe, or abundance of a transcript. A minimum of 2 samples is required. When number of samples $n$ is close to the number of signatures $k$, signatures are likely to represent individual samples rather than broad patterns.

Container

We provide a container image for linux/amd64 on through the Github Container Repository (GHCR), with the current version being ghcr.io/apduncan/cvanmf:latest/. This is intended either for running cvanmf command-line tools, or using as a container for using cvanmf within pipelines. Please see the documentation for more details.

References

If you use this tool please cite: For details on the method, please see:

For background on NMF see:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cvanmf-1.0.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cvanmf-1.0.0-py3-none-any.whl (1.3 MB view details)

Uploaded Python 3

File details

Details for the file cvanmf-1.0.0.tar.gz.

File metadata

  • Download URL: cvanmf-1.0.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cvanmf-1.0.0.tar.gz
Algorithm Hash digest
SHA256 58979faea2d3760f5601a0b38b6ce517364d43a25f9e0f4ff7dab30af8e24430
MD5 244208eb341a9cc3a68f9c698a5756f9
BLAKE2b-256 55a76b005dc56ff264ea5f43cf4173d90b23326b41f534ee53e06ac972813877

See more details on using hashes here.

File details

Details for the file cvanmf-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: cvanmf-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cvanmf-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c1de30767c50c8e181f9c5219b5e349f6b70bdb79f182ea1392ab73cae9f7ae3
MD5 d557c90f31205e9ef2963e76467da11c
BLAKE2b-256 9c02c4aff07c1b92efcf69b9e698f47275dd97948c94433e2dabc96add02fe66

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page