SSNMF contains class for (SS)NMF model and several multiplicative update methods to train different models.

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language
Topic
- Scientific/Engineering

Project description

SSNMF

SSNMF contains class for (SS)NMF model and several multiplicative update methods to train different models.

Installation

To install SSNMF, run this command in your terminal:

    $ pip install -U ssnmf

This is the preferred method to install SSNMF, as it will always install the most recent stable release.

If you don't have pip installed, these installation instructions can guide you through the process.

Usage

First, import the ssnmf package and the relevant class SSNMF. We import numpy and `scipy' for experimentation.

>>> import ssnmf
>>> from ssnmf import SSNMF
>>> import numpy as np
>>> import scipy
>>> import scipy.sparse as sparse
>>> import scipy.optimize

Training an unsupervised model

Declare an unsupervised NMF model with data matrix X and number of topics k.

>>> X = np.random.rand(100,100)
>>> k = 10
>>> model = SSNMF(X,k)

You may access the factor matrices initialized in the model, e.g., to check relative reconstruction error ||X-AS||_F/||X||_F.

>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')

Run the multiplicative updates method for this unsupervised model for N iterations. This method tries to minimize the objective function ||X-AS||_F.

>>> N = 100
>>> model.mult(numiters = N)

This method updates the factor matrices N times. You can see how much the relative reconstruction error improves.

>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')

Training a supervised model

We begin by generating some synthetic data for testing.

>>> labelmat = np.concatenate((np.concatenate((np.ones([1,10]),np.zeros([1,30])),axis=1),np.concatenate((np.zeros([1,10]),np.ones([1,10]),np.zeros([1,20])),axis=1),np.concatenate((np.zeros([1,20]),np.ones([1,10]),np.zeros([1,10])),axis=1),np.concatenate((np.zeros([1,30]),np.ones([1,10])),axis=1)))
>>> B = sparse.random(4,10,density=0.2).toarray()
>>> S = np.zeros([10,40])
>>> for i in range(40):
	S[:,i] = scipy.optimize.nnls(B,labelmat[:,i])[0]
>>> A = np.random.rand(40,10)
>>> X = A @ S

Declare a supervised NMF model with data matrix X, number of topics k, label matrix Y, and weight parameter lam.

>>> k = 10
>>> model = SSNMF(X,k,Y = labelmat,lam=100*np.linalg.norm(X,'fro'))

You may access the factor matrices initialized in the model, e.g., to check relative reconstruction error ||X-AS||_F/||X||_F and classification accuracy.

>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')
>>> acc = model.accuracy()

Run the multiplicative updates method for this supervised model for N iterations. This method tries to minimize the objective function ||X-AS||_F^2 + lam ||Y - BS||_F^2. This also saves the errors and accuracies in each iteration.

>>> N = 100
>>> [errs,reconerrs,classerrs,classaccs] = model.snmfmult(numiters = N,saveerrs = True)

This method updates the factor matrices N times. You can see how much the relative reconstruction error and classification accuracy improves.

>>> rel_error = reconerrs[99]/np.linalg.norm(X,'fro')
>>> acc = classaccs[99]

Training a supervised model with KL-divergence

We begin by generating some synthetic data for testing.

>>> labelmat = np.concatenate((np.concatenate((np.ones([1,10]),np.zeros([1,30])),axis=1),np.concatenate((np.zeros([1,10]),np.ones([1,10]),np.zeros([1,20])),axis=1),np.concatenate((np.zeros([1,20]),np.ones([1,10]),np.zeros([1,10])),axis=1),np.concatenate((np.zeros([1,30]),np.ones([1,10])),axis=1)))
>>> B = sparse.random(4,10,density=0.2).toarray()
>>> S = np.zeros([10,40])
>>> for i in range(40):
	S[:,i] = scipy.optimize.nnls(B,labelmat[:,i])[0]
>>> A = np.random.rand(40,10)
>>> X = A @ S

Declare a supervised NMF model with data matrix X, number of topics k, label matrix Y, and weight parameter lam.

>>> k = 10
>>> model = SSNMF(X,k,Y = labelmat,lam=100*np.linalg.norm(X,'fro'))

You may access the factor matrices initialized in the model, e.g., to check relative reconstruction error ||X-AS||_F/||X||_F, classification accuracy, and KL-divergence improves.

>>> rel_error = np.linalg.norm(model.X - model.A @ model.S, 'fro')/np.linalg.norm(model.X,'fro')
>>> acc = model.accuracy()
>>> div = model.kldiv()

Run the multiplicative updates method for this supervised model for N iterations. This method tries to minimize the objective function ||X-AS||_F^2 + lam D(Y||BS). This also saves the errors and accuracies in each iteration.

>>> N = 100
>>> [errs,reconerrs,classerrs,classaccs] = model.klsnmfmult(numiters = N,saveerrs = True)

This method updates the factor matrices N times. You can see how much the relative reconstruction error and classification accuracy improves.

>>> rel_error = reconerrs[99]/np.linalg.norm(X,'fro')
>>> acc = classaccs[99]
>>> div = classerrs[99]

Citing

If you use our code in an academic setting, please consider citing our code.

Development

See CONTRIBUTING.md for information related to developing the code.

Suggested Git Branch Strategy

master is for the most up-to-date development, very rarely should you directly commit to this branch. Your day-to-day work should exist on branches separate from master. It is recommended to commit to development branches and make pull requests to master.4. It is recommended to use "Squash and Merge" commits when committing PR's. It makes each set of changes to master atomic and as a side effect naturally encourages small well defined PR's.

Additional Optional Setup Steps:

Create an initial release to test.PyPI and PyPI.
- Follow This PyPA tutorial, starting from the "Generating distribution archives" section.
Create a blank github repository (without a README or .gitignore) and push the code to it.
Delete these setup instructions from README.md when you are finished with them.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language
Topic
- Scientific/Engineering

Release history Release notifications | RSS feed

1.0.3

Oct 15, 2020

1.0.2

Oct 13, 2020

1.0.1

Oct 12, 2020

1.0.0

Oct 12, 2020

0.0.2

May 14, 2020

This version

0.0.1

May 11, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssnmf-0.0.1.tar.gz (9.9 kB view hashes)

Uploaded May 11, 2020 Source

Built Distribution

ssnmf-0.0.1-py2.py3-none-any.whl (7.3 kB view hashes)

Uploaded May 11, 2020 Python 2 Python 3

Hashes for ssnmf-0.0.1.tar.gz

Hashes for ssnmf-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`b9dff53c5e673b4e0002e6a1e15ea60f53cd9e7b0e2b2ea98ee0fecf4b256def`
MD5	`8812054a0adaef1c2a763909645c832e`
BLAKE2b-256	`c2b4f40d8f575140b1aec3559c3f5dca16539c246a277ce05a36235663fe3aa3`

Hashes for ssnmf-0.0.1-py2.py3-none-any.whl

Hashes for ssnmf-0.0.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`e13ac808db0dfde7e8ee8429ae085a22fe293350aa9a6fd75393f8acdb59f080`
MD5	`4dfca63e3e2363ac629b0f9771b80421`
BLAKE2b-256	`505cae54655a795f055ac3bb1dff29e1ee03bf9e49e959b549a5147408a8aca8`