Skip to main content
Python Software Foundation 20th Year Anniversary Fundraiser  Donate today!

Implementation of Topic-Supervised Non-Negative Matrix Factorization

Project description


This repository contains an implementation of Topic-Supervised Non-Negative Matrix Factorization (TS-NMF) [1] with Sparse Matrices in Python, using a Scikit-Learn's compatible API.

How it Works

From [1]: Suppose that one supervises k << n documents and identifies l << t topics that were contained in a subset of the documents. One can supervise the NMF method using this information, represented by an n×d topic supervision matrix L.The elements of L contrain the importance weights of matrix W and are of the following form:

Then, for a term-document matrix V and supervision matrix L, TS-NMF seeks matrices W and H that minimize

Where â—‹ represent the Hadamard (element-wise) product operator.


You can install TS-NMF via pip:

pip install tsnmf

Or clonning this repository and running

python install


TS-NMF is used in a similar way as the module decomposition.NMF from Scikit-Learn. The extra thing that you need is a list of list that contains the labels to build the matrix L.

Suppose you want to get 3 topics from 5 documents. The 5 documents should be represented in a matrix V, the most used way is apply a TF-IDF Vectorizer, which reflect how important a word is to a document.

Each element of the list of list of labels correspond to a document. These elements contain a list of topics that contrain the document. For example

labels = [[],
          [0,2], # document 1
          [1]] # document 4

means that the document 1 is contrained to be topic 0 or 2 and document 4 to be topic 1. For the other documents all the topics are permitted.

Finally, to run TS-NMF:

from tsnmf import TSNMF

tsnmf = TSNMF(n_components=3, random_state=1)
W = tsnmf.fit_transform(V, labels=labels)
H = tsnmf.components_


  • Developed mainly by Victor Navarro (@vokturz), under the guidance of Eduardo Graells-Garrido (@carnby), in the context of CONICYT Fondo de Fomento al Desarrollo Científico y Tecnológico (FONDECYT) Proyecto de Iniciación 11180913.
  • Based on scikit-learn's NMF code and the original ws-nmf.


  1. MacMillan, Kelsey, and James D. Wilson. "Topic supervised non-negative matrix factorization." arXiv preprint arXiv:1706.05084 (2017).

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for tsnmf, version 1.0.4
Filename, size File type Python version Upload date Hashes
Filename, size tsnmf-1.0.4-py3-none-any.whl (9.0 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size tsnmf-1.0.4.tar.gz (8.1 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page