Implementation of Topic-Supervised Non-Negative Matrix Factorization
This repository contains an implementation of Topic-Supervised Non-Negative Matrix Factorization (TS-NMF)  with Sparse Matrices in Python, using a Scikit-Learn's compatible API.
How it Works
From : Suppose that one supervises k << n documents and identifies l << t topics that were contained in a subset of the documents. One can supervise the
NMF method using this information, represented by an nÃ—d topic supervision matrix L.The elements of L contrain the importance weights of matrix W and are of the following form:
Then, for a term-document matrix V and supervision matrix L, TS-NMF seeks matrices W and H that minimize
Where â—‹ represent the Hadamard (element-wise) product operator.
You can install TS-NMF via pip:
pip install tsnmf
Or clonning this repository and running
python setup.py install
TS-NMF is used in a similar way as the module
decomposition.NMF from Scikit-Learn. The extra thing that you need is a
list of list that contains the labels to build the matrix L.
Suppose you want to get 3 topics from 5 documents. The 5 documents should be represented in a matrix
V, the most used way is apply a TF-IDF Vectorizer, which reflect how important a word is to a document.
Each element of the
list of list of labels correspond to a document. These elements contain a list of topics that contrain the document. For example
labels = [, [0,2], # document 1 , , ] # document 4
means that the document 1 is contrained to be topic 0 or 2 and document 4 to be topic 1. For the other documents all the topics are permitted.
Finally, to run TS-NMF:
from tsnmf import TSNMF tsnmf = TSNMF(n_components=3, random_state=1) W = tsnmf.fit_transform(V, labels=labels) H = tsnmf.components_
- Developed mainly by Victor Navarro (@vokturz), under the guidance of Eduardo Graells-Garrido (@carnby), in the context of CONICYT Fondo de Fomento al Desarrollo CientÃfico y TecnolÃ³gico (FONDECYT) Proyecto de IniciaciÃ³n 11180913.
- Based on scikit-learn's NMF code and the original ws-nmf.
- MacMillan, Kelsey, and James D. Wilson. "Topic supervised non-negative matrix factorization." arXiv preprint arXiv:1706.05084 (2017).
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size tsnmf-1.0.4-py3-none-any.whl (9.0 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size tsnmf-1.0.4.tar.gz (8.1 kB)||File type Source||Python version None||Upload date||Hashes View|