PCMF is a Python package of Positive Collective Matrix Factorization(PCMF). PCMF is a model that combines the interpretability of NMF and the extensibility of CMF.

These details have not been verified by PyPI

Project links

Project description

Positive Collective Matrix Factorization (PCMF)

We propose Positive Collective Matrix Factorization (PCMF). PCMF is a model that combines the interpretability of NMF and the extensibility of CMF.

Description of PCMF

Problem setting

When there are two relational data (matrix , ) that share one set, and you want to predict the relational data (matrix , ) and extract feature representations (matrix , , ) at the same time.

Example

Two relational data (matrix , )

: Patient-disease matrix
: Patient-patient attribute matrix

At this time, the patient set is shared.

Feature representations

: Patient matrix
: Disease matrix
: Patient attributes matrix

Detailed description of PCMF

PCMF is a model that combines the advantages of NMF, "interpretability," and the advantages of CMF, "extensibility." Specifically, for each matrix, interpretability is achieved by converting the elements of the matrix into positive values using a softplus function. The backpropagation method is used as the learning method.

The illustration of PCMF is as follows.

Example

This will be described using the previous example.

The patient matrix with the softplus function applied is the patient matrix .
The disease matrix with the softplus function applied is the disease matrix .
The patient attribute matrix with the softplus function applied is the patient attribute matrix .
Applying the link function to the product of the patient matrix and the disease matrix yields the predicted value of the patient-disease matrix .
Applying the link function to the product of the patient matrix and the patient attributes yields the predicted value of the patient-patient attributes matrix .

Softplus function

The softplus function is a narrowly monotonically increasing function that takes a positive value for all real numbers . It is applied to each element of the matrix, and it is assumed that a matrix of the same size is output.

Link function

Note that due to the influence of the Softplus function, the input value of the PCMF link function is always positive. Choose a link function depending on the nature and purpose of the matrix you are predicting.

When the value of the matrix to be predicted is (-∞, ∞)
Log function.
When the value of the matrix to be predicted is (0, ∞)
Linear function.
When the value of the matrix to be predicted is {0,1}
Sigmoid function. (Since the output value of the sigmoid function is 0.5 or more when the input value is 0 or more, the operation of subtracting a common positive number uniformly for the input is performed.)

Feature representations analysis

Feature representations analysis can be performed by analyzing the feature representations (matrix , , ) extracted by PCMF. (Note that PCMF outputs the matrix , , ), which is the format to which the softplus function is applied, as the final output.)

CMF and NMF (reference)

Non-Negative Matrix Factorization (NMF) and Collective matrix Factorization (CMF) exist as methods of matrix factorization. The features of each are as follows.

Non-Negative Matrix Factorization（NMF）[1][2]

Predict the original matrix by the product of two nonnegative matrices.

Advantages
Since it is non-negative, a highly interpretable feature representation can be obtained.
Disadvantages
Low extensibility because multiple relationships cannot be considered.

Collective matrix Factorization（CMF）[3]

This is a method of factoring two or more relational data (matrix) at the same time when a set has multiple relations.

Advantages
In addition to being able to consider multiple relationships, flexible output is possible (link function), so it is highly extensible.
Disadvantages
The interpretability is low because positive and negative values appear in the elements of the matrix.

Installation

You can get PCMF from PyPI. Our project in PyPI is here.

pip install pcmf

Usage

For more detail, please read examples/How_to_use_PCMF.ipynb. If it doesn't render at all in github, please click here.

Training

cmf = Positive_Collective_Matrix_Factorization(X, Y, alpha=0.5, d_hidden=12, lamda=0.1)
cmf.train(link_X = 'sigmoid', link_Y = 'sigmoid', 
          weight_X = None, weight_Y =wY, 
          optim_steps=501, verbose=50, lr=0.05)

License

MIT Licence

Citation

You may use our package(PCMF) under MIT License. If you use this program in your research then please cite:

PCMF Package

@misc{sumiya2021pcmf,
  author = {Yuki, Sumiya and Ryo, Matsui and Kensho, Kondo and Kazuhide, Nakata},
  title = {PCMF},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {https://github.com/N-YS-KK/PCMF}
}

PCMF Paper[ link ](Japanese)

@article{sumiya2021pcmf,
  title={Patient Disease Prediction and Medical Feature Extraction using Matrix Factorization},
  author={Yuki, Sumiya and Atsuyoshi, Matsuda and Kenji, Araki and Kazuhide, Nakata},
  journal={The Japanese Society for Artifical Intelligence},
  year={2021}
}

Reference

[5] [6] [7] are used in the code.

[1] Daniel D. Lee and H. Sebastian Seung. “Learning the parts of objects by non-negative matrix factorization.” Nature 401.6755 (1999): 788-791.

[2] Daniel D. Lee and H. Sebastian Seung. “Algorithms for non-negative matrix factorization.” Advances in neural information processing systems 13 (2001): 556-562.

[3] Ajit P. Singh and Geoffrey J. Gordon. Relational learning via collective matrix factorization. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: 650-658, 2008.

[4] Yuki Sumiya, Kazuhide Nakata, Atsuyoshi Matsuda, Kenji Araki. "Patient Disease Prediction and Relational Data Mining using Matrix Factorization." The 40th Joint Conference on Medical Informatics, 2020.

[5] David E. Rumelhart, Geoffrey E. Hinton and Ronald J. Williams. “Learning representations by back-propagating errors.” Nature 323.6088 (1986): 533-536

[6] Diederik P. Kingma and Jimmy Ba. “Adam: A method for stochastic optimization.” arXiv preprint arXiv:1412.6980 (2014).

[7] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfel-low, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mane, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viegas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu and Xiaoqiang Zheng. “Tensor-flow: Large-scale machine learning on heterogeneous distributed systems.” arXiv preprint arXiv:1603.04467 (2016)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.5

Sep 1, 2021

0.1.4

Aug 10, 2021

0.1.3

Aug 9, 2021

0.1.2

Aug 7, 2021

0.1.1

Jul 27, 2021

0.1.0

Jul 27, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PCMF-0.1.5.tar.gz (8.2 kB view details)

Uploaded Sep 1, 2021 Source

Built Distribution

PCMF-0.1.5-py3-none-any.whl (7.3 kB view details)

Uploaded Sep 1, 2021 Python 3

File details

Details for the file PCMF-0.1.5.tar.gz.

File metadata

Download URL: PCMF-0.1.5.tar.gz
Upload date: Sep 1, 2021
Size: 8.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.1.7 CPython/3.7.9 Windows/10

File hashes

Hashes for PCMF-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`ca5105d5dbd3631250bd2e43a8eb3271b67ff06331606cfd0a914f3427eb8186`
MD5	`4f0600b9cc44d4db5a5fade7f90ad7bd`
BLAKE2b-256	`1fa31481f88e773464bff481c157c3a03020be6866167ca2f7791b116e8f1dda`

See more details on using hashes here.

File details

Details for the file PCMF-0.1.5-py3-none-any.whl.

File metadata

Download URL: PCMF-0.1.5-py3-none-any.whl
Upload date: Sep 1, 2021
Size: 7.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.1.7 CPython/3.7.9 Windows/10

File hashes

Hashes for PCMF-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a9a2a98d1bdc67456db116738fb18aaa0b67f96927ca91072a88218ee7aea377`
MD5	`4f32569a8ced159e9311bd2ecce098bf`
BLAKE2b-256	`944473465c4c13cd95ebf6d68f98dfc8bd979fc5cbc9660ecfd9e6dbc43032eb`

See more details on using hashes here.

pcmf 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Positive Collective Matrix Factorization (PCMF)

Description of PCMF

Problem setting

Example

Detailed description of PCMF

Example

Softplus function

Link function

Feature representations analysis

CMF and NMF (reference)

Non-Negative Matrix Factorization（NMF）[1][2]

Collective matrix Factorization（CMF）[3]

Installation

Usage

Training

License

Citation

Reference

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes