Skip to main content

A Python package for QR based PCA decomposition with CUDA acceleration via torch.

Project description

QRPCA

qrpca works similarly to sklean.decomposition, but employs a QR-based PCA decomposition and supports CUDA acceleration via torch.

How to install qrpca

The qrpca can be installed by the PyPI and pip:

pip install qrpca

If you download the repository, you can also install it in the qrpca directory:

git clone https://github.com/xuquanfeng/qrpca
cd qrpca
python setup.py install

You can access it by clicking on Github-qrpca .

Usage

Here is a demo for the use of qrpca.

The following are the results of retaining principal components containing 95% of the information content by principal component analysis.

You can set the parameter n_components to a value between 0 and 1 to execute the PCA on the corresponding proportion of the entire data, or set it to an integer number to reserve the n_omponents components.

import torch
import numpy as np
from qrpca.decomposition import qrpca
from qrpca.decomposition import svdpca

# Generate the random data
demo_data = torch.rand(60000,2000)
n_com = 0.95

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# qrpca
pca = qrpca(n_component_ratio=n_com,device=device) # The percentage of information retained.
# pca = qrpca(n_component_ratio=10,device=device) # n principal components are reserved.
demo_qrpca = pca.fit_transform(demo_data)
print(demo_pca)

# SVDPCA
pca = svdpca(n_component_ratio=n_com,device=device)
demo_svdpca = pca.fit_transform(demo_data)
print(demo_svdpca)

Comparision with sklearn

The methods and usage of qrpca are almost identical to those of sklearn.decomposition.PCA. If you want to switch from sklearn to qrpca, all you have to do is change the import and declare the device if you have a GPU, and that's it.

And here's an illustration of how minimal the change is when different PCA is used:

  • qrpca.decomposition.qrpca
from qrpca.decomposition import qrpca    
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
pca = qrpca(n_component_ratio=n_com,device=device)
demo_qrpca = pca.fit_transform(demo_data)
  • qrpca.decomposition.svdpca
from qrpca.decomposition import svdpca
pca = svdpca(n_component_ratio=n_com)
demo_svdpca = pca.fit_transform(demo_data)
  • sklearn.decomposition.PCA
from sklearn.decomposition import PCA
pca = PCA(n_components=n_com)
demo_pca = pca.fit_transform(demo_data)

Performance benchmark sklearn

With the acceleration of GPU computation, the speed of both QR decomposition and singular value decomposition in qrpca is much higher than that in sklearn

We run the different PCA methods on data with different numbers of rows and columns, and then we compare their PCA degradation times and plotted the distribution of the times. Here are the two plots.

Comparison of PCA degradation time with different number of rows and different methods for the case of 1000 columns.

Comparison of PCA reduction time with different number of columns and different methods for the case of 30000 rows.

We can see from the above two facts that qrpca may considerably cut program run time by using GPU acceleration, while also having a very cheap migration cost and a guaranteed impact.

Requirements

  • numpy>=1.21.1
  • pandas>=1.3.5
  • torch>=1.8.1
  • torchvision>=0.8.0
  • cudatoolkit>=0.7.1
  • scikit-learn>=1.0.2

Copyright & License

2022 Xu Quanfeng (xuquanfeng@shao.ac.cn) & Rafael S. de Souza (drsouza@shao.ac.cn) & Shen Shiyin (ssy@shao.ac.cn) & Peng Chen (pengchzn@gmail.com)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

References

  • Sharma, Alok and Paliwal, Kuldip K. and Imoto, Seiya and Miyano, Satoru 2013, International Journal of Machine Learning and Cybernetics, 4, 6, doi: 10.1007/s13042-012-0131-7 .

Citing qrpca

If you want to cite qrpca, please use the following citations.

Software Citation: Xu Quanfeng, & Rafael S. de Souza. (2022). PCA algorithm of QR accelerated SVD decomposition (1.5). Zenodo. https://doi.org/10.5281/zenodo.6555926

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qrpca-1.5.6.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qrpca-1.5.6-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file qrpca-1.5.6.tar.gz.

File metadata

  • Download URL: qrpca-1.5.6.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.8.1 keyring/23.1.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for qrpca-1.5.6.tar.gz
Algorithm Hash digest
SHA256 18e42ba4c015fa713c8e52f384cc9623ea50e6ee984aee65e28d5a256e2963d9
MD5 496a9c76b7da0a24f04fba1f78788009
BLAKE2b-256 dc5ab5d77ade51fea106335c2d128faf55b9c0ed175e6272e8c3d398bbfe2a07

See more details on using hashes here.

File details

Details for the file qrpca-1.5.6-py3-none-any.whl.

File metadata

  • Download URL: qrpca-1.5.6-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.8.1 keyring/23.1.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for qrpca-1.5.6-py3-none-any.whl
Algorithm Hash digest
SHA256 84ff6462e829d59d638f0b3737e3e4b59a30eda3b00dfba4935aba97e775d3ec
MD5 8598b1b51262c7b61dbf322df153ced9
BLAKE2b-256 ca33b5b32d4baec597b8395e084fe8f9207f5098c5fcbcb30d3c50280eb433ac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page