Skip to main content

Visualizing and propagating uncertainty in PCA

Project description

VIPurPCA

VIPurPCA offers a visualization of uncertainty propagated through the dimensionality reduction technique Principal Component Analysis (PCA) by automatic differentiation.

Installation

VIPurPCA requires Python 3.7.3 or later and can be installed via:

pip install vipurpca

A website showing results and animations can be found here.

Usage

Propagating uncertainty through PCA and visualize output uncertainty as animated scatter plot

In order to propagate uncertainty through PCA the class PCA can be used, which has the following parameters, attributes, and methods:

Parameters
matrix : array_like
Array of size [n, p] containing mean numbers to which VIPurPCA should be applied.
sample_cov : array_like of shape [n, n] or [n], default=None, optional
Input uncertainties in terms of the sample covariance matrix. If sample_cov is one-dimensional its values are assumed to be the diagonal of a diagonal matrix. Used to compute the total covariance matrix over the input using the Kronecker product of sample_cov and feature_cov.
feature_cov : array_like of shape [p, p] or [p], default=None, optional
Input uncertainties in terms of the feature covariance matrix. If feature_cov is one-dimensional its values are assumed to be the diagonal of a diagonal matrix. Used to compute the total covariance matrix over the input using the Kronecker product of sample_cov and feature_cov.
full_cov : array_like of shape [np, np] or [np], default=None, optional
Input uncertainties in terms of the full covariance matrix. If full_cov is one-dimensional its values are assumed to be the diagonal of a diagonal matrix. Used alternatively to the Kronecker product of sample_cov and feature_cov. Requires more memory.
n_components : int or float, default=None, optional
Number of components to keep.
axis : {0, 1} , default=0, optional
The default expects samples in rows and features in columns.
Attributes
size : [n, p]
Dimension of matrix (n: number of samples, p: number of dimensions)
eigenvalues : ndarray of size [n_components]
Eigenvalues obtained from eigenvalue decomposition of the covariance matrix.
eigenvectors : ndarray of size [n_componentsp, np]
Eigenvectors obtained from eigenvalue decomposition of the covariance matrix.
jacobian : ndarray of size [n_componentsp, np]
Jacobian containing derivatives of eigenvectors w.r.t. input matrix.
cov_eigenvectors : ndarray of size [n_componentsp, n_componentsp]
Propagated uncertainties of eigenvectors.
transformed data : ndarray of size [n, n_components]
Low dimensional representation of data after applying PCA.
Methods
pca_value() Apply PCA to the matrix.
compute_cov_eigenvectors(save_jacobian=False) Compute uncertainties of eigenvectors.
animate(pcx=1, pcy=2, n_frames=10, labels=None, outfile='animation.gif') Generate animation of PCA-plot of PC pcx vs. PC pcy with n_frames number of frames. labels (list, 1d array) indicate labelling of individual samples. >

Example datasets

Two example datasets can be loaded after installing VIPurPCA providing mean, covariance and labels.

from vipurpca import load_data
Y, cov_Y, y = load_data.load_studentgrades_dataset()
Y, cov_Y, y = load_data.load_estrogen_dataset()

More information on the datasets can be found here

Example

from vipurpca import load_data
from vipurpca import PCA

# load mean (Y), uncertainty estimates (cov_Y) and labels (y)
Y, cov_Y, y = load_data.load_estrogen_dataset()
pca = PCA(matrix=Y, sample_cov=None, feature_cov=None,
full_cov=cov_Y, n_components=3, axis=0)
# compute PCA
pca.pca_value()
# Bayesian inference
pca.compute_cov_eigenvectors(save_jacobian=False)# Create animation
pca.animate(1, 2, labels=y, outfile='animation.gif')

The resulting animation can be found here here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vipurpca-0.0.4.tar.gz (127.7 kB view details)

Uploaded Source

Built Distribution

vipurpca-0.0.4-py3-none-any.whl (125.1 kB view details)

Uploaded Python 3

File details

Details for the file vipurpca-0.0.4.tar.gz.

File metadata

  • Download URL: vipurpca-0.0.4.tar.gz
  • Upload date:
  • Size: 127.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/6.8.0 pkginfo/1.7.1 requests/2.31.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.2

File hashes

Hashes for vipurpca-0.0.4.tar.gz
Algorithm Hash digest
SHA256 c69b876239397c3d44253d7985833f4db8cb9385b14a9795992985fdcd3e977c
MD5 de48fc48872833d9df95e8c3a2a07c94
BLAKE2b-256 afdb51b9b48c26cbef7ca1515b4fe22d871a9ae29b51b2580997c95ff8b846a6

See more details on using hashes here.

File details

Details for the file vipurpca-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: vipurpca-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 125.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/6.8.0 pkginfo/1.7.1 requests/2.31.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.2

File hashes

Hashes for vipurpca-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 88451bad7c656bb1bb40d422640a5afa0378f352989841f8f686ba58009d0afe
MD5 165a8b1707e81bbd946a5400075276c3
BLAKE2b-256 71895abb8d05ad3e1f647676513dd559486c9d80222a215a028513ebab6c5a3c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page