Skip to main content

Visualizing and propagating uncertainty in PCA

Project description

VIPurPCA

VIPurPCA offers a visualization of uncertainty propagated through the dimensionality reduction technique Principal Component Analysis (PCA) by automatic differentiation.

Installation

VIPurPCA requires Python 3.7.3 or later and can be installed via:

pip install vipurpca

A website showing results and animations can be found here.

Usage

Propagating uncertainty through PCA and visualize output uncertainty as animated scatter plot

In order to propagate uncertainty through PCA the class PCA can be used, which has the following parameters, attributes, and methods:

Parameters
matrix : array_like
Array of size [n, p] containing mean numbers to which VIPurPCA should be applied.
n_components : int or float, default=None, optional
Number of components to keep.
axis : {0, 1} , default=0, optional
The default expects samples in rows and features in columns.
cov_data : array_like of shape [np] or [np, n*p] , default=None, optional
Uncertainties attached to the numbers in matrix. If cov_data is one-dimensional it is assumend to be the diagonal of a diagonal matrix. If None
compute_jacobian : Boolean, default=False, optional
Whether or whether not to propagate uncertainty through PCA.
Attributes
size : [n, p]
Dimension of matrix (n: number of samples, p: number of dimensions)
covariance : ndarray of size [p, p]
Features' covariance matrix.
eigenvalues : ndarray of size [n_components]
Eigenvalues obtained from eigenvalue decomposition of the covariance matrix.
eigenvectors : ndarray of size [n_componentsp, np]
Eigenvectors obtained from eigenvalue decomposition of the covariance matrix.
jacobian : ndarray of size [n_componentsp, np]
Jacobian containing derivatives of eigenvectors w.r.t. input matrix.
jacobian_eigenvalues : ndarray of size [n_componentsp, np]
Jacobian containing derivatives of eigenvalues w.r.t. input matrix.
cov_eigenvectors : ndarray of size [n_componentsp, n_componentsp]
Propagated uncertainties of eigenvectors.
cov_eigenvalues : ndarray of size [n_components*n_components]
Propagaged uncertainties of eigenvalues.
transformed data : ndarray of size [n, n_components]
Low dimensional representation of data after applying PCA.
Methods
pca_value() Apply PCA to the matrix.
pca_grad(center=True) Apply PCA to the matrix and compute the jacobian and jacobian_eigenvalues using automatic differentiation.
transform_data() Transform matrix according to eigenvectors and reduce dimensionality according to n_components.
compute_cov_eigenvectors() Compute uncertainties of eigenvectors.
compute_cov_eigenvalues() Compute uncertainties of eigenvalues.
animate(n_frames=10, labels=None, outfile='animation.html') Generate animation with n_frames number of frames with plotly. labels (list, 1d array) indicate labelling of individual samples. Save animation (as html) at outfile.

Example datasets

Three example datasets can be loaded after installing VIPurPCA providing mean, covariance and labels.

from vipurpca import load_data
Y, cov_Y, y = load_data.load_studentgrades_dataset()
Y, cov_Y, y = load_data.load_mice_dataset()
Y, cov_Y, y = load_data.load_estrogen_dataset()

More information on the datasets can be found here

Example

from vipurpca import load_data
from vipurpca import PCA

# load mean (Y), uncertainty estimates (cov_Y) and labels (y)
Y, cov_Y, y = load_data.load_mice_dataset()
pca_student_grades = PCA(matrix=Y, cov_data=cov_Y, n_components=2, axis=0, compute_jacobian=True)
# compute PCA with backprop
pca_student_grades.pca_grad()
# Bayesian inference
pca_student_grades.compute_cov_eigenvectors()
pca_student_grades.compute_cov_eigenvalues()
# Transform data 
pca_student_grades.transform_data()
pca_student_grades.animate(n_frames=10, labels=y, outfile='animation.html')

The resulting animation can be found here here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vipurpca-0.0.1.tar.gz (2.8 MB view hashes)

Uploaded source

Built Distribution

vipurpca-0.0.1-py3-none-any.whl (2.8 MB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page