Skip to main content

Weighted Principal Component Analysis using Expectation Maximization

Project description

wv

Weighted Principal Component Analysis using Expectation Maximization

To install: pip install wv

Overview

The wv package offers a sophisticated approach to Principal Component Analysis (PCA) through the implementation of Weighted Expectation Maximization PCA (EMPCA). This method is particularly useful for handling datasets with noisy or incomplete entries, as it allows for the incorporation of weights that can vary across observations and variables. This package provides tools not only for EMPCA but also includes implementations for classic PCA and a lower rank matrix approximation method, both of which can be used for comparative analysis.

Key Features

  • Weighted EMPCA: Iteratively solves PCA with weighted data, ideal for datasets with missing or uncertain values.
  • Classic PCA: A straightforward implementation of PCA using Singular Value Decomposition (SVD), without support for weighted data.
  • Lower Rank Matrix Approximation: An alternative method that iteratively approximates data using a set of model vectors that are not necessarily orthonormal.
  • Model Inspection: After computation, users can inspect eigenvectors, coefficients, and reconstructed models to analyze the principal components and the variance explained by them.

Installation

To install the package, use the following pip command:

pip install wv

Usage

Weighted EMPCA

To perform Weighted EMPCA on your data, you can use the empca function. Here is an example of how to use it:

import numpy as np
from wv import empca

# Example data and weights
data = np.random.normal(size=(100, 10))
weights = np.ones_like(data)  # Equal weights
weights[data < 0] = 0.5  # Lower weight for negative values

# Perform EMPCA
model = empca(data, weights, niter=10, nvec=3)

# Access the eigenvectors and model data
eigenvectors = model.eigvec
reconstructed_data = model.model

Classic PCA

For datasets without the need for weighting, you can use the classic_pca function:

from wv import classic_pca

# Example data
data = np.random.normal(size=(100, 10))

# Perform classic PCA
model = classic_pca(data)

# Eigenvectors
eigenvectors = model.eigvec

Lower Rank Matrix Approximation

This method is useful for datasets where the goal is to approximate the data without necessarily obtaining orthonormal vectors:

from wv import lower_rank

# Example data and weights
data = np.random.normal(size=(100, 10))
weights = np.ones_like(data)

# Perform lower rank approximation
model = lower_rank(data, weights, niter=10, nvec=3)

# Model vectors
model_vectors = model.eigvec

Documentation

Classes and Functions

Model

A class for storing the results of PCA computations. It includes the following attributes:

  • eigvec: Eigenvectors of the model.
  • data: Original data used in the model.
  • weights: Weights applied to the data.
  • coeff: Coefficients to reconstruct the data using the eigenvectors.
  • model: Reconstructed data using the eigenvectors and coefficients.

empca

Function to perform Weighted EMPCA. Parameters include:

  • data: Data matrix.
  • weights: Corresponding weights matrix.
  • niter: Number of iterations for the EM algorithm.
  • nvec: Number of eigenvectors to compute.
  • smooth: Optional smoothing parameter.
  • randseed: Seed for the random number generator.

classic_pca

Function to perform classic PCA using SVD. It only requires the data matrix and optionally the number of eigenvectors.

lower_rank

Function for lower rank matrix approximation. Similar to empca but does not enforce orthonormality of the resulting vectors.

Additional Tools

  • SavitzkyGolay: A utility class for smoothing signals using the Savitzky-Golay filter. Useful for preprocessing data or smoothing eigenvectors in the context of PCA.

Contributing

Contributions to the wv package are welcome. Please ensure that any pull requests or issues are clear and reproducible.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wv-0.0.5.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wv-0.0.5-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file wv-0.0.5.tar.gz.

File metadata

  • Download URL: wv-0.0.5.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for wv-0.0.5.tar.gz
Algorithm Hash digest
SHA256 8baadc367bd6ea4ed88aeac9d1b0bb3af7347398047b2e851f33a6bcd6e721de
MD5 43beaf59a03415cfefa0e8ca8faee282
BLAKE2b-256 c5b71e95c502c27d57936141fabad1ccb222439603d25faffe6904c6e36caf48

See more details on using hashes here.

File details

Details for the file wv-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: wv-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 11.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for wv-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 fe7546dca046a9b22b2ef0c8c0bd782880bb3afc1236820f777aaeb4b85dd01d
MD5 d9fff9818136440f0bf47cf7f10445f2
BLAKE2b-256 bb613a781a4158a57759bcc40adcbd50c760851e2db90472f49d9cad1a2eb7f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page