Skip to main content

Weighted Principal Component Analysis using Expectation Maximization

Project description

wv

Weighted Principal Component Analysis using Expectation Maximization

To install: pip install wv

Overview

The wv package offers a sophisticated approach to Principal Component Analysis (PCA) through the implementation of Weighted Expectation Maximization PCA (EMPCA). This method is particularly useful for handling datasets with noisy or incomplete entries, as it allows for the incorporation of weights that can vary across observations and variables. This package provides tools not only for EMPCA but also includes implementations for classic PCA and a lower rank matrix approximation method, both of which can be used for comparative analysis.

Key Features

  • Weighted EMPCA: Iteratively solves PCA with weighted data, ideal for datasets with missing or uncertain values.
  • Classic PCA: A straightforward implementation of PCA using Singular Value Decomposition (SVD), without support for weighted data.
  • Lower Rank Matrix Approximation: An alternative method that iteratively approximates data using a set of model vectors that are not necessarily orthonormal.
  • Model Inspection: After computation, users can inspect eigenvectors, coefficients, and reconstructed models to analyze the principal components and the variance explained by them.

Installation

To install the package, use the following pip command:

pip install wv

Usage

Weighted EMPCA

To perform Weighted EMPCA on your data, you can use the empca function. Here is an example of how to use it:

import numpy as np
from wv import empca

# Example data and weights
data = np.random.normal(size=(100, 10))
weights = np.ones_like(data)  # Equal weights
weights[data < 0] = 0.5  # Lower weight for negative values

# Perform EMPCA
model = empca(data, weights, niter=10, nvec=3)

# Access the eigenvectors and model data
eigenvectors = model.eigvec
reconstructed_data = model.model

Classic PCA

For datasets without the need for weighting, you can use the classic_pca function:

from wv import classic_pca

# Example data
data = np.random.normal(size=(100, 10))

# Perform classic PCA
model = classic_pca(data)

# Eigenvectors
eigenvectors = model.eigvec

Lower Rank Matrix Approximation

This method is useful for datasets where the goal is to approximate the data without necessarily obtaining orthonormal vectors:

from wv import lower_rank

# Example data and weights
data = np.random.normal(size=(100, 10))
weights = np.ones_like(data)

# Perform lower rank approximation
model = lower_rank(data, weights, niter=10, nvec=3)

# Model vectors
model_vectors = model.eigvec

Documentation

Classes and Functions

Model

A class for storing the results of PCA computations. It includes the following attributes:

  • eigvec: Eigenvectors of the model.
  • data: Original data used in the model.
  • weights: Weights applied to the data.
  • coeff: Coefficients to reconstruct the data using the eigenvectors.
  • model: Reconstructed data using the eigenvectors and coefficients.

empca

Function to perform Weighted EMPCA. Parameters include:

  • data: Data matrix.
  • weights: Corresponding weights matrix.
  • niter: Number of iterations for the EM algorithm.
  • nvec: Number of eigenvectors to compute.
  • smooth: Optional smoothing parameter.
  • randseed: Seed for the random number generator.

classic_pca

Function to perform classic PCA using SVD. It only requires the data matrix and optionally the number of eigenvectors.

lower_rank

Function for lower rank matrix approximation. Similar to empca but does not enforce orthonormality of the resulting vectors.

Additional Tools

  • SavitzkyGolay: A utility class for smoothing signals using the Savitzky-Golay filter. Useful for preprocessing data or smoothing eigenvectors in the context of PCA.

Contributing

Contributions to the wv package are welcome. Please ensure that any pull requests or issues are clear and reproducible.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wv-0.0.6.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wv-0.0.6-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file wv-0.0.6.tar.gz.

File metadata

  • Download URL: wv-0.0.6.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for wv-0.0.6.tar.gz
Algorithm Hash digest
SHA256 c4ace3537281621bb6cbd8c2982c67ac5e0965b9e117fbe79cf30098f0663083
MD5 1eae6d23809e98037628035dc048ab8a
BLAKE2b-256 a7dce84316af1739e984a08949910d8e48a639ece0de07fdf7a926d0be3bb4a6

See more details on using hashes here.

File details

Details for the file wv-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: wv-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 11.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for wv-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 3f44751c376334f6c8abfc145c45b8a86a3a994c68b342ccf173463736d25538
MD5 d694fc051df9a94df4ccd2132aa1ff4f
BLAKE2b-256 a66a090ac866ba005f0152f07423049890bc6a1ebab63563368aa9cfa7447a58

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page