Weighted Principal Component Analysis using Expectation Maximization
Project description
wv
Weighted Principal Component Analysis using Expectation Maximization
To install: pip install wv
Overview
The wv package offers a sophisticated approach to Principal Component Analysis (PCA) through the implementation of Weighted Expectation Maximization PCA (EMPCA). This method is particularly useful for handling datasets with noisy or incomplete entries, as it allows for the incorporation of weights that can vary across observations and variables. This package provides tools not only for EMPCA but also includes implementations for classic PCA and a lower rank matrix approximation method, both of which can be used for comparative analysis.
Key Features
- Weighted EMPCA: Iteratively solves PCA with weighted data, ideal for datasets with missing or uncertain values.
- Classic PCA: A straightforward implementation of PCA using Singular Value Decomposition (SVD), without support for weighted data.
- Lower Rank Matrix Approximation: An alternative method that iteratively approximates data using a set of model vectors that are not necessarily orthonormal.
- Model Inspection: After computation, users can inspect eigenvectors, coefficients, and reconstructed models to analyze the principal components and the variance explained by them.
Installation
To install the package, use the following pip command:
pip install wv
Usage
Weighted EMPCA
To perform Weighted EMPCA on your data, you can use the empca function. Here is an example of how to use it:
import numpy as np
from wv import empca
# Example data and weights
data = np.random.normal(size=(100, 10))
weights = np.ones_like(data) # Equal weights
weights[data < 0] = 0.5 # Lower weight for negative values
# Perform EMPCA
model = empca(data, weights, niter=10, nvec=3)
# Access the eigenvectors and model data
eigenvectors = model.eigvec
reconstructed_data = model.model
Classic PCA
For datasets without the need for weighting, you can use the classic_pca function:
from wv import classic_pca
# Example data
data = np.random.normal(size=(100, 10))
# Perform classic PCA
model = classic_pca(data)
# Eigenvectors
eigenvectors = model.eigvec
Lower Rank Matrix Approximation
This method is useful for datasets where the goal is to approximate the data without necessarily obtaining orthonormal vectors:
from wv import lower_rank
# Example data and weights
data = np.random.normal(size=(100, 10))
weights = np.ones_like(data)
# Perform lower rank approximation
model = lower_rank(data, weights, niter=10, nvec=3)
# Model vectors
model_vectors = model.eigvec
Documentation
Classes and Functions
Model
A class for storing the results of PCA computations. It includes the following attributes:
eigvec: Eigenvectors of the model.data: Original data used in the model.weights: Weights applied to the data.coeff: Coefficients to reconstruct the data using the eigenvectors.model: Reconstructed data using the eigenvectors and coefficients.
empca
Function to perform Weighted EMPCA. Parameters include:
data: Data matrix.weights: Corresponding weights matrix.niter: Number of iterations for the EM algorithm.nvec: Number of eigenvectors to compute.smooth: Optional smoothing parameter.randseed: Seed for the random number generator.
classic_pca
Function to perform classic PCA using SVD. It only requires the data matrix and optionally the number of eigenvectors.
lower_rank
Function for lower rank matrix approximation. Similar to empca but does not enforce orthonormality of the resulting vectors.
Additional Tools
SavitzkyGolay: A utility class for smoothing signals using the Savitzky-Golay filter. Useful for preprocessing data or smoothing eigenvectors in the context of PCA.
Contributing
Contributions to the wv package are welcome. Please ensure that any pull requests or issues are clear and reproducible.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wv-0.0.6.tar.gz.
File metadata
- Download URL: wv-0.0.6.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4ace3537281621bb6cbd8c2982c67ac5e0965b9e117fbe79cf30098f0663083
|
|
| MD5 |
1eae6d23809e98037628035dc048ab8a
|
|
| BLAKE2b-256 |
a7dce84316af1739e984a08949910d8e48a639ece0de07fdf7a926d0be3bb4a6
|
File details
Details for the file wv-0.0.6-py3-none-any.whl.
File metadata
- Download URL: wv-0.0.6-py3-none-any.whl
- Upload date:
- Size: 11.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f44751c376334f6c8abfc145c45b8a86a3a994c68b342ccf173463736d25538
|
|
| MD5 |
d694fc051df9a94df4ccd2132aa1ff4f
|
|
| BLAKE2b-256 |
a66a090ac866ba005f0152f07423049890bc6a1ebab63563368aa9cfa7447a58
|