Comprehensive PyTorch implementation of PCA.
Project description
PyTorch PCA
A comprehensive PCA implementation using PyTorch, inspired by the R package pcaMethods.
Overview
This package provides a unified interface to eight PCA algorithms, all accessible via the pca function. The main entry point is pytorch_pca.pca, and the package exposes pca, PCAResult, and AllowedMethod in its public API.
Installation
pip install pytorch_pca
Features
This package provides 8 different PCA algorithms optimized for various scenarios:
svd: Standard SVD-based PCA (fastest, complete data only)nipals: NIPALS algorithm (handles missing values effectively)rnipals: Robust NIPALS (resistant to outliers and missing values)ppca: Probabilistic PCA (classical probabilistic model)bpca: Bayesian PCA (probabilistic approach with uncertainty quantification)svd_impute: SVD-based PCA with missing value imputationrpca: Robust PCA using iterative outlier detectionnlpca: Non-linear PCA using autoencoder neural network architecture
Quick Start
import torch
from pytorch_pca import pca
# Generate sample data
X = torch.randn(100, 20)
# Basic PCA using the unified interface
result = pca(X, method="svd", n_components=5)
print(f"Transformed data shape: {result.transformed_data.shape}")
print(f"Components shape: {result.components.shape}")
print(f"Explained variance: {result.explained_variance_ratio}")
# Alternative: use pcaMethods-style naming
print(f"Scores shape: {result.scores.shape}")
print(f"Loadings shape: {result.loadings.shape}")
# Handle missing data with NIPALS
X_missing = X.clone()
X_missing[10:20, 5:10] = float('nan')
result = pca(X_missing, method="nipals", n_components=3)
# Robust PCA for data with outliers
result = pca(X, method="rpca", n_components=3)
# Probabilistic approaches
result = pca(X, method="ppca", n_components=3)
result = pca(X, method="bpca", n_components=3)
# Non-linear PCA with neural networks
result = pca(X, method="nlpca", n_components=3)
# Reconstruct data
X_reconstructed = result.reconstruct(n_components=3)
API Reference
Unified Interface
result = pca(data, method="svd", n_components=2, center=True, scale=False, **kwargs)
Method-Specific Parameters
NIPALS Methods
nipals(data, max_iter=1000, tol=1e-6, ...)
rnipals(data, max_iter=1000, tol=1e-6, ...)
Probabilistic Methods
ppca(data, max_iter=1000, tol=1e-6, ...)
bpca(data, max_iter=1000, tol=1e-6, ...)
Robust PCA
rpca(data, max_iter=100, tol=1e-6, ...)
Non-linear PCA
nlpca(data, hidden_dims=[10, 5], max_iter=1000, lr=0.01, ...)
Result Object
The PCAResult object provides:
transformed_data: Data projected onto principal components(n_samples, n_components)components: Principal components (eigenvectors)(n_components, n_features)eigenvalues: Eigenvalues of the covariance matrixexplained_variance_ratio: Proportion of variance explained by each componentmethod: Name of the method usedscores: Alias fortransformed_data(pcaMethods compatibility)loadings: Transposed components (pcaMethods compatibility)reconstruct(n_components=None): Reconstruct data using selected components
Method Selection Guide
- Complete data, speed priority:
svd - Missing values:
nipalsorsvd_impute - Outliers present:
rpcaorrnipals - Uncertainty quantification:
bpca - Probabilistic modeling:
ppca - Non-linear relationships:
nlpca
Dependencies
- torch: The only required dependency (>= 2.7.1)
Development & Testing
- black: Code formatting
- flake8: Code linting
- mypy: Type checking
- pytest: Test runner
- scikit-learn: For test comparisons
- setuptools, setuptools-scm: Packaging
Comprehensive tests are provided in the tests/ directory, covering all algorithms, edge cases, and robust/non-linear PCA scenarios.
Testing
# Run all tests from root
pytest ./tests
License
This project is licensed under the MIT License - see the LICENSE file for details.
Author
Ricardo Yanzon - ricayanzon
Acknowledgments
- Inspired by the R package pcaMethods
- Built with PyTorch
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pytorch_pca-0.1.3.tar.gz.
File metadata
- Download URL: pytorch_pca-0.1.3.tar.gz
- Upload date:
- Size: 25.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0fbfb20a283093a8d5460bd5e932ae32c1ee47034cf38da8d588f0a7da6ba1a6
|
|
| MD5 |
ba33c42900f11667fb4f6ebbecb8dbb4
|
|
| BLAKE2b-256 |
a728d377b1335abd413edc48fd4d52880735857d900e5d70d1992ea1939d677a
|
Provenance
The following attestation bundles were made for pytorch_pca-0.1.3.tar.gz:
Publisher:
publish.yml on ricayanzon/pytorch_pca
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pytorch_pca-0.1.3.tar.gz -
Subject digest:
0fbfb20a283093a8d5460bd5e932ae32c1ee47034cf38da8d588f0a7da6ba1a6 - Sigstore transparency entry: 275568525
- Sigstore integration time:
-
Permalink:
ricayanzon/pytorch_pca@51ac8ac6e48c2ff03209cc4a0158a749092d336f -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/ricayanzon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@51ac8ac6e48c2ff03209cc4a0158a749092d336f -
Trigger Event:
push
-
Statement type:
File details
Details for the file pytorch_pca-0.1.3-py3-none-any.whl.
File metadata
- Download URL: pytorch_pca-0.1.3-py3-none-any.whl
- Upload date:
- Size: 19.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
21ecb9c79e358626b337a4597578d9eb57e6532ccea0921d859737b0a5d2694d
|
|
| MD5 |
503bb5b4cd177a46db884145447cd035
|
|
| BLAKE2b-256 |
0c41c4ff30a1895ab35705072b0b7228740212903dfa149d70814a55d76e5d68
|
Provenance
The following attestation bundles were made for pytorch_pca-0.1.3-py3-none-any.whl:
Publisher:
publish.yml on ricayanzon/pytorch_pca
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pytorch_pca-0.1.3-py3-none-any.whl -
Subject digest:
21ecb9c79e358626b337a4597578d9eb57e6532ccea0921d859737b0a5d2694d - Sigstore transparency entry: 275568534
- Sigstore integration time:
-
Permalink:
ricayanzon/pytorch_pca@51ac8ac6e48c2ff03209cc4a0158a749092d336f -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/ricayanzon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@51ac8ac6e48c2ff03209cc4a0158a749092d336f -
Trigger Event:
push
-
Statement type: