Skip to main content

Comprehensive PyTorch implementation of PCA.

Project description

PyTorch PCA

A comprehensive PCA implementation using PyTorch, inspired by the R package pcaMethods.

Overview

This package provides a unified interface to eight PCA algorithms, all accessible via the pca function. The main entry point is pytorch_pca.pca, and the package exposes pca, PCAResult, and AllowedMethod in its public API.

Release PythonVersion PytorchVersion

Installation

pip install pytorch_pca

Features

This package provides 8 different PCA algorithms optimized for various scenarios:

  • svd: Standard SVD-based PCA (fastest, complete data only)
  • nipals: NIPALS algorithm (handles missing values effectively)
  • rnipals: Robust NIPALS (resistant to outliers and missing values)
  • ppca: Probabilistic PCA (classical probabilistic model)
  • bpca: Bayesian PCA (probabilistic approach with uncertainty quantification)
  • svd_impute: SVD-based PCA with missing value imputation
  • rpca: Robust PCA using iterative outlier detection
  • nlpca: Non-linear PCA using autoencoder neural network architecture

Quick Start

import torch
from pytorch_pca import pca

# Generate sample data
X = torch.randn(100, 20)

# Basic PCA using the unified interface
result = pca(X, method="svd", n_components=5)
print(f"Transformed data shape: {result.transformed_data.shape}")
print(f"Components shape: {result.components.shape}")
print(f"Explained variance: {result.explained_variance_ratio}")

# Alternative: use pcaMethods-style naming
print(f"Scores shape: {result.scores.shape}")
print(f"Loadings shape: {result.loadings.shape}")

# Handle missing data with NIPALS
X_missing = X.clone()
X_missing[10:20, 5:10] = float('nan')
result = pca(X_missing, method="nipals", n_components=3)

# Robust PCA for data with outliers
result = pca(X, method="rpca", n_components=3)

# Probabilistic approaches
result = pca(X, method="ppca", n_components=3)
result = pca(X, method="bpca", n_components=3)

# Non-linear PCA with neural networks
result = pca(X, method="nlpca", n_components=3)

# Reconstruct data
X_reconstructed = result.reconstruct(n_components=3)

API Reference

Unified Interface

result = pca(data, method="svd", n_components=2, center=True, scale=False, **kwargs)

Method-Specific Parameters

NIPALS Methods

nipals(data, max_iter=1000, tol=1e-6, ...)
rnipals(data, max_iter=1000, tol=1e-6, ...)

Probabilistic Methods

ppca(data, max_iter=1000, tol=1e-6, ...)
bpca(data, max_iter=1000, tol=1e-6, ...)

Robust PCA

rpca(data, max_iter=100, tol=1e-6, ...)

Non-linear PCA

nlpca(data, hidden_dims=[10, 5], max_iter=1000, lr=0.01, ...)

Result Object

The PCAResult object provides:

  • transformed_data: Data projected onto principal components (n_samples, n_components)
  • components: Principal components (eigenvectors) (n_components, n_features)
  • eigenvalues: Eigenvalues of the covariance matrix
  • explained_variance_ratio: Proportion of variance explained by each component
  • method: Name of the method used
  • scores: Alias for transformed_data (pcaMethods compatibility)
  • loadings: Transposed components (pcaMethods compatibility)
  • reconstruct(n_components=None): Reconstruct data using selected components

Method Selection Guide

  • Complete data, speed priority: svd
  • Missing values: nipals or svd_impute
  • Outliers present: rpca or rnipals
  • Uncertainty quantification: bpca
  • Probabilistic modeling: ppca
  • Non-linear relationships: nlpca

Dependencies

  • torch: The only required dependency (>= 2.7.1)

Development & Testing

  • black: Code formatting
  • flake8: Code linting
  • mypy: Type checking
  • pytest: Test runner
  • scikit-learn: For test comparisons
  • setuptools, setuptools-scm: Packaging

Comprehensive tests are provided in the tests/ directory, covering all algorithms, edge cases, and robust/non-linear PCA scenarios.

Testing

# Run all tests from root
pytest ./tests

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Ricardo Yanzon - ricayanzon

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytorch_pca-0.1.1.dev0.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pytorch_pca-0.1.1.dev0-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file pytorch_pca-0.1.1.dev0.tar.gz.

File metadata

  • Download URL: pytorch_pca-0.1.1.dev0.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pytorch_pca-0.1.1.dev0.tar.gz
Algorithm Hash digest
SHA256 48de9f85d441a7111cf91b309e07eb2bc953e4f1df7c117ca7795ff8365bcaa1
MD5 acc864230fff702e6ffe3f6e2d0a6dce
BLAKE2b-256 aaf0876bedbd2444c96ce9acb4de162d12ad4b86cade1520ea12ab61a87db6c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytorch_pca-0.1.1.dev0.tar.gz:

Publisher: publish.yml on ricayanzon/pytorch_pca

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pytorch_pca-0.1.1.dev0-py3-none-any.whl.

File metadata

File hashes

Hashes for pytorch_pca-0.1.1.dev0-py3-none-any.whl
Algorithm Hash digest
SHA256 5646d4527131a0520a8f458b24ce5eab3e1ac9da64d235b8f99e8fa2616797ed
MD5 421a749df7ebdb1eaea53e0868611988
BLAKE2b-256 c5e05c9735f7da137a7a0a7e97888d4c6b5aa8f678deadf1802dd1ae828f23b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytorch_pca-0.1.1.dev0-py3-none-any.whl:

Publisher: publish.yml on ricayanzon/pytorch_pca

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page