Skip to main content

Comprehensive PyTorch implementation of PCA.

Project description

PyTorch PCA

A comprehensive PCA implementation using PyTorch, inspired by the R package pcaMethods.

Overview

This package provides a unified interface to eight PCA algorithms, all accessible via the pca function. The main entry point is pytorch_pca.pca, and the package exposes pca, PCAResult, and AllowedMethod in its public API.

Release PythonVersion PytorchVersion

Installation

pip install pytorch_pca

Features

This package provides 8 different PCA algorithms optimized for various scenarios:

  • svd: Standard SVD-based PCA (fastest, complete data only)
  • nipals: NIPALS algorithm (handles missing values effectively)
  • rnipals: Robust NIPALS (resistant to outliers and missing values)
  • ppca: Probabilistic PCA (classical probabilistic model)
  • bpca: Bayesian PCA (probabilistic approach with uncertainty quantification)
  • svd_impute: SVD-based PCA with missing value imputation
  • rpca: Robust PCA using iterative outlier detection
  • nlpca: Non-linear PCA using autoencoder neural network architecture

Quick Start

import torch
from pytorch_pca import pca

# Generate sample data
X = torch.randn(100, 20)

# Basic PCA using the unified interface
result = pca(X, method="svd", n_components=5)
print(f"Transformed data shape: {result.transformed_data.shape}")
print(f"Components shape: {result.components.shape}")
print(f"Explained variance: {result.explained_variance_ratio}")

# Alternative: use pcaMethods-style naming
print(f"Scores shape: {result.scores.shape}")
print(f"Loadings shape: {result.loadings.shape}")

# Handle missing data with NIPALS
X_missing = X.clone()
X_missing[10:20, 5:10] = float('nan')
result = pca(X_missing, method="nipals", n_components=3)

# Robust PCA for data with outliers
result = pca(X, method="rpca", n_components=3)

# Probabilistic approaches
result = pca(X, method="ppca", n_components=3)
result = pca(X, method="bpca", n_components=3)

# Non-linear PCA with neural networks
result = pca(X, method="nlpca", n_components=3)

# Reconstruct data
X_reconstructed = result.reconstruct(n_components=3)

API Reference

Unified Interface

result = pca(data, method="svd", n_components=2, center=True, scale=False, **kwargs)

Method-Specific Parameters

NIPALS Methods

nipals(data, max_iter=1000, tol=1e-6, ...)
rnipals(data, max_iter=1000, tol=1e-6, ...)

Probabilistic Methods

ppca(data, max_iter=1000, tol=1e-6, ...)
bpca(data, max_iter=1000, tol=1e-6, ...)

Robust PCA

rpca(data, max_iter=100, tol=1e-6, ...)

Non-linear PCA

nlpca(data, hidden_dims=[10, 5], max_iter=1000, lr=0.01, ...)

Result Object

The PCAResult object provides:

  • transformed_data: Data projected onto principal components (n_samples, n_components)
  • components: Principal components (eigenvectors) (n_components, n_features)
  • eigenvalues: Eigenvalues of the covariance matrix
  • explained_variance_ratio: Proportion of variance explained by each component
  • method: Name of the method used
  • scores: Alias for transformed_data (pcaMethods compatibility)
  • loadings: Transposed components (pcaMethods compatibility)
  • reconstruct(n_components=None): Reconstruct data using selected components

Method Selection Guide

  • Complete data, speed priority: svd
  • Missing values: nipals or svd_impute
  • Outliers present: rpca or rnipals
  • Uncertainty quantification: bpca
  • Probabilistic modeling: ppca
  • Non-linear relationships: nlpca

Dependencies

  • torch: The only required dependency (>= 2.7.1)

Development & Testing

  • black: Code formatting
  • flake8: Code linting
  • mypy: Type checking
  • pytest: Test runner
  • scikit-learn: For test comparisons
  • setuptools, setuptools-scm: Packaging

Comprehensive tests are provided in the tests/ directory, covering all algorithms, edge cases, and robust/non-linear PCA scenarios.

Testing

# Run all tests from root
pytest ./tests

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Ricardo Yanzon - ricayanzon

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytorch_pca-0.1.2.dev0.tar.gz (25.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pytorch_pca-0.1.2.dev0-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file pytorch_pca-0.1.2.dev0.tar.gz.

File metadata

  • Download URL: pytorch_pca-0.1.2.dev0.tar.gz
  • Upload date:
  • Size: 25.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pytorch_pca-0.1.2.dev0.tar.gz
Algorithm Hash digest
SHA256 2888a2f820c0218a21c5e495bfee480da3632c204e3e300a330aa04613021acc
MD5 f24ee6c3d0e697525a800a6bd15fe291
BLAKE2b-256 a41c4d374dc54fa874d57481eb98bfc85b29d5655a6befbb2a90097366fe311d

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytorch_pca-0.1.2.dev0.tar.gz:

Publisher: publish.yml on ricayanzon/pytorch_pca

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pytorch_pca-0.1.2.dev0-py3-none-any.whl.

File metadata

File hashes

Hashes for pytorch_pca-0.1.2.dev0-py3-none-any.whl
Algorithm Hash digest
SHA256 fef0aa099d4043fbe7a356e992999e18a8a12b9cf6361397cda7afcd90d46bbd
MD5 de6298ad8c3a7e5ab746e3723c25cc4d
BLAKE2b-256 d5cc6cac21a51cdf07b45586185f50d36532ffeaf3ac05fe9d3e113c2bf1770f

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytorch_pca-0.1.2.dev0-py3-none-any.whl:

Publisher: publish.yml on ricayanzon/pytorch_pca

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page