Skip to main content

Probabilistic PCA (PPCA) with missing-data support - fast C++ core, clean Python API

Project description

ppca-cpp

Build Wheels PyPI version Python Versions License

Probabilistic PCA (PPCA) with missing-data support — fast C++ core, clean Python API.

ppca teaser

Overview

ppca-cpp implements Probabilistic Principal Component Analysis (PPCA) as described by Tipping & Bishop (1999), with a focus on speed, usability, and robust handling of missing data. The core is written in C++ (Armadillo), exposed via a simple Python interface.

Key Features

  • Handles missing values natively: No need for manual imputation—just use np.nan for missing entries.
  • Familiar API: Drop-in replacement for scikit-learn PCA with attributes like components_, explained_variance_, etc.
  • Probabilistic modeling: Compute log-likelihoods, posterior latent variable distributions, multiple imputations, and more.
  • Fast and scalable: Optimized C++ backend for large datasets.
  • Flexible: Supports both batch and online (mini-batch) EM.

Quick Start

pip install ppca-py

Note: pre-built wheels are produced only for Linux and macOS (CI builds target ubuntu-latest and macos-latest). On other platforms (e.g. Windows) you will need to build from source (see further below).

Usage example:

import numpy as np
from ppca import PPCA

X_train = np.random.randn(600, 10) + 0.1  # (n_samples, n_features)
X_train[::7, 3] = np.nan                  # missing values
X_test = np.random.randn(100, 10) + 0.1
X_test[::7, 2] = np.nan                   # missing values

model = PPCA(n_components=3, batch_size=200)
model.fit(X_train)

mZ, covZ = model.posterior_latent(X_test) # latent representation
mX, covX = model.likelihood(mZ)           # reconstruction
ll = model.score_samples(X_test)          # data log likelihood

# multiple imputation (return shape: (n_draws, n_samples, n_features))
X_imputed = model.sample_missing(X_test, n_draws=5)

# estimate of components, mean and noise variance
print("Components:", model.components_)
print("Mean:", model.mean_)
print("Noise variance:", model.noise_variance_)

For a short PPCA reference doc see docs/ppca.md, and some usage examples are provided in examples/.

Installation from Source

For development install from source:

git clone https://github.com/brdav/ppca-cpp.git
cd ppca-cpp
git submodule update --init --recursive
python -m pip install -e '.[dev]'
pre-commit install

Minimum build dependencies

  • CMake >= 3.18
  • Python >= 3.9 (development headers)
  • C++17-capable compiler (clang, gcc, or MSVC)
  • BLAS/LAPACK implementation (OpenBLAS, MKL, or Accelerate)

Note: Builds on Windows are untested in CI. You can attempt a Windows build but expect manual steps.

The PPCA C++ core can also be built independently:

cmake -S src/cpp -B build/cpp -DCMAKE_BUILD_TYPE=Release
cmake --build build/cpp --target ppca -j

Internals

PPCA uses an Expectation-Maximization (EM) algorithm to learn parameters through maximum likelihood estimation. For details see the reference paper listed below. The equations for the EM algorithm in the presence of missing values are shown in docs/equations.md.

Citing

If you use this code academically, cite the original PPCA paper:

  • M. Tipping & C. Bishop. Probabilistic Principal Component Analysis. JRSS B, 1999.

You may also reference the library name or URL.

License

MIT License — see LICENSE.


Questions or requests? Open an issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ppca_py-1.0.4-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.2 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

ppca_py-1.0.4-cp314-cp314-macosx_11_0_arm64.whl (208.6 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

ppca_py-1.0.4-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

ppca_py-1.0.4-cp313-cp313-macosx_11_0_arm64.whl (208.3 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ppca_py-1.0.4-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

ppca_py-1.0.4-cp312-cp312-macosx_11_0_arm64.whl (208.3 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

ppca_py-1.0.4-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

ppca_py-1.0.4-cp311-cp311-macosx_11_0_arm64.whl (207.5 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

ppca_py-1.0.4-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

ppca_py-1.0.4-cp310-cp310-macosx_11_0_arm64.whl (206.5 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

ppca_py-1.0.4-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (12.2 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

ppca_py-1.0.4-cp39-cp39-macosx_11_0_arm64.whl (206.6 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file ppca_py-1.0.4-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5d53d39a9377d8d498b8652bf8540886ce4a433116d2eaf997c48fed2c047017
MD5 5198524b7c3dd0872f7655824a489235
BLAKE2b-256 25485f25b0f514177a5e6ffd19653c786246d9081ac31fd551c82fbe49b441d0

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 188050911afcdc2768a86c693f77992e52dc8ba3dbf3d2773d6bd14476874bd1
MD5 48590aa02928d0f544baa72abe32b141
BLAKE2b-256 37231d453bbe4e28b82a41c7a82aa9b28afb071a02e1db2d44907e3d5f5242ab

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 921d60184a3c944d96ab05d635864a3a6de9e01e3da39a28090611b215c5519e
MD5 5e6b324f8da532d1146238fc09017f61
BLAKE2b-256 411a3b8798f91f533e3389332285307e3a10c4e504d3c8b014612abe53c7d87b

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 948b2ea48659b949f502051d1365a6784353755c544e9e46003f2423224723b8
MD5 fb8a1face2f8c5ceae9ac17dd6d7b3e2
BLAKE2b-256 b6632bcc42dfb238a76e5d1ebb71d570ccfe92aa020c99c0fc7a0738219f7980

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ae1fdfb75f536d047de6d9281cce3ed6c04a2c573ae11f87cd52a921fc69fba0
MD5 be20b5fae2b280255fd0ab7ec9bd69d8
BLAKE2b-256 97b0b13ecb5b4e5c50a7572cb4b089de46267ef1163a3c9344878593b130e78e

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 38dc8c5f7595a465c5da584cfbed29b71896e54209dbd53d956ee37938da1b6f
MD5 094af664eae285aff35cda762e217e24
BLAKE2b-256 34766ed0902f263ae8d7a419f3a98bc7667b2cbd379e9ae120c209fdea444b03

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 022679ea812196a6353733c6a052a7198ceb0e826e3c0e51f7b484fc762150c3
MD5 1fd561640b0e7b524e3f434d9f7aa486
BLAKE2b-256 2c5d6b58ead5866e3eaf895cc99cbe1b06e38566f27d67531181b7ec1dc81471

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 afd694bef69249f97f385b9546f88a92f463979334d51ea13c67b3143b3d8e08
MD5 53dbc7e503ff599c2528b52f31c55953
BLAKE2b-256 09981c123ad161172b9c984d4c32805922604d020edb3ce21fca125b17f81564

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d9ad18560bda9101b3fd1ddcb7be5aafa320725bdd96b898cfa05c102da79d9b
MD5 f4ea836036bc2be06565ddb2e5f1d9fe
BLAKE2b-256 e409affd0a7f7033ab157eb3ee54b1fcf5343aee848d4c9686f2fd9f7b546355

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f76a1c52834c582efcc98f678dcf42604f2408bc063a16e39ed47547494c95a7
MD5 ce1565f5c72abb064515196e0ed84c90
BLAKE2b-256 182e0f40ec543fd76cb6e0b6c88d8e80e714273c4138125f52cfdd6f96321480

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 772aeed025d7b72e70a324529b68c56275fe849e5069f7e94c6667e5593ecff5
MD5 057ed4bdccc598bfa3752bb758c8d046
BLAKE2b-256 4fa4b268b4b677f74e5f761d725543bcd24dfd2cfda87bc763e2ed95fec25491

See more details on using hashes here.

File details

Details for the file ppca_py-1.0.4-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ppca_py-1.0.4-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ce46ca0b842172763d468214986fad04500d1e4417e02ab22f65d1e92f092b55
MD5 7f1df362db93df84a7dd12de5762e2ac
BLAKE2b-256 d5ab7e630276b8cc67d899bdc5605c6ed397a8b59de5ab5b1623e1a6c889471e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page