Probabilistic PCA (PPCA) with missing-data support - fast C++ core, clean Python API
Project description
ppca-cpp
Probabilistic PCA (PPCA) with missing-data support — fast C++ core, clean Python API.
Overview
ppca-cpp implements Probabilistic Principal Component Analysis (PPCA) as described by Tipping & Bishop (1999), with a focus on speed, usability, and robust handling of missing data. The core is written in C++ (Armadillo), exposed via a simple Python interface.
Key Features
- Handles missing values natively: No need for manual imputation—just use
np.nanfor missing entries. - Familiar API: Drop-in replacement for scikit-learn PCA with attributes like
components_,explained_variance_, etc. - Probabilistic modeling: Compute log-likelihoods, posterior latent variable distributions, multiple imputations, and more.
- Fast and scalable: Optimized C++ backend for large datasets.
- Flexible: Supports both batch and online (mini-batch) EM.
Quick Start
pip install ppca-py
Note: pre-built wheels are produced only for Linux and macOS (CI builds target ubuntu-latest and macos-latest). On other platforms (e.g. Windows) you will need to build from source (see further below).
Usage example:
import numpy as np
from ppca import PPCA
X_train = np.random.randn(600, 10) + 0.1 # (n_samples, n_features)
X_train[::7, 3] = np.nan # missing values
X_test = np.random.randn(100, 10) + 0.1
X_test[::7, 2] = np.nan # missing values
model = PPCA(n_components=3, batch_size=200)
model.fit(X_train)
mZ, covZ = model.posterior_latent(X_test) # latent representation
mX, covX = model.likelihood(mZ) # reconstruction
ll = model.score_samples(X_test) # data log likelihood
# multiple imputation (return shape: (n_draws, n_samples, n_features))
X_imputed = model.sample_missing(X_test, n_draws=5)
# estimate of components, mean and noise variance
print("Components:", model.components_)
print("Mean:", model.mean_)
print("Noise variance:", model.noise_variance_)
For a short PPCA reference doc see docs/ppca.md, and some usage examples are provided in examples/.
Installation from Source
For development install from source:
git clone https://github.com/brdav/ppca-cpp.git
cd ppca-cpp
git submodule update --init --recursive
python -m pip install -e '.[dev]'
pre-commit install
Minimum build dependencies
- CMake >= 3.18
- Python >= 3.9 (development headers)
- C++17-capable compiler (clang, gcc, or MSVC)
- BLAS/LAPACK implementation (OpenBLAS, MKL, or Accelerate)
Note: Builds on Windows are untested in CI. You can attempt a Windows build but expect manual steps.
The PPCA C++ core can also be built independently:
cmake -S src/cpp -B build/cpp -DCMAKE_BUILD_TYPE=Release
cmake --build build/cpp --target ppca -j
Internals
PPCA uses an Expectation-Maximization (EM) algorithm to learn parameters through maximum likelihood estimation. For details see the reference paper listed below. The equations for the EM algorithm in the presence of missing values are shown in docs/equations.md.
Citing
If you use this code academically, cite the original PPCA paper:
- M. Tipping & C. Bishop. Probabilistic Principal Component Analysis. JRSS B, 1999.
You may also reference the library name or URL.
License
MIT License — see LICENSE.
Questions or requests? Open an issue.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ppca_py-1.0.4-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 12.2 MB
- Tags: CPython 3.14, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d53d39a9377d8d498b8652bf8540886ce4a433116d2eaf997c48fed2c047017
|
|
| MD5 |
5198524b7c3dd0872f7655824a489235
|
|
| BLAKE2b-256 |
25485f25b0f514177a5e6ffd19653c786246d9081ac31fd551c82fbe49b441d0
|
File details
Details for the file ppca_py-1.0.4-cp314-cp314-macosx_11_0_arm64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp314-cp314-macosx_11_0_arm64.whl
- Upload date:
- Size: 208.6 kB
- Tags: CPython 3.14, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
188050911afcdc2768a86c693f77992e52dc8ba3dbf3d2773d6bd14476874bd1
|
|
| MD5 |
48590aa02928d0f544baa72abe32b141
|
|
| BLAKE2b-256 |
37231d453bbe4e28b82a41c7a82aa9b28afb071a02e1db2d44907e3d5f5242ab
|
File details
Details for the file ppca_py-1.0.4-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 12.2 MB
- Tags: CPython 3.13, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
921d60184a3c944d96ab05d635864a3a6de9e01e3da39a28090611b215c5519e
|
|
| MD5 |
5e6b324f8da532d1146238fc09017f61
|
|
| BLAKE2b-256 |
411a3b8798f91f533e3389332285307e3a10c4e504d3c8b014612abe53c7d87b
|
File details
Details for the file ppca_py-1.0.4-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 208.3 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
948b2ea48659b949f502051d1365a6784353755c544e9e46003f2423224723b8
|
|
| MD5 |
fb8a1face2f8c5ceae9ac17dd6d7b3e2
|
|
| BLAKE2b-256 |
b6632bcc42dfb238a76e5d1ebb71d570ccfe92aa020c99c0fc7a0738219f7980
|
File details
Details for the file ppca_py-1.0.4-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 12.2 MB
- Tags: CPython 3.12, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae1fdfb75f536d047de6d9281cce3ed6c04a2c573ae11f87cd52a921fc69fba0
|
|
| MD5 |
be20b5fae2b280255fd0ab7ec9bd69d8
|
|
| BLAKE2b-256 |
97b0b13ecb5b4e5c50a7572cb4b089de46267ef1163a3c9344878593b130e78e
|
File details
Details for the file ppca_py-1.0.4-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 208.3 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38dc8c5f7595a465c5da584cfbed29b71896e54209dbd53d956ee37938da1b6f
|
|
| MD5 |
094af664eae285aff35cda762e217e24
|
|
| BLAKE2b-256 |
34766ed0902f263ae8d7a419f3a98bc7667b2cbd379e9ae120c209fdea444b03
|
File details
Details for the file ppca_py-1.0.4-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 12.2 MB
- Tags: CPython 3.11, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
022679ea812196a6353733c6a052a7198ceb0e826e3c0e51f7b484fc762150c3
|
|
| MD5 |
1fd561640b0e7b524e3f434d9f7aa486
|
|
| BLAKE2b-256 |
2c5d6b58ead5866e3eaf895cc99cbe1b06e38566f27d67531181b7ec1dc81471
|
File details
Details for the file ppca_py-1.0.4-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 207.5 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
afd694bef69249f97f385b9546f88a92f463979334d51ea13c67b3143b3d8e08
|
|
| MD5 |
53dbc7e503ff599c2528b52f31c55953
|
|
| BLAKE2b-256 |
09981c123ad161172b9c984d4c32805922604d020edb3ce21fca125b17f81564
|
File details
Details for the file ppca_py-1.0.4-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 12.2 MB
- Tags: CPython 3.10, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9ad18560bda9101b3fd1ddcb7be5aafa320725bdd96b898cfa05c102da79d9b
|
|
| MD5 |
f4ea836036bc2be06565ddb2e5f1d9fe
|
|
| BLAKE2b-256 |
e409affd0a7f7033ab157eb3ee54b1fcf5343aee848d4c9686f2fd9f7b546355
|
File details
Details for the file ppca_py-1.0.4-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 206.5 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f76a1c52834c582efcc98f678dcf42604f2408bc063a16e39ed47547494c95a7
|
|
| MD5 |
ce1565f5c72abb064515196e0ed84c90
|
|
| BLAKE2b-256 |
182e0f40ec543fd76cb6e0b6c88d8e80e714273c4138125f52cfdd6f96321480
|
File details
Details for the file ppca_py-1.0.4-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 12.2 MB
- Tags: CPython 3.9, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
772aeed025d7b72e70a324529b68c56275fe849e5069f7e94c6667e5593ecff5
|
|
| MD5 |
057ed4bdccc598bfa3752bb758c8d046
|
|
| BLAKE2b-256 |
4fa4b268b4b677f74e5f761d725543bcd24dfd2cfda87bc763e2ed95fec25491
|
File details
Details for the file ppca_py-1.0.4-cp39-cp39-macosx_11_0_arm64.whl.
File metadata
- Download URL: ppca_py-1.0.4-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 206.6 kB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce46ca0b842172763d468214986fad04500d1e4417e02ab22f65d1e92f092b55
|
|
| MD5 |
7f1df362db93df84a7dd12de5762e2ac
|
|
| BLAKE2b-256 |
d5ab7e630276b8cc67d899bdc5605c6ed397a8b59de5ab5b1623e1a6c889471e
|