Skip to main content

Complex-valued mixture of factor analyzers with a scikit-learn-like API.

Project description

cplx-mfa

Python License: BSD-3-Clause Package

Complex-valued mixture of factor analyzers with a scikit-learn-like API.

cplx-mfa provides an estimator for fitting mixture of factor analyzers (MFA) models to complex-valued data using expectation-maximization (EM). The model uses circularly symmetric complex Gaussian components with low-rank covariance structure, making it useful for high-dimensional signal processing problems where full covariance Gaussian mixtures can be too expensive or statistically inefficient.

Parts of the implementation are derived from the original mofa package and extended with complex-valued modeling support, modern packaging, improved naming, validation, reproducible initialization, and functional tests.

✨ Highlights

  • Complex-valued mixture of factor analyzers for data in complex vector spaces
  • Low-rank covariance structure via component-wise factor loading matrices
  • Circularly symmetric complex Gaussian components
  • scikit-learn-like estimator API with fit, predict, predict_proba, and sample
  • Optional isotropic PPCA-style noise model
  • Optional shared diagonal noise variances across components
  • Sampling from fitted complex-valued MFA models
  • Fitted parameters exposed with trailing-underscore names
  • Modern Python packaging with pyproject.toml, uv, pytest, and ruff

📌 Citation

If you use cplx-mfa in academic work, please cite the package directly:

@software{fesl_cplx_mfa,
  author = {Fesl, Benedikt},
  title = {{cplx-mfa}: Complex-valued mixture of factor analyzers},
  year = {2026},
  url = {https://github.com/benediktfesl/cplx-mfa},
  version = {0.1.0}
}

Plain-text citation:

B. Fesl, cplx-mfa: Complex-valued mixture of factor analyzers, version 0.1.0. Available: https://github.com/benediktfesl/cplx-mfa

📦 Installation

Install from PyPI:

pip install cplx-mfa

or with uv:

uv add cplx-mfa

For development, clone the repository and install the development environment:

git clone https://github.com/benediktfesl/cplx-mfa.git
cd cplx-mfa
uv sync --group dev

🚀 Quick Start

import numpy as np

from cplx_mfa import ComplexMFA

rng = np.random.default_rng(0)

X = (
    rng.normal(size=(1_000, 8))
    + 1j * rng.normal(size=(1_000, 8))
) / np.sqrt(2.0)

model = ComplexMFA(
    n_components=4,
    latent_dim=2,
    random_state=0,
    max_iter=100,
    verbose=False,
)

model.fit(X)

labels = model.predict(X)
responsibilities = model.predict_proba(X)

samples, component_labels = model.sample(
    n_samples=100,
    rng=np.random.default_rng(1),
)

The estimator follows the usual pattern: model configuration is passed to the constructor, and fit(X) receives the data.

🧩 Model Structure

A mixture of factor analyzers represents each mixture component with a low-rank covariance structure:

covariance = loadings @ loadingsᴴ + diagonal_noise

This is useful when the feature dimension is large but the dominant component-wise variation is approximately low-dimensional.

The fitted attributes are:

Attribute Description
weights_ Mixture weights of shape (n_components,).
means_ Component means of shape (n_components, n_features).
loadings_ Factor loading matrices of shape (n_components, n_features, latent_dim).
covariances_ Full implied covariance matrices of shape (n_components, n_features, n_features).
precisions_ Inverse covariance matrices of shape (n_components, n_features, n_features).
noise_variances_ Diagonal noise variances of shape (n_components, n_features).
lower_bound_history_ EM lower-bound values collected during fitting.

🧠 Estimator API

The main class is:

from cplx_mfa import ComplexMFA

Core methods:

Method Description
fit(X) Fit the complex-valued MFA model.
predict(X) Predict the most likely component for each sample.
predict_proba(X) Return posterior component probabilities.
sample(n_samples=1, rng=None) Draw samples from the fitted mixture model.

Constructor parameters:

Parameter Description
n_components Number of mixture components.
latent_dim Latent dimensionality of each factor analyzer.
ppca If True, use one isotropic noise variance per component.
lock_psis If True, use shared diagonal noise variances across components.
rs_clip Lower clipping value for responsibilities during EM.
max_condition_number Scaling factor used for random loading initialization.
max_iter Maximum number of EM iterations.
tol Relative convergence tolerance.
random_state Integer seed or NumPy random generator used for initialization.
verbose If True, print EM progress.

🔁 Sampling

Samples are generated component-wise and returned grouped by component. The returned labels follow the same order.

samples, labels = model.sample(
    n_samples=100,
    rng=np.random.default_rng(1),
)

This means labels is sorted by component group rather than shuffled randomly. This behavior is intentional and documented so that generated samples can be inspected component by component.

🔬 PPCA-Style Components

Set ppca=True to constrain each component to use an isotropic diagonal noise variance:

model = ComplexMFA(
    n_components=4,
    latent_dim=2,
    ppca=True,
    random_state=0,
)

model.fit(X)

This gives a probabilistic PCA-style covariance structure per mixture component.

🔒 Shared Noise Variances

Set lock_psis=True to enforce shared diagonal noise variances across all mixture components:

model = ComplexMFA(
    n_components=4,
    latent_dim=2,
    lock_psis=True,
    random_state=0,
)

model.fit(X)

This can be useful when all mixture components are expected to share the same residual noise floor.

📚 Research Background

This implementation was developed in the context of complex-valued generative modeling for wireless channel estimation and related signal processing applications.

The results of the following work are, in parts, based on the complex-valued MFA implementation:

  • B. Fesl, N. Turan, and W. Utschick, “Low-Rank Structured MMSE Channel Estimation with Mixtures of Factor Analyzers,” 57th Asilomar Conference on Signals, Systems, and Computers, 2023.
    [IEEE] [arXiv]

🧪 Development

Install the development environment with uv:

uv sync --group dev

Run tests:

uv run pytest

Run linting:

uv run ruff check .

Format code:

uv run ruff format .

Run the example:

uv run python examples/cplx_mfa_example.py

Build the package:

uv run python -m build

✅ Test Coverage

The test suite covers:

  • package imports
  • fitting and fitted attribute shapes
  • prediction and responsibility normalization
  • sampling behavior
  • grouped sample labels
  • validation behavior
  • unfitted-estimator behavior
  • reproducibility with fixed random_state
  • EM lower-bound history
  • utility functions for complex-valued data handling
  • example execution

📄 License

This project is licensed under the BSD 3-Clause License.

The implementation contains code derived from the original mofa package, which is MIT-licensed. Attribution and provenance are retained in NOTICE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cplx_mfa-0.1.0.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cplx_mfa-0.1.0-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file cplx_mfa-0.1.0.tar.gz.

File metadata

  • Download URL: cplx_mfa-0.1.0.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for cplx_mfa-0.1.0.tar.gz
Algorithm Hash digest
SHA256 21c7cf723b656dc4cad901831cec9bad7c49a894998e6fd193381122793310d6
MD5 a88f1c1a5a2a97bcaedab40f82d6647b
BLAKE2b-256 2025a5c9725371b6aec9e1272b16cc54d2f17eaeb0c89e0a77a6c320ab3b7d56

See more details on using hashes here.

File details

Details for the file cplx_mfa-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cplx_mfa-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for cplx_mfa-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b7f2b19fea774c3fae7ea2e34ced83e6cda6520a8d1e017dc2ced44b8d6950f2
MD5 01d154bc89fd76943af1a24bcfee660b
BLAKE2b-256 1ff44c1d9d638b2f79ccd23b43d5ff25d2acfc308715ef00b86d150b76c660d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page