Skip to main content

Complex-valued Gaussian mixture models with a scikit-learn-like API.

Project description

cplx-gmm

Python License: BSD-3-Clause Package

Complex-valued Gaussian mixture models with structured covariance matrices.

cplx-gmm provides a scikit-learn-style estimator for fitting Gaussian mixture models (GMMs) to complex-valued data using expectation-maximization (EM). It supports circularly symmetric complex Gaussian components and structured covariance types that are not available in the standard scikit-learn GaussianMixture, including circulant, block-circulant, Toeplitz, and block-Toeplitz covariance matrices.

✨ Highlights

  • Complex-valued Gaussian mixture models for data in complex vector spaces
  • scikit-learn-style estimator API
  • Structured covariance models beyond standard scikit-learn GMMs
  • Full, diagonal, spherical, circulant, block-circulant, Toeplitz, and block-Toeplitz covariance types
  • Optional zero-mean component constraint
  • Sampling from fitted complex-valued mixture models
  • Tested covariance-structure correctness for non-trivial dimensions
  • Modern Python packaging with pyproject.toml, uv, pytest, and ruff

📌 Citation

If you use cplx-gmm in academic work, please cite the package directly:

@software{fesl_cplx_gmm,
  author = {Fesl, Benedikt},
  title = {{cplx-gmm}: Complex-valued Gaussian mixture models with structured covariance matrices},
  year = {2026},
  url = {https://github.com/benediktfesl/GMM_cplx},
  version = {0.1.0}
}

Plain-text citation:

B. Fesl, cplx-gmm: Complex-valued Gaussian mixture models with structured covariance matrices, version 0.1.0. Available: https://github.com/benediktfesl/GMM_cplx

📦 Installation

Install from PyPI:

pip install cplx-gmm

or with uv:

uv add cplx-gmm

🧩 Covariance Structures

The covariance models are one of the main reasons to use this package. In addition to the usual full, diag, and spherical covariance types, cplx-gmm supports structured covariance matrices that are common in signal processing and wireless channel modeling.

covariance_type Description
"full" Full covariance matrix for each component.
"diag" Diagonal covariance for each component.
"spherical" One scalar variance per component.
"circulant" Circulant covariance matrix for each component.
"block-circulant" Block-circulant covariance matrix. Requires blocks=(n_1, n_2).
"toeplitz" Toeplitz covariance matrix for each component.
"block-toeplitz" Block-Toeplitz covariance matrix. Requires blocks=(n_1, n_2).

For block-structured covariance types, pass the block dimensions in the constructor:

model = GaussianMixtureCplx(
    n_components=4,
    covariance_type="block-circulant",
    blocks=(4, 8),
    random_state=0,
)

model.fit(X)

Here, blocks=(4, 8) means that the feature dimension must satisfy n_features = 4 * 8.

🚀 Quick Start

import numpy as np

from cplx_gmm import GaussianMixtureCplx

rng = np.random.default_rng(0)

X = (
    rng.normal(size=(1_000, 8))
    + 1j * rng.normal(size=(1_000, 8))
) / np.sqrt(2)

model = GaussianMixtureCplx(
    n_components=4,
    covariance_type="full",
    random_state=0,
    max_iter=100,
    n_init=1,
)

model.fit(X)

labels = model.predict(X)
responsibilities = model.predict_proba(X)
log_likelihood = model.score(X)

samples, component_labels = model.sample(n_samples=100)

The estimator follows the usual scikit-learn pattern: model configuration is passed to the constructor, and fit(X) receives the data.

📚 Research Background

This implementation was developed in the context of complex-valued Gaussian mixture modeling for wireless channel estimation and related signal processing applications.

The results of the following works are, in parts, based on the complex-valued implementation:

  • M. Koller, B. Fesl, N. Turan, and W. Utschick, “An Asymptotically MSE-Optimal Estimator Based on Gaussian Mixture Models,” IEEE Transactions on Signal Processing, vol. 70, pp. 4109–4123, 2022.
    [IEEE] [arXiv]

  • N. Turan, B. Fesl, M. Grundei, M. Koller, and W. Utschick, “Evaluation of a Gaussian Mixture Model-based Channel Estimator using Measurement Data,” International Symposium on Wireless Communication Systems (ISWCS), 2022.
    [IEEE] [arXiv]

  • B. Fesl, M. Joham, S. Hu, M. Koller, N. Turan, and W. Utschick, “Channel Estimation based on Gaussian Mixture Models with Structured Covariances,” 56th Asilomar Conference on Signals, Systems, and Computers, 2022, pp. 533–537.
    [IEEE] [arXiv]

  • B. Fesl, N. Turan, M. Joham, and W. Utschick, “Learning a Gaussian Mixture Model from Imperfect Training Data for Robust Channel Estimation,” IEEE Wireless Communications Letters, 2023.
    [IEEE] [arXiv]

  • M. Koller, B. Fesl, N. Turan, and W. Utschick, “An Asymptotically Optimal Approximation of the Conditional Mean Channel Estimator Based on Gaussian Mixture Models,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022, pp. 5268–5272.
    [IEEE] [arXiv]

  • B. Fesl, A. Faika, N. Turan, M. Joham, and W. Utschick, “Channel Estimation with Reduced Phase Allocations in RIS-Aided Systems,” IEEE 24th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), 2023, pp. 161–165.
    [IEEE] [arXiv]

  • N. Turan, B. Fesl, M. Koller, M. Joham, and W. Utschick, “A Versatile Low-Complexity Feedback Scheme for FDD Systems via Generative Modeling,” IEEE Transactions on Wireless Communications, 2023.
    [IEEE] [arXiv]

  • N. Turan, B. Fesl, and W. Utschick, “Enhanced Low-Complexity FDD System Feedback with Variable Bit Lengths via Generative Modeling,” 57th Asilomar Conference on Signals, Systems, and Computers, 2023.
    [IEEE] [arXiv]

  • N. Turan, M. Koller, B. Fesl, S. Bazzi, W. Xu, and W. Utschick, “GMM-based Codebook Construction and Feedback Encoding in FDD Systems,” 56th Asilomar Conference on Signals, Systems, and Computers, 2022, pp. 37–42.
    [IEEE] [arXiv]

  • ... and more

🧠 Estimator API

The main class is:

from cplx_gmm import GaussianMixtureCplx

Core methods:

Method Description
fit(X, y=None) Fit the complex-valued GMM.
fit_predict(X, y=None) Fit the model and return component labels.
predict(X) Predict the most likely component for each sample.
predict_proba(X) Return posterior component probabilities.
score_samples(X) Return per-sample log-likelihoods.
score(X, y=None) Return the mean log-likelihood.
sample(n_samples=1) Draw samples from the fitted mixture model.

Fitted parameters follow scikit-learn-style trailing-underscore names such as weights_, means_, covariances_, precisions_, precisions_cholesky_, converged_, n_iter_, and lower_bound_.

🔒 Zero-Mean Components

Some signal processing models assume zero-mean Gaussian components. This can be enforced with:

model = GaussianMixtureCplx(
    n_components=4,
    covariance_type="full",
    zero_mean=True,
)

model.fit(X)

When zero_mean=True, all component means are fixed to zero during fitting.

🔁 Circulant Covariances

Circulant covariance matrices are diagonalized by the discrete Fourier transform (DFT). For "circulant" and "block-circulant", the estimator fits a diagonal covariance model in the Fourier domain and transforms the fitted parameters back to the original domain.

model = GaussianMixtureCplx(
    n_components=4,
    covariance_type="circulant",
    random_state=0,
)

model.fit(X)

For block-circulant covariances, a two-dimensional FFT representation is used.

📐 Toeplitz Covariances

Toeplitz and block-Toeplitz covariance fitting uses an EM-based inverse covariance update inspired by:

T. A. Barton and D. R. Fuhrmann, “Covariance Estimation for Multidimensional Data using the EM Algorithm,” Proceedings of the 27th Asilomar Conference on Signals, Systems and Computers, 1993, pp. 203–207.

Example:

model = GaussianMixtureCplx(
    n_components=4,
    covariance_type="toeplitz",
    random_state=0,
)

model.fit(X)

🧪 Development

Clone the repository and install the development environment with uv:

git clone https://github.com/benediktfesl/GMM_cplx.git
cd GMM_cplx
uv sync

Run tests:

uv run pytest

✅ Test Coverage

The test suite covers:

  • package imports
  • sklearn-style estimator API
  • validation behavior
  • all supported covariance types
  • structural covariance correctness
  • EM lower-bound monotonicity
  • zero-mean fitting
  • initialization options
  • warm-start behavior
  • reproducibility with fixed random_state
  • sampling behavior
  • real-valued compatibility checks
  • doubled real-valued likelihood equivalence
  • example execution

📄 License

This project is licensed under the BSD 3-Clause License.

The implementation is based on ideas and portions of the original scikit-learn Gaussian mixture implementation, which is also distributed under the BSD 3-Clause License.

See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cplx_gmm-0.1.0.tar.gz (23.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cplx_gmm-0.1.0-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file cplx_gmm-0.1.0.tar.gz.

File metadata

  • Download URL: cplx_gmm-0.1.0.tar.gz
  • Upload date:
  • Size: 23.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for cplx_gmm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f8f88d47d292e2a42a422a311f64da139bbb2962f5cfc43d89af7976bed7d3ba
MD5 8872e1ee73d9426aa82222cda27572c7
BLAKE2b-256 5f437d77d4dee99b00b349d87739f3f1b867d774a444add124221c93aa95da68

See more details on using hashes here.

File details

Details for the file cplx_gmm-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cplx_gmm-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for cplx_gmm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41f9a279a9ff02f32d8ce42dcf5333daf0e3453ab27ab56e5369f805cb4c0099
MD5 c54e7e55e3bbd49186d466e629ce81ab
BLAKE2b-256 beee0869dc1cac2a40d43f26196f2a9b1e61b8e746b4ba442da846f1f55078cc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page