Skip to main content

Clustering using the mclust algorithm.

Project description

Downloads Python package contributions welcome

mclustpy

mclustpy is a Python function for clustering data using the Mclust algorithm from the R package mclust. The function takes a 2D numpy array of data and returns a dictionary containing various output values computed by the Mclust algorithm.

Installation

mclustpy requires the following dependencies:

  • numpy
  • rpy2

To install mclustpy, you can use pip:

pip install mclustpy

Usage

from mclustpy import mclustpy
import numpy as np

data = np.random.rand(1000, 10)
data.shape

res = mclustpy(data, G=9, modelNames='EEE', random_seed=2020)

The mclustpy function takes the following parameters:

  • data: a 2D numpy array of data to be clustered.
  • G: an integer specifying the maximum number of mixture components to be considered (default is 9).
  • modelNames: a string specifying the model types to be considered (default is 'EEE').
  • random_seed: an integer specifying the random seed for reproducibility (default is 2020).

The function returns a dictionary containing the following output values:

  • call: the function call used to run the Mclust algorithm.
  • data: the input data as an R matrix.
  • modelName: the model name(s) selected by the algorithm.
  • n: the number of observations in the data.
  • d: the number of variables in the data.
  • G: the number of mixture components selected by the algorithm.
  • BIC: the Bayesian Information Criterion (BIC) value for the selected model.
  • loglik: the log-likelihood of the selected model.
  • df: the number of degrees of freedom in the selected model.
  • bic: the BIC value for each model considered.
  • icl: the Integrated Completed Likelihood (ICL) value for each model considered.
  • hypvol: the hypervolume of the cluster tree for each model considered.
  • parameters: the estimated parameters for each component in the selected model.
  • z: the posterior probabilities of assignment to each component for each observation.
  • classification: the classification of each observation under the selected model.
  • uncertainty: a measure of uncertainty in the classification of each observation.

For more info take a look at the original mclust page

License Notice:

This package, mclustpy, is licensed under the MIT License. However, it depends on the R package mclust, which is licensed under the GNU General Public License (GPL ≥2). Users must ensure compliance with the GPL license when using mclustpy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mclustpy-0.0.3.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mclustpy-0.0.3-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file mclustpy-0.0.3.tar.gz.

File metadata

  • Download URL: mclustpy-0.0.3.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.18

File hashes

Hashes for mclustpy-0.0.3.tar.gz
Algorithm Hash digest
SHA256 125c4960283e661c7109dadafc7f2afa95e60b30c8a82dcc2daea8bdeee7ffbc
MD5 15e5b2157b380975d8b5da585c71fb9d
BLAKE2b-256 06771bf917cca799e2500e18e1cad81e3f199cd84003505c6cd9e945bd0e73be

See more details on using hashes here.

File details

Details for the file mclustpy-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: mclustpy-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.18

File hashes

Hashes for mclustpy-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5b21093bb5731dad451f17caa91c2be910f6dfe6277b77dfd7a6a824deae4147
MD5 0bbf92992c1ea787549efd66d51e44ee
BLAKE2b-256 f605fc64dd5fb417d3773e142178ae5d82ec11d47955d84044c05941540f4758

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page