Clustering using the mclust algorithm.
Project description
mclustpy
mclustpy is a Python function for clustering data using the Mclust algorithm from the R package mclust. The function takes a 2D numpy array of data and returns a dictionary containing various output values computed by the Mclust algorithm.
Installation
mclustpy requires the following dependencies:
- numpy
- rpy2
To install mclustpy, you can use pip:
pip install mclustpy
Usage
from mclustpy import mclustpy
import numpy as np
data = np.random.rand(1000, 10)
data.shape
res = mclustpy(data, G=9, modelNames='EEE', random_seed=2020)
The mclustpy function takes the following parameters:
- data: a 2D numpy array of data to be clustered.
- G: an integer specifying the maximum number of mixture components to be considered (default is 9).
- modelNames: a string specifying the model types to be considered (default is 'EEE').
- random_seed: an integer specifying the random seed for reproducibility (default is 2020).
The function returns a dictionary containing the following output values:
- call: the function call used to run the Mclust algorithm.
- data: the input data as an R matrix.
- modelName: the model name(s) selected by the algorithm.
- n: the number of observations in the data.
- d: the number of variables in the data.
- G: the number of mixture components selected by the algorithm.
- BIC: the Bayesian Information Criterion (BIC) value for the selected model.
- loglik: the log-likelihood of the selected model.
- df: the number of degrees of freedom in the selected model.
- bic: the BIC value for each model considered.
- icl: the Integrated Completed Likelihood (ICL) value for each model considered.
- hypvol: the hypervolume of the cluster tree for each model considered.
- parameters: the estimated parameters for each component in the selected model.
- z: the posterior probabilities of assignment to each component for each observation.
- classification: the classification of each observation under the selected model.
- uncertainty: a measure of uncertainty in the classification of each observation.
For more info take a look at the original mclust page
License Notice:
This package, mclustpy, is licensed under the MIT License. However, it depends on the R package mclust, which is licensed under the GNU General Public License (GPL ≥2). Users must ensure compliance with the GPL license when using mclustpy.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mclustpy-0.0.3.tar.gz.
File metadata
- Download URL: mclustpy-0.0.3.tar.gz
- Upload date:
- Size: 3.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
125c4960283e661c7109dadafc7f2afa95e60b30c8a82dcc2daea8bdeee7ffbc
|
|
| MD5 |
15e5b2157b380975d8b5da585c71fb9d
|
|
| BLAKE2b-256 |
06771bf917cca799e2500e18e1cad81e3f199cd84003505c6cd9e945bd0e73be
|
File details
Details for the file mclustpy-0.0.3-py3-none-any.whl.
File metadata
- Download URL: mclustpy-0.0.3-py3-none-any.whl
- Upload date:
- Size: 4.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b21093bb5731dad451f17caa91c2be910f6dfe6277b77dfd7a6a824deae4147
|
|
| MD5 |
0bbf92992c1ea787549efd66d51e44ee
|
|
| BLAKE2b-256 |
f605fc64dd5fb417d3773e142178ae5d82ec11d47955d84044c05941540f4758
|