Mixture modeling algorithms using the Student's t-distribution
Project description
studenttmixture
Mixtures of multivariate Student's t distributions are widely used for clustering data that may contain outliers, but scipy and scikit-learn do not at present offer classes for fitting Student's t mixture models. This package provides classes for:
- Modeling / clustering a dataset using a finite mixture of multivariate Student's t distributions fit via the EM algorithm. You can select the number of components using either prior knowledge or the information criteria calculated by the model (e.g. AIC, BIC).
- Modeling / clustering a dataset using a mixture of multivariate Student's t distributions fit via the variational mean-field approximation. Depending on the hyperparameters you select, the fitting process will automatically "choose" an appropriate number of clusters, so the number of components in this case acts as an upper bound. In many cases this can be a significant advantage, but of course the hyperparameters may require some tuning, and the variational approach makes some subtle assumptions that may have impact the quality of the fit, especially for small datasets. Nonetheless, for some problems the ability to automatically select the number of clusters can make this a powerful tool.
(1) is available in version 0.0.1, (2) will be available in version 0.0.2.
Unittests for the package are in the tests folder.
Installation
pip install studenttmixture
Usage
Background
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
File details
Details for the file studenttmixture-0.0.1.4-py3-none-any.whl
.
File metadata
- Download URL: studenttmixture-0.0.1.4-py3-none-any.whl
- Upload date:
- Size: 20.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e056fdb452418496f21f6b70ecca56942d5f960ad68950f863dba2a458be3d1 |
|
MD5 | d3483ddf3697c20dd184c7af31f1a0f1 |
|
BLAKE2b-256 | db1e718642ff4c1c968cfb71118a81969bd65e1704395df18803719df2a64920 |