Compute the S_Dbw validity index
Project description
S_Dbw
###Compute the S_Dbw or SD validity index
####S_Dbw validity index is defined by equation:
S_Dbw = Scatt + Dens_bw
where Scatt - means average scattering for clusters and Dens_bw - inter-cluster density.
Lower value -> better clustering.
####SD validity index is defined by equation:
SD = k*Scatt + distance
where distance - distances between cluster centers, k - weighting coefficient equal to distance(Cmax).
Lower value -> better clustering.
Installation
pip install --upgrade s-dbw
Usage
from s_dbw import S_Dbw
score = S_Dbw(X, labels, centers_id=None, method='Tong', alg_noise='bind',
centr='mean', nearest_centr=True, metric='euclidean')
#####OR
from s_dbw import SD
score = SD(X, labels, k=1.0, centers_id=None, alg_noise='bind',centr='mean', nearest_centr=True, metric='euclidean')
Parameters:
- X : array-like, shape (n_samples, n_features)
List of n_features-dimensional data points. Each row corresponds to a single data point. - labels : array-like, shape (n_samples,)
Predicted labels for each sample (-1 - for noise). - centers_id : array-like, shape (n_samples,)
The center_id of each cluster's center. If None - cluster's center calculate automatically. - alg_noise : str,
Algorithm for recording noise points.
'comb' - combining all noise points into one cluster (default)
'sep' - definition of each noise point as a separate cluster
'bind' - binding of each noise point to the cluster nearest from it
'filter' - filtering noise points - centr : str,
cluster center calculation method (mean (default) or median) - nearest_centr : bool,
The centroid corresponds to the cluster point closest to the geometric center (default: True). - metric : str,
The distance metric, can be ‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’,
‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’,
‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘wminkowski’,‘yule’.
Default is ‘euclidean’.
#####For S_Dbw: - method : str,
S_Dbw calc method:
'Halkidi' - original paper [1]
'Kim' - see [2]
'Tong' - see [3]
#####For SD: - k: float, The weighting coefficient equal to distance(Cmax). It is necessary for evaluating solutions with vary number of clusters because distance(C) depends on number of clusters [4].
Returns
score : float
The resulting S_Dbw or SD score.
References:
- M. Halkidi and M. Vazirgiannis, “Clustering validity assessment: Finding the optimal partitioning of a data set,” in ICDM, Washington, DC, USA, 2001, pp. 187–194.
- Youngok Kim and Soowon Lee. A clustering validity assessment Index. PAKDD’2003, Seoul, Korea, April 30–May 2, 2003, LNAI 2637, 602–608
- Tong, J. & Tan, H. J. Electron.(China) (2009) 26: 258. https://doi.org/10.1007/s11767-007-0151-8
- Halkidi, Maria & Vazirgiannis, Michalis & Batistakis, Yannis. (2000). Quality Scheme Assessment in the Clustering Process. LNCS (LNAI). 1910. 265-276. 10.1007/3-540-45372-5_26.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
s_dbw-0.4.0.tar.gz
(7.2 kB
view details)
Built Distribution
s_dbw-0.4.0-py3-none-any.whl
(8.0 kB
view details)
File details
Details for the file s_dbw-0.4.0.tar.gz
.
File metadata
- Download URL: s_dbw-0.4.0.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8cead4094d6fec5225ad98f9127ee2b1a141b1afe2c55194cdfd9fcc8be1f494 |
|
MD5 | ca5703f43650e314dcb142e6ecca63f4 |
|
BLAKE2b-256 | 8d4685d6c7875e6dad25e81f9e47d30e648446e0b0e31469be32f5c5bdfa12c2 |
File details
Details for the file s_dbw-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: s_dbw-0.4.0-py3-none-any.whl
- Upload date:
- Size: 8.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | aac5310afa988e31ef7c098952566609659b5a649f02359b93efbc68bc712030 |
|
MD5 | dc9f95482f5f69c24c8385fe26384729 |
|
BLAKE2b-256 | 0e3dbd5788d448ab18d92dc38b10cccb6629eb89525a4c543ce38a0c5d12feb2 |