Efficient implementation of DBCV with a k-dimensional tree
Project description
k-DBCV
k-DBCV is an efficient python implementation of the density based cluster validation (DBCV) score proposed by Moulavi et al. (2014).
Getting Started
Dependencies
- SciPy
- NumPy
Installation
k-DBCV can be installed via pip:
pip install kDBCV
Usage
To score clustering scenarios, the following libraries are used:
- scikit-learn
- ClustSim
For visualization:
- matplotlib
DBCV Score
Simple Scenario
The half moons dataset simulated from scikit-learn is shown:
DBCV_Score(X,labels)
Output: 0.5068928345037831
Scenario II
A larger dataset of clusters simulated with Clust_Sim-SMLM is shown:
score = DBCV_score(X,labels)
Output: 0.6171526846848352
Extracting Individual Cluster Scores
k-DBCV enables individual cluster score extraction where each cluster is assigned a score without consideration for noise: Individual Cluster Score = separation-sparseness/max(separation,sparseness)
By default, ind_clust_scores is set to False
score, ind_clust_score_array = DBCV_Score(X,labels, ind_clust_scores = True)
Individual cluster scores are displayed by color below:
Memory cutoff
A memory cutoff is necessary to prevent attempts to score clusters that would exceed available memory. This cutoff should be set dependent on the machine being used. The default is set to a maximum of 25.0 GB. The score will output a -1 if the cutoff would be exceeded, along with an error message. To remove these error messages set batch_mode = True (Default is False).
score = DBCV_score(X,labels, memory_cutoff = 25.0)
Relevant Citations
Density Based Cluster Validation
Moulavi, D., Jaskowiak, P. A., Campello, R. J. G. B., Zimek, A. & Sander, J. Density-based clustering validation. SIAM Int. Conf. Data Min. 2014, SDM 2014 2, 839–847 (2014)
k-DBCV implementation
Hammer, J. L., Devanny, A. J. & Kaufman, L. J. Density-based optimization for unbiased, reproducible clustering applied to single molecule localization microscopy. Preprint at https://www.biorxiv.org/content/10.1101/2024.11.01.621498v1 (2024)
License
k-DBCV is licensed with an MIT license. See LICENSE file for more information.
Referencing
In addition to citing Moulavi et al., if you use this repository, please cite with the following (currently in preprint):
Hammer, J. L., Devanny, A. J. & Kaufman, L. J. Density-based optimization for unbiased, reproducible clustering applied to single molecule localization microscopy. Preprint at https://www.biorxiv.org/content/10.1101/2024.11.01.621498v1 (2024)
Contact
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kdbcv-1.0.0.tar.gz.
File metadata
- Download URL: kdbcv-1.0.0.tar.gz
- Upload date:
- Size: 11.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b0d66e103f935eb11008e90bb6d52b5d7e7a0bbed1f69a3e44f050d5144a3e2
|
|
| MD5 |
4d0cde9ea2bb2c5919f4a0820ddc2107
|
|
| BLAKE2b-256 |
e7f7fb4cd6b293b6f3cb61a2c800bfb89a8dc387dc79545e48a1de0ff6cf5313
|
File details
Details for the file kDBCV-1.0.0-py3-none-any.whl.
File metadata
- Download URL: kDBCV-1.0.0-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3059216e871d9578e93ed19b8a97fd3cca263b9ae600b656030b27583122d2c0
|
|
| MD5 |
298a41210fbee5ef98d976c5d710c8a9
|
|
| BLAKE2b-256 |
47ccf312b478ca3839a1847213e33818ffac57ebd8f992dbd7f840f57e66e8d9
|