No project description provided
Project description
Parallel Delayed Cluster DP-Means
Introduction
The PDC-DP-Means package presents a highly optimized version of the DP-Means algorithm, introducing a new parallel algorithm, Parallel Delayed Cluster DP-Means (PDC-DP-Means), and a MiniBatch implementation for enhanced speed. These features cater to scalable and efficient cluster analysis where the number of clusters is unknown.
In addition to offering major speed improvements, the PDC-DP-Means algorithm supports an optional online mode for real-time data processing. Its scikit-learn-like interface is user-friendly and designed for easy integration into existing data workflows. PDC-DP-Means outperforms other nonparametric methods, establishing its efficiency and scalability in the realm of clustering algorithms.
See the paper for more details.
Installation
pip install pdc-dp-means
Quick Start
from sklearn.datasets import make_blobs
from pdc_dp_means import DPMeans
# Generate sample data
X, y_true = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)
# Apply DPMeans clustering
dpmeans = DPMeans(n_clusters=1,n_init=10, delta=10) # n_init and delta parameters
dpmeans.fit(X)
# Predict the cluster for each data point
y_dpmeans = dpmeans.predict(X)
# Plotting clusters and centroids
import matplotlib.pyplot as plt
plt.scatter(X[:, 0], X[:, 1], c=y_dpmeans, s=50, cmap='viridis')
centers = dpmeans.cluster_centers_
plt.scatter(centers[:, 0], centers[:, 1], c='black', s=200, alpha=0.5)
plt.show()
One thing to note is that we replace the \lambda
parameter from the paper with delta
in the code, as lambda
is a reserved word in python.
Usage
Please refer to the documentation: https://pdc-dp-means.readthedocs.io/en/latest/
Paper Code
Please refer to https://github.com/BGU-CS-VIL/pdc-dp-means/tree/main/paper_code for the code used in the paper.
Citing this work
If you use this code for your work, please cite the following:
@inproceedings{dinari2022revisiting,
title={Revisiting {DP}-Means: Fast Scalable Algorithms via Parallelism and Delayed Cluster Creation},
author={Dinari, Or and Freifeld, Oren},
booktitle={The 38th Conference on Uncertainty in Artificial Intelligence},
year={2022}
}
License
Our code is licensed under the BDS-3-Clause license.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for pdc_dp_means-0.0.8-cp312-cp312-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c86709f23ac22497b83884727f028930e3c9edcef7c4515f55dec41c4aa3fed6 |
|
MD5 | bd0d817be7b358c93c0d36ae77b2d085 |
|
BLAKE2b-256 | 1f6c5f3b87f647cd9934d8fa0b4138ff5d7c896b69c8d84450ab24dce472dd92 |
Hashes for pdc_dp_means-0.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9282b74e086c1cf966888e0f98462abd96e1be4d89265b669a1bfa962a7e04fd |
|
MD5 | 644066510f3611e2541e8f73e1a2df60 |
|
BLAKE2b-256 | ac11a9d68aaa0cb6f436b873959151631ec7aa78973d41c2afe2c5d1fb7d42c7 |
Hashes for pdc_dp_means-0.0.8-cp312-cp312-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba99c143a4f1eb1b0b81fe69b3d04a7b6109d0daa61aee4dd0f8bb47ddb0bdbb |
|
MD5 | 050d096e6a1430b783c23881b2843d7b |
|
BLAKE2b-256 | 014e5a3d1f5b14ea36186bcf4e7be4b4b7d8313bae9c10dba5bdaa7836858999 |
Hashes for pdc_dp_means-0.0.8-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 248be8840a4cb43c3cb7ae5182514eb8ba5bbe5e3c85c58c643ded4f4d77d5fc |
|
MD5 | 6f720d6c3ad1fb9a9c9da0399c47ee59 |
|
BLAKE2b-256 | 26bbdb80afd96218b9f5c89bccb795d9bab87f656444f75e956125fd803e0b07 |
Hashes for pdc_dp_means-0.0.8-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 126b197dce8e96932ec7e55b21baf9a8196ff949aaa3d51b8ab7cf7247ffc629 |
|
MD5 | e4f92fac50e2c9d164bbf37eb27d1914 |
|
BLAKE2b-256 | c4d495057618524e451e682f022175676811929dd6db34ba255b128d91ecc984 |
Hashes for pdc_dp_means-0.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a691bbb15f59100a010358a9bc55b766d95736ad1ca53888b87bf969b0cf854e |
|
MD5 | 2d029d04c00212bc4419b3b83f0fe24e |
|
BLAKE2b-256 | 7c1d8b748ac8a7ed1606b09af54aea1d395643c2c524bfd84859385d2e8be311 |
Hashes for pdc_dp_means-0.0.8-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39ff67a3b65bc66688cdeee2b67cb3158e890fe8fc8bb7a04cf03948c1bb0eb0 |
|
MD5 | b19d9510af655cacae1c4ad7d528b196 |
|
BLAKE2b-256 | 0a531836fbaa5e1fe8f4b635750c4ae6fe99adb5f150e074143d07938bb73477 |
Hashes for pdc_dp_means-0.0.8-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18eb3e18c77c25bee94bf18cb1f488f6c00cbc4e7f0a40b44fbcf39d909f8b63 |
|
MD5 | 5f18529d94a325946e856ea4e52aa768 |
|
BLAKE2b-256 | fa7b428928b4d7d1a99c118446a3ff587b69ab8d8d7d91028d141ebbb2c64308 |
Hashes for pdc_dp_means-0.0.8-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24b60975074a301b5f1a82373007372a7bd4b0d47b409194693ce01d86f29e1e |
|
MD5 | 1130d0ea2d32a02d132b84a75852365a |
|
BLAKE2b-256 | 1bd8cb8e04a7730d9d39d7044907972d23614eb38e415deda321fb20c0afa6a6 |
Hashes for pdc_dp_means-0.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8baeeb74efe8abca3d70bd7e5f1c6e4d9633c3ceea8eaf125b1c67d2ac887bf2 |
|
MD5 | de03312cef756e041f78e72bf5635aea |
|
BLAKE2b-256 | dc932e9a14ab03f298bc544633851e0d962b57d1d1e394db415426fad52f40fb |
Hashes for pdc_dp_means-0.0.8-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83b3069a0fa078a90d9db48272705880272092476de508b30a149fe85a07af5b |
|
MD5 | 3d47df2a6890b0a8fe68a9e58e644494 |
|
BLAKE2b-256 | 80df22fa8a25802fffb2b5ed9d57e6e2f55d9a60b95f29bf4ce49a8b916e3770 |
Hashes for pdc_dp_means-0.0.8-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 669b81596ce79e0664b0767b63aeea25646bae55aef92631d1e58a9a5159bf10 |
|
MD5 | d56183bfe5e9603498c9999372e9298d |
|
BLAKE2b-256 | 073a6ef345b4ff400168feb87a92d54b40af929d1b7c515a1ac25fcf9d01dd35 |
Hashes for pdc_dp_means-0.0.8-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 255b112f408aa04281ad26b7aa503324960196d5d6e9b90ddd110a5e5a463dc2 |
|
MD5 | 21c5f0be0086959ee274c331680095a3 |
|
BLAKE2b-256 | 8428a58832ef63413a058d711dc3b09df08bb5b313a729b2b44b8ad14b2ec40e |
Hashes for pdc_dp_means-0.0.8-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | baa7a6e0f87f665cb9178d77a22458ab246e95486878d00cfe8aaa2ec44b08a0 |
|
MD5 | 8070bd5ee62da933482a24f2c223f330 |
|
BLAKE2b-256 | 5b42178bca8b3e850e4361b7515e0376b51eb46bcc99dae69fc8f555d77d3ec7 |
Hashes for pdc_dp_means-0.0.8-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c4c1d95d445b194ed22df0e3704a6c37e1c102f2292796fdebb1acfc6c30405e |
|
MD5 | 5b0adb9670266b938b6d27c6863cc803 |
|
BLAKE2b-256 | fdaf4d789b8a833fa48d2cc80c8c18c86316726b6600bc0ba9b52d2a32082372 |
Hashes for pdc_dp_means-0.0.8-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e4af78839ffdfdb4fb838d171b29f45993bd3a79c0f0bc05c1faf25158d3a28 |
|
MD5 | 93c0087d335c47da07ecf02f166d6821 |
|
BLAKE2b-256 | f5fea122aaaf6f3ce0471f21d9515f26616abc1401b378eae8fd39818b436c51 |