A Python package for optimal 1D k-means clustering
Project description
kmeans1d
A Python library with an implementation of k-means clustering on 1D data, based on the algorithm in (Xiaolin 1991), as presented in section 2.2 of (Gronlund et al., 2017).
Globally optimal k-means clustering is NP-hard for multi-dimensional data. Lloyd's algorithm is a popular approach for finding a locally optimal solution. For 1-dimensional data, there are polynomial time algorithms. The algorithm implemented here is a O(kn + n log n) dynamic programming algorithm for finding the globally optimal k clusters for n 1D data points.
The code is written in C++, and wrapped with Python.
Requirements
kmeans1d supports Python 3.x.
Installation
kmeans1d is available on PyPI, the Python Package Index.
$ pip3 install kmeans1d
Example Usage
import kmeans1d
x = [4.0, 4.1, 4.2, -50, 200.2, 200.4, 200.9, 80, 100, 102]
k = 4
clusters, centroids = kmeans1d.cluster(x, k)
print(clusters) # [1, 1, 1, 0, 3, 3, 3, 2, 2, 2]
print(centroids) # [-50.0, 4.1, 94.0, 200.5]
Tests
Tests are in tests/.
# Run tests
$ python3 -m unittest discover tests -v
License
The code in this repository has an MIT License.
See LICENSE.
References
[1] Wu, Xiaolin. "Optimal Quantization by Matrix Searching." Journal of Algorithms 12, no. 4 (December 1, 1991): 663
[2] Gronlund, Allan, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider, and Mingzhou Song. "Fast Exact K-Means, k-Medians and Bregman Divergence Clustering in 1D." ArXiv:1701.07204 [Cs], January 25, 2017. http://arxiv.org/abs/1701.07204.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for kmeans1d-0.3.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c27d5068c8c7d92bb762b68eecef9ca360185bdfade3d8656ac789b119a5442f |
|
MD5 | 7966c17c61515d29de90f533c0dca396 |
|
BLAKE2b-256 | 7130a828552358644127f6e963e16a8efb5b613d2cca46f080be2f73794de330 |
Hashes for kmeans1d-0.3.0-cp38-cp38-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03538536bec7e48d1d440cf774fcdfe8f8cafe67b3f5ee68a978898473cb6466 |
|
MD5 | 57df7b47a77766efcbc1d7d7a4d7cc7a |
|
BLAKE2b-256 | 3ef4594213857a9d236e9d23f08eea6442b739abd4effca1794d867eacf0894d |
Hashes for kmeans1d-0.3.0-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5f9a5651e94abbe795397dd575bf6114e2c8f17182c3c2ea1a2bc949118daae |
|
MD5 | 099ac7775c5b9aa936f97bf9883086d6 |
|
BLAKE2b-256 | 3366f80292499f62f9b6d5845802288c79bbc3f3083c3459504ab1235ecfeca0 |
Hashes for kmeans1d-0.3.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c817bce3b53f0740e23fc20dbe116c26a9023dc7053640563f39553aece44510 |
|
MD5 | 55e078622613fe2dbeaac0ab3d0ac555 |
|
BLAKE2b-256 | e802e1ce080120a95cea1f6fbbe6883914bebcef24740f045c39956838360035 |
Hashes for kmeans1d-0.3.0-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e82be1f5fffceafc868aeea1e36686e146f15274d92e9a923234c1c3078d0a6 |
|
MD5 | dcebbaa6a2c507dff1ff17e6833f0b63 |
|
BLAKE2b-256 | 1ed2fe39d906fad9c644bc6b6ee9312614632a5206af7c2bdb5acd1c801d14e2 |
Hashes for kmeans1d-0.3.0-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 619860ec820c96c96639dd42a0376f46c41adbcbc4d1326a2d31a1f6a671966f |
|
MD5 | 1d50e85d28ec6724960103271d9119e2 |
|
BLAKE2b-256 | eea210e0c4e58b856e54cb92e4b2dca7115b1e877a62a82f3aace85d3dcc3977 |
Hashes for kmeans1d-0.3.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c8cec8277eac8951dc8cf265323edbeb54a81b7a9341e35f49581fd0a52b8ac4 |
|
MD5 | 22f4c27ca6f110f5a816f26cbca52ff7 |
|
BLAKE2b-256 | 92b4cd3b12d39aa2992cd2e854c494472bc43966b4541862483b2a131872e365 |
Hashes for kmeans1d-0.3.0-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0c64ae83830a97c17ae53c4e5f2d168565b6a1bea28379ae786d50fb8ff9f29f |
|
MD5 | 3e5d37e7f280a47350f16eb91c4a2381 |
|
BLAKE2b-256 | d1ca44e050b9b751c0779ef1042b2b5537c7981c785c4e5e5a9c7dbd6643016b |
Hashes for kmeans1d-0.3.0-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4af5ed1e84e6042af52251b7b7b89eb58b4d98ebadd081ceb4c11edd61e0e981 |
|
MD5 | ce91162c488c4e245ea916cdc27513b2 |
|
BLAKE2b-256 | 06b855e83846af2da119158485d45f2e8bc8b34eceb043350aa1e2fa4ed8e93d |