A python package for optimal 1d k-means clustering.
Project description
fast1dkmeans
A Python library which implements several variations of optimal k-means clustering on 1D data, based on the algorithms presented by Gronlund et al. (2017). This package is inspired by the kmeans1d package but extends it by implementing additional algorithms, in particular those with reduced memory requirements O(n) instead of O(kn).
There are several different ways to compute the optimal k-means clustering in 1d. Currently the package implements the following methods:
"binary-search-interpolation"
default [O(n lg(U) ), O(n) space, "wilber-interpolation"]"dynamic-programming-kn"
[O(kn), O(kn) space]"dynamic-programming-space"
[O(kn), O(n) space, "dp-linear"]"binary-search-normal"
[O(n lg(U) ), O(n) space, section 2.4, "wilber-binary"]
The code is written in Python and relies on the numba compiler for speed.
Requirements
fast1dkmeans relies on numpy
and numba
which currently support python 3.8-3.10.
Installation
fast1dkmeans is available on PyPI, the Python Package Index.
$ pip3 install fast1dkmeans
Example Usage
import fast1dkmeans
x = [4.0, 4.1, 4.2, -50, 200.2, 200.4, 200.9, 80, 100, 102]
k = 4
clusters = fast1dkmeans.cluster(x, k)
print(clusters) # [1, 1, 1, 0, 3, 3, 3, 2, 2, 2]
Important notice: On first usage the the code is compiled once which may take about 30s. On subsequent usages this is no longer necessary and execution is much faster.
Tests
Tests are in tests/.
# Run tests
$ python3 -m pytest .
License
The code in this repository has an BSD 2-Clause "Simplified" License.
See LICENSE.
References
[1] Gronlund, Allan, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider, and Mingzhou Song. "Fast Exact K-Means, k-Medians and Bregman Divergence Clustering in 1D." ArXiv:1701.07204 [Cs], January 25, 2017. http://arxiv.org/abs/1701.07204.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for fast1dkmeans-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b255507d7801a809e11fc6b3b23281e643683b5da645a4086f6ebb6c4c98d24 |
|
MD5 | 9e45c947d27d3bf6b6f12114bd98ab71 |
|
BLAKE2b-256 | f2dbd3639c5177e0cfa173cf6cfc22c8244614deca1aab7ea1ad84b7b62d46fe |