A Python package for Optimal 1D Clustering
Project description
optimal1dclustering
Add k-medians and add a minimum cluster size parameter to https://github.com/dstein64/kmeans1d.
A Python library with an implementation of k-means and k-medians clustering on 1D data, based on the algorithm from Xiaolin (1991), as presented by Gronlund et al. (2017, Section 2.2).
The algorithm implemented in this library can also find the optimal clustering when clusters are required to have a minimum cluster size. It still finds the optimal clustering as the cost function is still monge concave.
Globally optimal clustering is NP-hard for multi-dimensional data. Lloyd's algorithm is a popular approach for finding a locally optimal solution. For 1-dimensional data, there are polynomial time algorithms. The algorithm implemented here is an O(kn + n log n) dynamic programming algorithm for finding the globally optimal k clusters for n 1D data points.
The code is written in C++, and wrapped with Python.
Requirements
optimal1dclustering supports Python 3.x.
Installation
optimal1dclustering is available on PyPI, the Python Package Index.
$ pip3 install optimal1dclustering
Example Usage
import optimal1dclustering
x = [4.0, 4.1, 4.2, -50, 200.2, 200.4, 200.9, 80, 100, 102]
k = 4 # Number of clusters
min_cluster_size = 2 # The minimum number of elements in each cluster
mode = 2 # 2 for k-means; 1 for k-medians
clusters, centroids = optimal1dclustering.cluster(x, k, min_cluster_size = min_cluster_size, mode = mode)
print(clusters) # [0, 1, 1, 0, 3, 3, 3, 2, 2, 2]
print(centroids) # [-23.0, 4.15, 94.0, 200.5]
Tests
Tests are in tests/.
# Run tests
$ python3 -m unittest discover tests -v
Development
The underlying C++ code can be built in-place, outside the context of pip
. This requires Python
development tools for building Python modules (e.g., the python3-dev
package on Ubuntu). gcc
,
clang
, and MSVC
have been tested.
$ python3 setup.py build_ext --inplace
The packages
GitHub action can be manually triggered (Actions
> packages
> Run workflow
) to build wheels
and a source distribution.
License
The code in this repository has an MIT License.
See LICENSE.
References
[1] Wu, Xiaolin. "Optimal Quantization by Matrix Searching." Journal of Algorithms 12, no. 4 (December 1, 1991): 663
[2] Gronlund, Allan, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider, and Mingzhou Song. "Fast Exact K-Means, k-Medians and Bregman Divergence Clustering in 1D." ArXiv:1701.07204 [Cs], January 25, 2017. http://arxiv.org/abs/1701.07204.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file optimal1dclustering-1.0.2.tar.gz
.
File metadata
- Download URL: optimal1dclustering-1.0.2.tar.gz
- Upload date:
- Size: 10.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf0f8e57d33d183a6fe1f7ed44a9c1022c9ee598b958d24144ed66f363d58dab |
|
MD5 | 118f1f6796ce66c894cddc7564919b5b |
|
BLAKE2b-256 | 1370f6948bcb96d1085bf305e4107faabbeb1758e400a7f7b5398fa19233d7d6 |
File details
Details for the file optimal1dclustering-1.0.2-cp312-cp312-win_amd64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 20.0 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5aece70e9277dc3091af8037798ea9e709cf3695cf24ed31ae2534fd39220ff7 |
|
MD5 | 6bff0e12ef6ec9a877d32929be338297 |
|
BLAKE2b-256 | d8a897264883ae51ce0492833c24e167110a1726f92eafa7f72663bef3dc63a2 |
File details
Details for the file optimal1dclustering-1.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 136.5 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7bcefe38eaafdd174564594e2f40225708e7821509d6e63396f7c06ea561e116 |
|
MD5 | 89be713b3117003f9a50b7a0aea4ba06 |
|
BLAKE2b-256 | 9391d189bde19e970e89eb77f6d3e9392c21c91704ac91707eeaa36c4b1ce2e0 |
File details
Details for the file optimal1dclustering-1.0.2-cp312-cp312-macosx_10_9_universal2.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp312-cp312-macosx_10_9_universal2.whl
- Upload date:
- Size: 30.9 kB
- Tags: CPython 3.12, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63ce94f120a9f9523b96635c86c8c489e89362ccf3c0fc02452c896776a91da0 |
|
MD5 | 3073d5deb0cbcddec91191c0b56d031e |
|
BLAKE2b-256 | e5e33856dcc603fc08770c36c2314c071fe4f5a928869916469d0fcba2671dfb |
File details
Details for the file optimal1dclustering-1.0.2-cp311-cp311-win_amd64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 20.0 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae581796ff7543a4166da294c955a013069a204f3ba288bfe93fbd6b63af7dcf |
|
MD5 | a5b5c548a8c9bf45ca173b7e88c054b7 |
|
BLAKE2b-256 | 26c8e842d76cc168d8fb3d8461382368526dacfb7a6b6d017801ab6da6c16404 |
File details
Details for the file optimal1dclustering-1.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 136.8 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bd35d43d645b3146323821935a6108ecd10f9fd31882e1c2c8f1309a22368e23 |
|
MD5 | f9ac220ce104ef5e3066315fa4cbb0fd |
|
BLAKE2b-256 | e85ff0c4b30a2f9fce94e2e1c6572ad6bedad52ee2464891e56a1f6d9f646047 |
File details
Details for the file optimal1dclustering-1.0.2-cp311-cp311-macosx_10_9_universal2.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp311-cp311-macosx_10_9_universal2.whl
- Upload date:
- Size: 30.9 kB
- Tags: CPython 3.11, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 14a4e7d676bb9febb3d5002241f3c1a6c3d04e0910a1772cc0b53614451329e2 |
|
MD5 | 921c6642cca4ae86f6ec6f2ae907ee3c |
|
BLAKE2b-256 | 59945970b4b2f7b4b0d6fa1b42080a2f3cad34ae3d9447019741f0cfcd0a2397 |
File details
Details for the file optimal1dclustering-1.0.2-cp310-cp310-win_amd64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 20.0 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3274841cb1c76091c1ca1b2d417317032de541fc24c26ca3ce0496c9785f2c3 |
|
MD5 | d5789e5b1f5b99d872c75b9fe16606b0 |
|
BLAKE2b-256 | 87cae11d9d3bbae32ccde0903aa0980651256a9563e3e182a9a64b6556bcf280 |
File details
Details for the file optimal1dclustering-1.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 136.1 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3df7fc60a2fdd37d4326360008ccbd90b515d8543f054df558f04425007d3d2c |
|
MD5 | f717864e3ed493b2caa1fda6a339911e |
|
BLAKE2b-256 | 95e9e36d436ef367e1b08f2858fe7fc7832ff0d9a5a8b6c163d445fc95ad6ae4 |
File details
Details for the file optimal1dclustering-1.0.2-cp310-cp310-macosx_10_9_universal2.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp310-cp310-macosx_10_9_universal2.whl
- Upload date:
- Size: 30.9 kB
- Tags: CPython 3.10, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 732e6a3e623e433aab5e4e431327c4303fbf5d29c84512f9d219bf47481d4ced |
|
MD5 | ef892974e3e4e4acdc18fc5c5379dfcc |
|
BLAKE2b-256 | f18a86a09f402c8a45b4667ed8d48d0dd5008c7e5c621e1801e498a27324bf70 |
File details
Details for the file optimal1dclustering-1.0.2-cp39-cp39-win_amd64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 20.0 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 867ca7481d74ac3bc7ef84acb9ba73fc54624e8ba23d257a3d8b8a86398db823 |
|
MD5 | fe0035f23bd91e1cc4567e40b218dc24 |
|
BLAKE2b-256 | eff132db9d536a1f734d411d1cd7af8e952dfcb42beb9086b830a6c2ed75db24 |
File details
Details for the file optimal1dclustering-1.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 135.9 kB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | de475b17a76f5a9155b5da7bb04a92da07f68e9ccf10008fe797ea6b74cb092e |
|
MD5 | 1f0fbe75820ea0c9b82ef98510e085d1 |
|
BLAKE2b-256 | f0ee02175fc043cc037754d86b7d2bbfaff5e06fe6bc191b685d6895f22f2e7a |
File details
Details for the file optimal1dclustering-1.0.2-cp39-cp39-macosx_10_9_universal2.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp39-cp39-macosx_10_9_universal2.whl
- Upload date:
- Size: 30.9 kB
- Tags: CPython 3.9, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8c7a7ad9df52f4d540b9fb2627a9b380c03aab534b04f59664d46cfc4b328b3 |
|
MD5 | db5d4c4a5106c5118c5379c02293ab2f |
|
BLAKE2b-256 | ae992fcad1d3d25b897067dc3173c7f8af1e772db7ec6b998cdc7c362a4375d3 |
File details
Details for the file optimal1dclustering-1.0.2-cp38-cp38-win_amd64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 20.0 kB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 327aca87b2d12c5dd40906dbf225c7735cd16e45f5bc85fd12cf154eb47a2796 |
|
MD5 | b21f4f2a55e40ccfad695113bb7c63dd |
|
BLAKE2b-256 | e1220e3cfacd13b6b260bfe0db05719b394b55f297c074c9d83cac7c6586581c |
File details
Details for the file optimal1dclustering-1.0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 136.1 kB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5495e83a71a3abf68c202afcdbd9c6ac8d55ca9a9e42d9f9baa4bc47bdc3c7f3 |
|
MD5 | aba02eda1fb30bd6dac99259dd416d4f |
|
BLAKE2b-256 | 5b6a0e9c7acf308834a85921565bbde39abca2f8ab6e72225c926fe30525f320 |
File details
Details for the file optimal1dclustering-1.0.2-cp38-cp38-macosx_11_0_universal2.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp38-cp38-macosx_11_0_universal2.whl
- Upload date:
- Size: 30.9 kB
- Tags: CPython 3.8, macOS 11.0+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd9024d0f390a0f4a94ea24c1c3ad7ce0d8e2ad339a4e8dc675804c8350765e5 |
|
MD5 | 8e1e47421aff841310a63f9afbdb67ed |
|
BLAKE2b-256 | f3e695acb3a68f3531aea8f511d9be039d62f81ebc281e539400b9e87252df62 |
File details
Details for the file optimal1dclustering-1.0.2-cp37-cp37m-win_amd64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp37-cp37m-win_amd64.whl
- Upload date:
- Size: 20.0 kB
- Tags: CPython 3.7m, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26455a743380d180439b0f215fcc2127f988cb57215212d5150b096351461eab |
|
MD5 | 139af352c376e1f5a780342983705d49 |
|
BLAKE2b-256 | 7bfe992d01428f2779997343be68c5b057c918b056185d5df4063948b29ef6e2 |
File details
Details for the file optimal1dclustering-1.0.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 137.1 kB
- Tags: CPython 3.7m, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63b5a04c330ed18a70d9fbc8cab3333f985d5faf6e25e028db331e83a6bf60c0 |
|
MD5 | 5b84a7d30e17684e6a34ba36a1d5e54e |
|
BLAKE2b-256 | 8d6ee6475a524649cbdee4c09f82f966c62ed766fc83391b3cc4a0efee27b693 |
File details
Details for the file optimal1dclustering-1.0.2-cp36-cp36m-win_amd64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp36-cp36m-win_amd64.whl
- Upload date:
- Size: 20.1 kB
- Tags: CPython 3.6m, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26aaaaed03f403323c09631348df500bebb68f7ee19d293dbbbcbf8b68abfe7c |
|
MD5 | f3acf854d35336d0a4e812131bcf80db |
|
BLAKE2b-256 | 6912de862d1587b07f8853688913eda629053fa8e0f23d6d7a20d065dd18596e |
File details
Details for the file optimal1dclustering-1.0.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: optimal1dclustering-1.0.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 136.2 kB
- Tags: CPython 3.6m, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3bba541a504a235f6559e602831b3126f81997f3679e186c4353a787e5ac29b5 |
|
MD5 | f3ab93d330fe64976a8200bd9b4a5416 |
|
BLAKE2b-256 | c9fea0189218e36c3118d5e066afe4eb93bbfe3e242b982fd30acd55a8475bde |