Skip to main content

A Python package for optimal 1D k-means clustering

Project description

Build Status

kmeans1d

A Python library with an implementation of k-means clustering on 1D data, based on the algorithm from Xiaolin (1991), as presented by Gronlund et al. (2017, Section 2.2).

Globally optimal k-means clustering is NP-hard for multi-dimensional data. Lloyd's algorithm is a popular approach for finding a locally optimal solution. For 1-dimensional data, there are polynomial time algorithms. The algorithm implemented here is an O(kn + n log n) dynamic programming algorithm for finding the globally optimal k clusters for n 1D data points.

The code is written in C++, and wrapped with Python.

Requirements

kmeans1d supports Python 3.x.

Installation

kmeans1d is available on PyPI, the Python Package Index.

$ pip3 install kmeans1d

Example Usage

import kmeans1d

x = [4.0, 4.1, 4.2, -50, 200.2, 200.4, 200.9, 80, 100, 102]
k = 4

clusters, centroids = kmeans1d.cluster(x, k)

print(clusters)   # [1, 1, 1, 0, 3, 3, 3, 2, 2, 2]
print(centroids)  # [-50.0, 4.1, 94.0, 200.5]

Tests

Tests are in tests/.

# Run tests
$ python3 -m unittest discover tests -v

Development

The underlying C++ code can be built in-place, outside the context of pip. This requires Python development tools for building Python modules (e.g., the python3-dev package on Ubuntu). gcc, clang, and MSVC have been tested.

$ python3 setup.py build_ext --inplace

The packages GitHub action can be manually triggered (Actions > packages > Run workflow) to build wheels and a source distribution.

License

The code in this repository has an MIT License.

See LICENSE.

References

[1] Wu, Xiaolin. "Optimal Quantization by Matrix Searching." Journal of Algorithms 12, no. 4 (December 1, 1991): 663

[2] Gronlund, Allan, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider, and Mingzhou Song. "Fast Exact K-Means, k-Medians and Bregman Divergence Clustering in 1D." ArXiv:1701.07204 [Cs], January 25, 2017. http://arxiv.org/abs/1701.07204.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kmeans1d-0.5.0.tar.gz (7.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

kmeans1d-0.5.0-cp32-abi3-win_arm64.whl (15.9 kB view details)

Uploaded CPython 3.2+Windows ARM64

kmeans1d-0.5.0-cp32-abi3-win_amd64.whl (18.3 kB view details)

Uploaded CPython 3.2+Windows x86-64

kmeans1d-0.5.0-cp32-abi3-manylinux_2_34_x86_64.whl (119.0 kB view details)

Uploaded CPython 3.2+manylinux: glibc 2.34+ x86-64

kmeans1d-0.5.0-cp32-abi3-manylinux_2_34_aarch64.whl (117.0 kB view details)

Uploaded CPython 3.2+manylinux: glibc 2.34+ ARM64

kmeans1d-0.5.0-cp32-abi3-macosx_11_0_universal2.whl (28.4 kB view details)

Uploaded CPython 3.2+macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file kmeans1d-0.5.0.tar.gz.

File metadata

  • Download URL: kmeans1d-0.5.0.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kmeans1d-0.5.0.tar.gz
Algorithm Hash digest
SHA256 a8fd0cda0f3d7a563d232f53c1f752850832ad920f4162dbc71040d647ba4091
MD5 6af0da8109ad85181b96342249d576f1
BLAKE2b-256 2395dc8374732aae9f0bd90c5167fdb05248be12840022307106a954376ffbfc

See more details on using hashes here.

File details

Details for the file kmeans1d-0.5.0-cp32-abi3-win_arm64.whl.

File metadata

  • Download URL: kmeans1d-0.5.0-cp32-abi3-win_arm64.whl
  • Upload date:
  • Size: 15.9 kB
  • Tags: CPython 3.2+, Windows ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kmeans1d-0.5.0-cp32-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 b0cd805e8d755aed47ec35b82a16a7b27a708e8ce90ee8e54a93352fd1fe97f4
MD5 896d4e75f16073b43e80399a9054de02
BLAKE2b-256 482742ce366c599c52f49699836e33fba5ae01076931b36ef14586a6c2b7e4f7

See more details on using hashes here.

File details

Details for the file kmeans1d-0.5.0-cp32-abi3-win_amd64.whl.

File metadata

  • Download URL: kmeans1d-0.5.0-cp32-abi3-win_amd64.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: CPython 3.2+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kmeans1d-0.5.0-cp32-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 eb26be9596b9074cfca29354a86d17957f71c15a4dd0ac9d92687c3c3b6b5107
MD5 1afd5a7b73a6026c25a2e7f0531dcdb7
BLAKE2b-256 d6af96f753a2e0dfba6d3e4a7a0626e9437d2c35104f45790d08455077b6995b

See more details on using hashes here.

File details

Details for the file kmeans1d-0.5.0-cp32-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for kmeans1d-0.5.0-cp32-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 66dbb2794bc2afaec0286aaf92021b4f5557e84762b31f37dc73ae821386ea60
MD5 353ec0743aeec71ec69d71877f5fd0bd
BLAKE2b-256 6d50ca8bbf6453c1fff2f804b992f4eff00c292113bb2d8a18cb8765a05ccd92

See more details on using hashes here.

File details

Details for the file kmeans1d-0.5.0-cp32-abi3-manylinux_2_34_aarch64.whl.

File metadata

File hashes

Hashes for kmeans1d-0.5.0-cp32-abi3-manylinux_2_34_aarch64.whl
Algorithm Hash digest
SHA256 256a86c5ca7bd9bb29d26eaa4375f8a9705ae0739cc4123494ffdc98ee918e1e
MD5 04faa55cba0ac0b0ab47cf2d0c282e8c
BLAKE2b-256 113c2541fe9d8817b3aac12fc4d4bc1adf1c80c0b66c1b65a16ee0c47ac1dd83

See more details on using hashes here.

File details

Details for the file kmeans1d-0.5.0-cp32-abi3-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for kmeans1d-0.5.0-cp32-abi3-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 3a31ed6874f1be9d2add6a18980859d3423374c471cc624098891b40c779dbf4
MD5 ab506909b15954fd324a6cf55e293dfc
BLAKE2b-256 eb23d39d999fc8b49e33c3de437cfcd9012ae180eb2e66eb207eacea0626c5f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page