Skip to main content

A Python package for optimal 1D k-means clustering

Project description

Build Status

kmeans1d

A Python library with an implementation of k-means clustering on 1D data, based on the algorithm from Xiaolin (1991), as presented by Gronlund et al. (2017, Section 2.2).

Globally optimal k-means clustering is NP-hard for multi-dimensional data. Lloyd's algorithm is a popular approach for finding a locally optimal solution. For 1-dimensional data, there are polynomial time algorithms. The algorithm implemented here is an O(kn + n log n) dynamic programming algorithm for finding the globally optimal k clusters for n 1D data points.

The code is written in C++, and wrapped with Python.

Requirements

kmeans1d supports Python 3.x.

Installation

kmeans1d is available on PyPI, the Python Package Index.

$ pip3 install kmeans1d

Example Usage

import kmeans1d

x = [4.0, 4.1, 4.2, -50, 200.2, 200.4, 200.9, 80, 100, 102]
k = 4

clusters, centroids = kmeans1d.cluster(x, k)

print(clusters)   # [1, 1, 1, 0, 3, 3, 3, 2, 2, 2]
print(centroids)  # [-50.0, 4.1, 94.0, 200.5]

Tests

Tests are in tests/.

# Run tests
$ python3 -m unittest discover tests -v

Development

The underlying C++ code can be built in-place, outside the context of pip. This requires Python development tools for building Python modules (e.g., the python3-dev package on Ubuntu). gcc, clang, and MSVC have been tested.

$ python3 setup.py build_ext --inplace

The packages GitHub action can be manually triggered (Actions > packages > Run workflow) to build wheels and a source distribution.

License

The code in this repository has an MIT License.

See LICENSE.

References

[1] Wu, Xiaolin. "Optimal Quantization by Matrix Searching." Journal of Algorithms 12, no. 4 (December 1, 1991): 663

[2] Gronlund, Allan, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider, and Mingzhou Song. "Fast Exact K-Means, k-Medians and Bregman Divergence Clustering in 1D." ArXiv:1701.07204 [Cs], January 25, 2017. http://arxiv.org/abs/1701.07204.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kmeans1d-0.4.0.tar.gz (7.2 kB view details)

Uploaded Source

Built Distributions

kmeans1d-0.4.0-cp32-abi3-win_amd64.whl (18.1 kB view details)

Uploaded CPython 3.2+ Windows x86-64

kmeans1d-0.4.0-cp32-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (122.7 kB view details)

Uploaded CPython 3.2+ manylinux: glibc 2.17+ x86-64

kmeans1d-0.4.0-cp32-abi3-macosx_11_0_universal2.whl (28.6 kB view details)

Uploaded CPython 3.2+ macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file kmeans1d-0.4.0.tar.gz.

File metadata

  • Download URL: kmeans1d-0.4.0.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for kmeans1d-0.4.0.tar.gz
Algorithm Hash digest
SHA256 80f66178d4e66f75e076c4877b52f9b4d0eb2d3b4294c056016a1384f75fe5de
MD5 2b7eb7a6a44a2717e4717a07bd2cb6c6
BLAKE2b-256 185b0ba1f71363e62fb2bd5370ec9b26360c482bd9ad06082b5dcfb03e148928

See more details on using hashes here.

File details

Details for the file kmeans1d-0.4.0-cp32-abi3-win_amd64.whl.

File metadata

  • Download URL: kmeans1d-0.4.0-cp32-abi3-win_amd64.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: CPython 3.2+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for kmeans1d-0.4.0-cp32-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 5943d68f6514309023ace23d8ad3f43703871e87a021e8024cdf4db79ed2f2e1
MD5 a327d8ece92b60bb366631897a8346d4
BLAKE2b-256 d55adb6e1f27d31db470e53a6b70009216891050a7396a70584c9600a5b98bd4

See more details on using hashes here.

File details

Details for the file kmeans1d-0.4.0-cp32-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for kmeans1d-0.4.0-cp32-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ad8017fdb82746c718215695b4290e9bd6ded0305740e2fe2b1ede697a14e947
MD5 2ad01bfbbc32032aebc10b6c03f7501d
BLAKE2b-256 ba0c63c31443d391f5452aa21e7d5e10106de06cd6ea42bf0a0689b01900139f

See more details on using hashes here.

File details

Details for the file kmeans1d-0.4.0-cp32-abi3-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for kmeans1d-0.4.0-cp32-abi3-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 9a052bca5bbacbd70a7f4846896439ebed29e96a2b04115c1fec480555cad70d
MD5 27cc446ada593465911564900b00f7cf
BLAKE2b-256 191957056e4566611f5da2166c52a9f88185c177a224d0ecab1dd632cffa1c88

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page