Skip to main content

A python package for optimal 1d k-means clustering.

Project description

build

fast1dkmeans

A Python library which implements several variations of optimal k-means clustering on 1D data, based on the algorithms presented by Gronlund et al. (2017). This package is inspired by the kmeans1d package but extends it by implementing additional algorithms, in particular those with reduced memory requirements O(n) instead of O(kn).

There are several different ways to compute the optimal k-means clustering in 1d. Currently the package implements the following methods:

  • "binary-search-interpolation" default [O(n lg(U) ), O(n) space, "wilber-interpolation"]
  • "dynamic-programming-kn" [O(kn), O(kn) space]
  • "dynamic-programming-space" [O(kn), O(n) space, "dp-linear"]
  • "binary-search-normal" [O(n lg(U) ), O(n) space, section 2.4, "wilber-binary"]

The code is written in Python and relies on the numba compiler for speed.

Requirements

fast1dkmeans relies on numpy and numba which currently support python 3.8-3.10.

Installation

fast1dkmeans is available on PyPI, the Python Package Index.

$ pip3 install fast1dkmeans

Example Usage

import fast1dkmeans

x = [4.0, 4.1, 4.2, -50, 200.2, 200.4, 200.9, 80, 100, 102]
k = 4

clusters = fast1dkmeans.cluster(x, k)

print(clusters)   # [1, 1, 1, 0, 3, 3, 3, 2, 2, 2]

Important notice: On first usage the the code is compiled once which may take about 30s. On subsequent usages this is no longer necessary and execution is much faster.

Tests

Tests are in tests/.

# Run tests
$ python3 -m pytest .

License

The code in this repository has an BSD 2-Clause "Simplified" License.

See LICENSE.

References

[1] Gronlund, Allan, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider, and Mingzhou Song. "Fast Exact K-Means, k-Medians and Bregman Divergence Clustering in 1D." ArXiv:1701.07204 [Cs], January 25, 2017. http://arxiv.org/abs/1701.07204.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast1dkmeans-0.1.2.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fast1dkmeans-0.1.2-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file fast1dkmeans-0.1.2.tar.gz.

File metadata

  • Download URL: fast1dkmeans-0.1.2.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for fast1dkmeans-0.1.2.tar.gz
Algorithm Hash digest
SHA256 02a7295fb415895ccc35b7b4dde9882ae541caac0258f05ab9e63bac05598998
MD5 4137b9140121e550427643aeee57b288
BLAKE2b-256 227cbf37b9e4217457d3a4e3709b552a6ebc6fce77f95e4ef0a11bdb1c7be493

See more details on using hashes here.

File details

Details for the file fast1dkmeans-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: fast1dkmeans-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 20.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for fast1dkmeans-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5b255507d7801a809e11fc6b3b23281e643683b5da645a4086f6ebb6c4c98d24
MD5 9e45c947d27d3bf6b6f12114bd98ab71
BLAKE2b-256 f2dbd3639c5177e0cfa173cf6cfc22c8244614deca1aab7ea1ad84b7b62d46fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page