Skip to main content

https://github.com/bauxn/kernel-kmeans. openMP is not enabled when installing via PiPy

Project description

Kernel K-Means

Implementation of a Kernel K-Means clustering framework in Python. Utilizes Cython to generate efficient C code.
https://github.com/bauxn/kernel-kmeans

Installation

Installation via Pip does not allow OpenMP. To use OpenMP, see github installation.

Pip

Ensure pip is updated, then run:

pip install KKMeans

Github

Install via:

git clone https://github.com/bauxn/kernel-kmeans

Then open the project folder and run

pip install .

Enabling OpenMP: There is a clearly marked line in setup.py that contains the compiler arguments. Before installing, these need to be edited so that they contain whichever command your compiler uses to enable OpenMP. There are outcommented lines which contain the correct arguments (and some additional ones for efficiency) for the msvc and the gcc compiler.

Basic Usage

from KKMeans import KKMeans

kkm = KKMeans(n_clusters=3, kernel="rbf")
kkm.fit(data)
print(kkm.labels_) # shows label for each datapoint
print(kkm.quality_) # print quality (default is inertia) of clustering

predictions = kkm.predict(data_to_predict) # returns labels of points in data_to_predict

KKMeans also contains the modules kernels (provides functionality to build kernel matrices / calculate kernels), elkan and lloyd which allow to calculate single iterations of the respective algorithms and quality, which contains functionality to calculate the silhouette coefficient. For more elaborate usage consult the thesis on github or the docstrings.

Limitations

As the computations happen C, in extreme cases overflows and other datatype errors may occur. Critical points are:

values datatype
kernel_matrix double
n_clusters long
cluster_sizes long
labels long
  1. The kernel matrix consists of the results of the kernel function, which usually is applied pairwise on the dataset. So ensuring the results are able to fit in a double is necessary.
  2. Ensure the number of clusters fits in a long.
  3. Ensure the number of points fits in a long.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

KKMeans-0.0.10.tar.gz (848.2 kB view hashes)

Uploaded Source

Built Distribution

KKMeans-0.0.10-cp39-cp39-win_amd64.whl (1.3 MB view hashes)

Uploaded CPython 3.9 Windows x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page