Skip to main content

A simple tool to find the `knee` point of a 2-d curve.

Project description

Knee Finder

A simple tool to find the knee point of a 2-d curve.

This is useful for tune the parameters in several algorithms (clustering, etc.)

Installation

you can install this package with pip:

pip install kneefinder

Definition of "Knee" point

The knee point is defined as the “relative costs to increase [or decrease, NdC] some tunable parameter is no longer worth the corresponding performance benefit” (Satopää, Albrecht, Irwin, and Raghavan, 2011, p.1)

Example

import numpy as np
from KneeFinder import KneeFinder

data_x = np.linspace(1, 10, 15)
data_y = 10*(np.exp(-a) + 0.15 * np.random.rand(len(a)))

kf = KneeFinder(data_x=a, data_y=b)

knee_x, knee_y = kf.find_knee()

# plotting to check the results
kf.plot()

clustering_data

Methodology

KneeFinder define as knee the point which has the maximum distance from a line passing between the first and last point.

As example, take the following image: in blue you can see the data, in orange the segment which connect the first to the last data point, and in red the distances between the data points. The big continuous red line points to our knee point.

clustering_data

This methodology is simpler with respect to other methods: no parameters are required, so it's easier to use in automated processes.

Robustness

Since this tool does not rely on any assumption on the curve shape, it results as more robust with respect to other, more complicated, tools.

As example, if you consider Kneed with the following data, and simulating a common mis-configuration in the parameters:

# Finding the knee with the Kneed tool (not with our one)
from kneed import KneeLocator

x = [0.1       , 0.23571429, 0.37142857, 0.50714286, 0.64285714,
       0.77857143, 0.91428571, 1.05      , 1.18571429, 1.32142857,
       1.45714286, 1.59285714, 1.72857143, 1.86428571, 2.        ]
y = [ 1.17585897,  1.35051375,  1.836304  ,  2.20409812,  2.37060316,
        2.46157837,  3.28991099,  2.9927505 ,  3.44015722,  6.33212422,
        6.92051422,  5.28718862,  6.69129098,  6.67477275, 10.00921042]

kneedle = KneeLocator(x, y, curve="concave", direction="increasing")
kneedle.plot_knee()

Note that the curve is convex-like, while we configured Kneed as if the curve was concave-like. With this configuration, the package state the knee/elbow point to be the very first point, which is obviously wrong.

kneed_wrong

While using our tool you get:

kneed_right

Moreover, our tool is also a bit faster:

%%timeit
kf = KneeFinder(data_x=x, data_y=y)
kf.find_knee()
# 24 µs ± 268 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%%timeit
kneedle = KneeLocator(x, y, curve="concave", direction="increasing")
kneedle.find_knee()
# 91.8 µs ± 1.32 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kneefinder-0.0.2.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

kneefinder-0.0.2-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file kneefinder-0.0.2.tar.gz.

File metadata

  • Download URL: kneefinder-0.0.2.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for kneefinder-0.0.2.tar.gz
Algorithm Hash digest
SHA256 cb3052111c295f184353bda9f5f1e35274728157d628a56961e333c1659fa348
MD5 31beb7c5f4d92f8e0c2a0474e4ffb06e
BLAKE2b-256 516de7d827edb19c2af54f4da6ba0372914fabc620396142fb49dc7e18f7d545

See more details on using hashes here.

File details

Details for the file kneefinder-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: kneefinder-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for kneefinder-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0bcdc895afc6877cbb1e83ddad5a3ccdff3d581848afc4f55459e596f70ca246
MD5 3fa0dd154dcb842e72e45997aca066b3
BLAKE2b-256 648d3472d60311741dad9524575e7734bc46636fbdd8621704c264acfb43e79e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page