Skip to main content

Knee-point detection in Python

Project description

# kneed

## Knee-point detection in Python

[![Downloads](https://pepy.tech/badge/kneed)](https://pepy.tech/project/kneed) [![Downloads](https://pepy.tech/badge/kneed/week)](https://pepy.tech/project/kneed) [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/arvkevi/kneed/master) [![Build Status](https://travis-ci.com/arvkevi/kneed.svg?branch=master)](https://travis-ci.com/arvkevi/kneed) [![Codacy Badge](https://api.codacy.com/project/badge/Grade/0438592c8c0949fa902b2665a8b73ff1)](https://www.codacy.com/app/arvkevi/kneed?utm_source=github.com&utm_medium=referral&utm_content=arvkevi/kneed&utm_campaign=Badge_Grade)

This repository is an attempt to implement the kneedle algorithm, published [here](https://www1.icsi.berkeley.edu/~barath/papers/kneedle-simplex11.pdf). Given a set of `x` and `y` values, `kneed` will return the knee point of the function. The knee point is the point of maximum curvature.

![](images/functions_args_summary.png)

## Installation

To install use pip:

$ pip install kneed

Or clone the repo:

$ git clone https://github.com/arvkevi/kneed.git
$ python setup.py install
**Tested with Python 3.5 and 3.6**

## Usage
*This reproduces Figure 2 from the manuscript.*

`x` and `y` must be equal length arrays.
`DataGenerator` has functions to generate sample datasets.
```python
from kneed import DataGenerator, KneeLocator

x, y = DataGenerator.figure2()

print([round(i, 3) for i in x])
print([round(i, 3) for i in y])

[0.0, 0.111, 0.222, 0.333, 0.444, 0.556, 0.667, 0.778, 0.889, 1.0]
[-5.0, 0.263, 1.897, 2.692, 3.163, 3.475, 3.696, 3.861, 3.989, 4.091]
```
Instantiating `KneeLocator` with `x`, `y` and the appropriate `curve` and `direction` will find the knee (or elbow) point.
Here, `kneedle.knee` stores the knee point of the curve.

```python
kneedle = KneeLocator(x, y, S=1.0, curve='concave', direction='increasing')

print(round(kneedle.knee, 3))
0.222

# .elbow can also be used to access point of maximum curvature
print(round(kneedle.elbow, 3))
0.222
```
The `KneeLocator` class also has some plotting functions for quick visualization of the curve (blue), the distance curve (red) and the knee (dashed line, if present)
```Python
kneedle.plot_knee_normalized()
```

![](images/figure2.knee.png)

#### Average Knee from 5000 NoisyGaussians when mu=50 and sigma=10

```python
import numpy as np

knees = []
for i in range(5000):
x,y = DataGenerator.noisy_gaussian(mu=50, sigma=10, N=1000)
kneedle = KneeLocator(x, y, curve='concave', direction='increasing')
knees.append(kneedle.knee)

np.mean(knees)
60.921051806064931
```

## Application
Find the optimal number of clusters (k) to use in k-means clustering.
See the tutorial in the notebooks folder, this can be achieved with the `direction` keyword argument:

```python
KneeLocator(x, y, curve='convex', direction='decreasing')
```

![](images/knee.png)

## Contributing

Contributions are welcome, if you have suggestions or would like to make improvements please submit an issue or pull request.

## Citation

Finding a “Kneedle” in a Haystack:
Detecting Knee Points in System Behavior
Ville Satopa

, Jeannie Albrecht†
, David Irwin‡
, and Barath Raghavan§
†Williams College, Williamstown, MA
‡University of Massachusetts Amherst, Amherst, MA
§
International Computer Science Institute, Berkeley, CA


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kneed-0.2.0.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kneed-0.2.0-py2.py3-none-any.whl (6.8 kB view details)

Uploaded Python 2Python 3

File details

Details for the file kneed-0.2.0.tar.gz.

File metadata

  • Download URL: kneed-0.2.0.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0

File hashes

Hashes for kneed-0.2.0.tar.gz
Algorithm Hash digest
SHA256 aa0f2ce3393be485171311acf85fbee1654a8fe0351b84295a82a0155cf3a177
MD5 f4e4d001e7cf7c94fcae49cd69b38d99
BLAKE2b-256 439fa4a98703a8fd162f8241c6b76a59fb02abe2d0280cd26e38cace18d9263d

See more details on using hashes here.

File details

Details for the file kneed-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: kneed-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 6.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0

File hashes

Hashes for kneed-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 1930ec453d5b12dc8674b671f86a0ae506c2a4defb7ace5a7fb944f799679098
MD5 b7b733389d41ff87a990419ae8aaeaa2
BLAKE2b-256 5ac0bc48fa9169761f2a6787963bd44641ced4309a82806297f5b3a6ba1b4adf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page