Knee-point detection in Python
Project description
# kneed
## Knee-point detection in Python
[](https://pepy.tech/project/kneed) [](https://pepy.tech/project/kneed) [](https://mybinder.org/v2/gh/arvkevi/kneed/master) [](https://travis-ci.com/arvkevi/kneed) [](https://www.codacy.com/app/arvkevi/kneed?utm_source=github.com&utm_medium=referral&utm_content=arvkevi/kneed&utm_campaign=Badge_Grade)
This repository is an attempt to implement the kneedle algorithm, published [here](https://www1.icsi.berkeley.edu/~barath/papers/kneedle-simplex11.pdf). Given a set of `x` and `y` values, `kneed` will return the knee point of the function. The knee point is the point of maximum curvature.

## Installation
To install use pip:
$ pip install kneed
Or clone the repo:
$ git clone https://github.com/arvkevi/kneed.git
$ python setup.py install
**Tested with Python 3.5 and 3.6**
## Usage
*This reproduces Figure 2 from the manuscript.*
`x` and `y` must be equal length arrays.
`DataGenerator` has functions to generate sample datasets.
```python
from kneed import DataGenerator, KneeLocator
x, y = DataGenerator.figure2()
print([round(i, 3) for i in x])
print([round(i, 3) for i in y])
[0.0, 0.111, 0.222, 0.333, 0.444, 0.556, 0.667, 0.778, 0.889, 1.0]
[-5.0, 0.263, 1.897, 2.692, 3.163, 3.475, 3.696, 3.861, 3.989, 4.091]
```
Instantiating `KneeLocator` with `x`, `y` and the appropriate `curve` and `direction` will find the knee (or elbow) point.
Here, `kneedle.knee` stores the knee point of the curve.
```python
kneedle = KneeLocator(x, y, S=1.0, curve='concave', direction='increasing')
print(round(kneedle.knee, 3))
0.222
# .elbow can also be used to access point of maximum curvature
print(round(kneedle.elbow, 3))
0.222
```
The `KneeLocator` class also has some plotting functions for quick visualization of the curve (blue), the distance curve (red) and the knee (dashed line, if present)
```Python
kneedle.plot_knee_normalized()
```

#### Average Knee from 5000 NoisyGaussians when mu=50 and sigma=10
```python
import numpy as np
knees = []
for i in range(5000):
x,y = DataGenerator.noisy_gaussian(mu=50, sigma=10, N=1000)
kneedle = KneeLocator(x, y, curve='concave', direction='increasing')
knees.append(kneedle.knee)
np.mean(knees)
60.921051806064931
```
## Application
Find the optimal number of clusters (k) to use in k-means clustering.
See the tutorial in the notebooks folder, this can be achieved with the `direction` keyword argument:
```python
KneeLocator(x, y, curve='convex', direction='decreasing')
```

## Contributing
Contributions are welcome, if you have suggestions or would like to make improvements please submit an issue or pull request.
## Citation
Finding a “Kneedle” in a Haystack:
Detecting Knee Points in System Behavior
Ville Satopa
†
, Jeannie Albrecht†
, David Irwin‡
, and Barath Raghavan§
†Williams College, Williamstown, MA
‡University of Massachusetts Amherst, Amherst, MA
§
International Computer Science Institute, Berkeley, CA
## Knee-point detection in Python
[](https://pepy.tech/project/kneed) [](https://pepy.tech/project/kneed) [](https://mybinder.org/v2/gh/arvkevi/kneed/master) [](https://travis-ci.com/arvkevi/kneed) [](https://www.codacy.com/app/arvkevi/kneed?utm_source=github.com&utm_medium=referral&utm_content=arvkevi/kneed&utm_campaign=Badge_Grade)
This repository is an attempt to implement the kneedle algorithm, published [here](https://www1.icsi.berkeley.edu/~barath/papers/kneedle-simplex11.pdf). Given a set of `x` and `y` values, `kneed` will return the knee point of the function. The knee point is the point of maximum curvature.

## Installation
To install use pip:
$ pip install kneed
Or clone the repo:
$ git clone https://github.com/arvkevi/kneed.git
$ python setup.py install
**Tested with Python 3.5 and 3.6**
## Usage
*This reproduces Figure 2 from the manuscript.*
`x` and `y` must be equal length arrays.
`DataGenerator` has functions to generate sample datasets.
```python
from kneed import DataGenerator, KneeLocator
x, y = DataGenerator.figure2()
print([round(i, 3) for i in x])
print([round(i, 3) for i in y])
[0.0, 0.111, 0.222, 0.333, 0.444, 0.556, 0.667, 0.778, 0.889, 1.0]
[-5.0, 0.263, 1.897, 2.692, 3.163, 3.475, 3.696, 3.861, 3.989, 4.091]
```
Instantiating `KneeLocator` with `x`, `y` and the appropriate `curve` and `direction` will find the knee (or elbow) point.
Here, `kneedle.knee` stores the knee point of the curve.
```python
kneedle = KneeLocator(x, y, S=1.0, curve='concave', direction='increasing')
print(round(kneedle.knee, 3))
0.222
# .elbow can also be used to access point of maximum curvature
print(round(kneedle.elbow, 3))
0.222
```
The `KneeLocator` class also has some plotting functions for quick visualization of the curve (blue), the distance curve (red) and the knee (dashed line, if present)
```Python
kneedle.plot_knee_normalized()
```

#### Average Knee from 5000 NoisyGaussians when mu=50 and sigma=10
```python
import numpy as np
knees = []
for i in range(5000):
x,y = DataGenerator.noisy_gaussian(mu=50, sigma=10, N=1000)
kneedle = KneeLocator(x, y, curve='concave', direction='increasing')
knees.append(kneedle.knee)
np.mean(knees)
60.921051806064931
```
## Application
Find the optimal number of clusters (k) to use in k-means clustering.
See the tutorial in the notebooks folder, this can be achieved with the `direction` keyword argument:
```python
KneeLocator(x, y, curve='convex', direction='decreasing')
```

## Contributing
Contributions are welcome, if you have suggestions or would like to make improvements please submit an issue or pull request.
## Citation
Finding a “Kneedle” in a Haystack:
Detecting Knee Points in System Behavior
Ville Satopa
†
, Jeannie Albrecht†
, David Irwin‡
, and Barath Raghavan§
†Williams College, Williamstown, MA
‡University of Massachusetts Amherst, Amherst, MA
§
International Computer Science Institute, Berkeley, CA
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kneed-0.2.0.tar.gz
(5.7 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kneed-0.2.0.tar.gz.
File metadata
- Download URL: kneed-0.2.0.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa0f2ce3393be485171311acf85fbee1654a8fe0351b84295a82a0155cf3a177
|
|
| MD5 |
f4e4d001e7cf7c94fcae49cd69b38d99
|
|
| BLAKE2b-256 |
439fa4a98703a8fd162f8241c6b76a59fb02abe2d0280cd26e38cace18d9263d
|
File details
Details for the file kneed-0.2.0-py2.py3-none-any.whl.
File metadata
- Download URL: kneed-0.2.0-py2.py3-none-any.whl
- Upload date:
- Size: 6.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1930ec453d5b12dc8674b671f86a0ae506c2a4defb7ace5a7fb944f799679098
|
|
| MD5 |
b7b733389d41ff87a990419ae8aaeaa2
|
|
| BLAKE2b-256 |
5ac0bc48fa9169761f2a6787963bd44641ced4309a82806297f5b3a6ba1b4adf
|