Knee-point detection in Python
Project description
# kneed
## Knee-point detection in Python
[](https://mybinder.org/v2/gh/arvkevi/kneed/master) [](https://travis-ci.com/arvkevi/kneed) [](https://www.codacy.com/app/arvkevi/kneed?utm_source=github.com&utm_medium=referral&utm_content=arvkevi/kneed&utm_campaign=Badge_Grade)
This repository is an attempt to implement the kneedle algorithm, published [here](https://www1.icsi.berkeley.edu/~barath/papers/kneedle-simplex11.pdf). Given a set of `x` and `y` values, `kneed` will return the knee point of the function. The knee point is the point of maximum curvature.

## Installation
To install use pip:
$ pip install kneed
Or clone the repo:
$ git clone https://github.com/arvkevi/kneed.git
$ python setup.py install
**Tested with Python 3.5 and 3.6**
## Usage
### Reproduce Figure 2 from the paper.
```python
from kneed import DataGenerator, KneeLocator
DG = DataGenerator()
x,y = DG.figure2()
print(x,y)
(array([ 0. , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
0.55555556, 0.66666667, 0.77777778, 0.88888889, 1. ]),
array([-5. , 0.26315789, 1.89655172, 2.69230769, 3.16326531,
3.47457627, 3.69565217, 3.86075949, 3.98876404, 4.09090909]))
kneedle = KneeLocator(x, y, S=1.0, invert=False)
kneedle.knee
0.22222222222222221
kneedle.plot_knee_normalized()
```

#### Average Knee from 5000 NoisyGaussians when mu=50 and sigma=10
```python
import numpy as np
knees = []
for i in range(5000):
x,y = DG.noisy_gaussian(mu=50, sigma=10, N=1000)
kneedle = KneeLocator(x,y)
knees.append(kneedle.knee)
np.mean(knees)
60.921051806064931
```
## Application
## Find the optimal number of clusters (k) to use in k-means clustering
See the tutorial in the notebooks folder, this can be achieved with the `direction` keyword argument:
```python
KneeLocator(x, y, direction='decreasing')
```

Contributing
* * *
I welcome contibutions, if you have suggestions or would like to make improvements please submit an issue or pull request.
## Citation
Finding a “Kneedle” in a Haystack:
Detecting Knee Points in System Behavior
Ville Satopa
†
, Jeannie Albrecht†
, David Irwin‡
, and Barath Raghavan§
†Williams College, Williamstown, MA
‡University of Massachusetts Amherst, Amherst, MA
§
International Computer Science Institute, Berkeley, CA
## Knee-point detection in Python
[](https://mybinder.org/v2/gh/arvkevi/kneed/master) [](https://travis-ci.com/arvkevi/kneed) [](https://www.codacy.com/app/arvkevi/kneed?utm_source=github.com&utm_medium=referral&utm_content=arvkevi/kneed&utm_campaign=Badge_Grade)
This repository is an attempt to implement the kneedle algorithm, published [here](https://www1.icsi.berkeley.edu/~barath/papers/kneedle-simplex11.pdf). Given a set of `x` and `y` values, `kneed` will return the knee point of the function. The knee point is the point of maximum curvature.

## Installation
To install use pip:
$ pip install kneed
Or clone the repo:
$ git clone https://github.com/arvkevi/kneed.git
$ python setup.py install
**Tested with Python 3.5 and 3.6**
## Usage
### Reproduce Figure 2 from the paper.
```python
from kneed import DataGenerator, KneeLocator
DG = DataGenerator()
x,y = DG.figure2()
print(x,y)
(array([ 0. , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
0.55555556, 0.66666667, 0.77777778, 0.88888889, 1. ]),
array([-5. , 0.26315789, 1.89655172, 2.69230769, 3.16326531,
3.47457627, 3.69565217, 3.86075949, 3.98876404, 4.09090909]))
kneedle = KneeLocator(x, y, S=1.0, invert=False)
kneedle.knee
0.22222222222222221
kneedle.plot_knee_normalized()
```

#### Average Knee from 5000 NoisyGaussians when mu=50 and sigma=10
```python
import numpy as np
knees = []
for i in range(5000):
x,y = DG.noisy_gaussian(mu=50, sigma=10, N=1000)
kneedle = KneeLocator(x,y)
knees.append(kneedle.knee)
np.mean(knees)
60.921051806064931
```
## Application
## Find the optimal number of clusters (k) to use in k-means clustering
See the tutorial in the notebooks folder, this can be achieved with the `direction` keyword argument:
```python
KneeLocator(x, y, direction='decreasing')
```

Contributing
* * *
I welcome contibutions, if you have suggestions or would like to make improvements please submit an issue or pull request.
## Citation
Finding a “Kneedle” in a Haystack:
Detecting Knee Points in System Behavior
Ville Satopa
†
, Jeannie Albrecht†
, David Irwin‡
, and Barath Raghavan§
†Williams College, Williamstown, MA
‡University of Massachusetts Amherst, Amherst, MA
§
International Computer Science Institute, Berkeley, CA
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kneed-0.1.0.tar.gz
(5.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kneed-0.1.0.tar.gz.
File metadata
- Download URL: kneed-0.1.0.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9049fa23f96ca91d5c65857461dab703adf12b9773e337ae2274ed5f8e35fbc5
|
|
| MD5 |
da05924feae567f35ce63eee06295a96
|
|
| BLAKE2b-256 |
29ccd9a82da82bf2ab43257e542072a64794de4b681e48e6b79e145448681815
|
File details
Details for the file kneed-0.1.0-py2.py3-none-any.whl.
File metadata
- Download URL: kneed-0.1.0-py2.py3-none-any.whl
- Upload date:
- Size: 6.2 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d4dfe6d9a48f3806840193c2fe1fe9cf9d55cb144c68494f4cc7da11713d72d
|
|
| MD5 |
fe4e09f7b6ec3d89a55ca413f3782940
|
|
| BLAKE2b-256 |
0a187f4d247ed8822d542bc8deb08898686c7972020ad9f1dbead0c48f14b789
|