Skip to main content

Kolmogorov-Smirnov metric for machine learning

Project description

Kolmogorov-Smirnov metric (ks metric) is derived from K-S test. K-S test measures the distance between two plotted cumulative distribution functions (CDF). To use it as a metric for classification machine learning problem we see the distance of plotted CDF of target and non-target. The model that produces the greatest amount of separability between target and non-target distribution would be considered the better model.

Installation

The package requires: pandas and numpy.

To install the package, execute:

$ python setup.py install

or

pip install ks_metric

Usage

To get the KS score :

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

from ks_metric import ks_score

data = load_breast_cancer()
X, y = data['data'], data['target']
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.33, random_state=42)

clf = LogisticRegression(random_state=0, max_iter=10000).fit(X_train, y_train)
ks_score(y_train, clf.predict_proba(X_train)[:,1])

KS table :

from ks_metric import ks_table

ks_table(y_train, clf.predict_proba(X_train)[:,1])

KS scorer (for hyperparameter search) :

from sklearn.model_selection import GridSearchCV
from ks_metric import ks_scorer

clf = GridSearchCV(estimator=LogisticRegression(), param_grid={'C':[0.01,0.1,1]}, scoring=ks_scorer)

see the example notebook for detailed usage.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ks_metric-0.2.0.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

ks_metric-0.2.0-py2.py3-none-any.whl (4.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file ks_metric-0.2.0.tar.gz.

File metadata

  • Download URL: ks_metric-0.2.0.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.12

File hashes

Hashes for ks_metric-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ce2862c87c00a011611ea23f12fd3399a424cfb6324df932149697ffe1992f87
MD5 07520493d7df7380c767c6c6d4636eb2
BLAKE2b-256 a2af305f8b091e19e504e97478e87e8753f29b7962d6a0682b32d273646a1f1c

See more details on using hashes here.

File details

Details for the file ks_metric-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: ks_metric-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.12

File hashes

Hashes for ks_metric-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 573373321eb11659b96bed5591457da3951692e3207340aa73a7bb9b8d251a30
MD5 37576fccb6e886e810ce794a6d343eb2
BLAKE2b-256 4cf96d6a291a10ecb9b3e9af83187782aa7622084fc78f91185e9acb98b9e7b4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page