Confidence intervals and p-values for sci-kit learn.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Statkit

Supplement your sci-kit learn models with 95 % confidence intervals, p-values, and decision curves.

Description

Estimate 95 % confidence intervals for your test scores.

For example, to compute a 95 % confidence interval of the area under the receiver operating characteristic curve (ROC AUC):

from sklearn.metrics import roc_auc_score
from statkit.non_parametric import bootstrap_score

y_prob = model.predict_proba(X_test)[:, 1]
auc_95ci = bootstrap_score(y_test, y_prob, metric=roc_auc_score)
print('Area under the ROC curve:', auc_95ci)

Compute p-value to test if one model is significantly better than another.

For example, to test if the area under the receiver operating characteristic curve (ROC AUC) of model 1 is significantly larger than model 2:

from sklearn.metrics import roc_auc_score
from statkit.non_parametric import paired_permutation_test

y_pred_1 = model_1.predict_proba(X_test)[:, 1]
y_pred_2 = model_2.predict_proba(X_test)[:, 1]
p_value = paired_permutation_test(y_test, y_pred_1, y_pred_2, metric=roc_auc_score)

Perform decision curve analysis by making net benefit plots of your sci-kit learn models. Compare the utility of different models and with decision policies to always or never take an action/intervention.

Net benefit curve

from matplotlib import pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from statkit.decision import NetBenefitDisplay

centers = [[0, 0], [1, 1]]
X_train, y_train = make_blobs(
    centers=centers, cluster_std=1, n_samples=20, random_state=5
)
X_test, y_test = make_blobs(
    centers=centers, cluster_std=1, n_samples=20, random_state=1005
)

baseline_model = LogisticRegression(random_state=5).fit(X_train, y_train)
y_pred_base = baseline_model.predict_proba(X_test)[:, 1]

tree_model = GradientBoostingClassifier(random_state=5).fit(X_train, y_train)
y_pred_tree = tree_model.predict_proba(X_test)[:, 1]

NetBenefitDisplay.from_predictions(y_test, y_pred_base, name='Baseline model')
NetBenefitDisplay.from_predictions(y_test, y_pred_tree, name='Gradient boosted trees', show_references=False, ax=plt.gca())

Detailed documentation can be on the Statkit API documentation pages.

Installation

pip3 install statkit

Support

You can open a ticket in the Issue tracker.

Contributing

We are open for contributions. If you open a pull request, make sure that your code is:

Well documented,
Code formatted with black,
And contains an accompanying unit test.

Authors and acknowledgment

Hylke C. Donker

License

This code is licensed under the MIT license.

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

1.1.0

Nov 27, 2025

1.0.3

Aug 2, 2025

1.0.1

Jul 3, 2025

1.0.0

Jun 24, 2024

0.2.7

Jun 15, 2024

0.2.6

Jun 6, 2024

0.2.5

Feb 22, 2024

0.2.4

Jan 29, 2024

0.2.3

Oct 26, 2023

This version

0.2.2

Oct 26, 2023

0.2.1

Mar 15, 2023

0.1.2

Feb 4, 2023

0.1.1

Oct 30, 2022

0.1.0

Jul 25, 2022

0.0.4

Jul 18, 2022

0.0.3

Jun 1, 2022

0.0.2

May 14, 2022

0.0.1

May 1, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

statkit-0.2.2.tar.gz (40.0 kB view details)

Uploaded Oct 26, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

statkit-0.2.2-py3-none-any.whl (17.1 kB view details)

Uploaded Oct 26, 2023 Python 3

File details

Details for the file statkit-0.2.2.tar.gz.

File metadata

Download URL: statkit-0.2.2.tar.gz
Upload date: Oct 26, 2023
Size: 40.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for statkit-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`ef6271d1303071e9e7ee704304a761c1ba97c8f103f6c367b65d56087f4b3461`
MD5	`0596a9e849d9dcfcfbfda589f6b6149e`
BLAKE2b-256	`9347a927b730edb235241978c69f1dc43b30723a6e3b5805f6d8937f5b0f2e96`

See more details on using hashes here.

File details

Details for the file statkit-0.2.2-py3-none-any.whl.

File metadata

Download URL: statkit-0.2.2-py3-none-any.whl
Upload date: Oct 26, 2023
Size: 17.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for statkit-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`20582acdd8165dd5447a919543b01e665574c375e12a0902782e9ace351acd1b`
MD5	`d8f06f829cf0795f2889909df874953f`
BLAKE2b-256	`64291dd85d69ec8f60d77b6865512bb2cb6179c5b0c932005b56046cae7454ea`

See more details on using hashes here.

statkit 0.2.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Statkit

Description

Installation

Support

Contributing

Authors and acknowledgment

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes