A Python library for estimating confidence intervals around accuracy and sample sizes for classification experiments.
Project description
confidence-planner
The confidence-planner package provides implementations of estimation procedures for confidence intervals around classification accuracy in Python. The package currently features approximations for holdout, bootstrap, cross-validation, and progressive validation experiments. For information on how to install use the package, read on or take a look at our demonstration video below. To experiment with different estimation procedures go to the accompanying web application at https://prediction-confidence-planner.herokuapp.com/.
Installing confidence-planner
To install confidence-planner, just execute:
pip install confidence-planner
Afterwards you can import confidence_planner
and use all its functions.
Quickstart
from sklearn import datasets, svm, metrics
from sklearn.model_selection import train_test_split
import confidence_planner as cp
# example dataset
X, y = datasets.load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, stratify=y, random_state=23
)
# training the classifier and calculating accuracy
clf = svm.SVC(gamma=0.001)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
acc = metrics.accuracy_score(y_test, y_pred)
# confidence interval and sample size estimation
ci = cp.estimate_confidence_interval(y_test.shape[0], acc, confidence_level=0.90)
sample = cp.estimate_sample_size(interval_radius=0.05, confidence_level=0.90)
print(f"90% CI: {ci}")
print(f"Samples needed for a 0.05 radius 90% CI: {sample}")
More code examples (including cross-validation and bootstrapping) can be found in the examples
folder.
References
Confidence-planner methods belong to the field of frequentist statistics.
[1] Langford, J.: Tutorial on practical prediction theory for classification. Journal of Machine Learnining Research 6, 273–306 (2005).
[2] Blum, A., Kalai, A., Langford, J.: Beating the hold-out: Bounds for k-fold and progressive cross-validation. Proceedings of the Twelfth Annual Conference on Computational Learning Theory, COLT (1999).
[3] Puth, M.T., Neuhauser, M., Ruxton, G.: On the variety of methods for calculating confidence intervals by bootstrapping. The Journal of animal ecology 84 (2015).
License
Confidence-planner is free and open-source software licensed under the MIT license.
Contact
The best way to ask questions is via the GitHub Discussions channel. In case you encounter usage bugs, please don't hesitate to use the GitHub's issue tracker directly.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file confidence-planner-0.1.3.tar.gz
.
File metadata
- Download URL: confidence-planner-0.1.3.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c37b9eb9d5c688e6615fa59a1d663b206ba969ce170e2674202f542fc7533d46 |
|
MD5 | bc5f5b693db378ecb6ceda87d10cb181 |
|
BLAKE2b-256 | 0f05d77fcaf1b6b2f03f0982b65c4a5e1593dd942c26134c983eb974faee2cea |
File details
Details for the file confidence_planner-0.1.3-py3-none-any.whl
.
File metadata
- Download URL: confidence_planner-0.1.3-py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7820c9d6e97e467611e41529a479630742b7d38f691f4aec9531a549236a71ca |
|
MD5 | 06ff34ab5b5ff11205ff17077b3f3c56 |
|
BLAKE2b-256 | edba303c8c9e5b3c7598e77ef2d0434664347b45b6eb05f3aaac299c23ef8702 |