Probabilistic Scoring List classifier
Project description
Probabilistic Scoring Lists
Probabilistic scoring lists are incremental models that evaluate one feature of the dataset at a time. PSLs can be seen as a extension to scoring systems in two ways:
- they can be evaluated at any stage allowing to trade of model complexity and prediction speed.
- they provide a probability distribution over scores instead of hard thresholds.
Scoring Systems are used as decision support for human experts in medical or law domains.
The implementation adheres to the sklearn-api.
Install
pip install scikit-psl
Usage
from sklearn.datasets import make_classification
from sklearn.model_selection import ShuffleSplit
from skpsl import ProbabilisticScoringList
# Generating synthetic data with continuous features and a binary target variable
X, y = make_classification(random_state=42)
X = (X > .5).astype(int)
psl = ProbabilisticScoringList([-1, 1, 2])
for train, test in ShuffleSplit(1, test_size=.2, random_state=42).split(X):
psl.fit(X[train], y[train])
print(f"Brier score: {psl.score(X[test], y[test]):.4f}")
#> Brier score: 0.1924 (lower is better)
df = psl.inspect(5)
print(df.to_string(index=False, na_rep="-", justify="center", float_format=lambda x: f"{x:.2f}"))
#> Stage Score T = -3 T = -2 T = -1 T = 0 T = 1 T = 2 T = 3
#> 0 - - - - 0.54 - - -
#> 1 2.00 - - - 0.18 - 0.97 -
#> 2 -1.00 - - 0.00 0.28 0.91 1.00 -
#> 3 -1.00 - 0.00 0.07 0.86 0.91 1.00 -
#> 4 1.00 - 0.00 0.00 0.29 0.92 1.00 1.00
#> 5 -1.00 0.00 0.00 0.00 0.40 1.00 1.00 1.00
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.2.0 - 2023-08-10
Added
- PSL classifier
- introduced parallelization
- implemented l-step lookahead
- simple inspect(·) method that creates a tabular representation of the model
0.1.0 - 2023-08-08
Added
- Initial implementation of the PSL algorithm
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scikit_psl-0.2.0.tar.gz
(7.7 kB
view hashes)
Built Distribution
Close
Hashes for scikit_psl-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d33e2a8a97762fd89c74e0ede3fde23608d51540bd7859b5468e1b13668647ac |
|
MD5 | 32a3a30aebff71856e68e4e93c36a61c |
|
BLAKE2b-256 | 8a2fd7932ea7dc132060a6e832a39e1314741a63239fcb6c0153ed6261c707e6 |