Classifier two-sample tests and related tests.
Project description
samesame
Same, same but different ...
samesame implements classifier two-sample tests (CTSTs) and as a bonus
extension, a noninferiority test (NIT).
These were either missing or implemented with some tradeoffs
(looking at you, sample-splitting) in existing libraries. And so,
samesame fills in the gaps :)
Motivation
What is samesame good for? It is for data (model) validation, performance
monitoring, drift detection (dataset shift), statistical process control,
covariate balance
and so on and so forth.
As an example, this
motivating example
comes from the related R package dsos.
Installation
To install, run the following command:
python -m pip install samesame
Quick Start
Simulate outlier scores to test for no adverse shift when the null (no shift) holds.
from samesame.ctst import CTST
from samesame.nit import DSOS
from sklearn.metrics import roc_auc_score
import numpy as np
n_size = 600
rng = np.random.default_rng(123_456)
os_train = rng.normal(size=n_size)
os_test = rng.normal(size=n_size)
null_ctst = CTST.from_samples(os_train, os_test, metric=roc_auc_score)
null_dsos = DSOS.from_samples(os_train, os_test)
In this example, we reject the null of equal distribution (i.e. CTST)
print(f"{null_ctst.pvalue=:.4f}")
# null_ctst.pvalue=0.0358
However, we fail to reject the null of no adverse shift (i.e. DSOS), meaning
that the test sample (os_test) does not seem to contain disproportionally
more outliers than the training sample (os_train).
print(f"{null_dsos.pvalue=:.4f}")
# null_dsos.pvalue=0.9500
This is the type of false alarms that samesame can highlight by comparing
tests of equal distribution to noninferiority tests.
Usage
Functionality
Below, you will find an overview of common modules in samesame.
| Function | Module |
|---|---|
| Bayesian inference | samesame.bayes |
| Classifier two-sample tests (CTSTs) | samesame.ctst |
| Noninferiority tests (NITs) | samesame.nit |
Attributes
When the method is a statistical test, samesame saves (stores) the results of
some potentially computationally intensive results in attributes. These
attributes, when available, can be accessed as follows.
| Attribute | Description |
|---|---|
.statistic |
The test statistic for the hypothesis. |
.null |
The null distribution for the hypothesis. |
.pvalue |
The p-value for the hypothesis. |
.posterior |
The posterior distribution for the hypothesis. |
.bayes_factor |
The bayes factor for the hypothesis. |
Examples
To get started, please see the examples in the docs.
Dependencies
samesame has few dependencies beyond the standard library. It will
probably work with some older Python versions. It is, in short, a lightweight
dependency for most machine learning projects.samesame is built on top of,
and is compatible with, scikit-learn and numpy.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file samesame-0.1.3.tar.gz.
File metadata
- Download URL: samesame-0.1.3.tar.gz
- Upload date:
- Size: 15.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c17c896b22d82554a3fbd1ff44a5a0476709d1b9e2efb7a9080b49de9e45880b
|
|
| MD5 |
1388ce9066be3d3d1427eb165daee92c
|
|
| BLAKE2b-256 |
ce8df098750a520015ebb7efd13389d230cb53acb714437b82570bb66ba4dbfe
|
File details
Details for the file samesame-0.1.3-py3-none-any.whl.
File metadata
- Download URL: samesame-0.1.3-py3-none-any.whl
- Upload date:
- Size: 20.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
765119cd2c11f9d64881036abb1bf25e944767cf0ed4226879aecd67c9c8c4c8
|
|
| MD5 |
55ada5dc245d730e84c4eee1249040d5
|
|
| BLAKE2b-256 |
b7e68e15484a3e729e69faf3026f903bb5b40b4ac95c22decdb842df0b447f1a
|