Skip to main content

Statistical tests for model monitoring, data validation, and drift detection.

Project description

samesame

Python Downloads Static Badge UAI 2022 uv Ruff

Same, same but different ...

samesame helps you compare a reference sample with a new sample.

It answers two practical questions:

  • Did anything change? Use test_shift(...).
  • Did things get worse? Use test_adverse_shift(...).

Use it for model monitoring, data validation, drift assessment, or any workflow where you need to compare two groups and determine whether the difference is practically important.

Who is this for?

samesame is useful whenever you need to compare a reference group and a new group, for example:

  • Model monitoring — Does production data still look like training data?
  • Data validation — Does this new batch look like the data I expect?
  • Drift detection — Did something change between last month and this month?
  • Group comparison — Do two customer groups, regions, or experiments look meaningfully different?

Installation

python -m pip install samesame

Quick Start

Suppose you already have one score per row for a reference sample and a new sample. Larger scores should correspond either to more adverse outcomes or to greater unusualness. That score might come from predicted risk, anomaly detection, model confidence, or another monitoring step.

import numpy as np
from samesame import test_adverse_shift, test_shift

rng = np.random.default_rng(123_456)
reference_values = rng.normal(size=600)
candidate_values = rng.normal(size=600)

shift = test_shift(reference=reference_values, candidate=candidate_values)
print(f"Did anything change? p-value = {shift.pvalue:.4f}")

harm = test_adverse_shift(
	reference=reference_values,
	candidate=candidate_values,
	direction="higher-is-worse",
)
print(f"Did things get worse? p-value = {harm.pvalue:.4f}")

How to read this: a small p-value from test_shift(...) indicates evidence that the new sample differs from the reference sample. A small p-value from test_adverse_shift(...) indicates evidence that it has also shifted in a worse direction. If the first is small and the second is large, the data changed but not in a clearly harmful way.

How it works

samesame does not compare raw tables directly. The usual workflow is:

  1. Turn each row into one score.
  2. Compare those scores between the reference and candidate groups.

If your data has many columns, that score usually comes from a model: predicted risk, anomaly level, model confidence, or the output of a classifier trained to distinguish the two groups. You can think of it as a scalar summary of how each row should be monitored.

So under the hood, the package turns a multivariate dataset into one score per row, then runs two checks: test_shift(...) asks whether the groups differ overall, and test_adverse_shift(...) asks whether the candidate group is more concentrated in the adverse tail of the score distribution.

What you get back

Function Result type Fields
test_shift ShiftResult .statistic, .pvalue, .statistic_name
test_adverse_shift AdverseShiftResult .statistic, .pvalue, .direction

If you need the resampling distribution or optional Bayesian output, use samesame.advanced.

Where to go next

Step-by-step examples are available in the documentation:

Tutorials

How-to guides

API at a glance

  • test_shift(*, reference, candidate, statistic="roc_auc")
  • test_adverse_shift(*, reference, candidate, direction=...)
  • samesame.advanced for sample weights, more resamples, and optional Bayesian output
  • samesame.logit_scores for turning classifier outputs into a confidence score
  • samesame.importance_weights for adjusting for known group differences
  • samesame.bayes_factors for converting between p-values and Bayes factors

Most users can keep the default settings. If your inputs are already binary 0/1 values, test_shift(...) also supports balanced_accuracy and matthews_corrcoef.

Dependencies

samesame has minimal dependencies. It is built on top of, and fully compatible with, scikit-learn and numpy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

samesame-0.2.1.tar.gz (18.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

samesame-0.2.1-py3-none-any.whl (23.3 kB view details)

Uploaded Python 3

File details

Details for the file samesame-0.2.1.tar.gz.

File metadata

  • Download URL: samesame-0.2.1.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for samesame-0.2.1.tar.gz
Algorithm Hash digest
SHA256 88cc09a986061ae8fd21fe06208cf59b15a306f8a457aa810a99dc630afada9e
MD5 fd8848df3961bae92cdf1bf339a9caa9
BLAKE2b-256 8dec8df6d73413c64db649d64467e9c32f9d179170f9ffc90267f09208da0719

See more details on using hashes here.

File details

Details for the file samesame-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: samesame-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 23.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for samesame-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5b39e8ac07b6020f6c4f2e05446f00f7421d9c72813933d0cc31fa55188005a5
MD5 4acf01ac401a96f1c00d38e7f8c9afff
BLAKE2b-256 f46e761af21f2de7d1c6eafe4731dac39efc7205884434aafda1c3448a63524d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page