Skip to main content

Model drift detection and monitoring for insurance pricing models. PSI, CSI, Gini drift, A/E ratios, calibration checks.

Project description

insurance-monitoring

CI PyPI Python License: MIT

Deployed insurance pricing models go stale. The portfolio ages, the claims environment shifts, regulators change the rules. Without systematic monitoring you find out about it when the loss ratio deteriorates — typically 12 to 18 months after the model started misfiring.

This library gives UK pricing teams the specific tools to catch that drift early: exposure-weighted PSI for feature distribution, A/E ratios with Poisson confidence intervals for calibration, and the Gini drift z-test from arXiv 2510.04556 — currently the only statistically rigorous actuarial monitoring framework in the literature.

It produces traffic-light outputs (green/amber/red) that match how a Head of Pricing actually reads a monitoring pack, and a decision recommendation based on the Murphy score decomposition: recalibrate (update the intercept, one hour of work) or refit (rebuild the model, weeks of work).

No scikit-learn. No pandas. Polars-native throughout.

Installation

uv add insurance-monitoring

Quick example

import numpy as np
from insurance_monitoring import MonitoringReport, psi, ae_ratio, gini_coefficient

rng = np.random.default_rng(42)

# Reference period (model training window)
pred_ref = rng.uniform(0.05, 0.20, 50_000)
act_ref = rng.poisson(pred_ref).astype(float)

# Current monitoring period (18 months later)
# Portfolio has aged, young drivers more numerous, claim rate up
pred_cur = rng.uniform(0.05, 0.20, 15_000)
act_cur = rng.poisson(pred_cur * 1.08).astype(float)  # model is 8% optimistic

# Quick check: feature drift on model score
score_psi = psi(pred_ref, pred_cur)
print(f"Score PSI: {score_psi:.3f}")  # < 0.10 = stable, > 0.25 = investigate

# A/E ratio (aggregate)
from insurance_monitoring import ae_ratio_ci
ae_result = ae_ratio_ci(act_cur, pred_cur)
print(f"A/E: {ae_result['ae']:.3f}  (95% CI: {ae_result['lower']:.3f}{ae_result['upper']:.3f})")

# Gini coefficient (discrimination)
gini = gini_coefficient(act_cur, pred_cur)
print(f"Gini: {gini:.3f}")

# Combined monitoring report with traffic lights
report = MonitoringReport(
    reference_actual=act_ref,
    reference_predicted=pred_ref,
    current_actual=act_cur,
    current_predicted=pred_cur,
)
print(report.recommendation)  # 'NO_ACTION' | 'RECALIBRATE' | 'REFIT' | 'INVESTIGATE'
print(report.to_polars())     # flat DataFrame with metric / value / band columns

Modules

drift - Feature distribution monitoring

from insurance_monitoring.drift import psi, csi, ks_test, wasserstein_distance
import polars as pl

# PSI with exposure weighting (insurance-correct)
score_psi = psi(
    reference=score_train,
    current=score_q1_2025,
    n_bins=10,
    exposure_weights=earned_exposure,  # car-years, not policy count
)

# CSI heatmap across all rating factors
feature_ref = pl.DataFrame({"driver_age": [...], "vehicle_age": [...], "ncd_years": [...]})
feature_cur = pl.DataFrame({"driver_age": [...], "vehicle_age": [...], "ncd_years": [...]})
csi_table = csi(feature_ref, feature_cur, features=["driver_age", "vehicle_age", "ncd_years"])
# Returns: feature | csi | band

# Wasserstein: report drift in original units
d = wasserstein_distance(driver_ages_train, driver_ages_q1_2025)
print(f"Average driver age shifted by {d:.1f} years")

On exposure-weighted PSI: standard PSI treats every policy equally regardless of how long it was on risk. If your book renews quarterly and mixes 1-month and 12-month policies, unweighted PSI is wrong. The exposure_weights parameter weights bin proportions by earned exposure — correct for insurance.

calibration - A/E ratio and calibration checks

from insurance_monitoring.calibration import ae_ratio, ae_ratio_ci

# Aggregate A/E with Poisson CI (exact Garwood intervals)
result = ae_ratio_ci(actual, predicted, exposure=exposure)
# {'ae': 1.08, 'lower': 1.04, 'upper': 1.12, 'n_claims': 342, 'n_expected': 317}

# Segmented A/E: where is the model misfiring?
from insurance_monitoring.calibration import ae_ratio
seg_ae = ae_ratio(
    actual, predicted, exposure=exposure,
    segments=driver_age_bands,   # np.array(['17-24', '25-39', ...])
)
# Returns Polars DataFrame: segment | actual | expected | ae_ratio | n_policies

On the IBNR problem: the A/E ratio is only reliable on mature accident periods. For motor, that means at least 12 months of claims development. For liability, 24+ months. If you run monthly monitoring on recent accident months, apply chain-ladder development factors first — otherwise you will see artificially low A/E ratios that recover as claims develop.

discrimination - Gini drift test

from insurance_monitoring.discrimination import gini_coefficient, gini_drift_test

gini_ref = gini_coefficient(act_ref, pred_ref, exposure=exp_ref)
gini_cur = gini_coefficient(act_cur, pred_cur, exposure=exp_cur)

# Statistical test: has Gini degraded significantly?
# Implements arXiv 2510.04556 Theorem 1
result = gini_drift_test(
    reference_gini=gini_ref,
    current_gini=gini_cur,
    n_reference=50_000,
    n_current=15_000,
    reference_actual=act_ref, reference_predicted=pred_ref,
    current_actual=act_cur, current_predicted=pred_cur,
)
# {'z_statistic': -1.93, 'p_value': 0.054, 'gini_change': -0.03, 'significant': False}

The Gini drift test is the distinguishing feature of this library. Most monitoring tools will tell you whether A/E has moved. This tells you whether the model's ranking has degraded — the difference between a cheap recalibration and a full refit.

report - Combined monitoring in one call

from insurance_monitoring import MonitoringReport

report = MonitoringReport(
    reference_actual=act_ref,
    reference_predicted=pred_ref,
    current_actual=act_cur,
    current_predicted=pred_cur,
    exposure=exposure_cur,
    reference_exposure=exposure_ref,
    feature_df_reference=feat_ref,  # Polars DataFrame
    feature_df_current=feat_cur,
    features=["driver_age", "vehicle_age", "ncd_years"],
)

print(report.recommendation)
# 'REFIT' | 'RECALIBRATE' | 'NO_ACTION' | 'INVESTIGATE' | 'MONITOR_CLOSELY'

df = report.to_polars()
# metric              | value  | band
# ae_ratio            | 1.08   | amber
# gini_current        | 0.39   | amber
# gini_p_value        | 0.054  | amber
# csi_driver_age      | 0.14   | amber
# csi_vehicle_age     | 0.03   | green
# recommendation      | nan    | REFIT

thresholds - Configurable traffic lights

from insurance_monitoring.thresholds import MonitoringThresholds, PSIThresholds

# Tighten PSI thresholds for a large motor book with monthly monitoring
custom = MonitoringThresholds(
    psi=PSIThresholds(green_max=0.05, amber_max=0.15),
)
report = MonitoringReport(..., thresholds=custom)

Default thresholds follow industry convention (PSI: 0.1/0.25 from FICO/credit scoring; A/E: 0.95–1.05 green, 0.90–1.10 amber; Gini: p < 0.10 amber, p < 0.05 red).

Decision framework

The recommendation property implements the three-stage decision tree from arXiv 2510.04556, mapped to actuarial practice:

Signal Recommendation Action
No drift in any test NO_ACTION Continue, schedule next review
A/E red, Gini stable RECALIBRATE Update intercept/offset (hours of work)
Gini red REFIT Rebuild model on recent data (weeks of work)
Both red INVESTIGATE Manual review — check data quality first
Any amber MONITOR_CLOSELY Increase monitoring frequency

Databricks integration

The demo notebook at notebooks/demo_monitoring.py shows the full workflow on synthetic motor data and runs on Databricks serverless. Upload it to your workspace and schedule it as a monthly job against your MLflow inference table.

Background

The Gini drift test implements the framework from:

"Model Monitoring: A General Framework with an Application to Non-life Insurance Pricing", arXiv 2510.04556 (December 2025)

Read more

Your Pricing Model is Drifting (and You Probably Can't Tell) — why PSI alone is insufficient, and what it means when A/E is stable but the Gini is falling.

Related libraries

Library Why it's relevant
shap-relativities Extract rating relativities from GBMs — when monitoring flags REFIT, use SHAP to diagnose which factors have drifted most
insurance-interactions GLM interaction detection — a refit triggered by Gini degradation may need new interactions added
insurance-causal-policy SDID causal evaluation — if monitoring shows deterioration after a rate change, use this to isolate cause
insurance-cv Walk-forward cross-validation — use monitoring outputs to decide when to retrain and validate the retrained model
rate-optimiser Constrained rate change optimisation — monitoring informs when a rate adjustment is needed; rate-optimiser determines the right one

All Burning Cost libraries →


Licence

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insurance_monitoring-0.2.0.tar.gz (84.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

insurance_monitoring-0.2.0-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file insurance_monitoring-0.2.0.tar.gz.

File metadata

  • Download URL: insurance_monitoring-0.2.0.tar.gz
  • Upload date:
  • Size: 84.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_monitoring-0.2.0.tar.gz
Algorithm Hash digest
SHA256 60697df1bf8aa42d40f3991e8ac863b4b8f1fbd6abeeb6fa12b3b3db51ad222c
MD5 2a60cef96d2195dc2c0789a88cb23244
BLAKE2b-256 19b09924ea5839ec2b2a5892b599f4066508f610b5d78819d5c24f157ad145d4

See more details on using hashes here.

File details

Details for the file insurance_monitoring-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: insurance_monitoring-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 29.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_monitoring-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 25f5304b9e758cb0f06bc8d81bc1851e21bc590456222d934d74612a92027b29
MD5 1bdd1eb38483a78ba1a6d974b2edad2c
BLAKE2b-256 e060052a8d32cb95587899a66c4bd62d69e4be5d8d9ad3a900ec136db4f2fb8d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page