Distribution-free prediction intervals for insurance pricing models: conformal coverage guarantees, Tweedie non-conformity scores, SCR bounds, and anytime-valid sequential monitoring

These details have not been verified by PyPI

Project links

Project description

insurance-conformal

Distribution-free prediction intervals for insurance pricing models — 13% narrower than parametric Tweedie, with a finite-sample coverage guarantee.

Blog post: Conformal Prediction Intervals for Insurance Pricing Models

The problem

Your pricing model gives point estimates. Your parametric prediction intervals assume variance scales as mu^p across the whole book — an assumption that breaks exactly where the stakes are highest: large, unusual risks.

On a heterogeneous UK motor portfolio, parametric Tweedie intervals over-cover low-risk policies (unnecessary width) and under-cover the top risk decile — which is what drives reinsurance attachment, reserving, and SCR calculations.

Conformal prediction fixes this. The guarantee is P(y in interval) >= 1 - alpha for any data distribution, as long as calibration and test data are exchangeable. No parametric family required.

The non-obvious implementation detail: most conformal libraries use raw absolute residuals |y - yhat|. For insurance data that is wrong — a £1 error on a £100 risk is not the same as a £1 error on a £10,000 risk. The correct score for Tweedie models is |y - yhat| / yhat^(p/2), which normalises by the Tweedie standard deviation and produces exchangeable scores across risk levels. That is what this library implements.

Quick start

from insurance_conformal import InsuranceConformalPredictor

# Wrap any fitted sklearn-compatible model
cp = InsuranceConformalPredictor(
    model=fitted_gbm,
    nonconformity="pearson_weighted",  # correct default for Tweedie
    tweedie_power=1.5,
)

# Calibrate on held-out data (must not overlap training)
cp.calibrate(X_cal, y_cal)

# 90% prediction intervals — polars DataFrame: lower, point, upper
intervals = cp.predict_interval(X_test, alpha=0.10)

# Always check per-decile coverage (marginal != conditional)
print(cp.coverage_by_decile(X_test, y_test, alpha=0.10))

For locally-adaptive intervals (narrower on low-variance risks, wider on high-variance risks):

from insurance_conformal import LocallyWeightedConformal

lw = LocallyWeightedConformal(model=fitted_gbm, tweedie_power=1.5)
lw.fit(X_train, y_train)
lw.calibrate(X_cal, y_cal)
intervals = lw.predict_interval(X_test, alpha=0.10)

Why a pricing actuary should care

Accuracy where it matters. Parametric Tweedie intervals produce 93% aggregate coverage at a 90% target — fine in aggregate, but that surplus width sits on low-risk policies. The top-risk decile that drives reinsurance and reserving gets marginal coverage at best, and on books with more pronounced tail heteroscedasticity it will miss the target.

Regulatory defensibility. The distribution-free guarantee does not rely on model fit. You can write "P(claim in interval) >= 90%, finite-sample valid, no parametric assumptions" in a PRA SS1/23 validation pack. You cannot write that for a parametric bootstrap interval.

SCR calculations. SCRReport produces per-risk 99.5% upper bounds with a coverage validation table — exactly the format needed for internal model stress-testing documentation.

Premium sufficiency control. PremiumSufficiencyController finds the smallest loading factor such that expected underpricing shortfall is bounded at alpha. A direct regulatory argument, not a statistical artefact.

Performance on a realistic motor book

CatBoost Tweedie(p=1.5), 50,000 synthetic UK motor policies, heteroskedastic Gamma DGP, temporal 60/20/20 split.

	Parametric Tweedie	Conformal (`pearson_weighted`)	Locally-weighted conformal
Distribution assumption	Tweedie Var ~ mu^p	None	None
Aggregate coverage @ 90% target	93.1% (over-covers)	90.2%	90.3%
Top-decile coverage @ 90% target	90.4%	87.9%	90.6%
Mean interval width	£4,393	£3,806 (−13.4%)	£3,881 (−11.7%)
Width adapts per risk segment	No	Partial	Yes
Finite-sample valid guarantee	No	Yes	Yes

The locally-weighted variant meets the 90% target in the top decile by construction — the parametric baseline only coincidentally passes it on this dataset. Run the validation: import notebooks/databricks_validation.py into Databricks.

Installation

pip install insurance-conformal

# With CatBoost support:
pip install "insurance-conformal[catboost]"

# With LightGBM support:
pip install "insurance-conformal[lightgbm]"

# With everything (CatBoost, LightGBM, plotting):
pip install "insurance-conformal[all]"

Or with uv:

uv add insurance-conformal

Dependencies: polars and pandas are both required. Polars is the primary output format — all prediction and diagnostic methods return pl.DataFrame. Pandas is required for binning utilities and for accepting pandas DataFrame inputs. Both install automatically.

Worked examples

1. Motor frequency-severity model with per-decile coverage audit

from sklearn.linear_model import PoissonRegressor, GammaRegressor
from insurance_conformal.claims import FrequencySeverityConformal
from insurance_conformal import subgroup_coverage

fs = FrequencySeverityConformal(
    freq_model=PoissonRegressor(),
    sev_model=GammaRegressor(),
)
fs.fit(X_train, d_train, y_train)   # d_train = observed claim counts
fs.calibrate(X_cal, d_cal, y_cal)
intervals = fs.predict_interval(X_test, alpha=0.10)

# Coverage by vehicle group
sg = subgroup_coverage(
    predictor=fs,
    X_test=X_test,
    y_test=y_test,
    alpha=0.10,
    groups=vehicle_group_band,
    group_name="vehicle_group_band",
)
print(sg)

The calibration subtlety here: using the observed claim count in the severity model at calibration time creates a distributional mismatch that breaks the coverage guarantee. FrequencySeverityConformal feeds the predicted frequency (not the observed count) into the severity model at both calibration and test time. See Graziadei et al. (2023) for the proof.

2. Premium sufficiency control — bound expected underpricing

Useful when a pricing review requires a documented guarantee that expected shortfall from underpriced policies stays below a threshold.

from insurance_conformal.risk import PremiumSufficiencyController

psc = PremiumSufficiencyController(alpha=0.05, B=5.0)
psc.calibrate(y_cal, premium_cal)   # calibrate on held-out year
result = psc.predict(premium_new)   # apply to next year's book

# result["lambda_hat"]: the loading factor such that E[shortfall] <= 5%
# result["upper_bound"]: risk-controlled loaded premium per policy
print(f"Required loading: {result['lambda_hat']:.3f}")

3. SCR bounds for internal model documentation

from insurance_conformal import InsuranceConformalPredictor, SCRReport

cp = InsuranceConformalPredictor(model=fitted_model)
cp.calibrate(X_cal, y_cal)

scr = SCRReport(predictor=cp)
scr_bounds = scr.solvency_capital_requirement(X_test, alpha=0.005)
val_table  = scr.coverage_validation_table(X_test, y_test)
print(scr.to_markdown())

Disclaimer: SCRReport is an internal stress-testing tool. Solvency II SCR calculations for regulatory purposes require sign-off under an approved internal model or the standard formula. Do not use this output in regulatory returns without appropriate actuarial review, governance sign-off, and alignment with your firm's approved methodology.

4. Recovering from mid-year claims inflation (Ogden rate change, CAT event)

Standard conformal with a static calibration set breaks when the book shifts mid-year. RetroAdj recovers within 1–3 steps by retroactively correcting all leave-one-out residuals in the sliding window simultaneously.

from insurance_conformal import RetroAdj

# Residual-only mode: wrap an existing GLM or GBM
resid_train = y_train - glm.predict(X_train)
resid_test  = y_test  - glm.predict(X_test)

model = RetroAdj(window_size=250, gamma=0.005)
model.fit(resid_train)
lower_r, upper_r = model.predict_interval(resid_test, alpha=0.10)

lower_claims = lower_r + glm.predict(X_test)
upper_claims = upper_r + glm.predict(X_test)

Metric	RetroAdj	Standard ACI
Steps to recover 90% coverage after +30% inflation shock	~15–30	~80–150
Post-shift coverage (full window)	~88–91%	~80–87%

Features

InsuranceConformalPredictor — split conformal prediction wrapping any sklearn-compatible model. Non-conformity scores: pearson_weighted, pearson, deviance, anscombe, raw.
LocallyWeightedConformal — two-stage conformal with a secondary spread model. Meets per-decile coverage targets that standard conformal misses.
ConformalisedQuantileRegression — split CQR (Romano et al., 2019). Wraps pre-fitted quantile models. Works with CatBoost Quantile:alpha=, LightGBM objective=quantile.
FrequencySeverityConformal — correct conformity scoring for two-stage frequency-severity models (Graziadei et al., 2023).
SCRReport — per-risk 99.5% upper bounds with coverage validation table. For PRA SS1/23 model documentation.
solvency_capital_range() — functional API for SCR bounds inside pipelines.
insurance_conformal.risk — Conformal Risk Control (Angelopoulos et al., ICLR 2024). PremiumSufficiencyController, IntervalWidthController, SelectiveRiskController.
RetroAdj — online conformal with retrospective adjustment (Jun & Ohn, 2025). Recovers from abrupt distribution shifts within 1–3 steps.
CoverageDiagnostics — coverage-by-decile plots, interval width distributions, subgroup coverage by arbitrary segment.
insurance_conformal.multivariate — joint multi-output conformal for simultaneous frequency/severity intervals.

Non-conformity scores

Score	Formula	When to use
`pearson_weighted`	`\|y - yhat\| / yhat^(p/2)`	Default. Tweedie/Poisson pricing models.
`pearson`	`\|y - yhat\| / sqrt(yhat)`	Pure Poisson frequency models (p=1).
`deviance`	Deviance residual	When you want exact statistical optimality; slower.
`anscombe`	Anscombe transform	Variance-stabilising alternative to deviance.
`raw`	`\|y - yhat\|`	Baseline only. Not appropriate for insurance data.

Width hierarchy (narrowest first, coverage identical): pearson_weighted <= deviance <= anscombe < pearson < raw.

Temporal calibration

Calibrate on recent data to capture current loss trends:

from insurance_conformal.utils import temporal_split

X_train, X_cal, y_train, y_cal, _, _ = temporal_split(
    X, y,
    calibration_frac=0.20,
    date_col="accident_year",
)

model.fit(X_train, y_train)
cp.calibrate(X_cal, y_cal)

Target n_cal >= 2,000 for stable production use. The guarantee holds for any n_cal >= 1, but below 500 interval widths are materially wider and more variable.

Coverage guarantee

Split conformal provides:

P(y_test in [lower, upper]) >= 1 - alpha

Distribution-free — holds regardless of the true data distribution or model misspecification. The assumption is exchangeability: calibration and test observations drawn from the same distribution. Temporal covariate shift violates this — use temporal calibration splits and monitor coverage via RetroAdj if abrupt shifts are expected.

Design choices

Split conformal, not cross-conformal. Cross-conformal is more statistically efficient but requires refitting the model on each calibration fold. For GBMs that take hours to train, this is not practical. Split conformal trains once, calibrates once.

No MAPIE dependency. MAPIE is excellent but does not expose the insurance-specific scores implemented here. The split conformal algorithm is simple enough to own: 20 lines of code for conformal_quantile() plus the score functions.

Polars-native output. All prediction and diagnostic methods return pl.DataFrame. Pandas inputs are accepted.

Lower bound clipped at zero. Insurance losses are non-negative. Intervals with negative lower bounds are nonsensical. We clip at zero unconditionally.

Auto-detection of Tweedie power. For CatBoost, read from the loss function string. For sklearn TweedieRegressor, from model.power. Pass tweedie_power= explicitly to override.

Limitations

Coverage is marginal, not conditional. The guarantee holds on average. High-risk subgroups can be systematically under-covered even when aggregate coverage meets the target. Always run coverage_by_decile() after calibration.
Exchangeability is violated by portfolio drift. Mid-year claims inflation, Ogden rate changes, or significant portfolio mix shifts break the exchangeability assumption. Use temporal calibration splits and monitor via RetroAdj.
IBNR on recent accident years produces intervals that are too narrow. Calibrating on development-year 0 or 1 data means non-conformity scores are computed on understated claim totals. Use only accident years with at least 3 years of development, or apply IBNR chain-ladder factors to y_cal before calibration.
RetroAdj full method requires kernel ridge regression as the base model. Use residual-only mode for existing GLMs or GBMs.

Part of the Burning Cost stack

Takes any fitted model — Tweedie GBM, GAM, GLM, or the output of insurance-gam or insurance-frequency-severity. Feeds distribution-free prediction intervals into insurance-optimise (uncertainty-aware pricing) and insurance-governance (PRA SS1/23 validation packs). → See the full stack

References

Hong, L. (2025). "Conformal prediction of future insurance claims in the regression problem." arXiv:2503.03659.
Hong, L. (2026). "A new strategy for finite-sample valid prediction of future insurance claims in the regression setting." arXiv:2601.21153.
Graziadei, H., Janett, C., Embrechts, P. & Bucher, A. (2023). "Conformal Prediction for Insurance Data." arXiv:2307.13124.
Manna, S. et al. (2025). "Conformal Prediction Inference in Regularized Insurance Models." Wiley ASMB; arXiv:2507.06921.
Angelopoulos, A. N., Bates, S. et al. (2024). "Conformal Risk Control." ICLR 2024. arXiv:2208.02814.
Jun, J. & Ohn, I. (2025). "Online Conformal Inference with Retrospective Adjustment." arXiv:2511.04275.
Romano, Y., Patterson, E. & Candes, E. (2019). "Conformalized Quantile Regression." NeurIPS 2019. arXiv:1905.03222.

Related libraries

Library	Description
insurance-monitoring	Model drift detection — track coverage stability over time
insurance-conformal-ts	Conformal prediction for non-exchangeable claims time series
insurance-causal	Double Machine Learning for causal pricing inference
insurance-gam	GAM pricing models that feed directly into this library

Other Burning Cost libraries

Model building

Library	Description
shap-relativities	Extract rating relativities from GBMs using SHAP
insurance-cv	Walk-forward cross-validation respecting IBNR structure

Uncertainty quantification

Library	Description
bayesian-pricing	Hierarchical Bayesian models for thin-data segments
insurance-distributional	Full conditional distribution per risk: mean, variance, CoV

Deployment and optimisation

Library	Description
insurance-optimise	Constrained rate change optimisation with FCA PS21/5 compliance

Governance

Library	Description
insurance-fairness	Proxy discrimination auditing for UK insurance models
insurance-monitoring	Model monitoring: PSI, A/E ratios, Gini drift test

All libraries

Training Course

Want structured learning? Insurance Pricing in Python is a 12-module course covering the full pricing workflow. Module 11 covers conformal prediction — split conformal, CQR, and coverage guarantees for pricing models. £97 one-time.

Community

Questions? Start a Discussion
Found a bug? Open an Issue
Blog & tutorials: burning-cost.github.io

Licence

MIT. See LICENSE.

Contributing

Issues and pull requests welcome at github.com/burning-cost/insurance-conformal.

Need help implementing this? See our consulting services.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.3.2

Apr 4, 2026

1.3.1

Apr 4, 2026

1.2.0

Apr 1, 2026

This version

0.9.0

Apr 1, 2026

0.8.0

Mar 31, 2026

0.7.1

Mar 31, 2026

0.6.4

Mar 27, 2026

0.6.3

Mar 25, 2026

0.4.3

Mar 19, 2026

0.4.2

Mar 17, 2026

0.4.1

Mar 15, 2026

0.2.1

Mar 11, 2026

0.2.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insurance_conformal-0.9.0.tar.gz (420.6 kB view details)

Uploaded Apr 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

insurance_conformal-0.9.0-py3-none-any.whl (174.6 kB view details)

Uploaded Apr 1, 2026 Python 3

File details

Details for the file insurance_conformal-0.9.0.tar.gz.

File metadata

Download URL: insurance_conformal-0.9.0.tar.gz
Upload date: Apr 1, 2026
Size: 420.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_conformal-0.9.0.tar.gz
Algorithm	Hash digest
SHA256	`0c82c46c506395193d46cbb680b8592040531e4f6d6b26cca896b54c3f710b19`
MD5	`de1bdd0aee543a83b3531d1adfe378d0`
BLAKE2b-256	`b456085842680626573de4f69a87009dbb5bf1a2d4ea30a08c5c2f54faa8d6f2`

See more details on using hashes here.

File details

Details for the file insurance_conformal-0.9.0-py3-none-any.whl.

File metadata

Download URL: insurance_conformal-0.9.0-py3-none-any.whl
Upload date: Apr 1, 2026
Size: 174.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_conformal-0.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b360fa5d23d6685beed55614017ef18d5c201b4379ac501e8d99ca7c8a2d5175`
MD5	`f2073f6b660c26c8c8209f1c73f52740`
BLAKE2b-256	`4944cfd3ef7f2e9d5b1f01720f91051bdd96f77aaf2c48c023c9a3f3a69d123a`

See more details on using hashes here.

insurance-conformal 0.9.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

insurance-conformal

The problem

Quick start

Why a pricing actuary should care

Performance on a realistic motor book

Installation

Worked examples

1. Motor frequency-severity model with per-decile coverage audit

2. Premium sufficiency control — bound expected underpricing

3. SCR bounds for internal model documentation

4. Recovering from mid-year claims inflation (Ogden rate change, CAT event)

Features

Non-conformity scores

Temporal calibration

Coverage guarantee

Design choices

Limitations

Part of the Burning Cost stack

References

Related libraries

Other Burning Cost libraries

Training Course

Community

Licence

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes