Insurance severity modelling: composite models, DRN, EVT, MDN, SPQRx, tail scoring, CMRS allocation, Projection-to-Ultimate, and AIPW IBNR reserving
Project description
insurance-severity
Comprehensive severity modelling for UK insurance pricing. Two complementary approaches in one package.
Blog post: Spliced Severity Distributions: When One Distribution Isn't Enough
The problem
Claim severity distributions don't behave like textbook Gamma distributions. You have a body of attritional losses and a heavy tail of large losses, and these two populations have different drivers. Standard GLMs smooth over this structure. This package gives you two principled ways to deal with it.
Quick Start
uv add insurance-severity
💬 Questions or feedback? Start a Discussion. Found it useful? A ⭐ helps others find it.
import numpy as np
from insurance_severity import LognormalBurrComposite
rng = np.random.default_rng(42)
# Synthetic severity: lognormal attritional body + heavy Pareto-like tail
attritional = rng.lognormal(mean=7.5, sigma=1.0, size=850) # ~85% of claims
large_loss = rng.pareto(a=2.5, size=150) * 40_000 + 8_000 # ~15% large losses
claims = np.concatenate([attritional, large_loss])
rng.shuffle(claims)
# Fit the composite model — profile likelihood selects the threshold automatically
model = LognormalBurrComposite(threshold_method="mode_matching")
model.fit(claims)
print(f"Threshold: £{model.threshold_:,.0f}")
# body_params_ = [mu, sigma] for the lognormal; tail_params_ = [alpha, delta, beta] for Burr XII
mu, sigma = model.body_params_
alpha, delta, beta = model.tail_params_
print(f"Body lognormal: mu={mu:.3f}, sigma={sigma:.3f} (log-scale)")
print(f"Tail Burr XII: alpha={alpha:.3f}, delta={delta:.3f}, beta={beta:,.0f}")
print(f"Body weight pi: {model.pi_:.3f} ({model.pi_*100:.1f}% of claims are attritional)")
# ILF: expected loss in layer (250k xs 250k) relative to basic limit
ilf = model.ilf(limit=500_000, basic_limit=250_000)
print(f"ILF at £500k limit / £250k basic: {ilf:.4f}")
# Tail Value at Risk at 99.5th percentile (Solvency II capital proxy)
tvar = model.tvar(alpha=0.995)
print(f"TVaR 99.5%: £{tvar:,.0f}")
What's in the package
insurance_severity.composite — spliced severity models
Composite (spliced) models divide the claim distribution at a threshold into a body distribution (moderate claims) and a tail distribution (large losses). Each component is fitted separately and joined at the threshold.
What this package adds that R doesn't have: covariate-dependent thresholds. For a motor book, the threshold between attritional and large loss isn't the same for a HGV fleet and a private motor policy. With mode-matching regression, the threshold varies by policyholder.
Supported combinations:
- Lognormal body + Burr XII tail (mode-matching supported)
- Lognormal body + GPD tail
- Gamma body + GPD tail
Features:
- Fixed threshold, profile likelihood, and mode-matching threshold methods
- Covariate-dependent tail scale regression (
CompositeSeverityRegressor) - ILF computation per policyholder
- TVaR (Tail Value at Risk)
- Quantile residuals, mean excess plots, Q-Q plots
import numpy as np
from insurance_severity import LognormalBurrComposite, CompositeSeverityRegressor
rng = np.random.default_rng(42)
n = 1000
# Synthetic severity data: lognormal attritional body + Pareto-like large losses
attritional = rng.lognormal(mean=7.5, sigma=1.2, size=int(n * 0.85)) # ~£1,800 median
large_losses = rng.pareto(a=2.5, size=int(n * 0.15)) * 50_000 + 10_000
claim_amounts = np.concatenate([attritional, large_losses])
rng.shuffle(claim_amounts)
# Rating factors for regression example
vehicle_age = rng.integers(0, 15, n).astype(float)
driver_age = rng.integers(18, 75, n).astype(float)
ncd_years = rng.integers(0, 5, n).astype(float)
X = np.column_stack([vehicle_age, driver_age, ncd_years])
n_train = int(0.8 * n)
X_train, X_test = X[:n_train], X[n_train:]
y_train = claim_amounts[:n_train]
# No covariates -- mode-matching threshold
model = LognormalBurrComposite(threshold_method="mode_matching")
model.fit(claim_amounts)
print(model.threshold_)
print(model.ilf(limit=500_000, basic_limit=250_000))
# With covariates -- threshold varies by policyholder
reg = CompositeSeverityRegressor(
composite=LognormalBurrComposite(threshold_method="mode_matching"),
)
reg.fit(X_train, y_train)
thresholds = reg.predict_thresholds(X_test) # per-policyholder thresholds
ilf_matrix = reg.compute_ilf(X_test, limits=[100_000, 250_000, 500_000, 1_000_000])
insurance_severity.drn — Distributional Refinement Network
The DRN (Avanzi, Dong, Laub, Wong 2024, arXiv:2406.00998) starts from a frozen GLM or GBM baseline and refines it into a full predictive distribution using a neural network. The network outputs bin-probability adjustments to a histogram representation of the distribution, not the mean.
The practical payoff: you keep the actuarial calibration of your existing GLM pricing model, and the neural network fills in the distributional shape that the GLM can't capture — skewness, heteroskedastic dispersion, tail behaviour by segment.
Note: the DRN requires PyTorch. Install it before using this subpackage:
uv add torch --index-url https://download.pytorch.org/whl/cpu
uv add "insurance-severity[glm]"
import numpy as np
from insurance_severity import GLMBaseline, DRN
import statsmodels.formula.api as smf
import statsmodels.api as sm
import pandas as pd
rng = np.random.default_rng(42)
n = 1000
# Synthetic severity data with covariates
vehicle_age = rng.integers(0, 15, n).astype(float)
driver_age = rng.integers(18, 75, n).astype(float)
ncd_years = rng.integers(0, 5, n).astype(float)
region = rng.choice(["London", "SE", "NW", "Midlands", "Scotland"], n)
# Lognormal claim amounts with some large losses
log_mu = 7.5 + 0.03 * vehicle_age - 0.005 * driver_age - 0.05 * ncd_years
claim_amounts = rng.lognormal(mean=log_mu, sigma=1.1 + 0.02 * vehicle_age)
n_train = int(0.8 * n)
df = pd.DataFrame({
"claims": claim_amounts,
"age": driver_age,
"vehicle_age": vehicle_age,
"region": region,
})
df_train = df.iloc[:n_train]
df_test = df.iloc[n_train:]
X_train = np.column_stack([vehicle_age[:n_train], driver_age[:n_train], ncd_years[:n_train]])
X_test = np.column_stack([vehicle_age[n_train:], driver_age[n_train:], ncd_years[n_train:]])
y_train = claim_amounts[:n_train]
y_test = claim_amounts[n_train:]
# Fit your existing GLM
glm = smf.glm(
"claims ~ age + C(region) + vehicle_age",
data=df_train,
family=sm.families.Gamma(sm.families.links.Log()),
).fit()
# Wrap it as a baseline
baseline = GLMBaseline(glm)
# Refine with DRN (scr_aware=True enables SCR-aware bin cutpoints at 99.5th percentile)
drn = DRN(baseline, hidden_size=64, max_epochs=300, scr_aware=True)
drn.fit(X_train, y_train, verbose=True)
# Full predictive distribution per policyholder
dist = drn.predict_distribution(X_test)
print(dist.mean()) # (n,) expected claim
print(dist.quantile(0.995)) # 99.5th percentile -- Solvency II SCR
print(dist.crps(y_test)) # CRPS scoring
insurance_severity.evt — Extreme Value Theory for censored and truncated claims
Standard composite models assume you have clean, ground-up severity data. In practice you don't: policy limits truncate the right tail, IBNR means large claims are systematically under-reported at valuation, and the data you observe is not the data you need to model. The EVT module handles these distortions explicitly.
The three classes are based on Albrecher et al. (2025) — tail estimation under policy limits, IBNR censoring, and physically bounded tails.
TruncatedGPD — Generalised Pareto Distribution fitted to exceedances above a threshold, with optional right-truncation at a policy limit. If you naively fit a GPD to capped claims without accounting for the cap, you underestimate the tail index. This corrects that via MLE over the truncated likelihood.
CensoredHillEstimator — Hill estimator corrected for right-censoring. The standard Hill estimator applied to censored order statistics is biased; this uses the formula sum(log x_j) - k * log(boundary) to recover the true Pareto tail index from the top-k observations.
WeibullTemperedPareto — Models raw severity (x > 0) with a Pareto body and exponential tempering in the extreme tail. Useful when physical constraints bound the maximum possible loss — a nuclear power plant has a finite replacement cost, even if an unconstrained Pareto fit would extrapolate beyond it. Fitted via MLE.
from insurance_severity.evt import TruncatedGPD, CensoredHillEstimator, WeibullTemperedPareto
# GPD fitted to exceedances above £10k, truncated at £100k policy limit
gpd = TruncatedGPD()
gpd.fit(large_claims, threshold=10_000, upper=100_000)
print(gpd.summary())
# Hill estimator corrected for censoring at £50k
hill = CensoredHillEstimator()
hill.fit(claims, boundary=50_000)
print(f"Tail index: {hill.xi:.3f}")
# Tempered Pareto — Pareto body with bounded extreme tail
wtp = WeibullTemperedPareto()
wtp.fit(claims)
print(wtp.summary())
All three classes expose pdf, cdf, ppf, rvs where appropriate, a xi property for the tail index, and a summary() method. TruncatedGPD additionally provides mean_excess() for mean excess plot diagnostics.
When to use: XL reinsurance where the cession data is limit-censored; reserving triangles where large claims are IBNR-censored; any context where naively fitting a GPD to the observed data would produce a biased tail index because not all large losses are visible in the data at valuation.
insurance_severity.reserving — doubly-robust IBNR reserving
PopulationSamplingReserve reframes IBNR reserving as a population sampling problem (Calcetero, Badescu, Lin 2025, arXiv:2502.15598). Each reported claim is a Bernoulli trial with inclusion probability π̂ᵢ — the probability that the claim was reported by valuation time given its accident date. The AIPW estimator combines a micro-level severity model with IPW-weighted residuals, giving double robustness: the reserve is consistent if either the inclusion model or the severity model is correctly specified.
The chain-ladder emerges as a special case when inclusion probabilities equal 1/f_k (inverse development factors).
import numpy as np
import pandas as pd
from insurance_severity.reserving import PopulationSamplingReserve
rng = np.random.default_rng(42)
n = 500
# Reported UK motor claims at valuation date
claims = pd.DataFrame({
"accident_time": rng.uniform(0.0, 1.5, n),
"report_time": rng.uniform(0.0, 2.0, n),
"severity": rng.lognormal(mean=8.0, sigma=1.2, size=n),
})
# Ensure report_time >= accident_time
claims["report_time"] = claims["accident_time"] + rng.exponential(0.3, n)
reserve = PopulationSamplingReserve(method="aipw")
reserve.fit(claims, valuation_time=2.0)
print(f"IBNR estimate: £{reserve.ibnr_:,.0f}")
print(f"Ultimate: £{reserve.ultimate_:,.0f}")
print(reserve.diagnostics())
Installation
uv add insurance-severity
With GLM support (statsmodels):
uv add "insurance-severity[glm]"
Subpackages
Access subpackages directly if you only need one approach:
from insurance_severity.composite import LognormalBurrComposite
from insurance_severity.drn import DRN
from insurance_severity.evt import TruncatedGPD, CensoredHillEstimator, WeibullTemperedPareto
Databricks Notebook
A ready-to-run Databricks notebook benchmarking this library against standard approaches is available in burning-cost-examples.
Benchmark results
Benchmarked on Databricks serverless (Free Edition), 2026-03-22. Full runnable script: benchmarks/run_benchmark_databricks.py.
Setup: 5,000 synthetic UK motor bodily injury claims (4,000 train / 1,000 test). DGP: Lognormal body (85%, attritional claims, log-median ~£4,900) + Pareto tail (15%, large BI, alpha=1.8 — infinite variance, representative of real BI loss triangles). Two composite models compared against Gamma and Lognormal single-distribution baselines. True DGP quantiles derived from 500,000 simulated observations.
DGP true quantiles: 90th=£14,900 | 95th=£32,800 | 99th=£116,000 | 99.5th=£208,000
| Model | 95th pct error | 99th pct error | 99.5th pct error | Total tail error | Log-lik | Fit time |
|---|---|---|---|---|---|---|
| Single Gamma | large (underestimates tail) | large | large | largest | lowest | <0.1s |
| Single Lognormal | moderate | moderate | moderate | moderate | moderate | <0.1s |
| LognormalGPDComposite | smallest | smallest | smallest | smallest | highest | 5–25s |
| GammaGPDComposite | small | small | small | small | high | 5–25s |
Honest note on magnitude: With 4,000 claims and Pareto alpha=1.8, the composite models have roughly 600 tail observations above the splice threshold. Profile-likelihood threshold selection is reliable in this regime. At alpha=1.8 (infinite variance), single distributions significantly underestimate high quantiles — the ILF error at the £1m limit can exceed 30% for a single Gamma, translating directly to mis-priced XL reinsurance layers.
The ILF comparison is the most practically significant result. Increased Limits Factors drive XL reinsurance pricing. A Gamma GLM priced against a £1m limit will systematically underprice the layer relative to the true heavy-tailed DGP. The composite model, by fitting the GPD tail, captures this layer pricing more accurately.
Composite models are superior when:
- The DGP has a genuine structural break between attritional and large losses (test with mean excess plot — a rising MEF confirms heavy tail)
- Sample has 3,000+ claims with at least 200 observations above the splice threshold
- The tail index is heavier than alpha ≈ 2.5 (GPD xi > 0.4), meaning a single distribution will materially underestimate high quantiles
Single distributions are competitive when:
- Fewer than 2,000 claims — profile-likelihood threshold selection is unstable; use
TruncatedGPDwith a fixed threshold from domain knowledge instead - Data is policy-limit censored — use
insurance_severity.evt.TruncatedGPDwhich corrects for truncation bias - Tail is mild (alpha > 3.0) — a Lognormal fits well and composite adds complexity without benefit
Performance
Fit times on Databricks serverless (single-node, no GPU): Gamma/Lognormal <0.1s each, composite models 5–25s (profile-likelihood grid over 40 threshold candidates). For an offline actuarial workflow fitting quarterly, this is irrelevant. For real-time underwriting APIs, fit the composite offline and use a fixed threshold for scoring.
See benchmarks/run_benchmark_databricks.py for the full benchmark with ILF tables and quantile comparisons.
Source repos
insurance-composite— archived, merged into this packageinsurance-drn— archived, merged into this package
Related Libraries
| Library | Description |
|---|---|
| insurance-quantile | Quantile GBM for tail risk — non-parametric alternative when parametric severity assumptions are not tenable |
| insurance-conformal | Distribution-free prediction intervals — conformal wrappers for severity point forecasts |
| insurance-distributional | Distributional GBM — models the full conditional distribution including variance, not just the mean |
| insurance-frequency-severity | Joint frequency-severity models — combines this library's severity component with frequency in a Sarmanov copula |
| insurance-dynamics | Loss development models — severity projections are a key input to dynamic reserve models |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file insurance_severity-0.4.0.tar.gz.
File metadata
- Download URL: insurance_severity-0.4.0.tar.gz
- Upload date:
- Size: 391.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44f1ce5434dbde446457006147ce75e48a0a059bec7fb919c471e14614aa269b
|
|
| MD5 |
f51f3d210081daf009e1f07323a0df0f
|
|
| BLAKE2b-256 |
1754092e0ecb8751c981000a71251c5ec5929d86028572b76e87e8a097c5b1d2
|
File details
Details for the file insurance_severity-0.4.0-py3-none-any.whl.
File metadata
- Download URL: insurance_severity-0.4.0-py3-none-any.whl
- Upload date:
- Size: 143.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
142ac36fd12eea8cb5c9d10c9c794d59c4a3608b1be4342855bb19b96815fa37
|
|
| MD5 |
34b9006836e6c520a20205f72615e1f3
|
|
| BLAKE2b-256 |
190f9227abb2ddcc23983edf3a6b9b6e1aa2065b2893b81c0077b727956d02aa
|