Extreme Value Theory for catastrophic insurance claim severity: GPD/GEV fitting, profile likelihood CIs, censored MLE, excess layer pricing, Solvency II reporting
Project description
insurance-evt
Extreme Value Theory for catastrophic insurance claim severity. GPD and GEV fitting with profile likelihood confidence intervals, censored MLE, excess layer pricing, and Solvency II reporting.
The problem
UK insurers pricing flood, TPBI, subsidence, and large fire need to estimate the 1-in-200 year claim — the Solvency II SCR benchmark. Standard tools do not help:
- scipy: provides
genparetoandgenextreme, but just the distributions. No threshold selection diagnostics, no profile likelihood CIs, no reinsurance layer pricing. - pyextremes: excellent for time series extremes (river gauges, wind speed), but requires a
DatetimeIndex. Insurance claim data is cross-sectional — there is no datetime. It also has no excess layer pricing. - R's ReIns package: has
ExcessGPD(), profile likelihood, truncation/censoring support. But it's R.
This library is the Python gap. It wraps scipy's GPD/GEV with the workflow that insurance pricing teams actually need.
What's different
Profile likelihood CIs — not Wald intervals. For the shape parameter xi, Wald CIs (xi ± z * se) are symmetric and wrong. Profile likelihood CIs are asymmetric and correctly capture the positive skew in return level uncertainty. At n=50 exceedances, the difference between upper bounds is typically 30-50%.
Censored MLE — open/unsettled TPBI claims are right-censored: the ultimate is unknown, but the reported amount is a lower bound. Standard MLE on censored data overstates xi by ~15% (Poudyal & Brazauskas 2023). The right_censoring parameter handles this.
Left-truncated MLE — reinsurers only see claims above their attachment point. The conditional likelihood f(x | x > attachment) is different from the unconditional. Set left_truncation and the likelihood is adjusted automatically.
ExcessGPD formula — the annual expected cost of an XL layer, closed-form from GPD parameters. This is the output reinsurance pricing teams need. Previously available only in R's ReIns package. This is the first Python implementation.
Install
pip install insurance-evt
Quick start
import numpy as np
from insurance_evt import ThresholdSelector, GPDFitter, ReturnLevelCalculator
from insurance_evt import SolvencyIIReport, ExcessLayerCalculator
# Step 1: choose a threshold
sel = ThresholdSelector(claims)
mrl = sel.mrl_plot() # look for linearity
stability = sel.parameter_stability_plot() # look for stability
u = sel.auto_threshold() # algorithmic suggestion — examine the plot
# Step 2: fit GPD
fitter = GPDFitter(claims, threshold=u)
fit = fitter.fit()
print(fit)
# GPDFitResult(xi=0.3241 ± 0.0812, sigma=187000 ± 23000, n=87, threshold=500000)
# Step 3: profile likelihood CI for xi (asymmetric)
lo, hi = fitter.profile_likelihood_ci('xi')
print(f"xi: {fit.xi:.3f} ({lo:.3f}, {hi:.3f})")
# xi: 0.324 (0.183, 0.521) <-- upper wider than lower for heavy tail
# Step 4: return levels
n_years = 30 # years of data
calc = ReturnLevelCalculator(fit, lambda_annual=fit.n_exceedances / n_years)
calc.attach_data(fitter._exceedances)
print(f"1-in-200 year: £{calc.return_level(200):,.0f}")
lo, hi = calc.return_level_ci(200, method='profile')
print(f"95% CI: £{lo:,.0f} to £{hi:,.0f}")
# Step 5: Solvency II report
report = SolvencyIIReport(calc).generate()
print(report['interpretation'])
# "The 1-in-200 year claim severity is estimated at £4.2m (95% CI: £2.8m to £7.6m)..."
# Step 6: XL layer pricing
xl = ExcessLayerCalculator(fit, lambda_annual=fit.n_exceedances / n_years)
df = xl.layer_table(
retentions=[500_000, 750_000, 1_000_000, 2_000_000],
limits =[500_000, 500_000, 1_000_000, 1_000_000],
)
print(df[['retention', 'limit', 'pure_premium', 'rate_on_line']])
With censored claims
# open_flag: True for open/unsettled claims (lower bound on ultimate)
fitter = GPDFitter(claims, threshold=500_000, right_censoring=open_flag)
fit = fitter.fit()
# Censored MLE: open claims contribute log(1-F(x)) instead of log f(x)
With reinsurer data (left truncation)
# Reinsurer only sees claims > 750k; threshold set at 500k for model
fitter = GPDFitter(
claims,
threshold=500_000,
left_truncation=750_000
)
fit = fitter.fit()
# Conditional likelihood: f(x | x > 750k) correct for data selection
Peril presets
from insurance_evt import FLOOD, TPBI, SUBSIDENCE, LARGE_FIRE, get_preset
print(TPBI.typical_xi_range) # (0.3, 0.55)
print(TPBI.threshold_percentile) # 95.0
print(TPBI.notes[:200]) # right-censoring guidance...
# Use as sanity check on fitted xi
fit = GPDFitter(claims, threshold=u).fit()
lo, hi = TPBI.typical_xi_range
if not (lo <= fit.xi <= hi):
print(f"Warning: xi={fit.xi:.3f} outside typical TPBI range {TPBI.typical_xi_range}")
GEV for annual maxima (subsidence, portfolio-level)
from insurance_evt import GEVFitter
fitter = GEVFitter(annual_maxima) # annual maximum claim per year
fit = fitter.fit()
print(fitter.domain_of_attraction()) # 'Frechet' for heavy-tailed
print(f"1-in-200: £{fitter.return_level(200):,.0f}")
lo, hi = fitter.profile_likelihood_ci(200)
Diagnostic plots
import matplotlib.pyplot as plt
from insurance_evt import plots
sel = ThresholdSelector(claims)
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
mrl_result = sel.mrl_plot()
plots.plot_mrl(mrl_result, ax=axes[0])
stability = sel.parameter_stability_plot()
plots.plot_parameter_stability(stability, axes=[axes[1]])
# Hill plot
hill = sel.hill_plot(bootstrap_ci=True, n_boot=200)
plots.plot_hill(hill)
# Return level curve
curve = calc.return_level_curve([10, 20, 50, 100, 200, 500], method='profile')
plots.plot_return_level_curve(curve)
# GPD PP/QQ plots
plots.plot_gpd_diagnostic(fit, fitter._exceedances)
A note on threshold selection
There is no automated threshold selection method that reliably outperforms expert visual inspection. auto_threshold() uses a KS-test-based selection as a starting point. It is a starting point — not a verdict.
The mean residual life plot and parameter stability plot are the diagnostic tools. Two analysts looking at the same plots may pick thresholds that differ by 10-15%. That's normal, and the profile likelihood CI on the return level captures most of the resulting uncertainty.
A note on non-stationarity
Standard EVT assumes independent and identically distributed claims. For UK flood and subsidence, this assumption is increasingly questionable: the frequency of extreme rainfall events and hot/dry summers is changing. This library does not model non-stationarity (planned for v0.2). All reports include a stationarity warning. Document this assumption when submitting to your Internal Model Approval Process.
Background
The asymptotic foundations are in Coles (2001), An Introduction to Statistical Modeling of Extreme Values (Springer). The censored/truncated MLE follows Poudyal & Brazauskas (2023) and Albrecher & Beirlant (2024, arXiv:2511.22272). The ExcessGPD formula is from Reynkens et al. (2017), Insurance: Mathematics and Economics 77, 65–77.
License
MIT. Built by Burning Cost.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file insurance_evt-0.1.0.tar.gz.
File metadata
- Download URL: insurance_evt-0.1.0.tar.gz
- Upload date:
- Size: 141.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ec23930bbbcf860990bd52dc2edd9a8555483cf83f27d30c96237ac493acef4
|
|
| MD5 |
f926e840f0619c5e2f05ec1e930763b9
|
|
| BLAKE2b-256 |
84be5fd6baf74693fe57f24bbc1cdb601cf459bf35cc27d778c99f8ff52a7a23
|
File details
Details for the file insurance_evt-0.1.0-py3-none-any.whl.
File metadata
- Download URL: insurance_evt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 34.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33a2f2f9dcf4aca21199c6c4493267715a4235042e5b4118f0f8719eb76314d2
|
|
| MD5 |
b87b721e2f54228103ae1eb4549371ee
|
|
| BLAKE2b-256 |
f386601d156038d22c17c5862033382855b5d93d6d3dd76b2e31e0ed1a053894
|