BYM2 spatial territory ratemaking and conformal prediction intervals for UK personal lines insurance
Project description
insurance-spatial
Spatial tools for UK insurance pricing in one install: BYM2 territory ratemaking and spatially weighted conformal prediction intervals.
The Problems
Territory pricing in UK personal lines is broken in predictable ways.
The standard approach — a GLM with postcode sector as a categorical predictor — creates 11,200 separate territory parameters for motor, most of them estimated from a handful of claims. Adjacent sectors can differ by 30–40% on sparse data not because the underlying risk differs but because the estimates are noisy. Standard practice is to band sectors into 6–20 groups using k-means on historical loss ratios. This is ad hoc, discards information, and creates artificial discontinuities at band boundaries.
The academically-grounded alternative is the BYM2 model (Besag-York-Mollié, Riebler et al. 2016): a Bayesian hierarchical model that borrows strength across neighbouring postcode sectors, quantifies how much geographic variation is genuinely spatial vs. idiosyncratic noise, and produces territory relativities with proper uncertainty estimates.
Prediction intervals from a pricing model are nationally calibrated but geographically broken. Standard conformal prediction gives you a guarantee that 90% of risks are covered nationwide — but the coverage can be 70% in inner London and 98% in rural Somerset. That geographic miscalibration is a conduct risk under Consumer Duty.
The fix is spatially weighted conformal prediction: calibration non-conformity scores are weighted by geographic proximity to each test point, so intervals in Taunton reflect error behaviour in the South West, not the national average.
What's in this package
Two sub-systems, one install:
BYM2 territory ratemaking (insurance_spatial top-level):
- Build adjacency matrices from GeoJSON polygon files or grids
- Fit BYM2 Poisson models via PyMC v5
- Test spatial autocorrelation with Moran's I
- Extract territory relativities with credibility intervals, ready as GLM offsets
- MCMC convergence diagnostics
Spatially weighted conformal prediction (insurance_spatial.conformal):
- Geographically calibrated prediction intervals for any sklearn-compatible model
- Gaussian, Epanechnikov, and uniform spatial kernels
- Cross-validated bandwidth selection using spatial blocking
- Tweedie Pearson non-conformity scores (recommended for GLM/GBM pricing models)
- MACG (Mean Absolute Coverage Gap) spatial diagnostic
- FCA Consumer Duty table: coverage by geographic region
Installation
uv add insurance-spatial
With optional geo dependencies (shapefiles and spatial weights):
uv add "insurance-spatial[geo]"
With faster MCMC sampler:
uv add "insurance-spatial[nutpie]"
BYM2 Quick Start
import numpy as np
from insurance_spatial import build_grid_adjacency, BYM2Model
from insurance_spatial.diagnostics import moran_i
# 1. Build adjacency for a synthetic 10x10 grid of territories.
# In production, use build_grid_adjacency() from a real postcode-sector grid
# or AdjacencyMatrix.from_geojson() with ONSPD boundary files.
adj = build_grid_adjacency(10, 10, connectivity="queen")
N = len(adj.areas) # 100 territories
print(f"Territories: {N}, scaling factor: {adj.scaling_factor:.3f}")
# 2. Synthetic territory data — 100 sectors, true claim rate ~7%.
# Replace with your actual observed claims and earned exposure per sector.
rng = np.random.default_rng(42)
exposure = rng.uniform(200, 2_000, size=N) # policy-years per sector
true_log_rate = rng.normal(-2.66, 0.35, size=N) # spatial variation around 7%
claims = rng.poisson(np.exp(true_log_rate) * exposure)
# 3. Test for spatial autocorrelation before fitting.
# Run BYM2 only when Moran's I is significant — otherwise simpler
# credibility weighting suffices and MCMC runtime is wasted.
log_oe = np.log((claims / exposure) / np.mean(claims / exposure) + 1e-8)
test = moran_i(log_oe, adj, n_permutations=999)
print(test.interpretation)
# 4. Fit BYM2 model (requires PyMC — uv add pymc)
model = BYM2Model(adjacency=adj, draws=1000, chains=4)
result = model.fit(
claims=claims, # np.ndarray, shape (N,)
exposure=exposure, # np.ndarray, shape (N,) — earned policy-years
)
# 5. Check convergence
diag = result.diagnostics()
print(f"Max R-hat: {diag.convergence.max_rhat:.3f}") # want < 1.01
print(f"Min ESS: {diag.convergence.min_ess_bulk:.0f}") # want > 400
# 6. Extract territory relativities ready for use as GLM offsets
rels = result.territory_relativities(credibility_interval=0.95)
# area | relativity | lower | upper | ln_offset
# Pass ln_offset as a fixed offset in your downstream Poisson GLM
Spatial Conformal Prediction
Standard conformal prediction guarantees 90% coverage nationally. It does not guarantee 90% coverage in every postcode district. This matters because the FCA expects you to demonstrate fair outcomes across geography, and systematic under-coverage in deprived areas or rural postcodes creates conduct risk.
The spatially weighted conformal predictor wraps your existing fitted model and produces geographically calibrated intervals:
from insurance_spatial.conformal import SpatialConformalPredictor, SpatialCoverageReport
import numpy as np
import polars as pl
from sklearn.dummy import DummyRegressor
# Synthetic UK motor data: 2,000 calibration + 500 test policies with lat/lon
# (UK mainland bounding box: lat 50–58, lon -5–2)
rng = np.random.default_rng(0)
n_cal, n_test = 2_000, 500
lat_cal = rng.uniform(50.5, 57.5, size=n_cal)
lon_cal = rng.uniform(-4.5, 1.5, size=n_cal)
lat_test = rng.uniform(50.5, 57.5, size=n_test)
lon_test = rng.uniform(-4.5, 1.5, size=n_test)
X_cal = rng.standard_normal((n_cal, 5)) # five rating factors
X_test = rng.standard_normal((n_test, 5))
y_cal = rng.gamma(1.5, scale=800, size=n_cal) # claim severity
# Fit a simple placeholder model (replace with your real CatBoost/LightGBM)
fitted_lgbm = DummyRegressor(strategy='mean').fit(X_cal, y_cal)
# Wrap your fitted GBM or GLM
scp = SpatialConformalPredictor(
model=fitted_lgbm,
nonconformity='pearson_tweedie', # recommended for burning cost models
tweedie_power=1.5,
bandwidth_km=20.0, # or None to select automatically via CV
)
# Calibrate on a held-out set
cal_result = scp.calibrate(
X_cal, y_cal,
lat=lat_cal, lon=lon_cal, # or postcodes=['SW1A 2AA', ...]
)
print(f"Bandwidth: {cal_result.bandwidth_km} km")
# Generate prediction intervals
intervals = scp.predict_interval(
X_test, lat=lat_test, lon=lon_test, alpha=0.10 # 90% intervals
)
print(intervals.lower[:5], intervals.upper[:5])
# Diagnose spatial coverage quality
report = SpatialCoverageReport(scp)
# Reuse test set as validation; replace with a separate validation split in production
y_test = rng.gamma(1.5, scale=800, size=n_test) # actual observed values for coverage check
X_val, y_val = X_test, y_test
lat_val, lon_val = lat_test, lon_test
result = report.evaluate(X_val, y_val, lat=lat_val, lon=lon_val)
print(f"MACG: {result.macg:.4f}") # lower = more spatially uniform coverage
# FCA Consumer Duty table
table = report.fca_consumer_duty_table(region_labels=None) # pass a list of region names if available
print(table.filter(pl.col('flag') == 'REVIEW'))
Using UK postcodes instead of coordinates:
from insurance_spatial.conformal import PostcodeGeocoder
gc = PostcodeGeocoder()
lat_cal, lon_cal = gc.geocode(postcodes_cal)
scp.calibrate(X_cal, y_cal, lat=lat_cal, lon=lon_cal)
# Or pass postcodes directly
scp.calibrate(X_cal, y_cal, postcodes=postcodes_cal)
Non-conformity score choice
The score is the key design decision. For pricing models:
| Score | When to use |
|---|---|
pearson_tweedie (default) |
GLM/GBM with Tweedie objective (burning cost, severity) |
pearson |
Poisson frequency models |
absolute |
Baseline only — ignores heteroscedasticity |
scaled_absolute |
Two-model approach with a separate spread model |
The Tweedie Pearson score |y - yhat| / yhat^(p/2) variance-stabilises the residuals before weighting, so the spatial kernel is not confounded by the model's own heteroscedasticity.
Bandwidth selection
If you do not supply bandwidth_km, it is selected via spatial blocking cross-validation. The CV minimises MACG (Mean Absolute Coverage Gap) across a spatial grid, which directly measures what matters — geographic coverage consistency.
# Auto bandwidth selection
scp = SpatialConformalPredictor(model=fitted_model, bandwidth_km=None)
result = scp.calibrate(
X_cal, y_cal, lat=lat_cal, lon=lon_cal,
cv_candidates_km=[5.0, 10.0, 20.0, 30.0, 50.0],
cv_folds=5,
)
print(f"CV-selected bandwidth: {result.bandwidth_km} km")
The BYM2 model
The model for area i:
y_i ~ Poisson(mu_i)
log(mu_i) = log(E_i) + alpha + X_i @ beta + b_i
b_i = sigma * (sqrt(rho / s) * phi_i + sqrt(1-rho) * theta_i)
phi ~ ICAR(W) # structured spatial component
theta ~ Normal(0, 1) # unstructured IID component
sigma ~ HalfNormal(1) # total territory SD
rho ~ Beta(0.5, 0.5) # proportion attributable to spatial structure
s is the BYM2 scaling factor — the geometric mean of the marginal variances of the ICAR precision matrix. It ensures phi has unit variance, so rho and sigma are interpretable regardless of the graph topology.
Why the rho parameter matters. After fitting, the posterior of rho tells you directly how much of the residual geographic variation is spatially smooth. If rho → 1, nearby sectors genuinely tend to have similar risk; BYM2 smoothing is adding real information. If rho → 0, territory variation is area-specific noise; the data do not support spatial smoothing and you are better off with simpler credibility weighting.
Recommended pipeline
The library supports two use patterns:
Integrated: pass raw claims and exposure per sector. The model captures all geographic variation.
Two-stage (recommended for production): fit your main GLM or GBM without territory, compute sector-level O/E ratios, then pass those to BYM2. This keeps the spatial model decoupled and easier to explain:
# Stage 1: base GLM without territory
# ...compute expected claims per sector from base model...
# Stage 2: spatial model on residual O/E
result = model.fit(
claims=sector_observed_claims,
exposure=sector_expected_claims, # <-- E_i is the base model's fitted value
)
The two-stage approach also means the territory factor is auditable independently of the main rating model — useful for regulatory filings.
UK data sources
To get started with real UK territory data:
| Data | Source | Use |
|---|---|---|
| Postcode sector boundaries | Derived from OS CodePoint Open (free) via Voronoi | Adjacency construction |
| ONSPD | ONS Open Geography Portal | Postcode → sector/LSOA lookup |
| Index of Multiple Deprivation | MHCLG (gov.uk) | Covariates |
| Vehicle crime by LSOA | data.police.uk | Covariates |
| Flood risk by postcode | Environment Agency (data.gov.uk) | Home insurance covariates |
See the demo notebook for a full synthetic example and comments on each data source.
Computational notes
For N=11,200 UK postcode sectors, the ICAR model is feasible — the pairwise difference formulation is O(N·K) where K≈6 mean neighbours. Published benchmarks suggest ~20–30 minutes for 4 chains × 1,000 draws on modern hardware. The scaling factor computation (adj.scaling_factor) is a one-off sparse linear solve per graph topology; cache it between runs.
nutpie is recommended for production: uv add nutpie. It uses a Rust NUTS implementation and is typically 2–5x faster than PyMC's default sampler for models of this type.
Performance
Benchmarked against flat territory banding (5 quintile bands by raw observed frequency) and the grand mean on a synthetic 12×12 grid of territories (144 areas) with known DGP and genuine spatial autocorrelation. MSE evaluated against true DGP rates. Full notebook: notebooks/benchmark.py.
| Metric | Grand mean | Flat bands (5 quintiles) | BYM2 |
|---|---|---|---|
| Overall MSE vs true rates | highest | moderate | lowest |
| Thin territory MSE | moderate | high (noisy raw rates banded) | lowest |
| Thick territory MSE | highest | low | matches thick raw |
| Moran's I on residuals | high (signal remains) | moderate (artefact edges) | near zero (spatial signal absorbed) |
| Fit time | instant | instant | 3–8 min (MCMC, 500 draws × 2 chains) |
The Moran's I comparison is the diagnostic that matters here. Flat banding leaves detectable spatial autocorrelation in the residuals — the band boundaries are artefacts of sampling variation, not genuine rate cliffs. BYM2's posterior residuals show Moran's I near zero, meaning the spatial signal has been captured. The rho parameter from the fitted model also tells you directly how much of the residual territory variation is genuinely spatial vs. area-specific noise.
When to use: UK personal lines territory pricing where postcode sectors have heterogeneous exposure depths, genuine spatial gradients in risk (urban/rural, deprivation, theft patterns), and where band discontinuities at district boundaries create conduct risk under Consumer Duty. The two-stage approach (main GLM without territory, then BYM2 on O/E residuals) keeps the spatial model auditable independently.
When NOT to use: When spatial autocorrelation is not present (test with Moran's I before fitting — the library includes moran_i()), or when the rho posterior is near zero (meaning the data do not support spatial smoothing and simpler credibility weighting suffices). The 3–8 minute MCMC runtime per territory refresh is acceptable for monthly or quarterly batch cycles but not for real-time use.
Databricks Notebook
A ready-to-run Databricks notebook benchmarking this library against standard approaches is available in burning-cost-examples.
Related libraries
| Library | Why it's relevant |
|---|---|
| insurance-multilevel | Broker and scheme random effects — the same credibility-weighting logic applied to group factors instead of territory |
| insurance-credibility | Bühlmann-Straub closed-form credibility — simpler alternative when spatial correlation is not the primary concern |
| shap-relativities | Extract the base model's implicit territory effect before passing O/E ratios to BYM2 |
| insurance-causal | Test whether a postcode factor is a genuine risk driver or a proxy for a protected characteristic |
| insurance-demand | Conversion and retention modelling — territory is a key feature in demand models |
References
- Riebler, A., Sørbye, S.H., Simpson, D., & Rue, H. (2016). An intuitive Bayesian spatial model for disease mapping that accounts for scaling. Statistical Methods in Medical Research, 25(4), 1145–1165.
- Gschlössl, S., Schelldorfer, J., & Schnaus, M. (2019). Spatial statistical modelling of insurance risk. Scandinavian Actuarial Journal.
- Besag, J., York, J., & Mollié, A. (1991). Bayesian image restoration, with two applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43(1), 1–40.
- Hjort, N. L., Jullum, M., & Loland, A. (2025). Uncertainty quantification in automated valuation models with spatially weighted conformal prediction. IJDSA (Springer). doi:10.1007/s41060-025-00862-4
- Tibshirani, R. J., Barber, R. F., Candes, E. J., & Ramdas, A. (2019). Conformal prediction under covariate shift. NeurIPS 2019.
Related Libraries
| Library | What it does |
|---|---|
| bayesian-pricing | Hierarchical Bayesian models for thin rating cells — the same partial-pooling logic applied to non-geographic grouping factors |
| insurance-glm-tools | GLM tooling including R2VF factor merging — use to band territory factors after BYM2 produces the relativities |
Licence
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file insurance_spatial-0.2.2.tar.gz.
File metadata
- Download URL: insurance_spatial-0.2.2.tar.gz
- Upload date:
- Size: 277.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c787502261ad1cc53d3eb6e12036b8a188e57b13f0dfdb5209842cbb5d13903
|
|
| MD5 |
dd5ffd95cda514bc36d63b99dc74f41e
|
|
| BLAKE2b-256 |
b61008db91652b59460b99701534aedbd69364245c899f9fd9a477fd58edaa1c
|
File details
Details for the file insurance_spatial-0.2.2-py3-none-any.whl.
File metadata
- Download URL: insurance_spatial-0.2.2-py3-none-any.whl
- Upload date:
- Size: 51.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ffd16c78a592b867183f52a7a9dc2a7fd6b0832edf9bec0dfc6550e19400293
|
|
| MD5 |
4d0c3bb69421a493aa458c5a7f7ea4a2
|
|
| BLAKE2b-256 |
cab92babaeca98d9f0bb804a7436e238bbb11e04ea2823d2feac296d83fa7ea1
|