Causal price elasticity estimation and FCA-compliant renewal pricing optimisation for UK insurance

These details have not been verified by PyPI

Project links

Project description

insurance-elasticity

Causal price elasticity estimation and FCA PS21/5-compliant renewal pricing optimisation for UK personal lines — because naively regressing renewal flag on price in a formula-rated book measures confounding, not elasticity.

Why bother

Benchmarked against naive OLS elasticity (logistic regression with confounders) on 50,000 synthetic UK motor renewal records with known DGP. Results from Databricks run, 2026-03-16.

Metric	OLS Naive	DML (HistGBM nuisance)
ATE relative bias (prob scale)	24.5%	21.8%
NCD GATE RMSE	0.0855	0.0448 (-47.6%)
95% CI covers true ATE	No	Yes
Fit time (35k train)	2.5s	6.4s

OLS in a formula-rated book measures the correlation between risk level and renewal propensity, not the causal price effect. DML residualises both outcome and price on the same confounder set, recovering a credible causal semi-elasticity. The key advantage is in segment-level heterogeneous effects: DML recovers NCD-band elasticity gradients 47.6% more accurately.

Run on Databricks

The problem

UK motor and home insurance pricing teams want to know one thing: if we increase this customer's renewal price by 10%, how much does their probability of renewing fall?

The naive answer — run a logistic regression of renewal flag on price, read off the coefficient — is wrong. Risk factors drive both the price (because we re-rate them into the premium) and the renewal decision (because higher-risk customers may also have fewer alternatives). Ordinary regression conflates the two.

Double Machine Learning (DML) separates them. It residualises both the outcome and the treatment on the same set of observable confounders, then estimates the causal effect from what's left. Applied to renewal data, it gives a semi-elasticity: the expected change in renewal probability per unit change in log price, controlling for everything in your rating factors.

This library wraps EconML's CausalForestDML and LinearDML to do exactly that, with insurance-specific defaults and an FCA-compliant pricing optimiser built in.

Blog post: Your Renewal Pricing Is Flying Blind — why OLS gives the wrong answer, the near-deterministic price problem in practice, and how to interpret CATE heterogeneity for UK personal lines.

What you get

Heterogeneous elasticity estimates: per-customer CATE and segment-level GATE (group average treatment effects by NCD band, age, channel, etc.)
Treatment variation diagnostics: flags the near-deterministic price problem before you fit — if your pricing grid leaves no residual variation, the results are meaningless
Elasticity surface: heatmap and bar chart of elasticity across two dimensions simultaneously
FCA PS21/5-compliant optimiser: maximises profit subject to the ENBP constraint (offer price <= equivalent new business price)
ENBP audit: per-policy FCA ICOBS 6B.2 compliance flag
Portfolio demand curve: renewal rate and expected profit across a sweep of price changes

Install

uv add "insurance-elasticity[all]"
# or
pip install "insurance-elasticity[all]"

Core dependencies: polars, numpy, scipy, scikit-learn. Optional (for fitting): econml>=0.15, catboost>=1.2. Optional (for plotting): matplotlib>=3.7.

Quick start

from insurance_elasticity.data import make_renewal_data
from insurance_elasticity.fit import RenewalElasticityEstimator
from insurance_elasticity.surface import ElasticitySurface
from insurance_elasticity.optimise import RenewalPricingOptimiser
from insurance_elasticity.diagnostics import ElasticityDiagnostics
from insurance_elasticity.demand import demand_curve

# 1. Load data (or use the synthetic generator for testing)
df = make_renewal_data(n=50_000)

# 2. Check treatment variation before fitting
diag = ElasticityDiagnostics()
report = diag.treatment_variation_report(
    df,
    treatment="log_price_change",
    confounders=["age", "ncd_years", "vehicle_group", "region", "channel"],
)
print(report.summary())
# If report.weak_treatment is True, read the suggestions before proceeding.

# 3. Fit the elasticity model
confounders = ["age", "ncd_years", "vehicle_group", "region", "channel"]
est = RenewalElasticityEstimator(
    cate_model="causal_forest",   # non-parametric CATE surface
    n_estimators=200,
    catboost_iterations=500,
    n_folds=5,
)
est.fit(df, outcome="renewed", treatment="log_price_change", confounders=confounders)

# 4. Average treatment effect
ate, lb, ub = est.ate()
print(f"ATE: {ate:.3f}  95% CI: [{lb:.3f}, {ub:.3f}]")
# A 1-unit increase in log price change reduces renewal by |ATE| percentage points.
# For a 10% price increase (log change approx 0.095), effect approx ATE * 0.095.

# 5. Segment-level elasticity
gate = est.gate(df, by="ncd_years")
print(gate)

# 6. Elasticity surface and plots
surface = ElasticitySurface(est)
fig = surface.plot_surface(df, dims=["ncd_years", "age_band"])
fig.savefig("elasticity_surface.png", dpi=150, bbox_inches="tight")

fig2 = surface.plot_gate(df, by="channel")
fig2.savefig("gate_by_channel.png", dpi=150, bbox_inches="tight")

# 7. FCA-compliant pricing optimisation
opt = RenewalPricingOptimiser(
    est,
    technical_premium_col="tech_prem",
    enbp_col="enbp",
    floor_loading=1.0,
)
priced_df = opt.optimise(df, objective="profit")

# 8. Compliance audit
audit = opt.enbp_audit(priced_df)
print(f"Breaches: {(audit['compliant'] == False).sum()} / {len(audit)}")

# 9. Portfolio demand curve
demand_df = demand_curve(est, df, price_range=(-0.25, 0.25, 50))

Worked Example

price_elasticity_optimisation.py covers the complete DML workflow: elasticity estimation on a synthetic 50,000-policy motor book, heterogeneous CATE broken down by NCD band, channel, and age, an ENBP-constrained profit-maximising optimiser, and an efficient frontier showing the renewal rate versus expected profit trade-off across price change scenarios. Run it before fitting on your own data to understand how each component behaves.

The near-deterministic price problem

Insurance re-rating makes the offered price nearly a deterministic function of the observable risk factors. When Var(D~) / Var(D) < 10% — that is, less than 10% of price variation remains after conditioning on X — DML has almost nothing to work with. The confidence intervals blow up and the point estimate is noise.

Always run ElasticityDiagnostics.treatment_variation_report() first. If weak_treatment is True, do not proceed to fitting without addressing it.

The report's suggestions cover the main remedies: A/B price tests, panel data with within-customer variation, quasi-experiments from bulk re-rates, and the PS21/5 regression discontinuity.

FCA PS21/5 and ENBP

Since January 2022, UK GI firms must not quote a renewing customer a price above the equivalent new business price (ENBP). The RenewalPricingOptimiser enforces this as a hard per-policy constraint. The enbp_audit() method returns a per-row compliance flag for reporting to the compliance function.

Treatment variable

The standard treatment is log(offer_price / last_year_price). This gives a semi-elasticity directly: a 1-unit change in D (100% price increase) changes renewal probability by theta percentage points. For the typical 5-20% renewal re-rates in UK personal lines, interpret as: a 10% increase changes renewal probability by approximately ATE * log(1.1) approx ATE * 0.095.

Model choices

CausalForestDML (default): non-parametric, requires no pre-specified feature interactions, provides valid pointwise confidence intervals via honest splitting. Right for the elasticity surface. Computationally heavier.

LinearDML: assumes constant elasticity (or heterogeneity only through explicitly interacted features). Much faster. Right for quick portfolio-level ATE estimation.

CatBoost nuisance models: UK insurance data is full of categoricals (region, vehicle group, occupation, payment method). CatBoost is the default nuisance model choice. Note: the library currently one-hot encodes categorical columns before fitting (via _extract_arrays), so the native CatBoost categorical handling is not active. Passing pre-encoded features or a custom outcome_model / treatment_model will get you there faster than the default path.

Performance

Benchmarked against naive OLS elasticity (logistic regression with confounders) on 50,000 synthetic UK motor renewal records with known DGP (70/15/15 train/cal/test split). Run on Databricks serverless compute, 2026-03-16. See benchmarks/run_benchmark.py for full methodology.

DGP design: High-risk customers (low NCD, young, group D-F vehicle) face larger systematic price increases AND have lower base renewal probability. This creates positive confounding bias in OLS — the naive regression conflates the risk effect with the price effect.

Metrics are on the probability scale (average marginal effect: expected change in P(renew) per unit of log price change). True ATEs by NCD band range from -0.28 (NCD 0, most elastic) to -0.10 (NCD 5, least elastic).

Metric	OLS logistic AME	DML (HistGBM nuisance)
Portfolio ATE estimate	-0.194	-0.122
True portfolio ATE	-0.156	-0.156
ATE relative bias	24.5%	21.8%
95% CI covers true ATE	N/A	Yes
NCD GATE RMSE	0.0855	0.0448 (-47.6%)
Fit time (35k train)	2.5s	6.4s

Where DML wins: The 47.6% reduction in NCD GATE RMSE is the key result. OLS misranks the NCD bands because the confounding is unevenly distributed (low-NCD customers face both the largest systematic price increases and the lowest base renewal rates). DML's cross-fitting removes this. The segment-level heterogeneity — who is most price-sensitive — is what actually feeds into pricing decisions, not the portfolio average.

The ATE comparison: Both methods have meaningful bias in this partially-observable setting. The important difference is that DML provides a valid 95% confidence interval (covers the true value) while OLS has no interval at all. OLS is a point estimate from a mis-specified model; DML is an estimate with honest uncertainty quantification.

When to expect larger OLS bias: The benchmark uses price_variation_sd=0.08 — enough exogenous variation to identify the effect. In a tighter pricing grid (less A/B testing, more formula-driven re-rating), OLS bias will increase substantially while DML remains consistent.

References

Chernozhukov et al. (2018). Double/debiased machine learning for treatment and structural parameters. Econometrics Journal, 21(1).
Athey & Wager (2019). Estimating treatment effects with causal forests. Annals of Statistics, 47(2).
Guelman & Guillén (2014). A causal inference approach to measure price elasticity in automobile insurance. Expert Systems with Applications, 41(2).
FCA PS21/5 (2021). General Insurance Pricing Practices Policy Statement.

Worked Example

price_elasticity_optimisation.py — DML elasticity estimation from renewal data, ENBP-constrained optimiser, efficient frontier visualisation.

A Databricks-importable version is also available: Databricks notebook.

Related Libraries

Library	What it does
insurance-demand	Conversion, retention, and demand curve modelling — elasticity estimates feed directly into demand curve construction
insurance-optimise	Constrained rate change optimisation — consumes elasticity estimates to find profit-maximising factor adjustments
insurance-causal	Double Machine Learning for causal treatment effects — the methodological foundation for causal elasticity estimation

Other Burning Cost libraries

Model building

Library	Description
shap-relativities	Extract rating relativities from GBMs using SHAP
insurance-interactions	Automated GLM interaction detection via CANN and NID scores
insurance-cv	Walk-forward cross-validation respecting IBNR structure

Uncertainty quantification

Library	Description
insurance-conformal	Distribution-free prediction intervals for Tweedie models
bayesian-pricing	Hierarchical Bayesian models for thin-data segments
insurance-credibility	Bühlmann-Straub credibility weighting

Deployment and optimisation

Library	Description
insurance-deploy	Champion/challenger framework with ENBP audit logging
insurance-optimise	Constrained rate change optimisation with FCA PS21/5 compliance

Governance

Library	Description
insurance-fairness	Proxy discrimination auditing for UK insurance models
insurance-governance	PRA SS1/23 model governance and validation reports
insurance-monitoring	Model monitoring: PSI, A/E ratios, Gini drift test

All libraries and blog posts

Licence

MIT. Built by Burning Cost.

Need help implementing this in production? Talk to us.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Mar 20, 2026

0.1.7

Mar 17, 2026

This version

0.1.6

Mar 17, 2026

0.1.3

Mar 15, 2026

0.1.2

Mar 14, 2026

0.1.1

Mar 8, 2026

0.1.0

Mar 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insurance_elasticity-0.1.6.tar.gz (174.9 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

insurance_elasticity-0.1.6-py3-none-any.whl (35.2 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file insurance_elasticity-0.1.6.tar.gz.

File metadata

Download URL: insurance_elasticity-0.1.6.tar.gz
Upload date: Mar 17, 2026
Size: 174.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_elasticity-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`a3fe44bb827e7e3d821569adcae150532add869c5f3ebdc2a9412160aee9514b`
MD5	`11c165abc956e8d0f0066e0b1a7c1991`
BLAKE2b-256	`04bccf85ef47529dbc32009a83f23282bc6b95a0ecfa1329b05414e6f8ed2407`

See more details on using hashes here.

File details

Details for the file insurance_elasticity-0.1.6-py3-none-any.whl.

File metadata

Download URL: insurance_elasticity-0.1.6-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 35.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_elasticity-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a3521f38639ff003cf4fe2d47ac15ee38879575b43e97d7a6052b0722d8efdc4`
MD5	`5dabd254ffea1bbe63be39d033ab1fea`
BLAKE2b-256	`3b457c96d7b66c3d3b67c8b10800c765056e261f2f3ed6a9ad8816004dcfd963`

See more details on using hashes here.

insurance-elasticity 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

insurance-elasticity

Why bother

The problem

What you get

Install

Quick start

Worked Example

The near-deterministic price problem

FCA PS21/5 and ENBP

Treatment variable

Model choices

Performance

References

Worked Example

Related Libraries

Other Burning Cost libraries

Licence

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes