Global sensitivity analysis for insurance pricing models — variance decomposition via Shapley effects
Project description
insurance-sensitivity
Global sensitivity analysis for insurance pricing models.
The problem
You have a fitted pricing model — a GLM, gradient boosted tree, or anything else with a predict method. You want to know: which rating factors drive the most variance in your premiums?
The naive answer is SHAP. But SHAP decomposes individual predictions, not portfolio-level variance. For a regulatory submission or a fair value assessment, you need a statement like "vehicle group explains 34% of the variance in fitted premiums across our portfolio". That is a different question, and it needs a different tool.
The standard tool for this is Sobol indices — but Sobol first-order indices are only valid under independent inputs. UK motor rating factors are not independent. Driver age correlates with NCD level. Postcode correlates with vehicle type. Sobol S1 indices will over-count the contribution of factors that are correlated with high-importance factors.
Shapley effects (Owen 2014, Song et al. 2016) solve this. They use the same Shapley formula from cooperative game theory, but applied to variance decomposition rather than individual predictions. The effects always sum to V[Y] and are never negative, regardless of correlations.
This library implements Shapley effects with insurance-specific extensions:
- Exposure-weighted variance (mid-term policies, partial-year risks)
- Categorical rating factors via empirical sampling (no encoding)
- CLH subsampling for large portfolios (Rabitti & Tzougas 2025, EAJ)
- Fitted-model interface — pass your model, not parameter distributions
Installation
pip install insurance-sensitivity
pip install insurance-sensitivity[plots] # matplotlib for charts
pip install insurance-sensitivity[polars] # polars DataFrame input
Quick start
import pandas as pd
from insurance_sensitivity import SensitivityAnalysis
# fitted_glm: any model with a .predict(X) method
# training_df: the data the model was fitted on, with an 'exposure' column
sa = SensitivityAnalysis(
model=fitted_glm,
X=training_df,
exposure_col='exposure', # year fractions for each policy
log_scale=True, # decompose Var[log(fitted)] — right choice
# for a multiplicative GLM
random_state=42,
)
# Shapley effects: correct under correlated inputs
result = sa.shapley(
n_perms=256, # more permutations → lower Monte Carlo error
)
print(result)
# ShapleyResult(total_variance=0.1847)
# vehicle_group: 34.2%
# ncd_band: 22.1%
# driver_age: 18.4%
# area: 11.3%
# ...
result.plot_bar() # horizontal bar chart with 95% CIs
result.plot_pie() # pie chart of % contributions
# Sobol indices: faster, but warns if inputs are correlated
sobol = sa.sobol(n_samples=1024)
sobol.plot_bar() # S1 and ST side by side
Large portfolios: CLH subsampling
For portfolios with >10k rows, the k-NN step in the Song estimator gets slow. Rabitti & Tzougas (2025) showed that selecting ~2000 representative observations via Conditional Latin Hypercube sampling gives results very close to the full-sample estimate, at a fraction of the cost.
result = sa.shapley(
n_perms=256,
n_subsample=2500, # subsample size (default: use full dataset)
)
Group attributions
If you want attribution at the level of rating factor groups (e.g. all vehicle-related factors as one group, all driver-related factors as another):
groups = {
'vehicle': ['vehicle_group', 'vehicle_age', 'cc_band'],
'driver': ['driver_age', 'ncd_band', 'licence_years'],
'area': ['postcode_area', 'garage_type'],
}
result = sa.shapley(n_perms=256, groups=groups)
# effects DataFrame now has rows: vehicle, driver, area
Interaction effects
interactions = sa.interaction_effects()
# Returns a DataFrame comparing phi_j vs S1_j * V[Y].
# High phi_j - S1_j means factor j acts mostly through interactions,
# not in isolation.
print(interactions[['factor', 'phi', 'S1_abs', 'interaction_pct']])
When to use Shapley effects vs Sobol
Use Shapley effects (.shapley()) when:
- Your rating factors are correlated (almost always true)
- You need the effects to sum to total variance (required for regulatory use)
- You want a defensible decomposition for fair value / FCA reporting
Use Sobol indices (.sobol()) when:
- You know your inputs are approximately independent
- You want a faster, rougher estimate for exploration
- You need second-order interaction indices S2(i,j)
The library warns you if you run Sobol on correlated inputs.
Supported model types
The wrapper handles these automatically:
- sklearn: any estimator with
.predict()or.predict_proba() - statsmodels: GLM results with
.predict(exog=X)signature - glum:
GeneralizedLinearRegressorwith.predict(X) - LightGBM:
Boosterand sklearn API - XGBoost:
Boosterand sklearn API - CatBoost:
CatBoostRegressor,CatBoostClassifier
For anything else, pass predict_fn='my_method_name'.
References
Owen, A.B. (2014). Sobol' indices and Shapley value. SIAM/ASA Journal on Uncertainty Quantification, 2(1), 245–251.
Song, E., Nelson, B.L. & Staum, J.C. (2016). Shapley effects for global sensitivity analysis: Theory and computation. SIAM/ASA Journal on Uncertainty Quantification, 4(1), 1060–1083.
Biessy, G. (2024). Construction of Rating Systems Using Global Sensitivity Analysis: A Numerical Investigation. ASTIN Bulletin, 54(1), 25–45. DOI: 10.1017/asb.2023.34
Saltelli, A. et al. (2010). Variance based sensitivity analysis of model output. Computer Physics Communications, 181(2), 259–270.
Rabitti, G. & Tzougas, G. (2025). Accelerating the computation of Shapley effects for datasets with many observations. European Actuarial Journal, 15, 885–898. DOI: 10.1007/s13385-025-00412-z
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file insurance_sensitivity-0.1.0.tar.gz.
File metadata
- Download URL: insurance_sensitivity-0.1.0.tar.gz
- Upload date:
- Size: 145.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ac17062b474ec0074044e5d8227539b6658d3df98acb0e9dff5b2221c629684
|
|
| MD5 |
714a113f4cbebcfc1c6c909abfc0df51
|
|
| BLAKE2b-256 |
333a20d27d9a976ad8ac5de61196f3d223805153d3ab729ed13ddb453abbff53
|
File details
Details for the file insurance_sensitivity-0.1.0-py3-none-any.whl.
File metadata
- Download URL: insurance_sensitivity-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ffbf414ddbf0a6d20742fbbdc5b5b37388624ed653c64049b8d85078d09d8cc
|
|
| MD5 |
31c8132a8fa19619f28a660ce48736a5
|
|
| BLAKE2b-256 |
bcd94b69c1c81e7063be51a803db98d0a1c085e1a9660c3ff76008de1674ba33
|