Weighted conformal prediction sets for individual insurance counterfactuals (Lei & Candès 2021) with sensitivity analysis (Jin et al. 2023) and FCA harm reporting
Project description
insurance-counterfactual-sets
Finite-sample valid prediction sets for individual insurance counterfactuals.
The problem this solves: you want to know what a specific policyholder would have paid under a different pricing treatment — say, new-business pricing instead of renewal pricing. A point estimate of the counterfactual is not enough; you need a prediction set with a rigorous coverage guarantee.
This library implements Lei & Candès (2021) weighted conformal inference for counterfactuals, with Jin et al. (2023) sensitivity analysis and FCA Consumer Duty harm reporting.
Why conformal inference for counterfactuals?
Standard causal inference (DML, TMLE) gives you average treatment effects with asymptotic confidence intervals. If you want individual-level prediction sets — valid for each policyholder, not just on average — you need conformal methods.
The key properties:
- Finite-sample marginal coverage: no asymptotic approximations, no distributional assumptions on outcomes
- Handles covariate shift: importance weighting from propensity scores reweights the calibration distribution to match each test point
- Heteroskedasticity: conformalized quantile regression (CQR) adapts interval width to local variability in claims
- Sensitivity analysis: Gamma-values tell you how large unmeasured confounding would need to be to invalidate your conclusion
What it is not
- The coverage guarantee is marginal (averaged across test policyholders), not conditional (for each individual). Conditional coverage is much harder and not provided here.
- This is a screening tool for Consumer Duty / ICOBS 6B review. It is not a legal determination of harm.
- The Gamma-value quantifies sensitivity under the marginal sensitivity model (Tan 2006), not all possible confounding structures.
Installation
pip install insurance-counterfactual-sets
With optional CatBoost support:
pip install "insurance-counterfactual-sets[catboost]"
Quick start
import numpy as np
from sklearn.linear_model import Ridge, LogisticRegression
from insurance_counterfactual_sets import (
WeightedConformalITE,
PropensityWeighter,
SensitivityAnalyzer,
FCAHarmReport,
)
# Assume X, Y, T are your training data arrays
# T=0: new business, T=1: renewal (or whichever treatment is relevant)
# 1. Split into train / calibration / test
n = len(X)
train_idx = np.arange(n // 2)
cal_idx = np.arange(n // 2, 3 * n // 4)
test_idx = np.arange(3 * n // 4, n)
# 2. Fit
pw = PropensityWeighter(LogisticRegression(max_iter=1000))
model = WeightedConformalITE(
outcome_model=Ridge(),
propensity_model=pw,
alpha=0.05, # 95% coverage
nonconformity="cqr", # CQR default: adaptive width
)
model.fit(X[train_idx], Y[train_idx], T[train_idx])
model.calibrate(X[cal_idx], Y[cal_idx], T[cal_idx])
# 3. Predict: what would renewal policyholders have paid as new customers?
sets = model.predict_counterfactual(X[test_idx], treatment_arm=0)
# Returns a Polars DataFrame: lower, upper, point_estimate, half_width
# 4. ITE prediction sets
ite = model.predict_ite(X[test_idx])
# ite_lower, ite_upper, ite_point — Minkowski sum of Y(1) and Y(0) sets
# 5. Sensitivity analysis
sa = SensitivityAnalyzer(model)
gamma_df = sa.gamma_report(X[test_idx], treatment_arm=0)
# gamma_value: smallest Gamma that would invalidate the conclusion
# robust: True if gamma_value > 1.5
# 6. FCA harm report
renewal_premiums = ... # actual premiums charged
report = FCAHarmReport(conformal_ite=model, sensitivity_analyzer=sa)
individual = report.individual_harm_assessment(
X[test_idx], Y[test_idx], renewal_premiums
)
summary = report.portfolio_summary(individual)
report.fca_attestation_pack(individual, output_dir="./evidence")
Core API
WeightedConformalITE
The main class.
WeightedConformalITE(
outcome_model=None, # sklearn regressor, default Ridge
propensity_model=None, # PropensityWeighter, default LogisticRegression
alpha=0.05, # miscoverage level
method='split', # split conformal (only option currently)
nonconformity='cqr', # 'cqr' or 'abs'
)
fit(X, Y, T): fit outcome models on training foldcalibrate(X_cal, Y_cal, T_cal): compute nonconformity scores on calibration foldpredict_counterfactual(X, treatment_arm=0): returns Polars DataFrame withlower,upper,point_estimate,half_widthpredict_ite(X): ITE = Y(1) - Y(0) via Minkowski sum
Nonconformity scores:
abs: residual score R_i = |Y_i - mu_t(X_i)|. Simple, interpretable.
cqr: conformalized quantile regression score R_i = max(q_lo(X_i) - Y_i, Y_i - q_hi(X_i)). Gives tighter intervals when the outcome variance depends on covariates (typical for insurance claims). This is the default.
PropensityWeighter
PropensityWeighter(
estimator=None, # sklearn classifier, default LogisticRegression
calibrate=True, # Platt scaling via CalibratedClassifierCV
calibration_method='sigmoid',
clip_quantile=0.99, # clip weights at 99th percentile
min_propensity=0.01, # hard floor on propensity scores
)
fit(X, T): fit propensity modelpredict_propensity(X): returns P(T=1|X)predict_weights(X, T): returns clipped importance weightscheck_overlap(X, T): returns dict with ESS, weight diagnostics, overlap warning
Weight clipping: without clipping, a single calibration unit with a tiny propensity score can dominate the weighted quantile. The 99th percentile clip is a practical robustness measure. You can disable it with clip_quantile=1.0.
SensitivityAnalyzer
SensitivityAnalyzer(
conformal_ite, # fitted WeightedConformalITE
gamma_grid=None, # coarse grid for search, default arange(1, 5.1, 0.25)
)
robust_prediction_set(X_test, gamma, treatment_arm=0): prediction set valid under Gammagamma_value(X_test, treatment_arm=0, null_value=0.0, gamma_max=10.0): min Gamma that invalidates conclusiongamma_report(X_test, treatment_arm=0): Gamma-value per test unit as Polars DataFrameite_gamma_value(X_test, null_ite=0.0): Gamma-value for the ITE interval
Interpreting Gamma-values:
- Gamma = 1.0: no unmeasured confounding needed to shift the conclusion (fragile)
- Gamma = 1.5: odds of treatment could be 1.5x larger/smaller than estimated
- Gamma = 3.0: very robust — confounders would need to triple the treatment odds
FCAHarmReport
FCAHarmReport(
conformal_ite, # fitted WeightedConformalITE
sensitivity_analyzer=None # optional SensitivityAnalyzer
)
individual_harm_assessment(X_renewals, Y_actual, renewal_premium): per-policyholder harm flagsportfolio_summary(individual_results): aggregate statisticsfca_attestation_pack(individual_results, output_dir): write CSV + HTML evidence pack
Harm definition: a policyholder is flagged if their renewal premium exceeds the upper bound of the (1-alpha) prediction set for what they would have paid as a new customer.
The algorithm
For test point x and target arm t:
- Compute nonconformity scores
R_ifor calibration units in armt - Compute importance weights:
w_i = e(x)^[t=1] * (1-e(x))^[t=0] / (e(X_i)^[t=1] * (1-e(X_i))^[t=0]) - Augmented weighted quantile at level
(n+1)(1-alpha)/n - Prediction set:
[mu_t(x) - Q_w, mu_t(x) + Q_w](abs) or[q_lo(x) - Q_w, q_hi(x) + Q_w](CQR)
The augmented quantile (Tibshirani et al. 2019) appends an infinite score with weight 1 before the quantile computation. This is the finite-sample correction that guarantees coverage.
Overlap and diagnostics
Before trusting the prediction sets, check that your propensity model has adequate overlap:
diag = pw.check_overlap(X_cal, T_cal)
print(diag)
# {'ess': 0.72, 'weight_max': 4.2, 'overlap_warning': False, ...}
A low effective sample size (ESS < 0.3) is a warning sign: the weighted quantile is being driven by a small number of calibration units. Consider using a more flexible propensity model or restricting the analysis to a population with better overlap.
References
- Lei, L. & Candès, E.J. (2021). Conformal Inference of Counterfactuals and Individual Treatment Effects. JRSS-B 83(5):911-938. arXiv:2006.06138.
- Jin, Y., Ren, Z. & Candès, E.J. (2023). Sensitivity Analysis of Individual Treatment Effects: A Robust Conformal Inference Approach. PNAS 120(6). arXiv:2111.12161.
- Romano, Y., Patterson, E. & Candès, E.J. (2019). Conformalized Quantile Regression. NeurIPS 2019. arXiv:1905.03222.
- Tibshirani, R.J., Barber, R.F., Candès, E.J. & Ramdas, A. (2019). Conformal Prediction Under Covariate Shift. NeurIPS 2019. arXiv:1904.06019.
- Tan, Z. (2006). A distributional approach for causal inference using propensity scores. JASA 101(476):1619-1637.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file insurance_counterfactual_sets-0.1.0.tar.gz.
File metadata
- Download URL: insurance_counterfactual_sets-0.1.0.tar.gz
- Upload date:
- Size: 35.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b118e01a2e12726c42c7f8d1e8520ee288822c93345a961d38ce1f0e3fd9e2c9
|
|
| MD5 |
e8b77be9a5f68f82313fb79291c6dcba
|
|
| BLAKE2b-256 |
a831f86f9d884abd93cab19d10fa810eccd810b1c55cf8ece3b34def2bdef829
|
File details
Details for the file insurance_counterfactual_sets-0.1.0-py3-none-any.whl.
File metadata
- Download URL: insurance_counterfactual_sets-0.1.0-py3-none-any.whl
- Upload date:
- Size: 27.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42042f00909097ac43e5463e7e46f0ae3daf6ec2b4e3a47054762f21cf6ccacc
|
|
| MD5 |
54eda0aacdf50594e24738c99b0b350e
|
|
| BLAKE2b-256 |
b6149da6ddca942be0317da9872ab19d5bf6dd12240a86f8dcf3905be488bb33
|