Insurance workflow wrapper around interpretML's ExplainableBoostingMachine

These details have not been verified by PyPI

Project links

Project description

insurance-ebm

An insurance pricing workflow wrapper around interpretML's ExplainableBoostingMachine.

The problem

Gradient boosted models and neural networks can outperform GLMs on pure predictive accuracy, but they're hard to use in a regulated pricing environment. You need to explain your rating factors to an actuary, show that they're sensible, enforce business constraints (e.g. older car = lower comprehensive premium), and produce relativity tables that a pricing committee can review.

interpretML's ExplainableBoostingMachine solves the interpretability problem — it's an additive model that produces shape functions you can inspect feature by feature, just like a GLM's factor table. It also handles Poisson/Tweedie/Gamma loss, exposure offsets, monotonicity constraints, and interaction detection natively.

What it doesn't do is wrap those capabilities in the workflow a UK pricing team actually uses: exposure-aware predict(), relativity table extraction, actuarial validation metrics (Gini, double-lift, deviance), post-fit monotonicity editing, and GLM comparison tools.

That's what this library provides.

What you get

InsuranceEBM        — fit/predict with Poisson, Tweedie, Gamma. Exposure as offset.
RelativitiesTable   — extract relativity tables in the format a pricing committee expects.
Diagnostics         — Gini, Lorenz curve, double-lift, deviance, residual plots, A/E by segment.
MonotonicityEditor  — post-fit enforcement of monotone shape functions via isotonic regression.
GLMComparison       — compare EBM shape functions against your existing GLM factor tables.

Installation

pip install insurance-ebm
# With Excel export:
pip install insurance-ebm[excel]
# With statsmodels GLM integration:
pip install insurance-ebm[glm]

Quick start

import polars as pl
from insurance_ebm import InsuranceEBM, RelativitiesTable
from insurance_ebm import gini, double_lift

# Fit a Poisson frequency model
model = InsuranceEBM(loss='poisson', interactions='3x')
model.fit(X_train, y_train['claim_count'], exposure=y_train['exposure'])

# Predict expected claim counts on test data
preds = model.predict(X_test, exposure=y_test['exposure'])

# Evaluate
print(f"Gini: {gini(y_test['claim_count'], preds, exposure=y_test['exposure']):.3f}")
print(double_lift(y_test['claim_count'], preds, exposure=y_test['exposure']))

# Extract relativity tables
rt = RelativitiesTable(model)
print(rt.table('driver_age'))    # per-bin relativities
print(rt.summary())              # all features ranked by leverage
rt.export_excel('relativities.xlsx')

Exposure handling

For log-link families (Poisson, Tweedie, Gamma), exposure enters as a log offset:

model.fit(X, y, exposure=exposure)
# Internally: init_score = log(exposure) passed to interpretML

When predicting:

preds = model.predict(X, exposure=exposure)
# Returns: exp(log_score + log(exposure)) = rate * exposure

predict_log_score() returns the additive log score without the exposure scaling — useful for combining separate frequency and severity models.

Monotonicity

You can set constraints at fit time:

model = InsuranceEBM(
    loss='poisson',
    monotone_constraints={'ncd': -1, 'vehicle_age': -1}  # more NCD / older car = lower rate
)

Or enforce monotonicity post-fit using isotonic regression:

from insurance_ebm import MonotonicityEditor

me = MonotonicityEditor(model)
scores_before = me.get_scores('ncd')
me.enforce('ncd', direction='decrease')
me.plot_before_after('ncd', scores_before=scores_before)

Post-fit enforcement modifies the stored shape function in-place. It's a soft constraint — the shape function is isotonically regressed, not the model re-fitted. Use it to clean up noise at the tails, not to override systematic model signals.

GLM comparison

When migrating from a GLM or running models in parallel:

from insurance_ebm import GLMComparison
import polars as pl

# Supply pre-computed GLM relativities as a polars DataFrame
glm_rel = pl.DataFrame({
    'level': ['G1', 'G2', 'G3', 'G4', 'G5'],
    'relativity': [1.0, 1.05, 1.12, 1.22, 1.35]
})

cmp = GLMComparison(model)
cmp.plot_comparison('vehicle_group', glm_relativities=glm_rel)

# Which features diverge most?
by_feature = {feat: glm_rel for feat in model.feature_names}
print(cmp.divergence_summary(glm_relativities_by_feature=by_feature))

Design decisions

Polars as primary DataFrame library. interpretML requires pandas internally, so we convert at the boundary. The public API accepts polars and returns polars — pandas is an implementation detail.

predict() returns response scale, not log scale. A pricing actuary expects predict() to return expected claim frequency or severity, not log scores. Use predict_log_score() if you need the additive representation.

Deviance as the score metric. score() returns negative mean deviance (so higher = better, consistent with sklearn). We use the family-appropriate deviance rather than R², which is not meaningful for count or severity models.

Base level = modal bin. Relativities are normalised to the bin with the highest training weight. This matches GLM convention (where you'd typically nominate the most common level as the reference) and produces relativities that read naturally — the most common risk profile has relativity 1.0.

Post-fit monotonicity via isotonic regression. The MonotonicityEditor modifies stored term scores, not the boosting trees. This is sufficient for production predictions but the adjusted model has not been re-validated on training data. Document this when using it.

Dependencies

interpret >= 0.7.0 — the EBM engine
polars >= 0.20 — primary DataFrame library
numpy >= 1.21
matplotlib >= 3.4
scikit-learn >= 1.0 — isotonic regression for MonotonicityEditor

Optional: openpyxl >= 3.0 for Excel export, statsmodels >= 0.13 for GLM object integration.

Databricks demo

A full workflow notebook is available in notebooks/insurance_ebm_demo.py and in the Databricks workspace at /Workspace/insurance-ebm/notebooks/.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Mar 14, 2026

This version

0.1.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insurance_ebm-0.1.0.tar.gz (321.9 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

insurance_ebm-0.1.0-py3-none-any.whl (24.4 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file insurance_ebm-0.1.0.tar.gz.

File metadata

Download URL: insurance_ebm-0.1.0.tar.gz
Upload date: Mar 9, 2026
Size: 321.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_ebm-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`54178a719fc038dc309894d0c83a2739ce31fbffb5ad1ad179672f5248520013`
MD5	`bb99f5069ba91a6f121864f9c5bafb4d`
BLAKE2b-256	`8bf6de93563fc70a5a40ee2af62f2878d5acb45936f8cc87fdde39d20b7de4e2`

See more details on using hashes here.

File details

Details for the file insurance_ebm-0.1.0-py3-none-any.whl.

File metadata

Download URL: insurance_ebm-0.1.0-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 24.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_ebm-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8ebd9244956198aff305370352179a526e988903804ffeb4075ef42222958c18`
MD5	`15238ae440f8fc30ab0c4d898e43b979`
BLAKE2b-256	`16dde0277494c003d7dc4317ba47c986d429c634fee59bbb9e3f78c8fafbd914`

See more details on using hashes here.

insurance-ebm 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

insurance-ebm

The problem

What you get

Installation

Quick start

Exposure handling

Monotonicity

GLM comparison

Design decisions

Dependencies

Databricks demo

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes