Normalizing flows for insurance severity distribution modelling

These details have not been verified by PyPI

Project description

insurance-nflow

Normalizing flows for insurance severity distribution modelling.

The problem

The standard approach to severity modelling — lognormal GLM, gamma GLM — makes a strong bet on the distribution family. The family governs the tail, the shape, the response to rating factors. Most of that bet is untestable on training data and wrong in the upper tail, where the money is.

For UK motor bodily injury, this matters acutely. BI severity is bimodal: soft-tissue claims cluster around GBP 5,000; catastrophic injury claims (spinal, brain injury) follow a power law from GBP 100,000 upward. A lognormal fits neither mode well and dramatically misrepresents the tail. A pricing team using lognormal TVaR(0.99) to price their catastrophic injury layer is using the wrong number.

Normalizing flows drop the family assumption entirely. A Neural Spline Flow (NSF) learns the full conditional distribution p(severity | rating factors) from data. The Hickling & Prangle (ICML 2025) Tail Transform Flow (TTF) adds GPD-like heavy tails on top, with tail weight parameters estimated from the data via the Hill estimator.

The result is a model that can represent bimodality, shape changes by rating factor, and calibrated heavy tails — without choosing a parametric family.

What this library provides

SeverityFlow — unconditional severity distribution
ConditionalSeverityFlow — p(severity | rating factors)
TTF tail layer (Hickling & Prangle 2025), optional
TVaR, ILF curves, LEV, reinsurance layer pricing from flow samples
Diagnostics: PIT histogram, QQ plot, tail index comparison, AIC/BIC table vs parametric benchmarks
Synthetic UK motor BI data generator (bimodal DGP, known tail index) for testing

Install

pip install insurance-nflow

PyTorch is required. For CPU-only (recommended for most pricing teams):

pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install insurance-nflow

This is roughly 900MB of PyTorch plus the library. The API hides all PyTorch from the user; you interact with numpy arrays throughout.

Quickstart

Unconditional severity

import numpy as np
from insurance_nflow import SeverityFlow

# Load your claims data
claims = np.array([...])  # positive GBP amounts

# Fit a flow with heavy-tail extension
flow = SeverityFlow(
    n_transforms=6,
    tail_transform=True,   # Hickling 2025 TTF
    tail_mode='fix',       # pre-estimate tail params via Hill, then fix
    max_epochs=200,
    patience=20,
)
result = flow.fit(claims)

print(result)
# SeverityFlowResult(val_logL=-9.2341, AIC=1234567.8, n_params=97412, lambda+=0.612, epochs=143)

# Actuarial outputs
print(flow.tvar(0.99))          # TVaR(99%) in GBP
print(flow.quantile(0.995))     # VaR(99.5%) in GBP

# ILF curve
ilf = flow.ilf(
    limits=[50_000, 100_000, 250_000, 500_000, 1_000_000],
    basic_limit=50_000,
)
# {50000.0: 1.0, 100000.0: 1.31, 250000.0: 1.72, ...}

Conditional severity (with rating factors)

from insurance_nflow import ConditionalSeverityFlow
import numpy as np

claims = np.array([...])          # shape (N,)
context = np.array([...])         # shape (N, n_factors) — e.g. age_band, vehicle_group, region

flow = ConditionalSeverityFlow(
    context_features=5,
    n_transforms=6,
    tail_transform=True,
)
result = flow.fit(claims, context=context, exposure_weights=exposure)

# Price a specific risk
young_london = np.array([[1, 8, 1, 3, 0]])  # one row = one risk profile
print(flow.conditional_tvar(young_london, 0.99))

# Compare against a base risk
mid_north = np.array([[3, 5, 3, 5, 5]])
print(flow.conditional_tvar(mid_north, 0.99))

Reinsurance layer pricing

# Expected cost to 200k xs 50k per-occurrence XL layer
cost = flow.reinsurance_layer(
    attachment=50_000,
    limit=200_000,
    context=portfolio_factors,
    n_samples=500_000,
)

Model comparison vs parametric families

from insurance_nflow.diagnostics import fit_parametric_benchmarks, model_comparison_table

# Fit lognormal, gamma, Pareto as baselines
ll_benchmarks, k_benchmarks = fit_parametric_benchmarks(claims)

# Add flow
ll_benchmarks["flow"] = float(result.train_log_likelihood * len(claims))
k_benchmarks["flow"] = result.n_parameters

table = model_comparison_table(claims, ll_benchmarks, k_benchmarks)
# Sorted by AIC. Note: AIC heavily penalises the flow's ~100k params.
# Use test log-likelihood per observation as the primary metric.

Architecture

log(claim) ─→ [NSF: n coupling layers] ─→ [TTF tail layer] ─→ N(0,1)
                  ↑ context encoder
              rating factors feed into
              each coupling layer's
              parameter network (zuko API)

The log-transform maps positive claims to the real line. The NSF (Neural Spline Flow, Durkan et al. 2019) learns the residual non-Gaussianity — including the bimodal body. The TTF layer (Hickling & Prangle 2025) corrects the tail to GPD-like behaviour, with tail weight parameters lambda+, lambda- estimated from the upper and lower tails of the training data via Hill double-bootstrap.

Why NSF over MAF? Coupling architecture means sampling parallelises. Drawing 1M scenario claims takes seconds, not minutes.

Why TTF (fix) over (joint)? Hickling's experiments show TTF (fix) — pre-estimating tail parameters and fixing them — outperforms joint training on heavy-tailed targets. Less flexibility, more stability.

Why zuko? It's the only actively-maintained Python normalizing flow library (v1.6.0, March 2026). The flow(context).log_prob(x) API is clean. nflows is abandoned; normflows is slow to update.

Actuarial functions

These work on any numpy array of positive values — not just flow samples:

from insurance_nflow import tvar, ilf, limited_expected_value, reinsurance_layer_cost

samples = np.array([...])

tvar(samples, 0.99)
ilf(samples, limits=[100_000, 250_000], basic_limit=50_000)
limited_expected_value(samples, 100_000)
reinsurance_layer_cost(samples, attachment=50_000, limit=200_000)

Testing

pip install -e ".[dev]"
pytest tests/ -v

Tests that require torch/zuko use pytest.importorskip and skip gracefully if the packages aren't installed. The data, actuarial, and diagnostic tests have no heavy dependencies.

Limitations and roadmap

v0.1.0 (this release):

No truncation/censoring support. Policies with deductibles require the CDF for each observation, which zuko does not compute analytically. Feasible but doubles training time. Planned for v0.2.0.
CPU training only (GPU opt-in via device='cuda'). For 100k claims, 200 epochs takes ~50 minutes on a modern CPU. Subsample to 100k for initial model search.
Univariate only (one severity dimension). Joint modelling of claim count and severity is not in scope.

v0.2.0 planned:

Truncated/censored likelihood
Aggregate loss simulation (Panjer + flow severity)
Model persistence (save/load)

References

Hickling & Prangle (2025), Flexible Tails for Normalizing Flows, ICML 2025, PMLR 267:23155-23178
Durkan et al. (2019), Neural Spline Flows, NeurIPS 2019
Papamakarios et al. (2021), Normalizing Flows for Probabilistic Modeling and Inference, JMLR 22(57)
Winkler et al. (2019), Learning Likelihoods with Conditional Normalizing Flows

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Mar 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insurance_nflow-0.1.0.tar.gz (42.2 kB view details)

Uploaded Mar 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

insurance_nflow-0.1.0-py3-none-any.whl (30.2 kB view details)

Uploaded Mar 12, 2026 Python 3

File details

Details for the file insurance_nflow-0.1.0.tar.gz.

File metadata

Download URL: insurance_nflow-0.1.0.tar.gz
Upload date: Mar 12, 2026
Size: 42.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_nflow-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`03751bb9009c26f75a22a2595c24de9e45f48b1f2eb8387f712e0a9a2252ab6c`
MD5	`3525ced17a0437b1f70ce0a77d509dc0`
BLAKE2b-256	`76fa87af79fce11223b5c1b117ef8f2df6d22017686d69cf79e69c9bd58c3c51`

See more details on using hashes here.

File details

Details for the file insurance_nflow-0.1.0-py3-none-any.whl.

File metadata

Download URL: insurance_nflow-0.1.0-py3-none-any.whl
Upload date: Mar 12, 2026
Size: 30.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_nflow-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ee347c768e4802bf0c95b03319f9890aa39d886b3e770874e1f9b0d15dde5c44`
MD5	`e7a27fb280668151c7872f0e7f14e29d`
BLAKE2b-256	`7b60ab58287e1899ca5a513ca91fd56522ff63fc6b06a56e6c6fc12192497d71`

See more details on using hashes here.

insurance-nflow 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

insurance-nflow

The problem

What this library provides

Install

Quickstart

Unconditional severity

Conditional severity (with rating factors)

Reinsurance layer pricing

Model comparison vs parametric families

Architecture

Actuarial functions

Testing

Limitations and roadmap

References

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes