Skip to main content

Neural Poisson mixture model separating structural and stochastic zero-claimers for UK insurance pricing

Project description

insurance-poisson-mixture-nn

Neural Poisson mixture model that separates structural zero-claimers from stochastic zero-claimers in UK personal lines insurance.

The problem

Your telematics motor book has a lot of zero-claim policies. Some of those zeros are structural: the driver installed the black box for the discount but barely drives. These policyholders will never claim regardless of how long you cover them. Others are stochastic: active drivers who happened not to have an accident this year. A longer policy period or worse luck and they would have claimed.

Standard frequency models — Poisson GLM, Poisson GBM — treat all zeros the same. Zero-inflated Poisson (ZIP) separates them using a single inflation parameter, but ZIP cannot identify which zero-claim policyholders are structural vs stochastic at the individual level.

This matters for pricing. Charging a structural zero the same frequency load as a stochastic zero means you are systematically overcharging low-risk policyholders. Under FCA Consumer Duty, that is a problem.

The solution

A two-component Poisson mixture estimated end-to-end with gradient descent:

P(Y=k | x) = (1 - pi(x)) * Poisson(k; lambda_0(x) * t)
           +     pi(x)   * Poisson(k; lambda_1(x) * t)
  • pi(x): probability the policyholder is in the risky (at-risk) group — estimated by a neural network
  • lambda_0(x): claim rate for the safe/structural-zero group — kept near zero by the data
  • lambda_1(x): claim rate for the risky group — always constrained above lambda_0
  • t: exposure in policy years

The ordering constraint lambda_1 > lambda_0 is enforced via reparameterisation:

lambda_0 = softplus(a)
lambda_1 = lambda_0 + softplus(b)

This eliminates label-switching without any hard clipping.

The output pi(x) is a continuous structural zero score. A telematics driver with pi = 0.05 and zero claims is almost certainly a structural zero. A driver with pi = 0.8 and zero claims is a stochastic zero who was lucky this year.

Based on: Poisson Mixture Deep Learning Neural Network Models for the Prediction of Drivers' Claims with Excessive Zero Claims Using Telematics Data, North American Actuarial Journal (NAAJ), 2025.

Installation

pip install insurance-poisson-mixture-nn

With optional comparison baselines (requires statsmodels):

pip install insurance-poisson-mixture-nn[comparison]

Quick start

from insurance_poisson_mixture_nn import PoissonMixtureNN, PoissonMixtureTrainer, PoissonMixturePredictor
from insurance_poisson_mixture_nn.synthetic import SyntheticMixtureData

# Generate synthetic telematics data with known mixture structure
data = SyntheticMixtureData(n_policies=10_000, seed=42)
X_train, y_train, exp_train = data.training_split()
X_val, y_val, exp_val = data.validation_split()
X_test, y_test, exp_test = data.test_split()

# Build the model
model = PoissonMixtureNN(
    n_features=X_train.shape[1],
    hidden_sizes=[64, 64, 32, 32, 16],  # paper architecture
    dropout=0.1,
    batch_norm=True,
    activation='elu',
)

# Train
trainer = PoissonMixtureTrainer(
    model,
    lr=1e-3,
    batch_size=512,
    max_epochs=200,
    patience=15,
)
history = trainer.fit(X_train, y_train, exp_train, X_val, y_val, exp_val)
print(f"Best val NLL: {history.best_val_nll:.4f} at epoch {history.best_epoch}")

# Predict
predictor = PoissonMixturePredictor(model)

# Expected claim frequency per policy (the pricing output)
expected_freq = predictor.predict_expected(X_test, exp_test)

# At-risk probability (pi score — high = risky group)
pi_scores = predictor.predict_pi(X_test)

# Structural zero score (1 - pi — high = likely never-claimer)
sz_scores = predictor.predict_structural_zero_score(X_test)

# Hard classification: structural vs stochastic
labels = predictor.classify_zero(X_test, threshold=0.5)

Diagnostics

from insurance_poisson_mixture_nn.diagnostics import MixtureDiagnostics

diag = MixtureDiagnostics(predictor)

# Component separation: distributions of lambda_0, lambda_1, pi
fig = diag.component_separation(X_test, exp_test)

# Pi calibration by decile
fig = diag.pi_calibration(X_test, y_test, exp_test)

# For zero-claim policies: structural vs stochastic attribution
fig = diag.zero_decomposition(X_test, y_test, exp_test)

# Training curves
fig = diag.training_curves(history)

Model comparison

from insurance_poisson_mixture_nn.comparison import ModelComparison

comp = ModelComparison(model, verbose=True)
results = comp.compare(X_train, y_train, exp_train, X_test, y_test, exp_test)
df = comp.results_dataframe()
print(df)
# Compares: Poisson GLM, ZIP GLM, Poisson DNN, PM-DNN

Architecture

The shared trunk approach is a deliberate design choice. Three separate sub-networks for pi, lambda_0, and lambda_1 would have three times the parameters and would not share the feature representations learned from the shared training signal. The shared trunk learns a single latent representation; the three heads then specialise it.

Architecture by default:

  • 5 hidden layers: [64, 64, 32, 32, 16]
  • ELU activation (paper default — avoids dead neurons better than ReLU)
  • BatchNorm + Dropout (0.1)
  • Adam with ReduceLROnPlateau
  • Gradient clipping (max_norm=1.0) for stability in early epochs

When to use this

Use it when:

  • You have telematics data and believe some policies are near-zero-exposure structural zeros
  • Your zero-claim fraction is high and you suspect a genuine mixture (not just overdispersion)
  • You want a per-policy structural zero score for pricing or NCD ladder adjustment

Do not use it when:

  • You have no telematics or occupancy data — the model cannot identify structural zeros without informative covariates
  • Your excess zeros are driven by overdispersion rather than a genuine two-group structure (use Negative Binomial instead)
  • You want an interpretable GLM-style model — this is a black-box neural network

Requirements

  • Python >= 3.10
  • PyTorch >= 2.0
  • Polars >= 0.20
  • NumPy >= 1.24
  • scikit-learn >= 1.3

Licence

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insurance_poisson_mixture_nn-0.1.0.tar.gz (38.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

insurance_poisson_mixture_nn-0.1.0-py3-none-any.whl (27.6 kB view details)

Uploaded Python 3

File details

Details for the file insurance_poisson_mixture_nn-0.1.0.tar.gz.

File metadata

  • Download URL: insurance_poisson_mixture_nn-0.1.0.tar.gz
  • Upload date:
  • Size: 38.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_poisson_mixture_nn-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4baf35beae947d1c38f743d7b4cb8f76d763075b9c5e3c921a9d92be86912f4d
MD5 82967c325f4da44ad37bdded3ad992f7
BLAKE2b-256 a071ba5dc53d25f6ce435c98bf0d5331c86dd93fd16641eb045366dff9a5344b

See more details on using hashes here.

File details

Details for the file insurance_poisson_mixture_nn-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: insurance_poisson_mixture_nn-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for insurance_poisson_mixture_nn-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e43390ccc8d33a9a792b2494ae339c2a1c91acfbf496ca7e5cdbb778e9d0984b
MD5 9be8233121a7a8d67f43339b52af4018
BLAKE2b-256 e453b73a9e6b6a318c8957a8c679522d4564705ea9b162c72bddcd8bca43993f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page