Skip to main content

Extreme-loss estimation and tail risk analytics in Python

Project description

extremeloss

Python License

Extreme value theory, tail-risk estimation, and rare-event diagnostics.


Overview

extremeloss focuses on the part of a loss distribution that is hardest to estimate well: the far tail. It provides peaks-over-threshold and block-maxima workflows, generalized-Pareto and generalized-extreme-value fits, tail-index estimators, tail risk measures (VaR/TVaR) with uncertainty, importance-sampling and bootstrap machinery, and threshold diagnostics — all returning rich result objects you can summarize, plot, and feed back into the rest of a modeling pipeline.

It is built to sit alongside lossmodels (loss distributions and aggregate models) and risksim (portfolio and contract simulation): it can fit a tail directly from a lossmodels object, splice a fitted tail onto a body distribution, or analyze a risksim simulation result. Its only hard dependencies are numpy and scipy.

Highlights

  • Peaks-over-threshold (POT) / GPD — extract exceedances, choose a threshold, fit a generalized Pareto distribution, and read off tail probabilities, VaR, TVaR, and return levels.
  • Block maxima / GEV — block a series, fit a generalized extreme value distribution, and compute return levels.
  • Tail-index estimators — Hill and Pickands estimators, with a Hill curve.
  • Threshold diagnostics — mean-excess analysis and a threshold-stability scan.
  • Tail risk with uncertainty — empirical and model-based VaR/TVaR/tail probabilities returned as estimates with confidence intervals.
  • Variance reduction — importance-sampling estimators with weight diagnostics and effective sample size.
  • Bootstrap — resampling uncertainty for tail probabilities, VaR, TVaR, and arbitrary scalar statistics.
  • Ecosystem integration — fit tails from lossmodels objects, splice GPD tails onto bodies, and summarize risksim simulations.

Installation

pip install extremeloss

From source:

pip install -e .

Requires Python >=3.10 with numpy and scipy. The integration helpers use lossmodels / risksim objects when present, and the optional plotting helpers require matplotlib.

Package structure

extremeloss/
├── evt/             # GPD, POT, block-maxima/GEV, tail-index, thresholds
├── estimation/      # empirical, conditional-MC, and importance-sampling estimators
├── analytics/       # return periods/levels, summaries, diagnostics
├── utils/           # bootstrap and validation helpers
├── integration.py   # lossmodels / risksim interop and splicing
├── results.py       # GPDFit, GEVFit, GPDTail, TailEstimateResult, BootstrapResult, ThresholdScan
├── protocols.py     # SupportsSample / SupportsLosses / SupportsSimulationResult
└── plotting.py      # optional matplotlib diagnostics

Quick start

import numpy as np
from extremeloss import fit_pot

losses = np.random.default_rng(0).pareto(2.5, 50_000) * 1000  # heavy-tailed sample

fit = fit_pot(losses, threshold=np.quantile(losses, 0.95))   # -> GPDFit
print("shape xi      :", fit.xi)
print("scale beta    :", fit.beta)
print("99.5% VaR     :", fit.var(0.995))
print("99.5% TVaR    :", fit.tvar(0.995))
print("P(loss > 50k) :", fit.tail_probability(50_000))
print("100-obs return level:", fit.return_level(100))
print(fit.summary())

Peaks-over-threshold and the GPD

The POT workflow: pick a threshold (guided by diagnostics below), keep the exceedances, and fit a generalized Pareto distribution to them.

from extremeloss import extract_exceedances, fit_gpd, fit_pot, gpd_var, gpd_tvar

excesses = extract_exceedances(losses, threshold=10_000)   # losses above the threshold
fit = fit_gpd(excesses, threshold=10_000)                  # fit to excesses, or:
fit = fit_pot(losses, threshold=10_000)                    # fit directly from the data

# functional forms are available when you already have the parameters
v = gpd_var(0.995, threshold=10_000, xi=fit.xi, beta=fit.beta,
            exceedance_fraction=fit.exceedance_fraction)

A GPDFit carries threshold, xi, beta, exceedance_fraction, and n_exceedances, and exposes var(p), tvar(p), tail_probability(x), return_level(period), and summary().

Block maxima and the GEV

from extremeloss import make_blocks, fit_block_maxima, fit_gev, block_return_level

maxima = make_blocks(losses, block_size=250)        # block maxima
gev = fit_block_maxima(losses, block_size=250)      # or fit_gev(maxima)
print(gev.xi, gev.loc, gev.scale)
print("100-block return level:", gev.return_level(100))

A GEVFit carries xi, loc, scale, and n_blocks, and exposes return_level(period), cdf(x), and summary().

Threshold diagnostics

Choosing the POT threshold is the crux of a good tail fit. Mean-excess and threshold-stability scans help:

from extremeloss import mean_excess, threshold_diagnostic_table

grid = np.quantile(losses, [0.90, 0.95, 0.975, 0.99])
me = mean_excess(losses, grid)                       # mean excess at each threshold
scan = threshold_diagnostic_table(losses, grid)      # -> ThresholdScan
print(scan.thresholds, scan.xi, scan.beta, scan.n_exceedances)

A ThresholdScan exposes thresholds, mean_excess, xi, beta, n_exceedances, and to_dict(). Look for the threshold above which the shape xi and the mean excess stabilize.

Tail-index estimators

from extremeloss import hill_estimator, hill_curve, pickands_estimator

hill_estimator(losses, k=500)     # Hill tail-index estimate using the top k order statistics
pickands_estimator(losses, k=500)
curve = hill_curve(losses)        # Hill estimate across a grid of k (for a Hill plot)

Return periods and levels

from extremeloss import return_period, return_level, block_return_level

return_period(0.01)             # expected waiting time for a 1% exceedance
return_level(100, fit)          # the level exceeded once per 100 observations (GPDFit)
block_return_level(100, gev)    # the level exceeded once per 100 blocks (GEVFit)

Tail risk estimation with uncertainty

Empirical estimators are one-liners; the model-based estimators return a TailEstimateResult with a confidence interval and standard error.

from extremeloss import empirical_var, empirical_tvar, estimate_var, estimate_tvar, estimate_tail_probability

empirical_var(losses, 0.99)
empirical_tvar(losses, 0.99)

res = estimate_var(losses, 0.99)          # -> TailEstimateResult
print(res.estimate, res.ci, res.stderr)
print(res.summary())

estimate_tvar(losses, 0.99)
estimate_tail_probability(losses, threshold=100_000)

Conditional Monte Carlo summaries are available when you already have per-scenario conditional probabilities or tail expectations (estimate_tail_probability_cmc, estimate_tvar_cmc).

Importance sampling

When the event of interest is rare, importance sampling estimates it far more efficiently than crude Monte Carlo. Supply the losses drawn from a proposal distribution and their importance weights:

from extremeloss import (
    estimate_var_is, estimate_tvar_is, estimate_tail_probability_is,
    importance_sampling_diagnostics, effective_sample_size, stabilize_weights,
)

# losses and weights come from your proposal sampler
res = estimate_tail_probability_is(losses, weights, threshold=1_000_000)
print(res.estimate, res.ci, res.effective_n)

estimate_var_is(losses, weights, 0.999)
estimate_tvar_is(losses, weights, 0.999)

print(importance_sampling_diagnostics(weights))   # weight quality metrics
print(effective_sample_size(weights))
w = stabilize_weights(weights, clip_quantile=0.999)  # tame extreme weights

log_importance_weights(log_target_density, log_proposal_density) builds normalized weights directly from log-densities, and estimate_mean_is / estimate_exceedance_curve_is / estimate_var_tvar_is cover means, full exceedance curves, and joint VaR/TVaR.

Bootstrap uncertainty

from extremeloss import bootstrap_var, bootstrap_tvar, bootstrap_tail_probability, bootstrap_statistic

bv = bootstrap_var(losses, 0.99, n_resamples=1000)   # -> BootstrapResult
print(bv.estimate, bv.ci, bv.stderr)

bootstrap_tvar(losses, 0.99)
bootstrap_tail_probability(losses, threshold=100_000)

# wrap any scalar statistic
bootstrap_statistic(losses, np.median, n_resamples=1000)

A BootstrapResult carries estimate, bootstrap_estimates, ci, stderr, and summary().

The GPD tail object and splicing

GPDTail turns a fitted tail into a standalone, sampleable severity supported above the threshold (sample, cdf, quantile, mean, variance). Because its cdf(threshold) = 0, it satisfies the splicing contract used by lossmodels.SplicedSeverity, letting you attach an EVT tail to a body distribution:

from extremeloss import fit_pot, GPDTail, splice_gpd_tail, fit_spliced_gpd
from lossmodels import Gamma, SplicedSeverity

fit = fit_pot(losses, threshold=10_000)
tail = GPDTail.from_fit(fit)             # a severity for the tail above 10k

# splice an EVT tail onto a lossmodels body, two equivalent ways:
spliced = splice_gpd_tail(Gamma(2.0, 4_000), fit)            # via extremeloss helper
spliced = fit_spliced_gpd(Gamma(2.0, 4_000), losses, threshold=10_000)

# or assemble it yourself with lossmodels:
body = Gamma(2.0, 4_000)
u = tail.threshold
spliced = SplicedSeverity(body=body, tail=tail, threshold=u, weight=body.cdf(u))

The spliced object is a full lossmodels severity, so it drops straight back into a collective-risk model or a risksim portfolio.

Ecosystem integration

extremeloss reads directly from lossmodels and risksim objects:

Helper Purpose
sample_lossmodel(model, size) draw a sample from any lossmodels model
fit_pot_from_lossmodel(model, size, threshold) sample a model and fit a GPD tail in one step
losses_from_risksim(result, view) pull a loss vector (gross / ceded / retained) from a risksim SimulationResult
tail_summary_from_risksim(result, ...) a tail summary of a risksim simulation
component_tail_metrics(result, q, ...) per-component tail metrics from a simulation
layer_tail_metrics(result, q, ...) per-layer tail metrics from a simulation
from extremeloss import fit_pot_from_lossmodel, losses_from_risksim, tail_summary_from_risksim
from lossmodels import ParetoII

fit = fit_pot_from_lossmodel(ParetoII(2.5, 1000), size=40_000, threshold=1_500)

# after running a risksim Portfolio.simulate(...) -> result:
# net_losses = losses_from_risksim(result, view="retained")
# summary = tail_summary_from_risksim(result, quantiles=(0.95, 0.99, 0.995))

Plotting (optional)

With matplotlib installed, the extremeloss.plotting module offers diagnostic plots — plot_mean_excess(losses, thresholds), plot_hill_curve(losses), and plot_exceedance_curve(losses, thresholds) (each accepts an optional ax).

The ActuarialPy ecosystem

extremeloss is the tail-modeling layer of a small family of actuarial packages that interoperate through the .sample() / .mean() interface:

  • lossmodels — frequency / severity distributions, aggregate (collective-risk) loss models, coverage modifications, and model fitting; the source of body distributions for splicing and of models to fit tails from.
  • risksim — portfolio loss simulation and aggregate reinsurance; its simulation results feed the integration helpers above.
  • actuarialpy — deterministic, experience-and-data analysis (summaries, triangles, trend, credibility) on tabular data.

Testing

pytest -q

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extremeloss-0.2.2.tar.gz (25.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

extremeloss-0.2.2-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file extremeloss-0.2.2.tar.gz.

File metadata

  • Download URL: extremeloss-0.2.2.tar.gz
  • Upload date:
  • Size: 25.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for extremeloss-0.2.2.tar.gz
Algorithm Hash digest
SHA256 2660e1d9f426dd10151fca8d7764614bd37e368e319630d92998f2241e265885
MD5 fdcd95ebafc305f2c03a4ba8ff58b32b
BLAKE2b-256 a73c8eabe764528cfdbcc3dd0efa6b012b955277e7163367c5a1d91ce527360b

See more details on using hashes here.

File details

Details for the file extremeloss-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: extremeloss-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for extremeloss-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 94ccabbe78a506269c73e121d32c821bf04d537fcbcb40f2a02b6641a2aef13c
MD5 c96b9e0e73bc89cbd425713ada5c9ae7
BLAKE2b-256 b4238e35afe8f4c028b2bbf21eb97c3466792f349188a1c03f2e85c397b2029d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page