Loss ratio analytics for long-term health insurance.
Project description
lossratio (Python)
Python sibling of the R lossratio package: loss ratio analytics
for long-term health insurance — cohort development analysis,
stage-adaptive projection, regime detection, and backtest validation
on long-format experience data. Stage-adaptive (SA) projection uses
an exposure-driven (ED) model before the maturity point and chain
ladder (CL) after.
This Python implementation is in active development (0.0.1.devN
release line on PyPI).
Install
pip install lossratio # polars only
pip install lossratio[pandas] # add pandas / pyarrow support
Current status
Working components:
Triangle— cohort × dev aggregation. Accepts a long-format experience frame (cym,uym,loss_incr,premium_incr) and validates schema + adds derived period columns inline. Cumulative is the unmarked default (loss,premium,lr); per-period values carry an_incrsuffix.CL,ED,LR— sklearn-style estimators for chain ladder, exposure-driven, and stage-adaptive loss-ratio projection (fit(triangle)→CLFit/EDFit/LRFitwithsummary(),dfprojection frame, and per-cohort SE / CV).Triangle.link()— builds the long-formatLinktable (one row per cohort × adjacent dev pair). Method chaintri.link().ata()/tri.link().intensity()returns paired factor-level diagnostics (multiplicative ATA factors, additive ED intensities). Add.maturity(...)after.ata()to detect the development period at which age-to-age factors stabilise.Triangle.detect_regime()— detects structural shifts across the cohort sequence via E-Divisive or Ward hierarchical clustering (returns aRegimeresult).Backtest— calendar-diagonal hold-out backtest of any of the above estimators (returns aBacktestFitwith per-cell, by-dev, and by-diagonal A/E Error summaries —ae_err = actual / predicted - 1).
Not yet ported from the R sibling: Calendar / Total
aggregations and the Convergence diagnostic.
Quick Start
import polars as pl
import lossratio as lr
# Built-in synthetic experience: four coverages (CI / CAN / HOS / SUR),
# 36 monthly cohorts each, up to 36 dev months. SUR carries one
# regime shift at 2025-07. We focus on SUR for this walk-through.
df = lr.load_experience()
df.head(3)
#> shape: (3, 5)
#> ┌──────────┬────────────┬────────────┬───────────┬──────────────┐
#> │ coverage ┆ cym ┆ uym ┆ loss_incr ┆ premium_incr │
#> ╞══════════╪════════════╪════════════╪═══════════╪══════════════╡
#> │ CI ┆ 2024-01-01 ┆ 2024-01-01 ┆ 12.675578 ┆ 100.0 │
#> │ CI ┆ 2024-02-01 ┆ 2024-01-01 ┆ 63.639327 ┆ 100.0 │
#> │ CI ┆ 2024-03-01 ┆ 2024-01-01 ┆ 73.363608 ┆ 100.0 │
#> └──────────┴────────────┴────────────┴───────────┴──────────────┘
# 1. Subset to SUR (the coverage with the planted regime shift), then
# build the cohort x dev triangle. Triangle's constructor validates
# schema and adds derived period columns inline.
df_sur = df.filter(pl.col("coverage") == "SUR")
tri = lr.Triangle(df_sur, group_var="coverage")
# 2. Factor-level diagnostics via the link chain. Build the link table
# once, derive both ATA factors and ED intensities from it.
link = tri.link()
link
#> <Link: 1 groups, 630 total links, dual-mode>
ata = link.ata()
ata.df.head(3)
#> shape: (3, 7)
#> ┌──────────┬─────┬──────────┬───────────┬──────────┬──────────┬───────┐
#> │ coverage ┆ dev ┆ f ┆ sigma2 ┆ cv ┆ rse ┆ n_obs │
#> ╞══════════╪═════╪══════════╪═══════════╪══════════╪══════════╪═══════╡
#> │ SUR ┆ 1 ┆ 6.001549 ┆ 12.841988 ┆ 0.133092 ┆ 0.02052 ┆ 35 │
#> │ SUR ┆ 2 ┆ 1.851539 ┆ 1.431424 ┆ 0.056084 ┆ 0.009168 ┆ 34 │
#> │ SUR ┆ 3 ┆ 1.459929 ┆ 0.616219 ┆ 0.033029 ┆ 0.005667 ┆ 33 │
#> └──────────┴─────┴──────────┴───────────┴──────────┴──────────┴───────┘
ata.maturity(max_cv=0.15, max_rse=0.05, min_run=2).k_star
#> {'SUR': 1}
# 3. Project loss ratios with the stage-adaptive method (default).
fit = lr.LR().fit(tri)
fit.summary().select(["coverage", "cohort", "lr_ult", "se_lr", "cv_lr"]).head(3)
#> shape: (3, 5)
#> ┌──────────┬────────────┬──────────┬──────────┬──────────┐
#> │ coverage ┆ cohort ┆ lr_ult ┆ se_lr ┆ cv_lr │
#> ╞══════════╪════════════╪══════════╪══════════╪══════════╡
#> │ SUR ┆ 2024-01-01 ┆ 1.432623 ┆ null ┆ null │
#> │ SUR ┆ 2024-02-01 ┆ 1.428767 ┆ 0.00083 ┆ 0.000581 │
#> │ SUR ┆ 2024-03-01 ┆ 1.407394 ┆ 0.001983 ┆ 0.001409 │
#> └──────────┴────────────┴──────────┴──────────┴──────────┘
# 4. Detect cohort regime shifts.
reg = tri.detect_regime(loss_var="lr", K=12)
reg.breakpoints
#> [datetime.date(2025, 7, 1)]
# 5. Calendar-diagonal hold-out backtest. The last 6 diagonals are
# masked, the estimator is refitted on the remaining cells, and
# the projection is compared with actual loss.
# ae_err = actual / predicted - 1 (signed relative error).
bt = lr.Backtest(estimator=lr.LR(), holdout=6).fit(tri)
bt.diag_summary.head(3)
#> shape: (3, 6)
#> ┌──────────┬──────────────┬─────┬─────────────┬────────────┬───────────┐
#> │ coverage ┆ calendar_idx ┆ n ┆ ae_err_mean ┆ ae_err_med ┆ ae_err_wt │
#> ╞══════════╪══════════════╪═════╪═════════════╪════════════╪═══════════╡
#> │ SUR ┆ 30 ┆ 30 ┆ 0.0111 ┆ 0.001512 ┆ 0.003015 │
#> │ SUR ┆ 31 ┆ 30 ┆ 0.011723 ┆ 0.001789 ┆ 0.008004 │
#> │ SUR ┆ 32 ┆ 30 ┆ 0.012893 ┆ 0.000294 ┆ 0.012847 │
#> └──────────┴──────────────┴─────┴─────────────┴────────────┴───────────┘
To analyse multiple coverages jointly, drop the upfront filter; every
estimator and detector then fits per group, with coverage already
labelling each output row.
To plug in your own data, build a long-format frame with these
columns and pass it to lr.Triangle(df, group_var=...):
cym(date) — calendar year-monthuym(date) — underwriting year-month (cohort)loss_incr(numeric) — per-period claim amountpremium_incr(numeric) — per-period premium
Triangle also accepts an optional group_var (coverage, product,
age band, ...) — each estimator and detector then fits per group.
Pandas inputs are accepted too; outputs mirror the input type
(pandas in → pandas out, polars in → polars out). Use the
[pandas] install extra (see above) to pull in pandas and
pyarrow.
R package
- Source: https://github.com/seokhoonj/lossratio
- Documentation: https://seokhoonj.github.io/lossratio/
- 한국어 문서: https://seokhoonj.github.io/lossratio/ko/
remotes::install_github("seokhoonj/lossratio")
library(lossratio)
Author
Seokhoon Joo
(@seokhoonj,
seokhoonj@gmail.com) — also maintains the R lossratio package.
License
MPL-2.0 (Mozilla Public License 2.0).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lossratio-0.0.1.dev7.tar.gz.
File metadata
- Download URL: lossratio-0.0.1.dev7.tar.gz
- Upload date:
- Size: 44.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
057ed6ab4575242fcf22cd18fc1390096306e01f6ccb994829d5070eb94c9796
|
|
| MD5 |
45f9698e2894fdffa4b5fbf71fff3b08
|
|
| BLAKE2b-256 |
d9d16f2727bb626226725c3848b910e5489ae72fec79e4f5c5e1a2b9c67dfbdf
|
Provenance
The following attestation bundles were made for lossratio-0.0.1.dev7.tar.gz:
Publisher:
publish.yml on seokhoonj/lossratio-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lossratio-0.0.1.dev7.tar.gz -
Subject digest:
057ed6ab4575242fcf22cd18fc1390096306e01f6ccb994829d5070eb94c9796 - Sigstore transparency entry: 1493652817
- Sigstore integration time:
-
Permalink:
seokhoonj/lossratio-py@e81f4c63c2ed526c161d8b34e2a45a57a525d909 -
Branch / Tag:
refs/tags/v0.0.1.dev7 - Owner: https://github.com/seokhoonj
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e81f4c63c2ed526c161d8b34e2a45a57a525d909 -
Trigger Event:
push
-
Statement type:
File details
Details for the file lossratio-0.0.1.dev7-py3-none-any.whl.
File metadata
- Download URL: lossratio-0.0.1.dev7-py3-none-any.whl
- Upload date:
- Size: 52.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d3d0e1f56a8da3fb7042947d45ef5ad5f1946dec7354e4ce4f8b5bd27c4c7c1a
|
|
| MD5 |
f6594f945bbc037569920be430efad11
|
|
| BLAKE2b-256 |
5ea1e429669dd1112328f5cba14c8bb5ee181a3e20ef4ce93c2497b8203aa57e
|
Provenance
The following attestation bundles were made for lossratio-0.0.1.dev7-py3-none-any.whl:
Publisher:
publish.yml on seokhoonj/lossratio-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lossratio-0.0.1.dev7-py3-none-any.whl -
Subject digest:
d3d0e1f56a8da3fb7042947d45ef5ad5f1946dec7354e4ce4f8b5bd27c4c7c1a - Sigstore transparency entry: 1493652951
- Sigstore integration time:
-
Permalink:
seokhoonj/lossratio-py@e81f4c63c2ed526c161d8b34e2a45a57a525d909 -
Branch / Tag:
refs/tags/v0.0.1.dev7 - Owner: https://github.com/seokhoonj
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e81f4c63c2ed526c161d8b34e2a45a57a525d909 -
Trigger Event:
push
-
Statement type: