Skip to main content

Interpretable Boosted Linear Model (IBLM): A transparent machine learning approach combining generalized linear models with gradient boosting

Project description

PyIBLM: Interpretable Boosted Linear Model

License: MIT Python 3.12+

PyIBLM is a Python package implementing the Interpretable Boosted Linear Model (IBLM), a transparent machine learning approach that combines the interpretability of Generalized Linear Models (GLMs) with the predictive power of gradient boosting.

Features

  • 🎯 Interpretable by design: Combines GLM transparency with boosting performance
  • 📊 Multiple model families: Poisson, Tweedie, Gaussian, and more (via statsmodels)
  • 🚀 Gradient boosting integration: Uses scikit-learn's HistGradientBoostingRegressor and XGBoost
  • 📈 SHAP explanations: Built-in feature importance and contribution analysis
  • 🔍 Comprehensive diagnostics: Pinball scores, deviance metrics, and model comparisons
  • 📉 Visualization tools: Beta corrections, density plots, and correction corridors

Installation

Basic Installation

pip install pyiblm

With Visualization Support

pip install pyiblm[visualization]

With Explainability Features

pip install pyiblm[explainability]

Full Installation

pip install pyiblm[all]

Quick Start

from pyBLM import (
    IBLMModel,
    BoosterConfig,
    GLMConfig,
    TrainingConfig,
    load_freMTPL2freq,
)

# Load example data
data = load_freMTPL2freq("data/freMTPL2freq.csv")
train, validate, test = data.split_into_train_validate_test(seed=123)

# Configure the model
config = TrainingConfig(
    response="ClaimRate",
    glm=GLMConfig(family="poisson"),
    booster=BoosterConfig(
        nrounds=500,
        early_stopping_rounds=20,
        params={"max_depth": 3, "eta": 0.025},
    ),
)

# Train the model
model = IBLMModel(config).fit(train, validate)

# Make predictions
predictions = model.predict(test)

# Get GLM parameters
glm_params = model.get_glm_params()
print(glm_params)

Core Components

Model Classes

  • IBLMModel: Main model class combining GLM and gradient boosting
  • BoosterConfig: Configuration for the gradient boosting component
  • GLMConfig: Configuration for the GLM component
  • TrainingConfig: Overall training configuration

Data Handling

  • load_freMTPL2freq(): Load example insurance dataset
  • FeaturePreprocessor: Automatic feature encoding and preprocessing

Evaluation

  • poisson_deviance(): Compute Poisson deviance
  • get_pinball_scores(): Multi-model pinball loss comparison
  • calculate_deviance(): Family-based deviance calculation

Explanation & Visualization

  • explain(): Generate explanation object with SHAP values
  • IBLMPlotter: Visualization utilities for model interpretation
  • correction_corridor(): Visualize model correction patterns
  • extract_booster_shap(): Extract SHAP values from booster

Documentation

For detailed documentation and tutorials, see:

  • examples/ - Example scripts and use cases
  • dev.ipynb - Development notebook with comprehensive example

Development

This package is actively developed. Contributions are welcome!

Development Setup

git clone https://github.com/ZZhouGit/pyBLM.git
cd pyBLM
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e ".[all]"
poetry install --with dev

Running Tests

pytest tests/

Development Notebook

Open dev.ipynb in Jupyter to see comprehensive examples:

jupyter notebook dev.ipynb

Requirements

  • Python 3.12+
  • pandas >= 2.0.0
  • numpy >= 2.0.0
  • scikit-learn >= 1.5.0
  • xgboost >= 2.1.0
  • pydantic >= 2.10.0
  • statsmodels >= 0.14.0

Optional dependencies:

  • plotnine >= 0.15.0 (for visualization)
  • altair >= 5.4.0 (for interactive plots)
  • shap >= 0.45.0 (for SHAP explanations)

Citation

If you use PyBLM in your research, please cite:

@software{pyiblm2025,
  title={PyIBLM: Interpretable Boosted Linear Models},
  author={Your Name},
  year={2025},
  url={https://github.com/ZZhouGit/pyBLM},
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Authors

  • Your Name

Acknowledgments

Built with scikit-learn, XGBoost, SHAP, and statsmodels.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyiblm-0.1.0.tar.gz (22.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyiblm-0.1.0-py3-none-any.whl (23.9 kB view details)

Uploaded Python 3

File details

Details for the file pyiblm-0.1.0.tar.gz.

File metadata

  • Download URL: pyiblm-0.1.0.tar.gz
  • Upload date:
  • Size: 22.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.1 Linux/6.8.0-1044-azure

File hashes

Hashes for pyiblm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2afaa19bc435494a946f4dbc2bbc8cc1a768c4336af80e25f9cf23b6144197b0
MD5 509088c46919ecbec3f702463949d8e7
BLAKE2b-256 e61ad1f2ddefb298b34eb6549a7515805cf8b0535b70e7a419c95374683bc78d

See more details on using hashes here.

File details

Details for the file pyiblm-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyiblm-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.1 Linux/6.8.0-1044-azure

File hashes

Hashes for pyiblm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2aaa325cfb74608460366737ccaa378c15bea44231a51cbdaf0912fd87d6fd17
MD5 0ab7ae6b9f45422834900d6160219cfe
BLAKE2b-256 4dab5f18ccd387e06f97f0cbe10589ec3b3ccc3c2881002f9f44781ef80d6596

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page