Skip to main content

Python Wrapper of the 'midr' R package to interpret black-box models.

Project description

midlearn

PyPI version Documentation GitHub Repo License: MIT

midlearn is a rpy2-based Python wrapper for the midr R package. It provides a model-agnostic framework for interpreting black-box models using a scikit-learn compatible API.

The core objective of midlearn is to create a globally interpretable surrogate model through Maximum Interpretation Decomposition (MID). This technique finds an optimal additive approximation of any black-box model (e.g., GBMs, Neural Networks) by minimizing the squared error between the original predictions and the surrogate's components.

Main Features

  • Scikit-learn Compatible API: Fits seamlessly into existing workflows with familiar .fit() and .predict() methods.
  • Functional Decomposition: Deconstructs model predictions into an intercept, main effects, and second-order interaction effects, minimizing the squared residuals.
  • Model Fidelity: Quantifies the quality of the explanation and the complexity of the model using the Uninterpreted Variation Ratio.
  • Seamless Visualization: Built-in support for plotnine-based interfaces to generate feature importance, dependence plots, and additive breakdowns.

Installation

midlearn requires an R installation on your system with the midr package.

1. Install R Package

From CRAN:

install.packages('midr')

Or from GitHub:

pak::pak("ryo-asashi/midr")

2. Install Python package

From PyPI:

pip install midlearn

Or from GitHub:

pip install git+https://github.com/ryo-asashi/midlearn.git

Theoretical Foundation

MID is a functional decomposition method that deconstructs a black-box prediction function $f(\mathbf{X})$ into several interpretable components: an intercept $g_\emptyset$, main effects $g_j(X_j)$, and second-order interactions $g_{jk}(X_j, X_k)$, minimizing the squared residuals $\mathbf{E}\left[g_D(\mathbf{X})^2\right]$:

$$ f(\mathbf{X}) = g_\emptyset + \sum_{j} g_j(X_{j}) + \sum_{j < k} g_{jk}(X_{j},;X_{k}) + g_D(\mathbf{X}) $$

To ensure the uniqueness and identifiability of each component, MID imposes centering and probability-weighted minimum-norm constraints on the decomposition.

By approximating a black-box model with this surrogate structure, we can derive a representation that retains the superior predictive power of machine learning models without sacrificing actuarial transparency. Furthermore, it allows us to quantify the "uninterpreted" variance, i.e., the portion of the model's logic that can't be captured by low-order effects, via the residual term $g_D(\mathbf{X})$.

The theoretical foundations of MID are described in Iwasawa & Matsumori (2026) [Forthcoming], and the software implementation is detailed in Asashiba et al. (2025).

License

midlearn is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

midlearn-0.1.6.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

midlearn-0.1.6-py3-none-any.whl (25.4 kB view details)

Uploaded Python 3

File details

Details for the file midlearn-0.1.6.tar.gz.

File metadata

  • Download URL: midlearn-0.1.6.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for midlearn-0.1.6.tar.gz
Algorithm Hash digest
SHA256 2d156999e6113ee09f06f6181132a150051ab7803cb5a70c0f07c726044092d9
MD5 e93c279f88feb3980bd4d098983bc2e9
BLAKE2b-256 f02c24e857089e2578cc02e8defe85f722454e4f11b71b919ca310592d09f0bc

See more details on using hashes here.

File details

Details for the file midlearn-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: midlearn-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 25.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for midlearn-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 4bd82add32efd1c44bda9a68555c028bff8e90d47b80197c66c0337cc6d841b9
MD5 eabfbcf67e83d36913d4dda32ac11f21
BLAKE2b-256 6e8d62d65719f9da16ebfc34e35b7d7367c381d3ab870dfc2ab0f1bd242f6087

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page