Skip to main content

A simple Python package for Funcationally-Identical Pruning of Ensemble models

Project description

FIPE: Functionally Identical Pruning of Ensembles

PyPI Supported Python versions test

This repository provides methods for Functionally-Identical Pruning of Tree Ensembles (FIPE). Given a trained scikit-learn model, FIPE provides a pruned model that is certified to be equivalent to the original model on the entire feature space. The algorithm is described in detail in the paper: https://arxiv.org/abs/2408.16167 .

Installation

This project requires the gurobi solver. Free academic licenses are available. Please consult:

Run the following commands from the project root to install the requirements. You may have to install python and venv before.

virtualenv -p python3.12 env
pip install fipepy

The installation can be checked by running the test suite:

pip install tox
tox

The integration tests require a working Gurobi license. If a license is not available, the tests will pass and print a warning.

Getting started

A minimal working example to prune an AdaBoost ensemble is presented below.

import gurobipy as gp
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.ensemble import AdaBoostClassifier
from sklearn.model_selection import train_test_split

from fipe import FIPE, FeatureEncoder

# Load data encode features
data = load_iris(as_frame=True)
X = pd.DataFrame(data.data)
y = data.target

encoder = FeatureEncoder(X)
X = encoder.X.to_numpy()

# Train tree ensemble
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
base = AdaBoostClassifier(n_estimators=100, random_state=42)
base.fit(X, y)

# Read and normalize weights
w = base.estimator_weights_
w = (w / w.max()) * 1e5

# Prune using FIPE
norm = 1
print(f"Pruning model by minimizing l_{norm} norm.")
env = gp.Env()
env.setParam("OutputFlag", 0)
pruner = FIPE(
    base=base,
    encoder=encoder,
    weights=w,
    norm=norm,
    env=env,
    eps=1e-6,
    tol=1e-4,
)
print("Building pruner...")
pruner.build()
pruner.add_samples(X_train)
print("Pruning...")
pruner.prune()
print("Finished pruning.")

# Read pruned model
n_active_estimators = pruner.n_active_estimators
print(
    f"The pruned ensemble has {n_active_estimators}"
    f"/{base.n_estimators} active estimators."
)

# Verify functionally-identical on test data
y_pred = base.predict(X_test)
y_pruned = pruner.predict(X_test)
fidelity = np.mean(y_pred == y_pruned)
print(f"Fidelity to initial ensemble is {fidelity * 100:.2f}%.")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fipepy-1.0.7.tar.gz (218.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fipepy-1.0.7-py3-none-any.whl (34.5 kB view details)

Uploaded Python 3

File details

Details for the file fipepy-1.0.7.tar.gz.

File metadata

  • Download URL: fipepy-1.0.7.tar.gz
  • Upload date:
  • Size: 218.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.9.21

File hashes

Hashes for fipepy-1.0.7.tar.gz
Algorithm Hash digest
SHA256 c8497d1d3b292500f309abc8002c96cfe8a54255704cd8b956e9153eabb57160
MD5 f8b042f888c272b9b89d7b5d692582db
BLAKE2b-256 4dcb0eeea4ba956756c2e1e4822d00786f7ad17d710810df8cdb0aa47e9ba9fe

See more details on using hashes here.

File details

Details for the file fipepy-1.0.7-py3-none-any.whl.

File metadata

  • Download URL: fipepy-1.0.7-py3-none-any.whl
  • Upload date:
  • Size: 34.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.9.21

File hashes

Hashes for fipepy-1.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 3231ec38bba1afe1004d988c28d8f56bfe7913979f66eed0c0a31b5b2ec48e86
MD5 76e3d756bed25503f8618c3c0d3b5c7e
BLAKE2b-256 2151b86eab3c0347ba7b1c6a3f857f37d186e37a2ff2888271e65b1215fb8aa5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page