A simple Python package for Funcationally-Identical Pruning of Ensemble models
Project description
FIPE: Functionally Identical Pruning of Ensembles
This repository provides methods for Functionally-Identical Pruning of Tree Ensembles (FIPE). Given a trained scikit-learn model, FIPE provides a pruned model that is certified to be equivalent to the original model on the entire feature space. The algorithm is described in detail in the paper: https://arxiv.org/abs/2408.16167 .
Installation
This project requires the gurobi solver. Free academic licenses are available. Please consult:
Run the following commands from the project root to install the requirements. You may have to install python and venv before.
virtualenv -p python3.12 env
pip install fipepy
The installation can be checked by running the test suite:
pip install tox
tox
The integration tests require a working Gurobi license. If a license is not available, the tests will pass and print a warning.
Getting started
A minimal working example to prune an AdaBoost ensemble is presented below.
import gurobipy as gp
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.ensemble import AdaBoostClassifier
from sklearn.model_selection import train_test_split
from fipe import FIPE, FeatureEncoder
# Load data encode features
data = load_iris(as_frame=True)
X = pd.DataFrame(data.data)
y = data.target
encoder = FeatureEncoder(X)
X = encoder.X.to_numpy()
# Train tree ensemble
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
base = AdaBoostClassifier(n_estimators=100, random_state=42)
base.fit(X, y)
# Read and normalize weights
w = base.estimator_weights_
w = (w / w.max()) * 1e5
# Prune using FIPE
norm = 1
print(f"Pruning model by minimizing l_{norm} norm.")
env = gp.Env()
env.setParam("OutputFlag", 0)
pruner = FIPE(
base=base,
encoder=encoder,
weights=w,
norm=norm,
env=env,
eps=1e-6,
tol=1e-4,
)
print("Building pruner...")
pruner.build()
pruner.add_samples(X_train)
print("Pruning...")
pruner.prune()
print("Finished pruning.")
# Read pruned model
n_active_estimators = pruner.n_active_estimators
print(
f"The pruned ensemble has {n_active_estimators}"
f"/{base.n_estimators} active estimators."
)
# Verify functionally-identical on test data
y_pred = base.predict(X_test)
y_pruned = pruner.predict(X_test)
fidelity = np.mean(y_pred == y_pruned)
print(f"Fidelity to initial ensemble is {fidelity * 100:.2f}%.")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fipepy-1.0.7.tar.gz.
File metadata
- Download URL: fipepy-1.0.7.tar.gz
- Upload date:
- Size: 218.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c8497d1d3b292500f309abc8002c96cfe8a54255704cd8b956e9153eabb57160
|
|
| MD5 |
f8b042f888c272b9b89d7b5d692582db
|
|
| BLAKE2b-256 |
4dcb0eeea4ba956756c2e1e4822d00786f7ad17d710810df8cdb0aa47e9ba9fe
|
File details
Details for the file fipepy-1.0.7-py3-none-any.whl.
File metadata
- Download URL: fipepy-1.0.7-py3-none-any.whl
- Upload date:
- Size: 34.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3231ec38bba1afe1004d988c28d8f56bfe7913979f66eed0c0a31b5b2ec48e86
|
|
| MD5 |
76e3d756bed25503f8618c3c0d3b5c7e
|
|
| BLAKE2b-256 |
2151b86eab3c0347ba7b1c6a3f857f37d186e37a2ff2888271e65b1215fb8aa5
|