Explainable Boosted Scoring
Project description
xBooster 🚀
xBooster is a Python package designed to enhance the interpretability and explainability of XGBoost models.
It provides tools for constructing gradient boosted scorecards, generating local interpretations, and visualizing model explanations.
Features ✨
1️⃣ Construct (credit) scorecards for XGBoost models and make inference.
2️⃣ Visualize feature importances using several metrics and two methods.
3️⃣ Generate local explanations for model predictions.
4️⃣ Generate SQL queries for boosted scorecards for easy deployment (e.g., with DuckDB).
The methodology for explainers leverages the concepts of Weight-of-Evidence (WOE) and Fisher's Likelihood in calculating feature importances and local explanations. 🎲 For instance, booster's margins are seen as likelihoods and are conceptually similar to WOE. 📈 A scorecard can be constructed from WOE (natural logarithm of likelihood) based on booster's split information.
The results from explainer are highly consistent with SHAP values, but do not require significant computational resources, since all information is taken from the booster's model. 💡 This means that you can gain valuable insights into your model's behavior without the heavy computational overhead typically associated with SHAP computations. 🚀
Installation 🛠️
You can install xBooster via pip:
pip install xbooster
Usage 📝
Here's a quick example of how to use xBooster to construct a scorecard for an XGBoost model:
import pandas as pd
import xgboost as xgb
from xbooster.constructor import XGBScorecardConstructor
from sklearn.model_selection import train_test_split
# Load data and train XGBoost model
data = pd.read_csv("data.csv")
X = data.drop(columns=["target"])
y = data["target"]
model = xgb.XGBClassifier()
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
model.fit(X_train, y_train)
# Initialize XGBScorecardConstructor
scorecard_constructor = XGBScorecardConstructor(model, X_train, y_train)
scorecard_constructor.construct_scorecard()
# Print the scorecard
print(scorecard_constructor.scorecard)
After this we can create a scorecard and test its discrimination skill (Gini score):
from xbooster.constructor import XGBScorecardConstructor
# Create scoring points
xgb_scorecard_with_points = scorecard_constructor.create_points(
pdo=50, target_points=600, target_odds=50
)
# Make predictions using the scorecard
credit_scores = scorecard_constructor.predict_score(X_test)
gini = roc_auc_score(y_test, -credit_scores) * 2 - 1
print(f"Test Gini score: {gini:.2%}")
We can also visualize the score distribution between the events of interest:
from xbooster import explainer
explainer.plot_score_distribution(
y_test,
credit_scores,
num_bins=30,
figsize=(8, 3),
dpi=100
)
We can further examine feature importances.
Below we can visualize the global feature importances using Points
as our metric:
from xbooster import explainer
explainer.plot_importance(
scorecard_constructor,
metric='Points',
method='global',
normalize=True,
figsize=(3, 3)
)
Alternatively, we can calculate local feature importances, which is important for booster with a depth larger than 1.
from xbooster import explainer
explainer.plot_importance(
scorecard_constructor,
metric='Likelihood',
method='local',
normalize=True,
color='#ffd43b',
edgecolor='#1e1e1e',
figsize=(3, 3)
)
Finally, we can generate a scorecard in SQL format.
sql_query = scorecard_constructor.generate_sql_query(table_name='my_table')
print(scorecard_constructor.sql_query)
For more detailed examples and documentation, please refer to the documentation and check out the \notebooks
directory.
Contributing 🤝
Contributions are welcome! For bug reports or feature requests, please open an issue.
For code contributions, please open a pull request.
License 📄
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file xbooster-0.1.0.tar.gz
.
File metadata
- Download URL: xbooster-0.1.0.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.10.12 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6122aa44418727f9b8ef99ef26d96205de2be1bb67685fe481ad97b9f89c9c22 |
|
MD5 | 42d351455f9ae9f99702a943040b9a45 |
|
BLAKE2b-256 | c6257860c05f568e1d8a08f0d61be6a93538b9110253afa5ebeda97c82e0487e |
File details
Details for the file xbooster-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: xbooster-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.10.12 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c8ea6f5a15af4662e6abd6bf5ed319357a842af1244f13a87d29ff1da9de0bf2 |
|
MD5 | 0bd35c669e701413fa91b9b5336249fc |
|
BLAKE2b-256 | 4299fc5e6f4ba33cd7b5b1044d0417d1a6edc23e1696ecf833db971b291452eb |