Skip to main content

Exact computation of Shapley R-squared for tree ensembles in polynomial time

Project description

Q-SHAP: Feature-Specific $R^2$ Values for Tree Ensembles

PyPI Downloads

This package is used to compute feature-specific $R^2$ values, following Shapley decomposition of the total $R^2$, for tree ensembles in polynomial time based on the paper.

This version only takes outputs from XGBoost, LightGBM, scikit-learn Decision Tree, and scikit-learn GBDT. We are working to update it for random forests in the next version. Please check Q-SHAP Tutorial for more details using Q-SHAP.

Installation

qshap can be installed through PyPI:

pip install qshap

Quick Start

# Import necessary libraries
from sklearn.datasets import fetch_california_housing
from qshap import gazer, vis
import xgboost as xgb
import numpy as np

# Load the California Housing dataset and fit a XGBoost regressor
housing = fetch_california_housing()
x, y, feature_names = housing.data, housing.target, housing.feature_names
model = xgb.XGBRegressor(max_depth=2, n_estimators=50, random_state=42).fit(x, y)

# Obtain feature-specific R^2 using qshap, use 1024 randomly sampled data
gazer_rsq = gazer(model)
phi_rsq = gazer.rsq(gazer_rsq, x, y, nsample=1024, random_state=42)

# Visualize top values of feature-specific R^2
vis.rsq(phi_rsq, label=np.array(feature_names), rotation=30, save_name="cal_housing", color_map_name="Pastel2")

Citation

@article{jiang2024feature,
  title={Feature-Specific Coefficients of Determination in Tree Ensembles},
  author={Jiang, Zhongli and Zhang, Dabao and Zhang, Min},
  journal={arXiv preprint arXiv:2407.03515},
  year={2024}
}

References

  • Jiang, Z., Zhang, D., & Zhang, M. (2024). "Feature-specific coefficients of determination in tree ensembles." arXiv preprint arXiv:2407.03515.
  • Lundberg, Scott M., et al. "From local explanations to global understanding with explainable AI for trees." Nature Machine Intelligence 2.1 (2020): 56-67.
  • Karczmarz, Adam, et al. "Improved feature importance computation for tree models based on the Banzhaf value." Uncertainty in Artificial Intelligence. PMLR, 2022.
  • Bifet, Albert, Jesse Read, and Chao Xu. "Linear tree shap." Advances in Neural Information Processing Systems 35 (2022): 25818-25828.
  • Chen, Tianqi, and Carlos Guestrin. "Xgboost: A scalable tree boosting system." Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016.

Container Images

We provide pre-built images, available for both Docker and Singularity, with all necessary packages for Q-SHAP in Python 3.12:

  • Docker:
    You can pull the Docker image using the following command:
    docker pull catstat/xai
    
  • Singularity:
    You can pull the Docker image using the following command:
    singularity pull docker://catstat/xai:0.1
    

Task List

  • Task 1: Catboost version

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qshap-0.3.6.tar.gz (24.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qshap-0.3.6-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file qshap-0.3.6.tar.gz.

File metadata

  • Download URL: qshap-0.3.6.tar.gz
  • Upload date:
  • Size: 24.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for qshap-0.3.6.tar.gz
Algorithm Hash digest
SHA256 ce6b0a8fa89a6ed5401886e61a5a8e10658a42b46c92a09b47273ead59a93993
MD5 ca195c6b30336dd9d1c97b10ba3aa260
BLAKE2b-256 0e2129365e85ba395179a5183c11f6c90ab0e6c8e267ee3581b363dc94afdb28

See more details on using hashes here.

File details

Details for the file qshap-0.3.6-py3-none-any.whl.

File metadata

  • Download URL: qshap-0.3.6-py3-none-any.whl
  • Upload date:
  • Size: 24.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for qshap-0.3.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ad33293185b768b54dd2bfdc446b93b0fefe13c354f71c5b9b795170c337ab6d
MD5 b4c01bf4b45becd0adf71d41114432fb
BLAKE2b-256 e0a749829c171b85e9ccdc2f04c7873caaac23d181b8902529e2548de40e959c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page