Skip to main content

CLE-SH: Comprehensive Literal Explanation package for SHapley values by statistical validity (arXiv:2409.12578)

Project description

CLE-SH: Comprehensive Literal Explanation Package for SHapley Values

Python Version License: MIT arXiv

A Python library for statistically rigorous SHAP value interpretation. CLE-SH identifies significant features, discovers univariate patterns, and detects feature interactions using proper statistical testing.

Installation

# Using uv (recommended)
uv add "cle-sh[all]"

# Or using pip
pip install "cle-sh[all]"

Quick Start

from clesh.clesh import CLESH
import shap
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load data and train model
data = load_breast_cancer()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

model = LogisticRegression(random_state=42, max_iter=1000)
model.fit(X_train_scaled, y_train)

# Generate SHAP values
explainer = shap.LinearExplainer(model, X_train_scaled)
shap_values = explainer.shap_values(X_test_scaled)

# Run CLE-SH analysis
clesh = CLESH()
results = clesh.comprehensive_analysis(X_test_scaled, shap_values, data.feature_names)

# Get explanations
explanations = clesh.generate_literal_explanations()
for exp in explanations:
    print(exp)

# Create visualization
fig = clesh.visualize_results()
fig.savefig('results.png')

Key Features

  • Statistical Significance Testing: Identifies genuinely important features using paired tests
  • Pattern Discovery: Detects linear, quadratic, and sigmoid relationships automatically
  • Interaction Analysis: Finds synergistic and antagonistic feature interactions
  • Human-Readable Output: Generates natural language explanations
  • Professional Visualizations: Creates publication-ready charts

Example Results

CLE-SH generates comprehensive visualizations that provide insights into your model's behavior:

Linear Explainer Results

CLE-SH Analysis Results

Tree Explainer Results

CLE-SH Tree Explainer Results

These visualizations show:

  • Top Significant Features: Ranked by statistical importance with confidence intervals
  • Univariate Pattern Types: Distribution of linear, quadratic, and sigmoid relationships
  • Feature Interactions: Network of synergistic and antagonistic feature relationships

Configuration

# Configure analysis parameters
clesh = CLESH(
    alpha=0.05,                      # Significance threshold
    p_univariate=0.05,              # Pattern significance
    p_interaction=0.05,             # Interaction significance
    candidate_num_min=10,           # Min features to consider
    candidate_num_max=20,           # Max features to consider
    log_level="INFO"                # Logging verbosity
)

# Custom visualization colors
fig = clesh.visualize_results(custom_colors={
    'primary': '#1B365D',
    'highlight': '#E6A800'
})

Requirements

  • Python 3.9+
  • Core: numpy, pandas, scipy, statsmodels, shap, loguru
  • Visualization: matplotlib, seaborn (included with [all])
  • Examples: scikit-learn (included with [all])

License

MIT License - see LICENSE for details.

Reference

Based on methodology from arXiv:2409.12578

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cle_sh-0.1.0.tar.gz (368.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cle_sh-0.1.0-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file cle_sh-0.1.0.tar.gz.

File metadata

  • Download URL: cle_sh-0.1.0.tar.gz
  • Upload date:
  • Size: 368.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for cle_sh-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ac833ad27b8e7dd33af073f601458846b9fa4eeaa593b00a40254d7806c72818
MD5 9c806c469a5096f75bc417e18e51478b
BLAKE2b-256 7aac75c0b96f8bd383a1335c85450c85ba296f0927ab01b9e15ddc265d2eb704

See more details on using hashes here.

File details

Details for the file cle_sh-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cle_sh-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for cle_sh-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6f4eeffdc8285534115c4091d32d0199d9fa5317f96083576d7a22218dec2902
MD5 c8f703b2cb3f5dfda3051ad76d771da1
BLAKE2b-256 af1895496291a05585b47106ffd6ac98523cd9edb0639b9ce788d889303b1e2b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page