CLE-SH: Comprehensive Literal Explanation package for SHapley values by statistical validity (arXiv:2409.12578)
Project description
CLE-SH: Comprehensive Literal Explanation Package for SHapley Values
A Python library for statistically rigorous SHAP value interpretation. CLE-SH identifies significant features, discovers univariate patterns, and detects feature interactions using proper statistical testing.
Installation
# Using uv (recommended)
uv add "cle-sh[all]"
# Or using pip
pip install "cle-sh[all]"
Quick Start
from clesh.clesh import CLESH
import shap
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Load data and train model
data = load_breast_cancer()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
model = LogisticRegression(random_state=42, max_iter=1000)
model.fit(X_train_scaled, y_train)
# Generate SHAP values
explainer = shap.LinearExplainer(model, X_train_scaled)
shap_values = explainer.shap_values(X_test_scaled)
# Run CLE-SH analysis
clesh = CLESH()
results = clesh.comprehensive_analysis(X_test_scaled, shap_values, data.feature_names)
# Get explanations
explanations = clesh.generate_literal_explanations()
for exp in explanations:
print(exp)
# Create visualization
fig = clesh.visualize_results()
fig.savefig('results.png')
Key Features
- Statistical Significance Testing: Identifies genuinely important features using paired tests
- Pattern Discovery: Detects linear, quadratic, and sigmoid relationships automatically
- Interaction Analysis: Finds synergistic and antagonistic feature interactions
- Human-Readable Output: Generates natural language explanations
- Professional Visualizations: Creates publication-ready charts
Example Results
CLE-SH generates comprehensive visualizations that provide insights into your model's behavior:
Linear Explainer Results
Tree Explainer Results
These visualizations show:
- Top Significant Features: Ranked by statistical importance with confidence intervals
- Univariate Pattern Types: Distribution of linear, quadratic, and sigmoid relationships
- Feature Interactions: Network of synergistic and antagonistic feature relationships
Configuration
# Configure analysis parameters
clesh = CLESH(
alpha=0.05, # Significance threshold
p_univariate=0.05, # Pattern significance
p_interaction=0.05, # Interaction significance
candidate_num_min=10, # Min features to consider
candidate_num_max=20, # Max features to consider
log_level="INFO" # Logging verbosity
)
# Custom visualization colors
fig = clesh.visualize_results(custom_colors={
'primary': '#1B365D',
'highlight': '#E6A800'
})
Requirements
- Python 3.9+
- Core:
numpy,pandas,scipy,statsmodels,shap,loguru - Visualization:
matplotlib,seaborn(included with[all]) - Examples:
scikit-learn(included with[all])
License
MIT License - see LICENSE for details.
Reference
Based on methodology from arXiv:2409.12578
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cle_sh-0.1.0.tar.gz.
File metadata
- Download URL: cle_sh-0.1.0.tar.gz
- Upload date:
- Size: 368.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac833ad27b8e7dd33af073f601458846b9fa4eeaa593b00a40254d7806c72818
|
|
| MD5 |
9c806c469a5096f75bc417e18e51478b
|
|
| BLAKE2b-256 |
7aac75c0b96f8bd383a1335c85450c85ba296f0927ab01b9e15ddc265d2eb704
|
File details
Details for the file cle_sh-0.1.0-py3-none-any.whl.
File metadata
- Download URL: cle_sh-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6f4eeffdc8285534115c4091d32d0199d9fa5317f96083576d7a22218dec2902
|
|
| MD5 |
c8f703b2cb3f5dfda3051ad76d771da1
|
|
| BLAKE2b-256 |
af1895496291a05585b47106ffd6ac98523cd9edb0639b9ce788d889303b1e2b
|