A collection of metrics for comparing saliency maps
Project description
SaliencyTools
Comparing saliency maps produced by XAI methods.
This module is a work in progress, contributions are welcome!
SaliencyTools is an open-source Python package providing thirteen curated, image-native metrics for comparing saliency maps generated by explainability methods (SHAP, LIME, GradCAM, Integrated Gradients, …). Unlike general-purpose distance libraries, all metrics operate directly on 2-D NumPy arrays, preserve spatial structure, and handle signed attribution maps natively.
Installation
pip install saliencytools
Usage
import numpy as np
from saliencytools.maskcompare import (
sign_agreement_ratio,
ssim,
mean_absolute_error,
euclidean_distance,
cosine_distance,
mean_squared_error,
psnr,
emd,
correlation_distance,
jaccard_distance,
czenakowski_distance,
kl_divergence,
auc_judd,
# preprocessing utilities
normalize_mask_0_1,
clip_mask,
)
# Two signed attribution maps (e.g. from SHAP and LIME on the same input)
map_a = np.random.randn(28, 28)
map_b = np.random.randn(28, 28)
print(sign_agreement_ratio(map_a, map_b))
print(ssim(map_a, map_b))
# Set-theoretic metrics require non-negative inputs
map_a_nn = normalize_mask_0_1(map_a)
map_b_nn = normalize_mask_0_1(map_b)
print(jaccard_distance(map_a_nn, map_b_nn))
# AUC-Judd is directed: first argument is the prediction, second is the reference
# e.g. compare a LIME explanation against a prototype
print(auc_judd(map_a, map_b))
Metrics implemented
| Category | Metric | Function |
|---|---|---|
| Geometric | Euclidean distance | euclidean_distance |
| Geometric | Cosine distance | cosine_distance |
| Geometric | Mean Absolute Error | mean_absolute_error |
| Geometric | Mean Squared Error | mean_squared_error |
| Statistical | Earth Mover's Distance | emd |
| Statistical | Peak Signal-to-Noise Ratio | psnr |
| Statistical | Correlation distance | correlation_distance |
| Set-theoretic | Jaccard distance | jaccard_distance |
| Set-theoretic | Czekanowski distance | czenakowski_distance |
| Binary | Sign Agreement Ratio | sign_agreement_ratio |
| Structural | SSIM | ssim |
| Information-theoretic | KL Divergence (symmetric) | kl_divergence |
| Information-theoretic | AUC-Judd (directed) | auc_judd |
Preprocessing utilities: normalize_mask_0_1, clip_mask, normalize_mask
Set-theoretic metrics (jaccard_distance, czenakowski_distance) require non-negative inputs; apply normalize_mask_0_1 first.
auc_judd is intentionally asymmetric: auc_judd(prediction, reference) scores how well the prediction recovers the above-mean regions of the reference map.
Proxy benchmark
Because real saliency maps have no ground truth, we validate metric discriminability with a controlled proxy benchmark: a k-nearest-neighbour classifier on MNIST, evaluated with macro-F1 across 8 preprocessing configurations and 10 independent prototype draws.
Key findings (k=20 prototypes/class, 10 seeds):
| Metric | Mean F1 | Best config | Time (s) |
|---|---|---|---|
| Sign Agreement Ratio | 0.746 ± 0.017 | [---] |
~11 |
| SSIM | 0.726 ± 0.013 | [---] |
~228 |
| MAE | 0.707 ± 0.016 | [CN-] |
~9 |
| … | |||
| Earth Mover's Distance | 0.380 ± 0.009 | [---] |
~283 |
[C--] = clip to [-1,1]; [N--] = normalize to [0,1]; [S--] = Sobel filter.
Reproducing the benchmark
# Full multi-seed run (k=20 prototypes/class, 10 seeds; resumes safely if interrupted)
python run_benchmark.py --out results_seeds.json
# k-sensitivity run
python run_benchmark.py --k 5 --out results_k5.json
Estimated runtime: 3–4 hours on CPU for the default run. Use --resume to continue an interrupted run.
Reproducing paper tables and figures
# LaTeX results table (auto-detects multi-seed format)
python paper/generate_tables.py --results results_seeds.json
# All figures (heatmap, F1-vs-time, stability, joyplot)
# Add --results-k5 to overlay the k=5 KDE on the joyplot
python paper/generate_figures.py --results results_seeds.json --multi-seed
python paper/generate_figures.py --results results_seeds.json --results-k5 results_k5.json --multi-seed
Tables are written to paper/tables/ and figures to paper/figures/.
Why SaliencyTools?
Existing alternatives fall short for saliency map comparison:
- distancia (GitHub, docs) — broad coverage of mathematical distances, but images must be converted to flat lists (spatial structure lost), several metrics contain implementation errors, and there is no image-native preprocessing pipeline.
- saliency-metrics (GitHub, docs) — targets saliency evaluation specifically, but abandoned since 2022 with incomplete documentation.
- Quantus (GitHub) — evaluates explanation quality relative to a model (faithfulness, robustness, localisation); requires the model and input data. SaliencyTools only needs two maps, making it suitable for lightweight, model-agnostic comparison. The two tools are complementary.
SaliencyTools is image-native, actively maintained, and formally tested for symmetry, non-negativity, and identity axioms (test/test_metrics.py).
Further reading
- Interpretable Machine Learning Book — https://christophm.github.io/interpretable-ml-book/pixel-attribution.html
- Bylinskii et al. (2019) — What Do Different Evaluation Metrics Tell Us About Saliency Models? — IEEE TPAMI
- Samek et al. (2017) — Evaluating the Visualization of What a Deep Neural Network Has Learned — IEEE TNNLS
- Wörheide et al. (2021) — Multilevel Correspondence Analysis — https://doi.org/10.1364/JOSAA.31.000532
- Google
saliencylibrary — https://pypi.org/project/saliency/
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file saliencytools-0.35.tar.gz.
File metadata
- Download URL: saliencytools-0.35.tar.gz
- Upload date:
- Size: 13.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69cfeeb1af1ca92746c37bdbf68601c98be4211918ef2f91e8920adc4c317c17
|
|
| MD5 |
0e58343986b1e1adcc65b2f47cc07e7b
|
|
| BLAKE2b-256 |
65ad85e39cd938fdf2c8818b778cb26e96964c729ede0bc07715e6319e09ae1f
|
File details
Details for the file saliencytools-0.35-py3-none-any.whl.
File metadata
- Download URL: saliencytools-0.35-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
137dde4012537158c186df9c0358004b619b9ae610349691c7f1501c7f28208c
|
|
| MD5 |
ec92c51d7175aafd8a726f78538d386c
|
|
| BLAKE2b-256 |
e20896a8181a6a2fa7128f3109644a1220e41e81c85c378403d65e81f1bc8fff
|