A python package for benchmarking interpretability approaches.
Project description
ferret is Python library that streamlines the use and benchmarking of interpretability techniques on Transformers models.
- Documentation: https://ferret.readthedocs.io
- Paper: https://arxiv.org/abs/2208.01575
- Demo: https://huggingface.co/spaces/g8a9/ferret
ferret is meant to integrate seamlessly with 🤗 transformers models, among which it currently supports text models only. We provide:
- 🔍 Four established interpretability techniques based on Token-level Feature Attribution. Use them to find the most relevant words to your model output quickly.
- ⚖️ Six Faithfulness and Plausibility evaluation protocols. Benchmark any token-level explanation against these tests to guide your choice toward the most reliable explainer.
📝 Examples
All around tutorial (to test all explainers, evaluation metrics, and interface with XAI datasets): Colab
Text Classification
- Intent Detection with Multilingual XLM RoBERTa: Colab
Getting Started
Installation
pip install -U ferret-xai
Our main dependencies are 🤗 tranformers
and datasets
.
Important Some of our dependencies might use the package name for scikit-learn
and that breaks ferret installation.
If your pip install command fails, try:
SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True pip install -U ferret-xai
This is hopefully a temporary situation!
Explain & Benchmark
The code below provides a minimal example to run all the feature-attribution explainers supported by ferret and benchmark them on faithfulness metrics.
We start from a common text classification pipeline
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from ferret import Benchmark
name = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)
Using ferret is as simple as:
bench = Benchmark(model, tokenizer)
explanations = bench.explain("You look stunning!", target=1)
evaluations = bench.evaluate_explanations(explanations, target=1)
bench.show_evaluation_table(evaluations)
Be sure to run the code in a Jupyter Notebook/Colab: the cell above will produce a nicely-formatted table to analyze the saliency maps.
Features
ferret offers a painless integration with Hugging Face models and naming conventions. If you are already using the transformers library, you immediately get access to our Explanation and Evaluation API.
Post-Hoc Explainers
- Gradient (plain gradients or multiplied by input token embeddings) (Simonyan et al., 2014)
- Integrated Gradient (plain gradients or multiplied by input token embeddings) (Sundararajan et al., 2017)
- SHAP (via Partition SHAP approximation of Shapley values) (Lundberg and Lee, 2017)
- LIME (Ribeiro et al., 2016)
Evaluation Metrics
Faithfulness measures:
- AOPC Comprehensiveness (DeYoung et al., 2020)
- AOPC Sufficiency (DeYoung et al., 2020)
- Kendall's Tau correlation with Leave-One-Out token removal. (Jain and Wallace, 2019)
Plausibility measures:
- Area-Under-Precision-Recall-Curve (soft score) (DeYoung et al., 2020)
- Token F1 (hard score) (DeYoung et al., 2020)
- Token Intersection Over Union (hard score) (DeYoung et al., 2020)
See our paper for details.
Visualization
The Benchmark
class exposes easy-to-use table
visualization methods (e.g., within Jupyter Notebooks)
bench = Benchmark(model, tokenizer)
# Pretty-print feature attribution scores by all supported explainers
explanations = bench.explain("You look stunning!")
bench.show_table(explanations)
# Pretty-print all the supported evaluation metrics
evaluations = bench.evaluate_explanations(explanations)
bench.show_evaluation_table(evaluations)
Dataset Evaluations
The Benchmark
class has a handy method to compute and
average our evaluation metrics across multiple samples from a dataset.
import numpy as np
bench = Benchmark(model, tokenizer)
# Compute and average evaluation scores one of the supported dataset
samples = np.arange(20)
hatexdata = bench.load_dataset("hatexplain")
sample_evaluations = bench.evaluate_samples(hatexdata, samples)
# Pretty-print the results
bench.show_samples_evaluation_table(sample_evaluations)
Planned Developement
See the changelog file for further details.
- ✅ GPU acceleartion support for inference (v0.4.0)
- ✅ Batched Inference for internal methods's approximation steps (e.g., LIME or SHAP) (v0.4.0)
- ⚙️ Simplified Task API to support NLI, Zero-Shot Text Classification, Language Modeling (branch).
- ⚙️ Multi-sample explanation generation and evaluation
- ⚙️ Support to explainers for seq2seq and autoregressive generation through inseq.
- ⚙️ New evaluation measure: Sensitivity, Stability (Yin et al.)
- ⚙️ New evaluation measure: Area Under the Threshold-Performance Curve (AUC-TP) (Atanasova et al.)
- ⚙️ New explainer: Sampling and Occlusion (SOC) (Jin et al., 2020)
- ⚙️ New explainer: Discretized Integrated Gradient (DIG) (Sanyal and Ren, 2021)
- ⚙️ New explainer: Value Zeroing (Mohebbi et al, 2023)
- ⚙️ Support additional form of aggregation over embeddings' hidden dimension.
Authors
- Giuseppe Attanasio
- Eliana Pastor
- Debora Nozza
- Chiara Di Bonaventura
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
Logo and graphical assets made by Luca Attanasio.
If you are using ferret for your work, please consider citing us!
@inproceedings{attanasio-etal-2023-ferret,
title = "ferret: a Framework for Benchmarking Explainers on Transformers",
author = "Attanasio, Giuseppe and Pastor, Eliana and Di Bonaventura, Chiara and Nozza, Debora",
booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations",
month = may,
year = "2023",
publisher = "Association for Computational Linguistics",
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ferret_xai-0.4.2.tar.gz
.
File metadata
- Download URL: ferret_xai-0.4.2.tar.gz
- Upload date:
- Size: 44.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.13 Linux/4.18.0-513.9.1.el8_9.x86_64
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 292135c8b1b020a8da92a69c6194ecc413b736a4278f075bc1a2a22e14c7df19 |
|
MD5 | c85d2758674493eb1894a501384c5e76 |
|
BLAKE2b-256 | cf5b89e4625bcb0e10d423c3c292f94b04fd9ccc27ec54455d956a193aeec589 |
File details
Details for the file ferret_xai-0.4.2-py3-none-any.whl
.
File metadata
- Download URL: ferret_xai-0.4.2-py3-none-any.whl
- Upload date:
- Size: 52.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.13 Linux/4.18.0-513.9.1.el8_9.x86_64
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20873554afcbc783bc2fc48a97bf20a8ee3fd5723969632609a95bd2fa773836 |
|
MD5 | c8bf3ff2f2761f5772b9fded002fd133 |
|
BLAKE2b-256 | a0f2d187644e77b8eb77b8d325de64493c87986fc00026d173f95a52f4c84b39 |