Explore/examine/explain/expose your model with the explabox!
Project description
"{Explore | Examine | Expose | Explain} your model with the explabox!"
| Status | |
|---|---|
| Latest release | |
| Development |
Developed to meet the practical machine learning (ML) auditing requirements of the Netherlands National Police, explabox is an open-source Python toolkit for the complete ML auditing lifecycle. It implements a standardized four-step workflow—Explore, Examine, Explain, and Expose—to produce reproducible and holistic evaluations of text-based models.
The framework turns opaque models and data (ingestibles) into interpretable reports and visualizations (digestibles) tailored for diverse stakeholders, from developers and auditors to legal and ethical oversight bodies. It aids in explaining, testing and documenting AI/ML models, developed in-house or acquired externally.
explabox operationalizes the audit process through its standardized four-step workflow:
- Explore: describe aspects of the model and data.
- Examine: calculate quantitative metrics on how the model performs.
- Expose: see model sensitivity to random inputs (safety), test model generalizability (e.g. sensitivity to typos; robustness), and see the effect of adjustments of attributes in the inputs (e.g. swapping male pronouns for female pronouns; fairness), for the dataset as a whole (global) as well as for individual instances (local).
- Explain: use XAI methods for explaining the whole dataset (global), model behavior on the dataset (global), and specific predictions/decisions (local).
A number of analyses in the explabox can also be used to provide transparency and explanations to stakeholders, such as end-users or clients.
:information_source: The
explaboxcurrently only supports natural language text as a modality. In the future, we intend to extend to other modalities.
© National Police Lab AI (NPAI), 2022
Quick tour
The explabox is distributed on PyPI. To use the package with Python, install it (pip install explabox), import your data and model and wrap them in the Explabox. The example dataset and model shown here can be easily imported using demo package explabox-demo-drugreview.
:information_source: To easily follow along without a need for installation, run the Notebook in
First, import the pre-provided model, and import the data from the dataset_file. All we need to know is in which column(s) your data is, and where we can find the corresponding labels:
from explabox_demo_drugreview import model, dataset_file
from explabox import import_data
data = import_data(dataset_file,
data_cols='review',
label_cols='rating')
Second, we provide the data and model to the Explabox, and it does the rest! Rename the splits from your file names for easy access:
from explabox import Explabox
box = Explabox(data=data,
model=model,
splits={'train': 'drugsComTrain.tsv', 'test': 'drugsComTest.tsv'})
Then .explore, .examine, .expose and .explain your model:
# Explore the descriptive statistics for each split
box.explore()
# Show wrongly classified instances
box.examine.wrongly_classified()
# Compare the performance on the test split before and after adding typos to the text
box.expose.compare_metric(split='test', perturbation='add_typos')
# Get a local explanation (uses LIME by default)
box.explain.explain_prediction('Hate this medicine so much!')
For more information, visit the explabox documentation.
Contents
Installation
The easiest way to install the latest release of the explabox is through pip:
user@terminal:~$ pip install explabox
Collecting explabox
...
Installing collected packages: explabox
Successfully installed explabox
:information_source: The
explaboxrequires Python 3.8 or above.
See the full installation guide for troubleshooting the installation and other installation methods.
Documentation
Documentation for the explabox is hosted externally on explabox.rtfd.io.
The explabox consists of three layers:
- Ingestibles: A unified interface for importing models (any Python
Callable,scikit-learn,onnx) and data (Pandas,huggingface, raw files), abstracting away access for optimized processing. - Analyses: The core four-step engine that transform ingestibles into insights using the explore, examine, explain and expose methods.
- Digestibles: The stakeholder-centric output system. Digestibles are interactive objects and static reports that present audit results in formats tailored to developers, auditors, and decision-makers provide insights into model behavior and data, assisting stakeholders in increasing the explainability, fairness, auditability and safety of their AI systems. Depending on their needs, these can be accessed interactively (e.g. via the Jupyter Notebook UI or embedded via the API) or through static reporting.
Example usage
The example usage guide showcases the explabox for a black-box model performing multi-class classification of the UCI Drug Reviews dataset.
Without requiring any local installations, the notebook is provided on .
If you want to follow along on your own device, simply pip install explabox-demo-drugreview and run the lines in the Jupyter notebook we have prepared for you!
Advanced set-up
When importing your own model and data, you can refer to a(n) (archive of) file(s), on disk or with an online URL. The explabox does all the importing for you. Consult the ingestibles documentation for an up-to-date list of the supported file formats.
from explabox import import_data, import_model
data = import_data('./drugsCom.zip',
data_cols='review',
label_cols='rating')
model = import_model('model.onnx',
label_map={0: 'negative', 1: 'neutral', 2: 'positive'})
In this example, the data in the archive drugsCom.zip contains two .tsv (tab-separated values) files with the data in the review column and the gold labels in the rating column. The two files in drugsCom.zip are drugsComTrain.tsv and drugsComTest.tsv, containing the training data and test data, respectively.
The model is provided as an onnx file, where output 0 corresponds to a negative review, 1 to a neutral review, and 2 to a positive review.
You can add a mapping from the files in drugsCom.zip that refer to your train/test/validation splits by renaming them for easy access:
from explabox import Explabox
box = Explabox(data=data,
model=model,
splits={'train': 'drugsComTrain.tsv', 'test': 'drugsComTest.tsv'})
Now you can .explore, .examine, .expose and .explain your data and model as usual.
Releases
The explabox is officially released through PyPI. The changelog includes a full overview of the changes for each version.
Contributing
The explabox is an open-source project developed and maintained primarily by the Netherlands National Police Lab AI (NPAI). However, your contributions and improvements are still required! See contributing for a full contribution guide.
Citation
If you use the Explabox in your work, please read the corresponding paper at doi:10.48550/arXiv.2411.15257, and cite the paper as follows:
@article{Robeer2025,
title = {{Explabox: A Python Toolkit for Standardized Auditing and Explanation of Text Models}},
author = {Robeer, Marcel and Bron, Michiel and Herrewijnen, Elize and Hoeseni, Riwish and Bex, Floris},
journal = {Journal of Open Source Software},
year = {2025},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file explabox-1.0.3.tar.gz.
File metadata
- Download URL: explabox-1.0.3.tar.gz
- Upload date:
- Size: 458.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe72f9cd6006c2bc58bfd39ff1a6cc05d7780c4766f881332f5f4ad688df1173
|
|
| MD5 |
0637a5b445034f5c768e24244682b479
|
|
| BLAKE2b-256 |
ffbae0c36d3c6f22a857ec225269d2159031527bb172e6bca1652cdd5cfc3bd0
|
File details
Details for the file explabox-1.0.3-py3-none-any.whl.
File metadata
- Download URL: explabox-1.0.3-py3-none-any.whl
- Upload date:
- Size: 45.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eeac7d94a1a7332a6ff17408089c1b3369577828e0db7f0c6c8f624ec6320543
|
|
| MD5 |
9cfa589d0fb7c30a969279ee600b3229
|
|
| BLAKE2b-256 |
5251f6c3bec9dbd3cab8f8a43b41dee753e1e59308d87e79bab54f4e54ab8483
|