Extension of text_explainability for sensitivity testing (robustness, fairness)
Project description
Sensitivity testing (fairness & robustness) for text machine learning models
Extension of text_explainability
Uses the generic architecture of text_explainability
to also include tests of safety (how safe it the model in production, i.e. types of inputs it can handle), robustness (how generalizable the model is in production, e.g. stability when adding typos, or the effect of adding random unrelated data) and fairness (if equal individuals are treated equally by the model, e.g. subgroup fairness on sex and nationality).
© Marcel Robeer, 2021
Quick tour
Safety: test if your model is able to handle different data types.
from text_sensitivity import RandomAscii, RandomEmojis, combine_generators
# Generate 10 strings with random ASCII characters
RandomAscii().generate_list(n=10)
# Generate 5 strings with random ASCII characters and emojis
combine_generators(RandomAscii(), RandomEmojis()).generate_list(n=5)
Robustness: if your model performs equally for different entities ...
from text_sensitivity import RandomAddress, RandomEmail
# Random address of your current locale (default = 'nl')
RandomAddress(sep=', ').generate_list(n=5)
# Random e-mail addresses in Spanish ('es') and Portuguese ('pt'), and include from which country the e-mail is
RandomEmail(languages=['es', 'pt']).generate_list(n=10, attributes=True)
... and if it is robust under simple perturbations.
from text_sensitivity import compare_accuracy
from text_sensitivity.perturbation import to_upper, add_typos
# Is model accuracy equal when we change all sentences to uppercase?
compare_accuracy(env, model, to_upper)
# Is model accuracy equal when we add typos in words?
compare_accuracy(env, model, add_typos)
Fairness: see if performance is equal among subgroups.
from text_sensitivity import RandomName
# Generate random Dutch ('nl') and Russian ('ru') names, both 'male' and 'female' (+ return attributes)
RandomName(languages=['nl', 'ru'], sex=['male', 'female']).generate_list(n=10, attributes=True)
Installation
See the installation instructions for an extended installation guide.
Method | Instructions |
---|---|
pip |
Install from PyPI via pip3 install text_sensitivity . |
Local | Clone this repository and install via pip3 install -e . or locally run python3 setup.py install . |
Documentation
Full documentation of the latest version is provided at https://text-sensitivity.readthedocs.io/.
Example usage
See example_usage.md to see an example of how the package can be used, or run the lines in example_usage.py
to do explore it interactively.
Releases
text_sensitivity
is officially released through PyPI.
See CHANGELOG.md for a full overview of the changes for each version.
Citation
@misc{text_sensitivity,
title = {Python package text\_sensitivity},
author = {Marcel Robeer},
howpublished = {\url{https://git.science.uu.nl/m.j.robeer/text_sensitivity}},
year = {2021}
}
Maintenance
Contributors
- Marcel Robeer (
@m.j.robeer
) - Elize Herrewijnen (
@e.herrewijnen
)
Todo
Tasks yet to be done:
- Word-level perturbations
- Add fairness-specific metrics:
- Counterfactual fairness
- Add expected behavior
- Robustness: equal to prior prediction, or in some cases might expect that it deviates
- Fairness: may deviate from original prediction
- Tests
- Add tests for perturbations
- Add tests for sensitivity testing schemes
- Add visualization ability
Credits
- Edward Ma. NLP Augmentation. 2019.
- Daniele Faraglia and other contributors. Faker. 2012.
- Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin and Sameer Singh. Beyond Accuracy: Behavioral Testing of NLP models with CheckList. Association for Computational Linguistics (ACL). 2020.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file text_sensitivity-0.3.3.tar.gz
.
File metadata
- Download URL: text_sensitivity-0.3.3.tar.gz
- Upload date:
- Size: 105.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.12.0 pkginfo/1.7.0 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39b77ae23dd1ab5496c1f79a9776a7b8247a4506096f7ea1653c8157f24580cc |
|
MD5 | fa8806a60faff991410879b011cd30d2 |
|
BLAKE2b-256 | 3ae4ab3bae1ad306176c73838f59b68a809f1158a4ce4ed4bf97d820d694ce3a |
File details
Details for the file text_sensitivity-0.3.3-py3-none-any.whl
.
File metadata
- Download URL: text_sensitivity-0.3.3-py3-none-any.whl
- Upload date:
- Size: 56.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.12.0 pkginfo/1.7.0 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8cfcbaa6d3273c3b33b3df122d83fe24744c7a59e9f4eabc56282e359cae63d |
|
MD5 | 920f000fa2af6f9682ac956437b61c64 |
|
BLAKE2b-256 | e8610a1c1bd21125db7addbeeb322c1dcc1c591ea6b4ce33eda6bcf7595ceda3 |