Skip to main content

Extension of text_explainability for sensitivity testing (robustness, fairness)

Project description

T_xt Sensitivity logo

Sensitivity testing (fairness & robustness) for text machine learning models

PyPI Downloads Python_version Build_passing License Documentation Status Code style: black


Extension of text_explainability

Uses the generic architecture of text_explainability to also include tests of safety (how safe it the model in production, i.e. types of inputs it can handle), robustness (how generalizable the model is in production, e.g. stability when adding typos, or the effect of adding random unrelated data) and fairness (if equal individuals are treated equally by the model, e.g. subgroup fairness on sex and nationality).

© Marcel Robeer, 2021

Quick tour

Safety: test if your model is able to handle different data types.

from text_sensitivity import RandomAscii, RandomEmojis, combine_generators

# Generate 10 strings with random ASCII characters
RandomAscii().generate_list(n=10)

# Generate 5 strings with random ASCII characters and emojis
combine_generators(RandomAscii(), RandomEmojis()).generate_list(n=5)

Robustness: if your model performs equally for different entities ...

from text_sensitivity import RandomAddress, RandomEmail

# Random address of your current locale (default = 'nl')
RandomAddress(sep=', ').generate_list(n=5)

# Random e-mail addresses in Spanish ('es') and Portuguese ('pt'), and include from which country the e-mail is
RandomEmail(languages=['es', 'pt']).generate_list(n=10, attributes=True)

... and if it is robust under simple perturbations.

from text_sensitivity import compare_accuracy
from text_sensitivity.perturbation import to_upper, add_typos

# Is model accuracy equal when we change all sentences to uppercase?
compare_accuracy(env, model, to_upper)

# Is model accuracy equal when we add typos in words?
compare_accuracy(env, model, add_typos)

Fairness: see if performance is equal among subgroups.

from text_sensitivity import RandomName

# Generate random Dutch ('nl') and Russian ('ru') names, both 'male' and 'female' (+ return attributes)
RandomName(languages=['nl', 'ru'], sex=['male', 'female']).generate_list(n=10, attributes=True)

Installation

See the installation instructions for an extended installation guide.

Method Instructions
pip Install from PyPI via pip3 install text_sensitivity.
Local Clone this repository and install via pip3 install -e . or locally run python3 setup.py install.

Documentation

Full documentation of the latest version is provided at https://text-sensitivity.readthedocs.io/.

Example usage

See example_usage.md to see an example of how the package can be used, or run the lines in example_usage.py to do explore it interactively.

Releases

text_sensitivity is officially released through PyPI.

See CHANGELOG.md for a full overview of the changes for each version.

Citation

@misc{text_sensitivity,
  title = {Python package text\_sensitivity},
  author = {Marcel Robeer},
  howpublished = {\url{https://git.science.uu.nl/m.j.robeer/text_sensitivity}},
  year = {2021}
}

Maintenance

Contributors

Todo

Tasks yet to be done:

  • Word-level perturbations
  • Add fairness-specific metrics:
    • Counterfactual fairness
  • Add expected behavior
    • Robustness: equal to prior prediction, or in some cases might expect that it deviates
    • Fairness: may deviate from original prediction
  • Tests
    • Add tests for perturbations
    • Add tests for sensitivity testing schemes
  • Add visualization ability

Credits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

text_sensitivity-0.3.3.tar.gz (105.4 kB view details)

Uploaded Source

Built Distribution

text_sensitivity-0.3.3-py3-none-any.whl (56.3 kB view details)

Uploaded Python 3

File details

Details for the file text_sensitivity-0.3.3.tar.gz.

File metadata

  • Download URL: text_sensitivity-0.3.3.tar.gz
  • Upload date:
  • Size: 105.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.12.0 pkginfo/1.7.0 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for text_sensitivity-0.3.3.tar.gz
Algorithm Hash digest
SHA256 39b77ae23dd1ab5496c1f79a9776a7b8247a4506096f7ea1653c8157f24580cc
MD5 fa8806a60faff991410879b011cd30d2
BLAKE2b-256 3ae4ab3bae1ad306176c73838f59b68a809f1158a4ce4ed4bf97d820d694ce3a

See more details on using hashes here.

File details

Details for the file text_sensitivity-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: text_sensitivity-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 56.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.12.0 pkginfo/1.7.0 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for text_sensitivity-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b8cfcbaa6d3273c3b33b3df122d83fe24744c7a59e9f4eabc56282e359cae63d
MD5 920f000fa2af6f9682ac956437b61c64
BLAKE2b-256 e8610a1c1bd21125db7addbeeb322c1dcc1c591ea6b4ce33eda6bcf7595ceda3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page