Skip to main content

CafGa is a library that facilitates creating and evaluating grouped-attribution explanations.

Project description

GALE

About

GALE is a python module built to facilitate the development and deployment of grouping strategies for attribution-based explanations of natural language processing models.

Installation

GALE can be installed through PyPI using (not yet)

pip install gale

When running GALE from the repository run:

pip install -r requirements.txt

Note that some of the extra functionality requires further installations:

  1. To get the syntax-parse requires downloading spaCy and the en_core_web_trf module. Which can be done with the following commands:
pip install spacy
python -m spacy download en_core_web_trf

Important Notes:

  1. spaCy may fail to build on python >= 3.13. So in case you into a build failure try downgrading python to 3.12.

  2. en_core_web_trf requires a version of torch that cannot be run on numpy 2. Thus, you may need to run the following command to downgrade numpy:

pip install numpy==1.26.4
  1. GALE also provides two jupyter widgets. The edit widget allows one to visually edit assignments and the display widget displays the attributions generated by the explanation. To use these please follow the instructions in the 'Demo Instructions.md' file.

Using GALE

The following provides an explanation of the main functions of gale. To see an example of how to use gale please look at the demo.

To begin using gale, start by creating a gale object:

gale = GALE(model = 'your_model')

The model parameter is where you pass the model you want to explain. To allow for parallelization in how your model generates predictions (e.g. by batching) gale sends lists of inputs to your model instead of single inputs. Thus, the function that implements your model should take a list of strings as input and output either a list of strings or a list of floats as output (i.e. a list containing one output for every input).

Once gale is instantiated the typical usage of gale runs proceeds in three steps: Explanation, Evaluation, and Visualisation.

1. Explanation

To generate an explanation run the explain function on the instantiated gale object:

explanation = gale.explain(params)

There are two way of using the explain functions.

Firstly, you can pass the string you want to get an explanation for without segmenting it into the individual parts that you want to get attributions for. In this case you need to provide the name of the predefined attribution method ('word', 'sentence', 'syntax-parse') that you want to use.

Secondly, you can provide your own segmentation of the input by using the segmented_input parameter. In this case you will also need to provide the assignments of input segment to group with the input_assignments parameter. Specifically, the input_assignments[i] = g_i should be the index of the group that input_segments[i] belongs to.

2. Evaluation

Once an explanation object has been generated you can pass it on to the evaluation function:

evaluation = gale.evaluate(explanation, params)

The two forms of evaluation currently supported are deletion (going from all features present to no features present) and insertion (going from no features present to all features present), which can be indicated by the direction parameter. The resulting evaluation accordinlgy contains the array of difference values computed as part of the perturbation curve.

3. Visualisation

Finally, the perturbation curve generated by the evaluation can be visualised using the visualisation function:

gale.visualize_evaluation(evaluated_explanations, params)

Since you may want to plot the aggregate over many evaluations the visualisation functions takes in a list of evaluations as input. The two forms of aggregation currently supported are equal width binning and linear interpolation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cafga-0.0.1.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cafga-0.0.1-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file cafga-0.0.1.tar.gz.

File metadata

  • Download URL: cafga-0.0.1.tar.gz
  • Upload date:
  • Size: 21.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for cafga-0.0.1.tar.gz
Algorithm Hash digest
SHA256 f87414480535453f52987d198d1a02d6967fe4cc073da78f5ae92764e28b8560
MD5 855722029074972c0a8559dea0129e6c
BLAKE2b-256 f4374459a9d936aeb06ccaf4adcc6a02e88f77c9779d1f83d1f9bd5b9f048659

See more details on using hashes here.

File details

Details for the file cafga-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: cafga-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for cafga-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6c99dbf00c117acc7620dc0a5e7ac3b45229f2cd367078801b2bf5ee4fd15782
MD5 f278f9a2ea0cf273c2abb70ea63910c6
BLAKE2b-256 02c7963a4efe088cca9331fbc91a390fbe699370383062e94c5f8bdff9dacb52

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page