Skip to main content

Explaining models, with Triples.

Project description

TripleX

Explaining models, with triples

Triplex is a local explainability method to explain transformer models by creating small knowledge graphs in the form of triplets. This implementation focuses on explaining predictions on NLI (natural language inference) tasks. Explanations are provided as dfas.DFAH (Deterministic Finite state Automata of Hypernyms).

import pathlib
import copy
import json

from dfas import DFAH

# base path
BASE_PATH = str(pathlib.Path().absolute()) + '/'
# Load a sample DFAH
dfah = DFAH.from_json(BASE_PATH + 'data/dummies/dfah.json')
# Show a DFAH visually
print(dfah)
# access the perturbations it went through
perturbations = dfah.perturbations

# dfah are copy-able and serializable
copy_dfah = copy.copy(dfah)
with open('data/dummies/my_dfah.json') as log:
    json.dump(dfah.to_json(), log)

Getting started

Install dependencies:

pip install triplex

python -m remote.py download en_core_web_sm

Run

from transformers import AutoModel
import logzero

from triplex.triplex import TripleX

# logging level, set to logging.DEBUG for verbose output
logzero.loglevel(logzero.logging.INFO)

model = 'microsoft/deberta-base'
model = AutoModel.from_pretrained(model, output_attentions=True)
# create explainer
explainer = TripleX(model)

premise = 'Dana Reeve, the widow of the actor Christopher Reeve, has died of lung cancer at age 44, according to the Christopher Reeve Foundation.'
hypothesis = 'Christopher Reeve had an accident.'
dfas, counterfactual_dfas = explainer.extract(premise, hypothesis,
                                              depth=2,
                                              max_perturbations_per_token=3)
print('--- Explanations')
for d in dfas[:3]:
    print(str(d))
for d in counterfactual_dfas[:3]:
    print(str(d))

To run on a local JSONL dataset:

from transformers import AutoModel
import pandas as pd

from scripts.extract_from_dataset import to_standard_labels
from triplex.triplex import TripleX

dataset = 'path/to/dataset.jsonl'
data = pd.read_json(dataset, lines=True)
data = data.drop('idx', axis='columns')
data['label'] = to_standard_labels(data['label'].values, dataset)
data = data[['premise', 'hypothesis', 'label']]

model = AutoModel.from_pretrained('microsoft/deberta-base', output_attentions=True)
explainer = TripleX(model)
explanations = list()
for idx, row in data.iterrows():
    premise, hypothesis, label = row.premise, row.hypothesis, row.label
    dfas, counterfactual_dfas = explainer.extract(premise, hypothesis,
                                                  depth=2,
                                                  max_perturbations_per_token=3)
    explanations.append((premise, hypothesis, label, dfas, counterfactual_dfas))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

triplex-0.0.18.tar.gz (38.7 kB view details)

Uploaded Source

Built Distribution

triplex-0.0.18-py3-none-any.whl (40.2 kB view details)

Uploaded Python 3

File details

Details for the file triplex-0.0.18.tar.gz.

File metadata

  • Download URL: triplex-0.0.18.tar.gz
  • Upload date:
  • Size: 38.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for triplex-0.0.18.tar.gz
Algorithm Hash digest
SHA256 142d77c5f637f56665ec28d8d47ac8cba62320bf9a45054403a0e1f397ce88f9
MD5 a29539ae2009f9927ea211e4f2d49c3a
BLAKE2b-256 ca105b841e8c01bfb3d78cba067d49d05b94351641eeb7e9a2337a75cc493af2

See more details on using hashes here.

File details

Details for the file triplex-0.0.18-py3-none-any.whl.

File metadata

  • Download URL: triplex-0.0.18-py3-none-any.whl
  • Upload date:
  • Size: 40.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for triplex-0.0.18-py3-none-any.whl
Algorithm Hash digest
SHA256 c508e2408befd7d3488aeb983ccaffb035c5a7538feb6d3f47fe594098e41315
MD5 c73b916beff798a829ea41ed90e1fc9c
BLAKE2b-256 04a67514d02911e817f5de96f9b544c932462d80a775e989a4ef8a65e8a7fa67

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page