Skip to main content

Veritas Diagnosis tool for fairness & transparency assessment.

Project description

Veritas Toolkit

codecov PyPI versionPython 3.10 Python 3.9 Python 3.8 GitHub license Python package

The purpose of this toolkit is to facilitate the adoption of Veritas Methodology on Fairness & Transparency Assessment and spur industry development. It will also benefit customers by improving the fairness and transparency of financial services delivered by AIDA systems.

Installation

The easiest way to install veritastool is to download it from PyPI. It's going to install the library itself and its prerequisites as well. It is suggested to create virtual environment with requirements.txt file first.

pip install veritastool

Then, you will be able to import the library and use its functionalities. Before we do that, we can run a test function on our sample datasets to see if our codes are performing as expected.

from veritastool.util.utility import test_function_cs
test_function_cs()

Output:

Initialization

You can now import the custom library that you would to use for diagnosis. In this example we will use the Credit Scoring custom library.

from veritastool.model.modelwrapper import ModelWrapper
from veritastool.model.model_container import ModelContainer
from veritastool.usecases.credit_scoring import CreditScoring

Once the relevant use case object (CreditScoring) and model container (ModelContainer) has been imported, you can upload your contents into the container and initialize the object for diagnosis.

import pickle
import numpy as np

#Load Credit Scoring Test Data
# NOTE: Assume current working directory is the root folder of the cloned veritastool repository
file = "./veritastool/examples/data/credit_score_dict.pickle"
input_file = open(file, "rb")
cs = pickle.load(input_file)

#Model Contariner Parameters
y_true = np.array(cs["y_test"])
y_pred = np.array(cs["y_pred"])
y_train = np.array(cs["y_train"])
p_grp = {'SEX': [1], 'MARRIAGE':[1]}
up_grp = {'SEX': [2], 'MARRIAGE':[2]}
x_train = cs["X_train"]
x_test = cs["X_test"]
model_name = "credit_scoring"
model_type = "classification"
y_prob = cs["y_prob"]
model_obj = LogisticRegression(C=0.1)
model_obj.fit(x_train, y_train) #fit the model as required for transparency analysis

#Create Model Container 
container = ModelContainer(y_true, p_grp, model_type, model_name, y_pred, y_prob, y_train, x_train=x_train, \
                           x_test=x_test, model_object=model_obj, up_grp=up_grp)

#Create Use Case Object
cre_sco_obj= CreditScoring(model_params = [container], fair_threshold = 80, fair_concern = "eligible", \
                           fair_priority = "benefit", fair_impact = "normal", perf_metric_name="accuracy", \
                           tran_row_num = [20,40], tran_max_sample = 1000, tran_pdp_feature = ['LIMIT_BAL'], tran_max_display = 10)
                                                     

API functions

Below are the API functions that the user can execute to obtain the fairness and transparency diagnosis of their use cases.

Evaluate

The evaluate API function computes all performance and fairness metrics and renders it in a table format (default). It also highlights the primary performance and fairness metrics (automatic if not specified by user).

cre_sco_obj.evaluate()

Output:

You can also toggle the widget to view your results in a interactive visualization format.

cre_sco_obj.evaluate(visualize = True)

Output:

Tradeoff

Computes trade-off between performance and fairness.

cre_sco_obj.tradeoff()

Output:

** Note: Replace {Balanced Accuracy} with the respective given metrics.

Feature Importance

Computes feature importance of protected features using leave one out analysis.

cre_sco_obj.feature_importance()

Output:

Root Cause

Computes the importance of variables contributing to the bias.

cre_sco_obj.root_cause()

Output:

Mitigate

User can choose methods to mitigate the bias.

mitigated = cre_sco_obj.mitigate(p_var=[], method=['reweigh', 'correlation', 'threshold'])

Output:

Explain

Runs the transparency analysis - global & local interpretability, partial dependence analysis and permutation importance

#run the entire transparency analysis
cre_sco_obj.explain()

Output:

#get the local interpretability plot for specific row index and model
cre_sco_obj.explain(local_row_num = 20)

Output:

Compile

Generates model artifact file in JSON format. This function also runs all the API functions if it hasn't already been run.

cre_sco_obj.compile()

Output:

Model Artifact

A JSON file that stores all the results from all the APIs.

Output:

Examples

You may refer to our example notebooks below to see how the toolkit can be applied:

Filename Description
CS_Demo.ipynb Tutorial notebook to diagnose a credit scoring model for predicting customers' loan repayment.
CM_Demo.ipynb Tutorial notebook to diagnose a customer marketing uplift model for selecting existing customers for a marketing call to increase the sales of loan product.
BaseClassification_demo.ipynb Tutorial notebook for a multi-class propensity model
BaseRegression_demo.ipynb Tutorial notebook for a prediciton of a continuous target variable
PUW_demo.ipynb Tutorial notebook for a binary classification model to predict whether to award insurance policy by assessing risk
NewUseCaseCreation_demo.ipynb Tutorial notebook to create a new use case note-book and add custom metrics
nonPythonModel_customMetric_demo.ipynb Tutorial notebook to diagnose a credit scoring model by LibSVM (non-Python) with custom metric.

License

Veritas Toolkit is licensed under the Apache License, Version 2.0 - see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

veritastool-2.0.2.tar.gz (405.5 kB view details)

Uploaded Source

Built Distribution

veritastool-2.0.2-py3-none-any.whl (419.5 kB view details)

Uploaded Python 3

File details

Details for the file veritastool-2.0.2.tar.gz.

File metadata

  • Download URL: veritastool-2.0.2.tar.gz
  • Upload date:
  • Size: 405.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for veritastool-2.0.2.tar.gz
Algorithm Hash digest
SHA256 629ed1112c16bd19a85a03bfbcd4138a0b13d823de9aaa537c9a2f689343e64d
MD5 e34547a451d782aae913ae76f432bccf
BLAKE2b-256 9c1feb84cfd6e3eff54e3caed29ad0bbac18fb19866812d4e497e6248cb94567

See more details on using hashes here.

File details

Details for the file veritastool-2.0.2-py3-none-any.whl.

File metadata

  • Download URL: veritastool-2.0.2-py3-none-any.whl
  • Upload date:
  • Size: 419.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for veritastool-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ecccb9c258c39da6d2a7a414e2906191c4c72e9629ef85a5eec78ef90ac73045
MD5 a5ab277ac2c15900082b45c76ab9266f
BLAKE2b-256 82ed32ec859db24b2b219e00aec56d3ab38950e53ac772648e69c80f046fec0f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page