Library to creat causal model and mitigate the bias.
Project description
FLAI : Fairness Learning in Artificial Intelligence
Python library developed by Rubén González during his phD. research. His mission? To mitigate bias and discrimination through the application of causal algorithms.
Overview
FLAI is a Python library designed with two key functionalities: building a causal algorithm and mitigating biases within it.
-
Causal Algorithm Creation: This library facilitates the development of a reliable causal algorithm, setting the stage for impartial data analysis.
-
Bias Mitigation: Fairness is pursued in two significant areas - In-Training and Pre-Training.
In-Training Mitigation
The library includes features that allow the user to adjust the causal algorithm in two essential ways:
- Graph Relationship Modification: Relationships within the graph can be modified to establish a more balanced structure.
- Probability Table Modification: The probability table can be adjusted to prevent propagation or amplification of existing biases.
Pre-Training Mitigation
With the mitigated causal algorithm, a bias-free dataset can be generated. This dataset can be used for the training of other algorithms, enabling the bias mitigation process to extend to the initial stages of new model development.
Installation
FLAI can be easily installed using pip, Python's package installer. Open your terminal or command prompt and type the following command:
pip install flai-causal
Features
Causal Creation
from FLAI import data
from FLAI import causal_graph
import pandas as pd
df = pd.read_pickle('../Data/adult.pickle')
flai_dataset = data.Data(df, transform=True)
flai_graph = causal_graph.CausalGraph(flai_dataset, target = 'label')
flai_graph.plot(directed = True)
Causal Mitigation
Relations Mitigation
flai_graph.mitigate_edge_relation(sensible_feature=['sex','age'])
Table Probabilities Mitigation
flai_graph.mitigate_calculation_cpd(sensible_feature = ['age','sex'])
Inference
Assess the impact of sensitive features before mitigation. Sex, Age and Label 0 is the unfavorable value.
flai_graph.inference(variables=['sex','label'], evidence={})
flai_graph.inference(variables=['age','label'], evidence={})
sex | label | p |
---|---|---|
0 | 0 | 0.1047 |
0 | 1 | 0.2053 |
1 | 0 | 0.1925 |
1 | 1 | 0.4975 |
age | label | p |
---|---|---|
0 | 0 | 0.0641 |
0 | 1 | 0.1259 |
1 | 0 | 0.2331 |
1 | 1 | 0.5769 |
Assess the impact of sensitive features after mitigation. Changes in sex or age not affect the output.
mitigated_graph.inference(variables=['sex','label'], evidence={})
mitigated_graph.inference(variables=['age','label'], evidence={})
sex | label | p |
---|---|---|
0 | 0 | 0.1498 |
0 | 1 | 0.3502 |
1 | 0 | 0.1498 |
1 | 1 | 0.3502 |
age | label | p |
---|---|---|
0 | 0 | 0.1498 |
0 | 1 | 0.3502 |
1 | 0 | 0.1498 |
1 | 1 | 0.3502 |
Fair Data
fair_data = flai_graph.generate_dataset(n_samples = 1000, methodtype = 'bayes')
Train Algorithm With Fair Data.
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
mitigated_X = fair_data.data[['age', 'sex', 'credit_history','savings','employment' ]]
mitigated_y = fair_data.data[['label']]
mitigated_X_train, mitigated_X_test, mitigated_y_train, mitigated_y_test = train_test_split(mitigated_X,
mitigated_y, test_size=0.7, random_state=54)
model_mitigated = XGBClassifier()
model_mitigated.fit(mitigated_X_train, mitigated_y_train)
metrics = mitigated_dataset.fairness_metrics(target_column='label', predicted_column = 'Predicted',
columns_fair = {'sex' : {'privileged' : 1, 'unprivileged' : 0},
'age' : {'privileged' : 1, 'unprivileged' : 0}})
Metrics Performance
ACC | TPR | FPR | FNR | PPP | |
---|---|---|---|---|---|
model | 0.7034 | 0.97995 | 0.94494 | 0.02005 | 0.96948 |
sex_privileged | 0.7024 | 0.97902 | 0.94363 | 0.02098 | 0.96841 |
sex_unprivileged | 0.7044 | 0.98087 | 0.94626 | 0.01913 | 0.97055 |
age_privileged | 0.7042 | 0.97881 | 0.94118 | 0.02119 | 0.96758 |
age_unprivileged | 0.7026 | 0.98109 | 0.94872 | 0.01891 | 0.97139 |
Metrics Fairness
EOD | DI | SPD | OD | |
---|---|---|---|---|
sex_fair_metrics | 0.00185 | 1.00221 | 0.00214 | 0.00448 |
age_fair_metrics | 0.00228 | 1.00394 | 0.00382 | 0.00981 |
Shap Results
import shap
explainer_original = shap.Explainer(model_original)
explainer_mitigated = shap.Explainer(model_mitigated)
shap_values_orignal = explainer_original(original_dataset.data[
['sex', 'race','age','education']])
shap_values_mitigated = explainer_mitigated(original_dataset.data[
['sex', 'race', 'age','education']])
shap.plots.beeswarm(shap_values_orignal)
shap.plots.bar(shap_values_orignal)
shap.plots.beeswarm(shap_values_mitigated)
shap.plots.bar(shap_values_mitigated)
Citation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for FLAI_CAUSAL-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4fa063ba5b869d79473a4f6aea441937137e696d57bf886f257d227a9fccced9 |
|
MD5 | 2313de5baca7443563524afcbbb4bb35 |
|
BLAKE2b-256 | 1978ecd97a25dc9629a185bd878f24dbf723c2a0d287cd8797088608632e0fb0 |