Various attributes inferences attacks tested against fairness enforcing mechanisms
Project description
AIA_fairness
Code repository of the article. The repositries segments the pipeline as to isolate key componants of the experimental analysis.
Installation and dependencies
By using pypi :
pip install aia_fairness
Or clone the repository then run
pip install --editable .
in the directory containing pyproject.toml
All dependencies are specified in pyproject.toml and installed automatically by pip.
Configuration
Default configuration is loaded automatically and can be found at src/aia_fairness/config/defaulf.py
.
To set custom configuration, first run
python -m aia_fairness.config
It will create a file config.py
in your current directory containing the default configuration.
You can then edit this at you liking.
If the file config.py
exist in your current directory it is loaded.
If not, the default configuration is loaded.
How to use
Dataset automatic download and processing
Part of the dataset uses the kaggle API to download the data.
Hence you need to include your API key in ~/.kaggle/kaggle.json
.
aia_fairness provides automatic download, formatic and saving of the dataset used in the article.
To use this feature and thus saving data in the data_format
directory simply run once
python -m aia_fairness.dataset_processing.fetch
Then you can load any dataset easily form anywhere in your code with
import aia_fairness.dataset_processing as dp
data = dp.load_format(dset, attrib)
Dataset evaluation
Each dataset can be evaluated along different axes:
- Sizes
from aia_fairness.dataset_processing import metric metric.counting(<dataset>)
- Fairness
from aia_fairness.dataset_processing import metric metric.dp_lvl(<dataset>, <attribute>)
To run all the evalution simply execute
python -m aia_fairness.dataset_processing.evaluation
You can refer to the full implementation in test/target_training.py
.
Running all the expriments form the paper
** Heavy computing power required **
One you have downloaded all the data you can run all the expirement of the article by running
python -m aia_fairness.experimental_stack
or import aia_fairness.experimental_stack
in the python interpreter.
Plotting all the experiments of the paper
Run the same shell command as for running the experiment adding the plot
argument:
python -m aia_fairness.experimental_stack plot
Training a target model
aia_fairness.models.target
contains various target model type.
The target models available are :
RandomForest
RandomForest_EGD
fairlearn is used to impose EGDNeuralNetwork
NeuralNetwork_EGD
NeuralNetwork_Fairgrad
Original implementation of the fairgrad paperNeuralNetwork_AdversarialDebiasing
Uses the fairlearn implementation of Adversarial Debisaing
For instance to train a random forest (based on sklearn) you can
import aia_fairness.models.target as targets
T = dp.split(data,0)
target.fit(T["train"]["x"], T["train"]["y"])
yhat = target.predict(T["test"]["x"])
Evaluation of a target model
aia_fairness.evaluation
provides the metrics used in the article.
import aia_fairness.evaluation as evaluations
utility = evaluations.utility()
fairness = evaluations.fairness()
utility.add_result(T["test"]["y"], yhat, T["test"]["z"])
fairness.add_result(T["test"]["y"], yhat, T["test"]["z"])
utility.save(type(target).__name__,dset,attrib)
fairness.save(type(target).__name__,dset,attrib)
The save method, as called in the example, creates a directory structure of the form :
#|result/target/<target type>
#| |<name of the metric class>
#| | |<dset>
#| | | |<attrib>
#| | | | <Name of metric 1>.pickle
#| | | | <Name of metric 2>.pickle
#| | | | <Name of metric ..>.pickle
#| |<name of another metric class>
#| | |<dset>
#| | | |<attrib>
#| | | | <Name of another metric 1>.pickle
#| | | | <Name of another metric 2>.pickle
#| | | | <Name of another metric ..>.pickle
Training an attack
aia_fairness.models
provides the two types of AIA attack described in the paper:
Classification
for hard labelsRegression
for soft labels
from aia_fairness.models import attack as attacks
aux = {"y":T["test"]["y"],
"z":T["test"]["z"],
"yhat":yhat}
aux_split = dp.split(aux,0)
classif = attacks.Classification()
classif.fit(aux_split["train"]["yhat"], aux_split["train"]["z"])
Evaluation of an attack
Similarly with evaluating the target model aia_fairness.evaluation.attack
can be used to save the accuracy and the balanced accuracy of the attack.
Plots and graphs
TODO
Tests scripts
Various tests are provided in the test
directory.
download_data.py
Fetches all the data form the diferent sources (don't forget to set your kaggle API key)target_training.py
Loads a dataset, split it with 5-folding (cross validation), train a target model on the data, and computes the metrics for this target model
Directories structure
- data_processing contains code that downloads, preprocess and saves the dataset in a uniformed pickle format exploitable by the rest of the pipeline using
load_format(dataset,attribute)
function of the utils.py file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for aia_fairness-0.2.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8749615b1cdac0216d8c3e97f6e4b697b10e1389730b80b46762cb03f8ee484c |
|
MD5 | 1ef3942ed0ab82da82d97153af7bde48 |
|
BLAKE2b-256 | f2b9b194ce54df69425e7341047bf4b23ced3fdc1168288d1e698bac93141f42 |