Measure bias from data and machine learning models.
Project description
# Parity
### Overview This repository contains codes that demonstrate the use of fairness metrics, bias mitigations and explainability tool.
# Bias and Fairness
A common problem with most machine learning models is bias from data. This notebook shows how to measure those biases and perform bias mitigation. A python package called [aif360](https://github.com/IBM/AIF360) can give us metrics and algorithms for bias measurement and mitigation
### Metrics
Statistical Parity Difference
Equal Opportunity Difference
Average Absolute Odds Difference
Disparate Impact
Theil Index
### Statistical Parity Difference
This measure is based on the following formula :
![Formula for Statistical Parity Difference.](images/spd.png)
Statistical imparity is the difference between the probability that a random individual drawn from unprivileged is labeled 1 (so here that he has more than 50K for income) and the probability that a random individual from privileged is labeled 1.
Fairer scores are close to 0.
More documentation here [One definition of algorithmic fairness: statistical parity](https://jeremykun.com/2015/10/19/one-definition-of-algorithmic-fairness-statistical-parity/).
### Equal Opportunity Difference
This metric is just a difference between the true positive rate of unprivileged group and the true positive rate of privileged group.
![Formula for Equal Opportunity Difference.](images/eod.png)
Fairer scores are close to 0.
### Average Absolute Odds Difference
This measure is using both false positive rate and true positive rate to calculate the bias.
![Formula for Average Absolute Odds Difference.](images/aaod.png)
Fairer scores are close to 0.
### Disparate Impact
For this metric we use the following formula :
![Formula for Disparate Impact.](images/di.png)
Like the first metric we use both probabities of a random individual drawn from unprivileged or privileged with a label of 1 but here it’s a ratio.
Better disparate impact should be closer to 1.
### Theil Index
This measure is also known as the generalized entropy index but with $alpha$ equals to 1. More information here [Generalized Entropy Index](https://en.wikipedia.org/wiki/Generalized_entropy_index)).
![Formula for Theil Index.](images/ti.png)
Fairer scores are close to 0.
Some metrics need predictions while others just the original dataset. This is why we will use 2 classes of the aif360 package : ClassificationMetric and BinaryLabelDatasetMetric.
### For metrics that require predictions: * [Equal Opportunity Difference: ](https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.ClassificationMetric.equal_opportunity_difference) equal_opportunity_difference() * [Average Absolute Odds Difference: ](https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.ClassificationMetric.average_abs_odds_difference) average_abs_odds_difference() * [Theil Index : ](https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.ClassificationMetric.theil_index) theil_index()
### For metrics that don’t require predictions: * [Statistical Parity Difference: ](https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.BinaryLabelDatasetMetric.statistical_parity_difference) statistical_parity_difference() * [Disparate Impact: ](https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.ClassificationMetric.disparate_impact) disparate_impact()
### Sample Display
The fairness metrics should display like this:
![Sample image of the fairness metrics.](images/bias_metrics.png)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file parity-fairness-0.1.2.tar.gz
.
File metadata
- Download URL: parity-fairness-0.1.2.tar.gz
- Upload date:
- Size: 6.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/47.3.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 92ac09af3257e3bc1c38fbaac96333568b90a6dcd69aee73d4c18a5ccf427320 |
|
MD5 | 226f5df0ef442f085ac3a0ed3b6b0df8 |
|
BLAKE2b-256 | 3487924bedfcfcfe3e624bc83186594763b6bfcb918a3c67251ac2abb96259e3 |