Skip to main content

Measure bias from data and machine learning models.

Project description

# Parity

### Overview This repository contains codes that demonstrate the use of fairness metrics, bias mitigations and explainability tool.

# Bias and Fairness

A common problem with most machine learning models is bias from data. This notebook shows how to measure those biases and perform bias mitigation. A python package called [aif360](https://github.com/IBM/AIF360) can give us metrics and algorithms for bias measurement and mitigation

### Metrics

  • Statistical Parity Difference

  • Equal Opportunity Difference

  • Average Absolute Odds Difference

  • Disparate Impact

  • Theil Index

### Statistical Parity Difference

This measure is based on the following formula :

![Formula for Statistical Parity Difference.](images/spd.png)

Statistical imparity is the difference between the probability that a random individual drawn from unprivileged is labeled 1 (so here that he has more than 50K for income) and the probability that a random individual from privileged is labeled 1.

Fairer scores are close to 0.

More documentation here [One definition of algorithmic fairness: statistical parity](https://jeremykun.com/2015/10/19/one-definition-of-algorithmic-fairness-statistical-parity/).

### Equal Opportunity Difference

This metric is just a difference between the true positive rate of unprivileged group and the true positive rate of privileged group.

![Formula for Equal Opportunity Difference.](images/eod.png)

Fairer scores are close to 0.

### Average Absolute Odds Difference

This measure is using both false positive rate and true positive rate to calculate the bias.

![Formula for Average Absolute Odds Difference.](images/aaod.png)

Fairer scores are close to 0.

### Disparate Impact

For this metric we use the following formula :

![Formula for Disparate Impact.](images/di.png)

Like the first metric we use both probabities of a random individual drawn from unprivileged or privileged with a label of 1 but here it’s a ratio.

Better disparate impact should be closer to 1.

### Theil Index

This measure is also known as the generalized entropy index but with $alpha$ equals to 1. More information here [Generalized Entropy Index](https://en.wikipedia.org/wiki/Generalized_entropy_index)).

![Formula for Theil Index.](images/ti.png)

Fairer scores are close to 0.

Some metrics need predictions while others just the original dataset. This is why we will use 2 classes of the aif360 package : ClassificationMetric and BinaryLabelDatasetMetric.

### For metrics that require predictions: * [Equal Opportunity Difference: ](https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.ClassificationMetric.equal_opportunity_difference) equal_opportunity_difference() * [Average Absolute Odds Difference: ](https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.ClassificationMetric.average_abs_odds_difference) average_abs_odds_difference() * [Theil Index : ](https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.ClassificationMetric.theil_index) theil_index()

### For metrics that don’t require predictions: * [Statistical Parity Difference: ](https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.BinaryLabelDatasetMetric.statistical_parity_difference) statistical_parity_difference() * [Disparate Impact: ](https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.ClassificationMetric.disparate_impact) disparate_impact()

### Sample Display

The fairness metrics should display like this:

![Sample image of the fairness metrics.](images/bias_metrics.png)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parity-fairness-0.1.2.tar.gz (6.4 kB view details)

Uploaded Source

File details

Details for the file parity-fairness-0.1.2.tar.gz.

File metadata

  • Download URL: parity-fairness-0.1.2.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/47.3.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for parity-fairness-0.1.2.tar.gz
Algorithm Hash digest
SHA256 92ac09af3257e3bc1c38fbaac96333568b90a6dcd69aee73d4c18a5ccf427320
MD5 226f5df0ef442f085ac3a0ed3b6b0df8
BLAKE2b-256 3487924bedfcfcfe3e624bc83186594763b6bfcb918a3c67251ac2abb96259e3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page