Skip to main content

Measure bias from data and machine learning models.

Project description

Parity

Overview

This repository contains codes that demonstrate the use of fairness metrics, bias mitigations and explainability tool.

Installation

Install using:

foo@bar:~$ pip install parity-fairness

Bias Measurement Usage

Setup the data such that the target column is a binary string target. Then find out which features are the privileged categories and which values are privileged values. Afterwards, feed them into the function called show_bias like:

from parity.fairness_metrics import show_bias

priv_category = 'Race-White'
priv_value = 'True'
target_label = 'high pay'
unencoded_target_label = 'True'
cols_to_drop = ''

show_bias(data, priv_category, priv_value, target_label, unencoded_target_label, cols_to_drop)

Bias and Fairness

A common problem with most machine learning models is bias from data. This notebook shows how to measure those biases and perform bias mitigation. A python package called aif360 can give us metrics and algorithms for bias measurement and mitigation

Metrics

  • Statistical Parity Difference
  • Equal Opportunity Difference
  • Average Absolute Odds Difference
  • Disparate Impact
  • Theil Index

Statistical Parity Difference

This measure is based on the following formula :

Formula for Statistical Parity Difference.

Statistical imparity is the difference between the probability that a random individual drawn from unprivileged is labeled 1 (so here that he has more than 50K for income) and the probability that a random individual from privileged is labeled 1.

Fairer scores are close to 0.

More documentation here One definition of algorithmic fairness: statistical parity.

Equal Opportunity Difference

This metric is just a difference between the true positive rate of unprivileged group and the true positive rate of privileged group.

Formula for Equal Opportunity Difference.

Fairer scores are close to 0.

Average Absolute Odds Difference

This measure is using both false positive rate and true positive rate to calculate the bias.

Formula for Average Absolute Odds Difference.

Fairer scores are close to 0.

Disparate Impact

For this metric we use the following formula :

Formula for Disparate Impact.

Like the first metric we use both probabities of a random individual drawn from unprivileged or privileged with a label of 1 but here it's a ratio.

Better disparate impact should be closer to 1.

Theil Index

This measure is also known as the generalized entropy index but with $\alpha$ equals to 1. More information here Generalized Entropy Index).

Formula for Theil Index.

Fairer scores are close to 0.

Some metrics need predictions while others just the original dataset. This is why we will use 2 classes of the aif360 package : ClassificationMetric and BinaryLabelDatasetMetric.

For metrics that require predictions:

For metrics that don't require predictions:

Sample Display

The fairness metrics should display like this:

Sample image of the fairness metrics.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parity-fairness-0.1.11.tar.gz (14.6 kB view details)

Uploaded Source

File details

Details for the file parity-fairness-0.1.11.tar.gz.

File metadata

  • Download URL: parity-fairness-0.1.11.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/47.3.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.6

File hashes

Hashes for parity-fairness-0.1.11.tar.gz
Algorithm Hash digest
SHA256 3418551c7aaab324735efdced5fdbd0b3370fded2e4eee3d0fe46bb74845de7d
MD5 6673385ca3dc4ee1812625a280fc938e
BLAKE2b-256 e54ef9a4854c8f1224d08b4c1e803b6121b4eb306036abfa18ded90a7ae22d69

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page