Skip to main content

Bias Detector is a python package for detecting bias in machine learning models

Project description

codecov CircleCI PyPI version

Bias Detector

Bias Detector is a python package for detecting bias in machine learning models used for making high stakes decisions.

Based on email address/first and last name/zip code the package analyzes the probability of the user belonging to different genders/races. Then, the model predictions per gender/race are compared using various bias metrics.

Using this package the data scientist would be able to get insight on whether their model is biased or not.

The Bias Detector developers can be contacted on Stack Overflow using the bias-detector tag. We would appreciate your feedback!

Supported Metrics

There are many metrics which can possibly be used to detect Bias, we currently support the following three:

  1. Statistical Parity - tests whether the probability of 2 groups to be classified as belonging to the positive class by the model is equal.
  2. Equal Opportunity - tests whether the True Positive Rates of 2 groups are equal (how likely is the model to predict correctly the positive class for each group).
  3. Predictive Equality - tests whether there False Positive Rates of 2 groups are equal (how likely is the model to predict incorrectly the positive class for each group).

Attention

The Bias Detector is based on statistical data from the US and therefore should be used only with US originated data. We hope to support more countries in the future.

Usage

Install the package

!pip install bias-detector

Calculate bias metrics based on users data, y_true and y_pred:

from bias_detector.BiasDetector import BiasDetector
bias_report = BiasDetector().get_bias_report(first_names=first_names, last_names=last_names, zip_codes=zip_codes, y_true=y_true, y_pred=y_pred, country='US')
bias_report.plot_summary()

Example for the report output:

bias_report.plot_summary()

bias_report.print_summary()

  • Statistical Parity:
  • We observed the following statistically significant differences:
    • P(pred=1|Male)-P(pred=1|Female)=0.55-0.49=0.053±0.026 (𝛼=0.01,p-value=1e-07)
  • Equal Opportunity:
  • We observed the following statistically significant differences:
    • TPRMale-TPRFemale=0.56-0.51=0.047±0.036 (𝛼=0.01,p-value=0.00097)
  • Predictive Equality:
  • We observed the following statistically significant differences:
    • FPRMale-FPRFemale=0.54-0.48=0.06±0.036 (𝛼=0.01,p-value=2e-05)

bias_report.plot_groups()

Contributing

See CONTRIBUTING.md

References

  1. NINAREH MEHRABI, FRED MORSTATTER, NRIPSUTA SAXENA, KRISTINA LERMAN, and ARAM GALSTYAN, 2019. A Survey on Bias and Fairness in Machine Learning.
  2. Moritz Hardt, Eric Price, Nathan Srebro, 2016. Equality of Opportunity in Supervised Learning.
  3. Ioan Voicu (2018) Using First Name Information to Improve Race and Ethnicity Classification, Statistics and Public Policy, 5:1, 1-13, DOI: 10.1080/2330443X.2018.1427012

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bias-detector-0.0.5.tar.gz (247.2 kB view hashes)

Uploaded Source

Built Distribution

bias_detector-0.0.5-py3-none-any.whl (258.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page