Bias Detector is a python package for detecting bias in machine learning models

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.6
Topic
- Software Development :: Libraries
- Software Development :: Libraries :: Python Modules

Project description

Bias Detector

Bias Detector is a python package for detecting gender/race bias in binary classification models.

Based on first and last name/zip code the package analyzes the probability of the user belonging to different genders/races. Then, the model predictions per gender/race are compared using various bias metrics.

Using this package you will be able to gain insight into whether your model is biased or not, and if so, how much bias was found.

The Bias Detector is based on statistical data from the US, and therefore it performs best with US originated data. However, it is possible to optimize the package for other countries by adding the relevant statistical information. If you have such data, you can open a Pull Request to add it to the package.

If you have any questions please let us know. You can learn more about our research here.

Supported Metrics

There are many metrics which can possibly be used to detect Bias, we currently support the following three:

Statistical Parity - tests whether the probability of 2 groups to be classified as belonging to the positive class by the model is equal.
Equal Opportunity - tests whether the True Positive Rates of 2 groups are equal (how likely is the model to predict correctly the positive class for each group).
Predictive Equality - tests whether there False Positive Rates of 2 groups are equal (how likely is the model to predict incorrectly the positive class for each group).

Usage

Install the package

!pip install bias-detector

Create a bias detector instance:

from bias_detector.BiasDetector import BiasDetector
bias_detector = BiasDetector(country='US')

Generate bias report:

bias_report = bias_detector.get_bias_report(first_names=first_names, last_names=last_names, zip_codes=zip_codes, y_true=y_true, y_pred=y_pred)

Visualize the bias report:

bias_report.plot_summary()

bias_report.print_summary()

Statistical Parity:

P(pred=1|Male)-P(pred=1|Female)=0.55-0.49=0.053±0.026 (p-value=1e-07)

Equal Opportunity:

TPR_Male-TPR_Female=0.56-0.51=0.047±0.036 (p-value=0.00097)

Predictive Equality:

FPR_Male-FPR_Female=0.54-0.48=0.06±0.036 (p-value=2e-05)

bias_report.plot_groups()

Show gender/race correlation with model features:

bias_detector.get_features_groups_correlation(first_names=first_names, last_names=last_names, zip_codes=zip_codes, features=features)

Sample output from the Titanic demo:

	male_correlation	female_correlation	white_correlation	black_correlation	api_correlation	hispanic_correlation	native_correlation
ticket_class	-0.243730	0.010038	-0.122978	-0.152287	0.128161	-0.003452	-0.029846
age	0.234712	-0.168692	0.165937	-0.059513	-0.044503	-0.058893	0.036010
num_of_siblings_or_spouses_aboard	0.027651	0.025737	0.029292	0.066896	-0.061708	-0.072092	0.138135
num_of_parents_or_children_aboard	0.057575	0.042770	0.048623	0.099354	-0.064993	-0.100496	0.064185
fare	0.053703	0.071300	0.076330	0.061158	-0.001893	-0.067631	0.058121
embarked_Cherbourg	-0.073627	-0.013599	-0.093890	-0.075407	-0.007720	0.124144	-0.020478
embarked_Queenstown	-0.019206	0.169752	0.110737	-0.049664	-0.049379	0.011407	-0.054550
embarked_Southampton	0.082538	-0.090631	0.011149	0.100265	0.038108	-0.116438	0.050909
sex_female	-0.327044	0.615262	0.047330	0.073640	-0.051959	0.074259	0.011737
sex_male	0.327044	-0.615262	-0.047330	-0.073640	0.051959	-0.074259	-0.011737

Fuzzy extraction of first/last names from emails:

bias_detector.fuzzily_get_emails_full_names(emails)

This method will return a DataFrame with first_name and last_name columns fuzzily extracted from the users emails. Note that the accuracy of this method varies between emails and data sets.

Sample output for synthetic emails:

email	first_name	last_name
holleybeverly@gmail.com	holley	beverly
breweradrienne@gmail.com	adrienne	brewer
craigreed@gmail.com	craig	reed
battagliahenry@gmail.com	henry	battaglia
apaget@gmail.com		paget
briana@gmail.com	briana
apena@gmail.com		pena
jacka@gmail.com		jacka
mattiea@gmail.com	mattie
patricia_calder@gmail.com	patricia	calder

Contributing

See CONTRIBUTING.md.

References

Elhanan Mishraky, Aviv Ben Arie, Yair Horesh, Shir Meir Lador, 2021. Bias Detection by Using Name Disparity Tables Across Protected Groups, DOI: 10.1016/j.jrt.2021.100020
Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, Aram Galstyan, 2019. A Survey on Bias and Fairness in Machine Learning.
Moritz Hardt, Eric Price, Nathan Srebro, 2016. Equality of Opportunity in Supervised Learning.
Ioan Voicu (2018) Using First Name Information to Improve Race and Ethnicity Classification, Statistics and Public Policy, 5:1, 1-13, DOI: 10.1080/2330443X.2018.1427012

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.6
Topic
- Software Development :: Libraries
- Software Development :: Libraries :: Python Modules

Release history Release notifications | RSS feed

This version

0.0.13

Feb 4, 2024

0.0.12

Apr 22, 2021

0.0.11

Feb 9, 2021

0.0.10

Feb 8, 2021

0.0.9

Feb 7, 2021

0.0.8

Feb 7, 2021

0.0.7

Feb 7, 2021

0.0.6

Feb 7, 2021

0.0.5

Feb 4, 2021

0.0.4

Feb 4, 2021

0.0.3

Feb 4, 2021

0.0.2

Feb 3, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bias-detector-0.0.13.tar.gz (251.1 kB view details)

Uploaded Feb 4, 2024 Source

Built Distribution

bias_detector-0.0.13-py3-none-any.whl (260.5 kB view details)

Uploaded Feb 4, 2024 Python 3

File details

Details for the file bias-detector-0.0.13.tar.gz.

File metadata

Download URL: bias-detector-0.0.13.tar.gz
Upload date: Feb 4, 2024
Size: 251.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.7.17

File hashes

Hashes for bias-detector-0.0.13.tar.gz
Algorithm	Hash digest
SHA256	`35f0ae1467028187577ebb262e9f64b4b56f45f7886cc7bff402bf168ffd2172`
MD5	`853b8569fe47704eacb4eb0bfe0c9ac1`
BLAKE2b-256	`074910c85283227ebf625386b074f2916c8e08b78da68686ad1eca510f2ea91a`

See more details on using hashes here.

File details

Details for the file bias_detector-0.0.13-py3-none-any.whl.

File metadata

Download URL: bias_detector-0.0.13-py3-none-any.whl
Upload date: Feb 4, 2024
Size: 260.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.7.17

File hashes

Hashes for bias_detector-0.0.13-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0de5f132c668bafe08094b6b341cddf28af6a734d66491d90b553f3e91c88a3c`
MD5	`f360d853f961bd4bd872944f49e68e07`
BLAKE2b-256	`071d2742fed8a640d82f16ca6b87525663b230d9fb989a8b95847e50507b9cf8`

See more details on using hashes here.

bias-detector 0.0.13

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Bias Detector

Supported Metrics

Usage

Contributing

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes