Skip to main content

Identify bias and measure fairness of your data

Project description

This is a fork of the fairlens project. We went back to version 0.1.0 and modified some dependencies in order to include it in gabarit (https://github.com/OSS-Pole-Emploi/gabarit)

FairLens Logo

CI PyPI Python version License

FairLens

FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quickly identify bias, and provides multiple metrics to measure fairness across a range of sensitive and legally protected characteristics such as age, race and sex.

Bias in my data?

It's very simple to quickly start understanding any biases that may be present in your data.

import pandas as pd
import fairlens as fl

# Load in the data
df = pd.read_csv("datasets/compas.csv")

# Automatically generate a report
fscorer = fl.FairnessScorer(
    df,
    target_attribute="RawScore",
    sensitive_attributes=[
        "Sex",
        "Ethnicity",
        "MaritalStatus"
    ]
)
fscorer.demographic_report()
Sensitive Attributes: ['Ethnicity', 'MaritalStatus', 'Sex']

                         Group Distance  Proportion  Counts   P-Value
African-American, Single, Male    0.249    0.291011    5902 3.62e-251
      African-American, Single    0.202    0.369163    7487 1.30e-196
                       Married    0.301    0.134313    2724 7.37e-193
        African-American, Male    0.201    0.353138    7162 4.03e-188
                 Married, Male    0.281    0.108229    2195 9.69e-139
              African-American    0.156    0.444899    9023 3.25e-133
                      Divorced    0.321    0.063754    1293 7.51e-112
            Caucasian, Married    0.351    0.049504    1004 7.73e-106
                  Single, Male    0.121    0.582910   11822  3.30e-95
           Caucasian, Divorced    0.341    0.037473     760  1.28e-76

Weighted Mean Statistical Distance: 0.14081832462333957

Check out the documentation to get started, or try out FairLens now in Google Colab!

See some of our previous blog posts for our take on bias and fairness in ML:

Core Features

Some of the main features of Fairlens are:

  • Measuring Bias - FairLens can be used to measure the extent and significance of biases in datasets using a wide range of statistical distances and metrics.

  • Sensitive Attribute and Proxy Detection - Data Scientists may be unaware of protected or sensitive attributes in their data, and potentially hidden correlations between these columns and other non-protected columns in their data. FairLens can quickly identify sensitive columns and flag hidden correlations and the non-sensitive proxies.

  • Visualization Tools - FairLens has a range of tools that be used to generate meaningful and descriptive diagrams of different distributions in the dataset before delving further in to quantify them. For instance, FairLens can be used to visualize the distribution of a target with respect to different sensitive demographics, or a correlation heatmap.

  • Fairness Scorer - The fairness scorer is a simple tool which data scientists can use to get started with FairLens. It is designed to just take in a dataset and a target variable and to automatically generate a report highlighting hidden biases, correlations, and containing various diagrams.

The goal of FairLens is to enable data scientists to gain a deeper understanding of their data, and helps to to ensure fair and ethical use of data in analysis and machine learning tasks. The insights gained from FairLens can be harnessed by the Bias Mitigation feature of the Synthesized platform, which is able to automagically remove bias using the power of synthetic data.

Installation

FairLens can be installed using pip

pip install fairlens

Contributing

FairLens is under active development, and we appreciate community contributions. See CONTRIBUTING.md for how to get started.

The repository's current roadmap is maintained as a Github project here.

License

This project is licensed under the terms of the BSD 3 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairlens_pe-0.2.0.tar.gz (32.8 kB view details)

Uploaded Source

Built Distribution

fairlens_pe-0.2.0-py3-none-any.whl (36.0 kB view details)

Uploaded Python 3

File details

Details for the file fairlens_pe-0.2.0.tar.gz.

File metadata

  • Download URL: fairlens_pe-0.2.0.tar.gz
  • Upload date:
  • Size: 32.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.18

File hashes

Hashes for fairlens_pe-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8b1e1c591a39dc4da3548f12e144e5d81bd8cc15de823e191add0446b1c70cf1
MD5 087b5b0b9b000a1a21e361b88a8b6985
BLAKE2b-256 1044ac22eae69315458be6576e28200d475e6338f90cf98c0784a7e94845593a

See more details on using hashes here.

File details

Details for the file fairlens_pe-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: fairlens_pe-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 36.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.18

File hashes

Hashes for fairlens_pe-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3fadd0687a844c39f43437eb95bb4635b305ca7eb7c704cde90a279a767b487d
MD5 e8c77394c840a4272c12bc837523842c
BLAKE2b-256 29e6c40711abf2e08d3cd4c053025b288a0cd7eb401349c92b4a7dc33540dc50

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page