Skip to main content

Identify bias and measure fairness of your data

Project description

FairLens Logo

Open In Colab Documentation Status CI PyPI PyPI - Downloads Python version License Code style: black GitHub Repo stars

FairLens

FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quickly identify bias, and provides multiple metrics to measure fairness across a range of sensitive and legally protected characteristics such as age, race and sex.

Bias in my data?

It's very simple to quickly start understanding any biases that may be present in your data.

import pandas as pd
import fairlens as fl

# Load in the data
df = pd.read_csv("datasets/compas.csv")

# Automatically generate a report
fscorer = fl.FairnessScorer(
    df,
    target_attribute="RawScore",
    sensitive_attributes=[
        "Sex",
        "Ethnicity",
        "MaritalStatus"
    ]
)
fscorer.demographic_report()
Sensitive Attributes: ['Ethnicity', 'MaritalStatus', 'Sex']

                         Group Distance  Proportion  Counts   P-Value
African-American, Single, Male    0.249    0.291011    5902 3.62e-251
      African-American, Single    0.202    0.369163    7487 1.30e-196
                       Married    0.301    0.134313    2724 7.37e-193
        African-American, Male    0.201    0.353138    7162 4.03e-188
                 Married, Male    0.281    0.108229    2195 9.69e-139
              African-American    0.156    0.444899    9023 3.25e-133
                      Divorced    0.321    0.063754    1293 7.51e-112
            Caucasian, Married    0.351    0.049504    1004 7.73e-106
                  Single, Male    0.121    0.582910   11822  3.30e-95
           Caucasian, Divorced    0.341    0.037473     760  1.28e-76

Weighted Mean Statistical Distance: 0.14081832462333957

Check out the documentation to get started, or try out FairLens now in Google Colab!

See some of our previous blog posts for our take on bias and fairness in ML:

Core Features

Some of the main features of Fairlens are:

  • Measuring Bias - FairLens can be used to measure the extent and significance of biases in datasets using a wide range of statistical distances and metrics.

  • Sensitive Attribute and Proxy Detection - Data Scientists may be unaware of protected or sensitive attributes in their data, and potentially hidden correlations between these columns and other non-protected columns in their data. FairLens can quickly identify sensitive columns and flag hidden correlations and the non-sensitive proxies.

  • Visualization Tools - FairLens has a range of tools that be used to generate meaningful and descriptive diagrams of different distributions in the dataset before delving further in to quantify them. For instance, FairLens can be used to visualize the distribution of a target with respect to different sensitive demographics, or a correlation heatmap.

  • Fairness Scorer - The fairness scorer is a simple tool which data scientists can use to get started with FairLens. It is designed to just take in a dataset and a target variable and to automatically generate a report highlighting hidden biases, correlations, and containing various diagrams.

The goal of FairLens is to enable data scientists to gain a deeper understanding of their data, and helps to to ensure fair and ethical use of data in analysis and machine learning tasks. The insights gained from FairLens can be harnessed by the Bias Mitigation feature of the Synthesized platform, which is able to automagically remove bias using the power of synthetic data.

Installation

FairLens can be installed using pip

pip install fairlens

Contributing

FairLens is under active development, and we appreciate community contributions. See CONTRIBUTING.md for how to get started.

The repository's current roadmap is maintained as a Github project here.

License

This project is licensed under the terms of the BSD 3 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairlens-0.1.0.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

fairlens-0.1.0-py3-none-any.whl (36.1 kB view details)

Uploaded Python 3

File details

Details for the file fairlens-0.1.0.tar.gz.

File metadata

  • Download URL: fairlens-0.1.0.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.9.6

File hashes

Hashes for fairlens-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8f5ae43370cf1a7e87594e5af228714c0c5b720effda0d1bc1d26c55690222b5
MD5 cb8447c09b9af3afb15097121fe3caf3
BLAKE2b-256 d847ac830df22760b0be93fb7e586296e149e73c539ca173d2d2c5ba3941d8ee

See more details on using hashes here.

File details

Details for the file fairlens-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: fairlens-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 36.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.9.6

File hashes

Hashes for fairlens-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41be0f3a052e846641245120c8ccc6c5f3a4d2b45e19116535a6498ad3c17193
MD5 16d0e91e60add13b5000152919fffb86
BLAKE2b-256 bf4c66f3abfb8bb25ab457a5c435ce74617def2e8878f98d96a3d8b4f7b57c6d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page