Skip to main content

Detect, Recognize and de-bias textual data.

Project description

Detecting Bias and ensuring Fairness in AI solutions

This package is used to detect and mitigate biases in NLP tasks. The model is an end-to-end framework that takes data into a raw form, preprocess it, detect the various types of biases and mitigate them. The output is the text that is free from bias.

Installation

Use the package manager pip to install Dbias.

pip install Dbias
pip install https://huggingface.co/d4data/en_pipeline/resolve/main/en_pipeline-any-py3-none-any.whl

Usage

To de-bias a news article

from Dbias.text_debiasing import * 

# returns unbiased recommendations for a given sentence fragment.
run("Billie Eilish issues apology for mouthing an anti-Asian derogatory term in a resurfaced video.")

To Classify a news article whether it's biased or not

from Dbias.bias_classification import *

# returns classification label for a given sentence fragment.
classifier("Nevertheless, Trump and other Republicans have tarred the protests as havens for terrorists intent on destroying property.")

To Recognize the biased words/phrases

from Dbias.bias_recognition import *

# returns extracted biased entities from a given sentence fragment
recognizer("Christians should make clear that the perpetuation of objectionable vaccines and the lack of alternatives is a kind of coercion.")

To Mask out the biased portions of a given sentence fragment

from Dbias.bias_masking import *

# returns extracted biased entities from a given sentence fragment
masking("The fact that the abortion rate among American blacks is far higher than the rate for whites is routinely chronicled and mourned.")

About

This is a collective pipeline comprises of 3 Transformer models to de-bias/reduce amount of bias in news articles. The three models are:

  • An English sequence classification model, trained on the MBIC Dataset, to detect bias and fairness in sentences (news articles). This model was built on top of distilbert-base-uncased model and trained for 30 epochs with a batch size of 16, a learning rate of 5e-5, and a maximum sequence length of 512.
  • An Entity Recognition model, which is is trained on MBIC Dataset to recognize the biased word/phrases in a sentence. This model was built on top of roberta-base offered by Spacy transformers.
  • A Masked Language model, which is a Pretrained model on English language using a masked language modeling (MLM) objective.

Author

This model is part of the Research topic "Bias and Fairness in AI" conducted by Deepak John Reji, Shaina Raza. If you use this work (code, model or dataset), Please star at:

Bias & Fairness in AI, (2022), GitHub repository, https://github.com/dreji18/Fairness-in-AI

please cite our paper

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Dbias-0.1.5.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

Dbias-0.1.5-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file Dbias-0.1.5.tar.gz.

File metadata

  • Download URL: Dbias-0.1.5.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.13

File hashes

Hashes for Dbias-0.1.5.tar.gz
Algorithm Hash digest
SHA256 1f4db98b70ca772205f64a2d9da0b62592b93625d4ad1c2bcfb938a3a04075e5
MD5 99c7493a0730399f9bba571452940990
BLAKE2b-256 b0690b5218fa6ac51e3130e81f006d222791e25527eef7bb09d3ac3db20ec4ad

See more details on using hashes here.

File details

Details for the file Dbias-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: Dbias-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.13

File hashes

Hashes for Dbias-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 43728f55b18b595769b4e76b2b14d502105f1838b13065da34568175626cc150
MD5 ee173aed6eb3990941be20613cadcef5
BLAKE2b-256 f42d0bc7a0c0901e50f194ca142b945dd9483549595dde265d2b29a7d6aef603

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page