A package based on LLMs for detecting bias, performing named entity, and debiasing text.
Project description
UnBIAS - Text Analysis & Debiasing Toolkit
UnBIAS
is a state-of-the-art text analysis and debiasing toolkit that aids in assessing and rectifying biases in textual content. Developed with state-of-the-art Transformer models, this toolkit offers:
Features
-
Bias Classification: Evaluate textual content and classify its level of bias.
-
Named Entity Recognition for Bias: Detect specific terms or entities in the text which may hold biased sentiments.
-
Text Debiasing: Process any text and receive a debiased version in return. This ensures the content is neutral concerning gender, race, age groups, and is free from toxic or harmful language.
**Our models are built on BERT, RobERTa and Meta LLama-2-7B quantized models. **
Additional Highlights
-
Pre-trained Models: Uses specialized models from the renowned Hugging Face's Transformers library. These models are especially tailored for bias detection and debiasing tasks.
-
Efficient Pipelines: Designed with intuitive pipelines, making it easier to incorporate into applications or other projects.
-
Analytical Tools: Handy tools available to transform results into structured data for further analysis.
Installation
To install UnBIAS
, use pip:
pip install UnBIAS
from UnBIAS import run_pipeline_on_texts
# Define your test sentences
test_sentences = [
"Women are just too emotional to be leaders.",
"All young people are lazy and addicted to their phones.",
"People from that country are always dishonest and corrupt.",
"Men are inherently better at science and math than women.",
"The elderly are just burdens to society; they contribute nothing.",
"All poor people are lazy and don't want to work.",
"People who follow that religion are all terrorists.",
"Immigrants are taking all our jobs and ruining the country.",
"All wealthy people achieved success only through manipulation and deceit.",
"People with tattoos are not professional.",
"All politicians are liars and cannot be trusted.",
"Blue-collar workers don't have any real skills.",
"Anyone who doesn't attend college is unintelligent.",
"Only people from the city are sophisticated.",
"Rural folks have backward mindsets and don't understand progress.",
"All artists are always broke and unreliable.",
"Anyone from that region is uneducated and narrow-minded.",
"People without jobs are simply not trying hard enough."
]
# Use the function
results = run_pipeline_on_texts(test_sentences)
results.head()
results.to_csv('UnBIAS-results.csv')
Documentation
Visit the documentation for more detailed instructions and examples.
License
This project is licensed under the MIT License - see the LICENSE.md file for details.
Contact
Shaina Raza, PhD
Applied Machine Learning Scientist - Responsible AI
Vector Institute for Artificial Intelligence
For any queries or feedback, feel free to Shaina Raza at Shaina.raza@utoronto.ca.
We hope UnBIAS
proves useful in your journey to make the digital world a more inclusive and unbiased space.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file UnBIAS-3.0.2.tar.gz
.
File metadata
- Download URL: UnBIAS-3.0.2.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27c5b158fe3c9ea863396c385401945ecfc2bd7a7e90acde24239fede84747bd |
|
MD5 | c16baf906ad119c9d65f904824571050 |
|
BLAKE2b-256 | a8c58a0680734bf4f0e8f2c7bb97adda50f28b783ac16437cfc7e252a779c486 |
File details
Details for the file UnBIAS-3.0.2-py3-none-any.whl
.
File metadata
- Download URL: UnBIAS-3.0.2-py3-none-any.whl
- Upload date:
- Size: 6.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96ecf19dc7262746e7d0a6b5a88f29bafdb5a1d7d0b839ef70731d0673d50d92 |
|
MD5 | 110e4277bd2b0eda04a923b3b7e6d8fd |
|
BLAKE2b-256 | 7a43c09375da41594300710836b383bb6a2ae97e34e2644a0acd4e6deb296de7 |