A library for detecting and analyzing bias in text, datasets, and language models.

These details have not been verified by PyPI

Project description

BiasCheck: An Open-Source Library for Bias Detection

BiasCheck is a robust and modular Python library designed to analyze and detect bias in text, models, and datasets. It provides tools for researchers, data scientists, and developers to measure various forms of bias (e.g., stereotypical, cultural) and assess the quality of language model outputs or textual data.

Features

Modular Design: BiasCheck offers modular and extensible classes for different bias analysis tasks.
Bias Detection: Analyze text, datasets, language models or databases for various types of bias.
Support for RAG: Automatically create Retrieval-Augmented Generation (RAG) pipelines using documents or PDFs.
Sentiment Analysis: Assess sentiment polarity alongside bias.
Visualization: Visualize flagged sentences and bias types in your analysis.

Main Classes

1. `DocuCheck`

Analyze bias in standalone text documents or files.

Key Features:

Accepts text data or documents (e.g., PDF, TXT).
Detects flagged sentences and calculates a bias score.
Optionally uses a list of polarizing terms for context-specific bias detection.

Example:

from biascheck.analysis.docucheck import DocuCheck

data = "This is a sample document that may contain biases."
terms = ["biased", "lazy", "discrimination"]

analyzer = DocuCheck(data=data, terms=terms)
result = analyzer.analyze(verbose=False)
print(result)

2. SetCheck

Analyze entire datasets (e.g., DataFrames) for skewed or biased records.

Key Features:

Works with Python DataFrames and CSV files.
Adds bias-related columns to the dataset.
Returns flagged records and overall bias analysis.

Example:

from biascheck.analysis.setcheck import SetCheck

data = [{"text": "A biased example."}, {"text": "A neutral sentence."}]
terms = ["bias", "stereotype"]

analyzer = SetCheck(data=data, input_cols=["text"], terms=terms)
flagged_df = analyzer.analyze(top_n=5)
print(flagged_df)

3. ModuCheck

Analyze bias in language model outputs using Hugging Face models.

Key Features:

Supports Hugging Face models and pipelines.
Detects bias in generated outputs based on user-provided topics.
Automatically builds a RAG pipeline if a document is provided.
Saves flagged outputs and bias results to a DataFrame.

Example:

from biascheck.analysis.moducheck import ModuCheck
from transformers import pipeline

# Initialize a Hugging Face pipeline
model = pipeline("text-generation", model="gpt2")
topics = ["The role of gender in leadership", "Cultural diversity"]

analyzer = ModuCheck(model=model, terms=["bias", "stereotype"], document="file.pdf")
result = analyzer.analyze(topics=topics, num_responses=5)
print(result)

4. RAGCheck

Analyze bias in RAG pipelines by combining document retrieval and natural language generation.

Key Features:

Builds Retrieval-Augmented Generation pipelines from documents or PDFs.
Supports hypothesis-based contextual bias detection using NLI models.
Integrates FAISS for vectorized document retrieval.
Identifies bias in retrieved content and generated outputs.

Example:

from biascheck.analysis.ragcheck import RAGCheck
from transformers import pipeline

# Initialize a Hugging Face pipeline
model = pipeline("text-generation", model="gpt2")
terms = ["bias", "discrimination"]

analyzer = RAGCheck(model=model, document="sample.pdf", terms=terms, verbose=True)
result = analyzer.analyze(top_n=5)
print(result)

5. Visualiser

Visualize the results of bias analysis.

Key Features:

Generates bar charts for flagged bias categories.
Visualizes flagged sentences and bias distribution.

Example:

from biascheck.visualisation.visualiser import Visualiser

visualiser = Visualiser()
visualiser.plot_bias_categories(flagged_records)

6. BaseCheck (under construction)

Analyze bias in databases similar to the rest of the library.

Key Features:

Database Compatibility: Supports both vector databases (e.g., FAISS) and graph databases (e.g., Neo4j).
Saves flagged outputs and bias results to a DataFrame.

Installation

Prerequisites

Python 3.9 or 3.10
pip (Python package installer)
For GPU support: CUDA-compatible GPU and CUDA toolkit

Basic Installation

For CPU-only installation:

pip install biascheck

Optional Dependencies

For GPU support (requires CUDA-compatible GPU):

pip install "biascheck[gpu]"

For development and testing:

pip install "biascheck[test]"

For all features (GPU + testing):

pip install "biascheck[all]"

Platform-Specific Notes

macOS

No additional requirements for basic installation
For GPU support, ensure you have CUDA installed via Homebrew or other package manager

Linux

No additional requirements for basic installation
For GPU support, ensure CUDA toolkit is installed
Some distributions may require additional system packages for PDF processing

Windows

No additional requirements for basic installation
For GPU support, ensure CUDA toolkit is installed
May require Visual C++ Redistributable for some dependencies

Troubleshooting

If you encounter any installation issues:

Ensure you're using Python 3.9 or 3.10

Try creating a fresh virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install biascheck

For GPU-related issues, verify CUDA installation:
```
nvidia-smi  # Should show GPU information
```

If specific dependencies fail, try installing them separately:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install biascheck

System Requirements

Minimum 4GB RAM (8GB recommended)
2GB free disk space
For GPU support: NVIDIA GPU with CUDA support

Usage

Run Examples

The notebooks/ directory contains example scripts for all analysis classes:

python notebooks/moducheck_example.py
python notebooks/docucheck_example.py

Contributing

We welcome contributions! Please fork the repository, make your changes, and submit a pull request. Ensure all new features are covered with appropriate tests.

Future Work

Multimodal Support: Expand the library to include image, video, and audio bias detection.
Enhanced RAG Pipelines: Improve integration with custom retrievers.
Advanced Bias Categories: Expand predefined bias categories for deeper contextual analysis.

Contact

For questions, suggestions, or feedback, reach out to the project maintainer:

Name: Arjun Balaji

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.8.10

Apr 25, 2025

This version

0.8.9

Apr 25, 2025

0.8.8

Apr 25, 2025

0.8.7

Apr 25, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biascheck-0.8.9.tar.gz (21.7 kB view details)

Uploaded Apr 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

biascheck-0.8.9-py3-none-any.whl (27.9 kB view details)

Uploaded Apr 25, 2025 Python 3

File details

Details for the file biascheck-0.8.9.tar.gz.

File metadata

Download URL: biascheck-0.8.9.tar.gz
Upload date: Apr 25, 2025
Size: 21.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for biascheck-0.8.9.tar.gz
Algorithm	Hash digest
SHA256	`6586707d31b4e896a259446f7495ad74b6271890a20ee020e231bb04e195f9ed`
MD5	`7ec8283c33acd098d4ef5a9a20b1f70a`
BLAKE2b-256	`2c557738bb7da553cb8f1a98e0b3d7931072b200346114e707c5701455217764`

See more details on using hashes here.

File details

Details for the file biascheck-0.8.9-py3-none-any.whl.

File metadata

Download URL: biascheck-0.8.9-py3-none-any.whl
Upload date: Apr 25, 2025
Size: 27.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for biascheck-0.8.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`15abb12b3e4a770e0a5d25ed142734c997d67668eca40f4ac0e097e933778a81`
MD5	`9c534159b8eb8ce050132d2c539a4506`
BLAKE2b-256	`bb225d4602d12b73be97dae79e2acd519cb34eb36e9f98316fab05e604264fff`

See more details on using hashes here.

biascheck 0.8.9

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

BiasCheck: An Open-Source Library for Bias Detection

Features

Main Classes

1. DocuCheck

Key Features:

Example:

2. SetCheck

Key Features:

Example:

3. ModuCheck

Key Features:

Example:

4. RAGCheck

Key Features:

Example:

5. Visualiser

Key Features:

Example:

6. BaseCheck (under construction)

Key Features:

Installation

Prerequisites

Basic Installation

Optional Dependencies

Platform-Specific Notes

macOS

Linux

Windows

Troubleshooting

System Requirements

Usage

Run Examples

Contributing

Future Work

Contact

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

1. `DocuCheck`