Bias detection and debiasing: identify segments, classify severity, get reasoning and replacements, full neutral rewrite

These details have not been verified by PyPI

Project links

Project description

unbias-plus

Bias detection and debiasing in text: identify biased segments, classify severity, get reasoning and neutral replacements per segment, and a full neutral rewrite. Structured output (binary label, severity, biased segments with offsets) via CLI, REST API, or Python.

Overview

Input text → analysis → validated BiasResult: binary label (biased/unbiased), overall severity (0–10), biased_segments (original phrase, replacement, severity, bias type, reasoning, character offsets), and full unbiased_text. Entry points: CLI (unbias-plus), REST API (FastAPI + demo UI), or Python (UnBiasPlus).

Project structure:

unbias-plus/
├── src/unbias_plus/
│   ├── __init__.py      # UnBiasPlus, BiasResult, BiasedSegment, serve
│   ├── cli.py           # unbias-plus entry point (--text, --file, --serve)
│   ├── api.py           # FastAPI app, /health, /analyze, serve()
│   ├── pipeline.py      # UnBiasPlus: prompt → model → parse → result
│   ├── model.py         # UnBiasModel: load LM, generate(), 4-bit optional
│   ├── prompt.py        # build_prompt(text), system prompt
│   ├── parser.py        # parse_llm_output() → BiasResult
│   ├── schema.py        # BiasResult, BiasedSegment (Pydantic)
│   ├── formatter.py     # format_cli, format_dict, format_json
│   └── demo/            # bundled web UI (served at / when using --serve)
│       ├── static/      # script.js, style.css
│       └── templates/   # index.html
├── tests/
│   ├── conftest.py      # fixtures (sample_result, sample_json, …)
│   └── unbias_plus/     # test_api, test_pipeline, test_parser, …
├── pyproject.toml
└── README.md

Features

Bias detection: Identifies biased phrases in text and returns them as segments with character-level offsets for highlighting.
Classification: Binary label (biased/unbiased), per-segment severity (low/medium/high), and bias type (e.g. loaded_language, stereotypical_association).
Reasoning: Each segment includes an explanation of why it is considered biased.
Debiasing: Per-segment neutral replacements and a full rewritten unbiased_text.
Structured output: Pydantic-validated BiasResult with binary_label, severity (0–10), biased_segments, and unbiased_text.
Demo UI: --serve launches a FastAPI server that also serves a visual web interface at http://localhost:8000.
CLI: Analyze from command line with --text, --file, or start the API + UI with --serve. Optional 4-bit quantization and JSON output.
REST API: FastAPI server with /health and /analyze (POST JSON {"text": "..."}). Model loaded at startup via lifespan.
Python API: Use UnBiasPlus in code; call analyze(), analyze_to_cli(), analyze_to_dict(), or analyze_to_json().

Requirements

Python ≥3.10, <3.12
CUDA 12.4 recommended (PyTorch + CUDA deps in pyproject.toml). CPU is supported with device="cpu".

Installation

The project uses uv for dependency management. Install uv, then from the project root:

uv sync
source .venv/bin/activate   # or .venv\Scripts\activate on Windows

For development (tests, linting, type checking):

uv sync --dev
source .venv/bin/activate

Optional: flash-attn (GPU only)

For training or faster inference with flash attention, install the train extra (requires CUDA/nvcc to build):

uv sync --extra train
# On HPC: load CUDA first, e.g. module load cuda/12.4.0

Default uv sync does not install flash-attn, so CI and CPU-only setups work without it.

Usage

Command line

# Analyze a string
unbias-plus --text "Women are too emotional to lead."

# Analyze a file, output JSON
unbias-plus --file article.txt --json

# Start API server + demo UI (default model, port 8000)
unbias-plus --serve
unbias-plus --serve --model path/to/model --port 8000
unbias-plus --serve --load-in-4bit   # reduce VRAM

Options: --model, --load-in-4bit, --max-new-tokens, --host, --port, --json.

Test the model (CLI)

After uv sync (and optionally uv sync --extra train on a GPU machine), verify the pipeline with:

# Default install (no flash-attn); use a small model or --load-in-4bit on GPU
uv run unbias-plus --text "Women are too emotional to lead."

# With your own model path
uv run unbias-plus --text "Some biased sentence." --model path/to/your/model

# JSON output
uv run unbias-plus --text "Test." --json

Or in Python (same env):

uv run python -c "
from unbias_plus import UnBiasPlus
pipe = UnBiasPlus()  # or UnBiasPlus('your-model-id', load_in_4bit=True)
text = 'Women are too emotional to lead.'
print(pipe.analyze_to_cli(text))
"

REST API + Demo UI

Start the server with unbias-plus --serve (or serve() in Python). This starts a single FastAPI server that:

Serves the visual demo UI at http://localhost:8000/
Exposes GET /health → {"status": "ok", "model": "<model_name_or_path>"}
Exposes POST /analyze → Body: {"text": "Your text here"}. Returns JSON matching BiasResult.

Programmatic start:

from unbias_plus import serve
serve("your-hf-model-id", port=8000, load_in_4bit=False)

Running on a remote server or HPC node: If the server is running on a remote machine, use SSH port forwarding to access the UI in your browser:
ssh -L 8000:localhost:8000 user@your-server.com
# or through a login node to a compute node:
ssh -L 8000:gpu-node-hostname:8000 user@login-node.com
Then open http://localhost:8000. If port 8000 is already in use locally, use a different local port (e.g. -L 8001:...) and open http://localhost:8001.

If you're using VS Code remote SSH, port forwarding is handled automatically via the Ports tab.

Python API

from unbias_plus import UnBiasPlus, BiasResult, BiasedSegment

pipe = UnBiasPlus("your-hf-model-id", load_in_4bit=False)
result = pipe.analyze("Women are too emotional to lead.")

print(result.binary_label)   # "biased" | "unbiased"
print(result.severity)       # 0–10
print(result.bias_found)     # bool

for seg in result.biased_segments:
    print(seg.original, seg.replacement, seg.severity, seg.bias_type, seg.reasoning)
    print(seg.start, seg.end)  # character offsets in original text

print(result.unbiased_text)  # full neutral rewrite

# Formatted outputs
cli_str  = pipe.analyze_to_cli("...")    # human-readable colored terminal output
d        = pipe.analyze_to_dict("...")   # plain dict
json_str = pipe.analyze_to_json("...")   # pretty-printed JSON string

Training

The Qwen3-8B checkpoint shipped with the demo was fine-tuned with SFT on the vector-institute/unbias-plus-dataset dataset on HuggingFace.

Standalone scripts that reproduce both stages live in training/, along with a sanity-check inference runner. They depend on the [train] optional extra (peft, trl, unsloth, flash-attn) and require an A100 or comparable GPU.

See training/README.md for details, CLI invocations, and resource sizing.

Development

Tests: pytest (see pyproject.toml for markers). Run from repo root: uv run pytest tests/.
Linting / formatting: ruff (format + lint), config in pyproject.toml.
Type checking: mypy with strict options, mypy_path = ["src", "training"].

Team

Developed by the AI Engineering team at the Vector Institute. UnBias-Plus is licensed under the Apache License 2.0. See LICENSE.md in the repository.

For research collaborations, partnerships, or technical inquiries, please contact Shaina Raza, PhD at shaina.raza@vectorinstitute.ai.

Acknowledgement

Resources used in preparing this research are provided, in part, by the Province of Ontario, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute.

This research is also supported by the European Union's Horizon Europe research and innovation programme under the AIXPERT project (Grant Agreement No. 101214389).

Citation

If you use UnBias-Plus in your research, please cite:

@article{radwan2026unbias,
  title={UnBias-Plus: Detect, Explain, and Rewrite Bias},
  author={Radwan, Ahmed Y and ElKady, Ahmed and Chaduvula, Sindhuja and Hafez, Mohamed and Krishnan, Amrit and Raza, Shaina},
  journal={arXiv preprint arXiv:2606.23412},
  year={2026}
}

Support

Open an issue on GitHub: https://github.com/VectorInstitute/unbias-plus/issues
For contribution guidelines, see CONTRIBUTING.md

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.8

Jul 23, 2026

0.1.7

Jul 17, 2026

0.1.6

May 26, 2026

0.1.5

Mar 5, 2026

0.1.4

Mar 5, 2026

0.1.3

Mar 3, 2026

0.1.2

Mar 3, 2026

0.1.0

Mar 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unbias_plus-0.1.8.tar.gz (13.3 MB view details)

Uploaded Jul 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

unbias_plus-0.1.8-py3-none-any.whl (6.5 MB view details)

Uploaded Jul 23, 2026 Python 3

File details

Details for the file unbias_plus-0.1.8.tar.gz.

File metadata

Download URL: unbias_plus-0.1.8.tar.gz
Upload date: Jul 23, 2026
Size: 13.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for unbias_plus-0.1.8.tar.gz
Algorithm	Hash digest
SHA256	`49e7fc0f548e2709b5869c2b59f24e3d3425da6660f50b044d77ac412d79e7be`
MD5	`1370e4bd4949825a35fb35c19f2ddb67`
BLAKE2b-256	`c7f180b42e633f0ca2bfebc1c7546f2b5e09bce943b4bdddf27f27c1baa3cfd7`

See more details on using hashes here.

File details

Details for the file unbias_plus-0.1.8-py3-none-any.whl.

File metadata

Download URL: unbias_plus-0.1.8-py3-none-any.whl
Upload date: Jul 23, 2026
Size: 6.5 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for unbias_plus-0.1.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c41bca55128f16f7c7ca498dd6562b91620d023a1d1e176df6ecc29df01e6182`
MD5	`e81c0784eaf3b8cdba546a7faa55bae4`
BLAKE2b-256	`c4cd4930e80da49300dbad8bcf9f0c60d2afee33d960fd1605b41d17030bef36`

See more details on using hashes here.

unbias-plus 0.1.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

unbias-plus

Overview

Features

Requirements

Installation

Usage

Command line

Test the model (CLI)

REST API + Demo UI

Python API

Training

Development

Team

Acknowledgement

Citation

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes