An LLM-based pipeline to detect toxic speech.

Project description

Toxicity Detector

An LLM-based pipeline to detect toxic speech using language models.

Setup

This project uses uv for dependency management.

Prerequisites

Python 3.12 or higher
uv package manager

Installation

Install uv (if not already installed):

Clone the repository:

git clone https://github.com/debatelab/toxicity-detector.git
cd toxicity-detector

Install dependencies:
```
uv sync
```
This will create a virtual environment and install all dependencies specified in pyproject.toml.
Install development dependencies (optional):
```
uv sync --group dev
```

Environment Variables

Create a .env file in the project root with the following variables:

# API Keys (by the names as specified in the model config files)

# Optional: Custom app config file path
TOXICITY_DETECTOR_APP_CONFIG_FILE=./config/app_config.yaml

Configuration of the Pipeline

#TODO: Add section on configuring the pipeline using YAML files in the config/ directory.

You can also configure the underlying prompt templates that are used in the pipeline by modifying and/or providing the relevant parts of the configuration. For detail, refer to the default_pipeline_config.yaml.

Running the Pipeline

Using the CLI

The simplest way to run toxicity detection from the command line:

# Basic usage
uv run toxicity-detector detect \
  --text "Your text to analyze" \
  --pipeline-config ./config/pipeline_config.yaml

# With all options
uv run toxicity-detector detect \
  --text "Your text to analyze" \
  --pipeline-config ./config/pipeline_config.yaml \
  --toxicity-type personalized_toxic_speech \
  --source "chat" \
  --context "Additional context here" \
  --save \
  --verbose

Using Python

from toxicity_detector import detect_toxicity, PipelineConfig

# Load pipeline configuration from YAML file
pipeline_config = PipelineConfig.from_file('./config/pipeline_config.yaml')

# The text to analyze for toxicity
input_text = 'Peter is dumn.'

# Run toxicity detection
result = detect_toxicity(
    input_text=input_text,  # The text to be analyzed
    user_input_source=None,  # Optional: identifier for the source of the input (e.g., 'chat', 'comment')
    toxicity_type='personalized_toxicity',  # Type of toxicity analysis to perform
    context_info=None,  # Optional: additional context about the conversation or situation
    pipeline_config=pipeline_config,  # Configuration specifying model, paths, and behavior
    serialize_result=True,  # If True, saves the result to disk as YAML
)

# Display the analysis result and toxicity verdict
print(result.answer['contains_toxicity'])

We also provide an example notebook that demonstrates how to run the toxicity detection pipeline with a Hugging Face API key.

Running the Gradio App

The project includes a Gradio web interface for interactive toxicity detection.

Using the CLI

Run the app using the simple command:

# With app configuration file
uv run toxicity-detector app --app-config ./config/app_config.yaml

# With pipeline configuration file (uses default app settings)
uv run toxicity-detector app --pipeline-config ./config/pipeline_config.yaml

# With custom server settings
uv run toxicity-detector app \
  --app-config ./config/app_config.yaml \
  --server-port 8080 \
  --share

The app will start and be accessible at http://localhost:7860 by default (or your specified port).

Alternative Methods

Direct Python execution (uses environment variable or default config path):

uv run python src/toxicity_detector/app/app.py

Using the activated virtual environment:

# Activate the virtual environment
source .venv/bin/activate  # On Linux/Mac
# or
.venv\Scripts\activate  # On Windows

# Run the app with CLI
toxicity-detector app --app-config ./config/app_config.yaml

# or for live reloading during development
gradio src/toxicity_detector/app/app.py

Developer Mode

To enable developer mode with additional configuration options, update your config/app_config.yaml:

developer_mode: true

Project Structure

toxicity-detector/
├── config/                          # Configuration files
│   ├── app_config.yaml             # App configuration
│   └── default_model_config_*.yaml # Model configurations
├── src/
│   └── toxicity_detector/
│       ├── __init__.py
│       ├── app.py                  # Gradio web interface
│       ├── backend.py              # Core detection logic
│       └── chains.py               # LangChain pipelines
├── logs/                           # Application logs
├── notebooks/                      # Jupyter notebooks for testing
├── pyproject.toml                  # Project dependencies
└── README.md                       # This file

Development

Code Style

The project follows PEP 8 guidelines with a maximum line length of 88 characters.

Run linting checks:

uv run flake8 src/

Running Tests

Run all tests:

uv run pytest

Run tests with verbose output:

uv run pytest -v

Run a specific test file:

uv run pytest tests/test_config.py

Run tests with coverage report:

uv run pytest --cov=src/toxicity_detector

Alternative: Using the activated virtual environment:

# Activate the virtual environment first
source .venv/bin/activate  # On Linux/Mac
# or
.venv\Scripts\activate  # On Windows

# Then run pytest directly
pytest tests/
pytest tests/test_config.py -v

Working with Notebooks

To use Jupyter notebooks for development:

# Install dev dependencies if not already done
uv sync --group dev

# Start Jupyter
uv run jupyter notebook notebooks/

License

See LICENSE file for details.

Project details

Release history Release notifications | RSS feed

0.1.0

Jan 16, 2026

This version

0.1.0b3 pre-release

Jan 16, 2026

0.1.0b2 pre-release

Jan 16, 2026

0.1.0b1 pre-release

Jan 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toxicity_detector-0.1.0b3.tar.gz (30.4 kB view details)

Uploaded Jan 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

toxicity_detector-0.1.0b3-py3-none-any.whl (38.2 kB view details)

Uploaded Jan 16, 2026 Python 3

File details

Details for the file toxicity_detector-0.1.0b3.tar.gz.

File metadata

Download URL: toxicity_detector-0.1.0b3.tar.gz
Upload date: Jan 16, 2026
Size: 30.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for toxicity_detector-0.1.0b3.tar.gz
Algorithm	Hash digest
SHA256	`abe8550a972cbaf2d91faa753a41dae158d1ac26691d33354cf6b0c579e77c3c`
MD5	`7306c46586f99da7d395f0921e89217e`
BLAKE2b-256	`03fb8a12e48f3bdf4db9dd983bb3ca78034651278874501b02ef8d5022e34524`

See more details on using hashes here.

File details

Details for the file toxicity_detector-0.1.0b3-py3-none-any.whl.

File metadata

Download URL: toxicity_detector-0.1.0b3-py3-none-any.whl
Upload date: Jan 16, 2026
Size: 38.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for toxicity_detector-0.1.0b3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ee9109f5a62bc1f9049e1598d6db52ed653bcaf290c9b3917865491c38355576`
MD5	`61c614c76720b2bb9e9e3befe67ab737`
BLAKE2b-256	`b5d7af523c4deafda2699a7302126d7e1dc740baa94a603657750065d891a4f3`

See more details on using hashes here.

toxicity-detector 0.1.0b3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Toxicity Detector

Setup

Prerequisites

Installation

Environment Variables

Configuration of the Pipeline

Running the Pipeline

Using the CLI

Using Python

Running the Gradio App

Using the CLI

Alternative Methods

Developer Mode

Project Structure

Development

Code Style

Running Tests

Working with Notebooks

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes