Skip to main content

Policy-driven agent for real-time text moderation

Project description

Lexicont: Lightweight Policy-Driven Moderation Agent

License Python PyPI Status

Lexicont Logo

Lexicont is a moderation system built as a policy-driven agent that combines fast rule-based filters, machine learning, and LLM reasoning. It processes the majority of inputs in milliseconds and only invokes the LLM when confidence is low, making it suitable for production environments where low latency is required.

Overview

  • Fast layers - profanity detection, fuzzy matching, toxicity ML - run quickly.
  • Early stop - high-confidence blocks skip further processing.
  • LLM layers - intermediate triage and final judgment with Qwen via llamacpp or Ollama.
  • RAG support - Qdrant vector database for retrieving similar examples.
  • Policy driven agent - explicit control loop ensures predictable behaviour.

How It Works

The system uses a strictly linear pipeline with configurable early exits:

graph TD
    A[Input Text] --> B[Stage 1: Profanity Filter]

    subgraph "Policy Engine"
        B --> C[Stage 2: Fuzzy Trigger]
        C --> D[Stage 3: Toxicity ML Classifier]
        D --> E[Stage 3.5: LLM Entry Judge]

        B -->|"High Confidence → Block"| Z
        C -->|"High Confidence → Block"| Z
        D -->|"High Confidence → Block"| Z
        
        E -->|"High Confidence → Block"| Z
        E -->|"Low Confidence"| F
    end

    F[Stage 4: LLM Judge + Optional RAG] --> Z[Final Decision: BLOCK / REVIEW / PASS]

    style Z fill:#4ade80,stroke:#22c55e
    style A fill:#60a5fa,stroke:#3b82f6

Architecture

The moderation pipeline follows a fixed order of stages:

  1. profanity_filter - dictionary + leetspeak detection
  2. fuzzy_trigger - partial ratio matching
  3. toxicity_ml - multilingual toxicity classifier
  4. llm_entry_judge - LLM for text normalisation and decision on stage 4
  5. llm_judge - final LLM with RAG

Control logic:

  • After stage 1 or stage 2, if confidence >= 0.85 and decision is block, the pipeline stops.
  • Stage 4 is invoked only if either the triage layer (stage 3.5) explicitly allows it, or triage is disabled and max confidence < 0.80.

Quick Start

1. Local installation with Poetry

git clone https://github.com/corefrg/lexicont.git
cd lexicont
poetry install

Test it immediately:

poetry run lexicont check "buy fake documents through traffic police"

2. Run LLM backend (required for LLM stages)

Option A - Ollama (easiest)

ollama serve          # in one terminal
ollama pull qwen3:4b  # in another terminal

Option B - llama.cpp

# Example with quantized model
./llama-server -m Qwen_Qwen3-4B-Q4_K_M.gguf \
  --host 0.0.0.0 \
  --port 11434 \
  -c 4096 \
  --threads 12 \
  -ngl 0

3. Docker Compose (full stack)

docker-compose up -d

Test the API:

curl -X POST http://localhost:8000/moderate \
  -H "Content-Type: application/json" \
  -d '{"text":"buy fake documents"}'

Usage

Command line

# Legacy syntax
poetry run lexicont "text to moderate"

# New explicit command
poetry run lexicont check "text to moderate"

# With debug output
poetry run lexicont check "text" --log-level DEBUG --verbose

# Interactive mode
poetry run lexicont

HTTP API

Start the server:

poetry run uvicorn lexicont.api:app --host 0.0.0.0 --port 8000

Endpoints:

  • POST /moderate - returns decision, confidence, and stage details
  • GET /health - status check

From Python

from lexicont.pipeline import run

result = run("your text here")
print(result.final_decision)   # block, review or pass
print(result.max_confidence)
print(result.explanation)

Configuration

Create your own configuration files

poetry run lexicont init --dir my_configs

Then set environment variables so the library uses your files:

# Windows
set LEXICONT_CONFIG=my_configs\moderation_config.yaml
set LEXICONT_RULES=my_configs\moderation_rules.v1.yaml
set LEXICONT_PATTERNS=my_configs\patterns.jsonl

# Linux / macOS
export LEXICONT_CONFIG=my_configs/moderation_config.yaml
export LEXICONT_RULES=my_configs/moderation_rules.v1.yaml
export LEXICONT_PATTERNS=my_configs/patterns.jsonl

Main configuration files

File Purpose Environment Variable
moderation_config.yaml Thresholds, stage toggles, LLM settings, RAG LEXICONT_CONFIG
moderation_rules.v1.yaml Custom phrase lists for profanity and fuzzy LEXICONT_RULES
patterns.jsonl Examples for RAG semantic search LEXICONT_PATTERNS

Each file has built-in defaults. You can override them with your own files.

Important options in moderation_config.yaml

  • general.early_stop_confidence - threshold for early termination (default 0.85)
  • general.stage4_trigger_confidence - when to call final LLM (default 0.80)
  • general.enable_stage1/2/3 - toggle individual stages
  • enable_llm_entry_judge / enable_llm_judge - enable/disable LLM layers
  • llm_judge.backend - llamacpp or ollama
  • llm_judge.rag.enabled - turn RAG on or off

moderation_rules.v1.yaml example

categories:
  profanity:
    - fuck you
  illegal:
    - buy fake license

patterns.jsonl example (one JSON per line)

{"text": "buy a license through traffic police", "label": "offer to buy forged documents", "category": "illegal"}

Development

poetry install
pre-commit install
ruff format src tests
ruff check --fix src tests

To add a new filter, implement a function in the filters/ directory and register it in agent.py.

License

MIT


Built With

  • Python 3.11+
  • Pydantic (data validation)
  • FastAPI (HTTP API)
  • detoxify (multilingual toxicity classifier)
  • rapidfuzz (fuzzy matching)
  • Qwen3-4B (quantized LLM)
  • Qdrant (vector database for RAG)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lexicont-0.1.5.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lexicont-0.1.5-py3-none-any.whl (28.0 kB view details)

Uploaded Python 3

File details

Details for the file lexicont-0.1.5.tar.gz.

File metadata

  • Download URL: lexicont-0.1.5.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for lexicont-0.1.5.tar.gz
Algorithm Hash digest
SHA256 a4d6fec6fef1ffa7dc533dfa94f30bc1f7387e0547389deacd15c3ebcea099c1
MD5 213f55b91d79bc21df932814138ef1c7
BLAKE2b-256 44453f98b11db88d1f7947c973e7f5c67202f1ff16e527e4b6f75565841d7cb0

See more details on using hashes here.

File details

Details for the file lexicont-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: lexicont-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 28.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for lexicont-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 8393eb21ebbc373512ccffad6e56b2b93abe0126e1462063848a675b31323caa
MD5 5698f8d9b5c3695d801503731a43c3b0
BLAKE2b-256 ccb5992a4e00e51e8ffbb2182e630569517c2b90ff58adcecb8901d27afe3893

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page