Skip to main content

Policy-driven agent for real-time text moderation

Project description

Lexicont: Lightweight Policy-Driven Moderation Agent

License Python PyPI Status

Lexicont Logo

Lexicont is a moderation system built as a policy-driven agent that combines fast rule-based filters, machine learning, and LLM reasoning. It processes the majority of inputs in milliseconds and only invokes the LLM when confidence is low, making it suitable for production environments where low latency is required.

Overview

  • Fast layers - profanity detection, fuzzy matching, toxicity ML - run quickly.
  • Early stop - high-confidence blocks skip further processing.
  • LLM layers - intermediate triage and final judgment with Qwen via llamacpp or Ollama.
  • RAG support - Qdrant vector database for retrieving similar examples.
  • Policy driven agent - explicit control loop ensures predictable behaviour.

How It Works

The system uses a strictly linear pipeline with configurable early exits:

graph TD
    A[Input Text] --> B[Stage 1: Profanity Filter]

    subgraph "Policy Engine"
        B --> C[Stage 2: Fuzzy Trigger]
        C --> D[Stage 3: Toxicity ML Classifier]
        D --> E[Stage 3.5: LLM Entry Judge]

        B -->|"High Confidence → Block"| Z
        C -->|"High Confidence → Block"| Z
        D -->|"High Confidence → Block"| Z
        
        E -->|"High Confidence → Block"| Z
        E -->|"Low Confidence"| F
    end

    F[Stage 4: LLM Judge + Optional RAG] --> Z[Final Decision: BLOCK / REVIEW / PASS]

    style Z fill:#4ade80,stroke:#22c55e
    style A fill:#60a5fa,stroke:#3b82f6

Architecture

The moderation pipeline follows a fixed order of stages:

  1. profanity_filter - dictionary + leetspeak detection
  2. fuzzy_trigger - partial ratio matching
  3. toxicity_ml - multilingual toxicity classifier
  4. llm_entry_judge - LLM for text normalisation and decision on stage 4
  5. llm_judge - final LLM with RAG

Control logic:

  • After stage 1 or stage 2, if confidence >= 0.85 and decision is block, the pipeline stops.
  • Stage 4 is invoked only if either the triage layer (stage 3.5) explicitly allows it, or triage is disabled and max confidence < 0.80.

Quick Start

1. Local installation with Poetry

git clone https://github.com/corefrg/lexicont.git
cd lexicont
poetry install

Test it immediately:

poetry run lexicont check "buy fake documents through traffic police"

2. Run LLM backend (required for LLM stages)

Option A - Ollama (easiest)

ollama serve          # in one terminal
ollama pull qwen3:4b  # in another terminal

Option B - llama.cpp

# Example with quantized model
./llama-server -m Qwen_Qwen3-4B-Q4_K_M.gguf \
  --host 0.0.0.0 \
  --port 11434 \
  -c 4096 \
  --threads 12 \
  -ngl 0

3. Docker Compose (full stack)

docker-compose up -d

Test the API:

curl -X POST http://localhost:8000/moderate \
  -H "Content-Type: application/json" \
  -d '{"text":"buy fake documents"}'

Usage

Command line

# Legacy syntax
poetry run lexicont "text to moderate"

# New explicit command
poetry run lexicont check "text to moderate"

# With debug output
poetry run lexicont check "text" --log-level DEBUG --verbose

# Interactive mode
poetry run lexicont

HTTP API

Start the server:

poetry run uvicorn lexicont.api:app --host 0.0.0.0 --port 8000

Endpoints:

  • POST /moderate - returns decision, confidence, and stage details
  • GET /health - status check

From Python

from lexicont.pipeline import run

result = run("your text here")
print(result.final_decision)   # block, review or pass
print(result.max_confidence)
print(result.explanation)

Configuration

Create your own configuration files

poetry run lexicont init --dir my_configs

Then set environment variables so the library uses your files:

# Windows
set LEXICONT_CONFIG=my_configs\moderation_config.yaml
set LEXICONT_RULES=my_configs\moderation_rules.v1.yaml
set LEXICONT_PATTERNS=my_configs\patterns.jsonl

# Linux / macOS
export LEXICONT_CONFIG=my_configs/moderation_config.yaml
export LEXICONT_RULES=my_configs/moderation_rules.v1.yaml
export LEXICONT_PATTERNS=my_configs/patterns.jsonl

Main configuration files

File Purpose Environment Variable
moderation_config.yaml Thresholds, stage toggles, LLM settings, RAG LEXICONT_CONFIG
moderation_rules.v1.yaml Custom phrase lists for profanity and fuzzy LEXICONT_RULES
patterns.jsonl Examples for RAG semantic search LEXICONT_PATTERNS

Each file has built-in defaults. You can override them with your own files.

Important options in moderation_config.yaml

  • general.early_stop_confidence - threshold for early termination (default 0.85)
  • general.stage4_trigger_confidence - when to call final LLM (default 0.80)
  • general.enable_stage1/2/3 - toggle individual stages
  • enable_llm_entry_judge / enable_llm_judge - enable/disable LLM layers
  • llm_judge.backend - llamacpp or ollama
  • llm_judge.rag.enabled - turn RAG on or off

moderation_rules.v1.yaml example

categories:
  profanity:
    - fuck you
  illegal:
    - buy fake license

patterns.jsonl example (one JSON per line)

{"text": "buy a license through traffic police", "label": "offer to buy forged documents", "category": "illegal"}

Development

poetry install
pre-commit install
ruff format src tests
ruff check --fix src tests

To add a new filter, implement a function in the filters/ directory and register it in agent.py.

License

MIT


Built With

  • Python 3.11+
  • Pydantic (data validation)
  • FastAPI (HTTP API)
  • detoxify (multilingual toxicity classifier)
  • rapidfuzz (fuzzy matching)
  • Qwen3-4B (quantized LLM)
  • Qdrant (vector database for RAG)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lexicont-0.1.3.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lexicont-0.1.3-py3-none-any.whl (28.0 kB view details)

Uploaded Python 3

File details

Details for the file lexicont-0.1.3.tar.gz.

File metadata

  • Download URL: lexicont-0.1.3.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for lexicont-0.1.3.tar.gz
Algorithm Hash digest
SHA256 1515686f54e1ee5758d61d7d1538ab00a259f69eebb7ca86db11540679376ae4
MD5 3e50153ebf18436c90684ddc74fa7fa5
BLAKE2b-256 caf5b560851c5d82c7cd76cc7c03c6227d81c97deda718c05e78996b42785922

See more details on using hashes here.

File details

Details for the file lexicont-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: lexicont-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 28.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for lexicont-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 84f7c1fd9ce5b85cd46f2ec8b12c597a04aba201e531324161d47f26a7154008
MD5 5cbb7a3ccf1fb77f3914c7c0d8048ce1
BLAKE2b-256 4b7e88f2d490c48c0546c05b504c3f853bbfb52c9062972004cc11426d2ef4a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page