Policy-driven agent for real-time text moderation
Project description
Lexicont: Lightweight Policy-Driven Moderation Agent
Lexicont is a moderation system built as a policy-driven agent that combines fast rule-based filters, machine learning, and LLM reasoning. It processes the majority of inputs in milliseconds and only invokes the LLM when confidence is low, making it suitable for production environments where low latency is required.
Overview
- Fast layers - profanity detection, fuzzy matching, toxicity ML - run quickly.
- Early stop - high-confidence blocks skip further processing.
- LLM layers - intermediate triage and final judgment with Qwen via llamacpp or Ollama.
- RAG support - Qdrant vector database for retrieving similar examples.
- Policy driven agent - explicit control loop ensures predictable behaviour.
How It Works
The system uses a strictly linear pipeline with configurable early exits:
graph TD
A[Input Text] --> B[Stage 1: Profanity Filter]
subgraph "Policy Engine"
B --> C[Stage 2: Fuzzy Trigger]
C --> D[Stage 3: Toxicity ML Classifier]
D --> E[Stage 3.5: LLM Entry Judge]
B -->|"High Confidence → Block"| Z
C -->|"High Confidence → Block"| Z
D -->|"High Confidence → Block"| Z
E -->|"High Confidence → Block"| Z
E -->|"Low Confidence"| F
end
F[Stage 4: LLM Judge + Optional RAG] --> Z[Final Decision: BLOCK / REVIEW / PASS]
style Z fill:#4ade80,stroke:#22c55e
style A fill:#60a5fa,stroke:#3b82f6
Architecture
The moderation pipeline follows a fixed order of stages:
- profanity_filter - dictionary + leetspeak detection
- fuzzy_trigger - partial ratio matching
- toxicity_ml - multilingual toxicity classifier
- llm_entry_judge - LLM for text normalisation and decision on stage 4
- llm_judge - final LLM with RAG
Control logic:
- After stage 1 or stage 2, if confidence >= 0.85 and decision is block, the pipeline stops.
- Stage 4 is invoked only if either the triage layer (stage 3.5) explicitly allows it, or triage is disabled and max confidence < 0.80.
Quick Start
1. Local installation with Poetry
git clone https://github.com/corefrg/lexicont.git
cd lexicont
poetry install
Test it immediately:
poetry run lexicont check "buy fake documents through traffic police"
2. Run LLM backend (required for LLM stages)
Option A - Ollama (easiest)
ollama serve # in one terminal
ollama pull qwen3:4b # in another terminal
Option B - llama.cpp
# Example with quantized model
./llama-server -m Qwen_Qwen3-4B-Q4_K_M.gguf \
--host 0.0.0.0 \
--port 11434 \
-c 4096 \
--threads 12 \
-ngl 0
3. Docker Compose (full stack)
docker-compose up -d
Test the API:
curl -X POST http://localhost:8000/moderate \
-H "Content-Type: application/json" \
-d '{"text":"buy fake documents"}'
Usage
Command line
# Legacy syntax
poetry run lexicont "text to moderate"
# New explicit command
poetry run lexicont check "text to moderate"
# With debug output
poetry run lexicont check "text" --log-level DEBUG --verbose
# Interactive mode
poetry run lexicont
HTTP API
Start the server:
poetry run uvicorn lexicont.api:app --host 0.0.0.0 --port 8000
Endpoints:
POST /moderate- returns decision, confidence, and stage detailsGET /health- status check
From Python
from lexicont.pipeline import run
result = run("your text here")
print(result.final_decision) # block, review or pass
print(result.max_confidence)
print(result.explanation)
Configuration
Create your own configuration files
poetry run lexicont init --dir my_configs
Then set environment variables so the library uses your files:
# Windows
set LEXICONT_CONFIG=my_configs\moderation_config.yaml
set LEXICONT_RULES=my_configs\moderation_rules.v1.yaml
set LEXICONT_PATTERNS=my_configs\patterns.jsonl
# Linux / macOS
export LEXICONT_CONFIG=my_configs/moderation_config.yaml
export LEXICONT_RULES=my_configs/moderation_rules.v1.yaml
export LEXICONT_PATTERNS=my_configs/patterns.jsonl
Main configuration files
| File | Purpose | Environment Variable |
|---|---|---|
| moderation_config.yaml | Thresholds, stage toggles, LLM settings, RAG | LEXICONT_CONFIG |
| moderation_rules.v1.yaml | Custom phrase lists for profanity and fuzzy | LEXICONT_RULES |
| patterns.jsonl | Examples for RAG semantic search | LEXICONT_PATTERNS |
Each file has built-in defaults. You can override them with your own files.
Important options in moderation_config.yaml
general.early_stop_confidence- threshold for early termination (default 0.85)general.stage4_trigger_confidence- when to call final LLM (default 0.80)general.enable_stage1/2/3- toggle individual stagesenable_llm_entry_judge/enable_llm_judge- enable/disable LLM layersllm_judge.backend-llamacpporollamallm_judge.rag.enabled- turn RAG on or off
moderation_rules.v1.yaml example
categories:
profanity:
- fuck you
illegal:
- buy fake license
patterns.jsonl example (one JSON per line)
{"text": "buy a license through traffic police", "label": "offer to buy forged documents", "category": "illegal"}
Development
poetry install
pre-commit install
ruff format src tests
ruff check --fix src tests
To add a new filter, implement a function in the filters/ directory and register it in agent.py.
License
MIT
Built With
- Python 3.11+
- Pydantic (data validation)
- FastAPI (HTTP API)
- detoxify (multilingual toxicity classifier)
- rapidfuzz (fuzzy matching)
- Qwen3-4B (quantized LLM)
- Qdrant (vector database for RAG)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lexicont-0.1.5.tar.gz.
File metadata
- Download URL: lexicont-0.1.5.tar.gz
- Upload date:
- Size: 25.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4d6fec6fef1ffa7dc533dfa94f30bc1f7387e0547389deacd15c3ebcea099c1
|
|
| MD5 |
213f55b91d79bc21df932814138ef1c7
|
|
| BLAKE2b-256 |
44453f98b11db88d1f7947c973e7f5c67202f1ff16e527e4b6f75565841d7cb0
|
File details
Details for the file lexicont-0.1.5-py3-none-any.whl.
File metadata
- Download URL: lexicont-0.1.5-py3-none-any.whl
- Upload date:
- Size: 28.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8393eb21ebbc373512ccffad6e56b2b93abe0126e1462063848a675b31323caa
|
|
| MD5 |
5698f8d9b5c3695d801503731a43c3b0
|
|
| BLAKE2b-256 |
ccb5992a4e00e51e8ffbb2182e630569517c2b90ff58adcecb8901d27afe3893
|