Skip to main content

Enterprise AI guardrails: input gateway, multi-agent debate output verification, confidence scoring, and RAG pipeline — all in one SDK.

Project description

Guardrails Enterprise — AI Guardrails & ML Infrastructure

Enterprise AI security and ML infrastructure platform. Wraps any enterprise LLM (healthcare, banking, legal, HR) with input and output guardrail layers, backed by a regulatory RAG corpus.

📁 Repository Structure

guardrails-enterprise/
├── docs/                    # All documentation
│   ├── README.md           # PII NER pipeline documentation
│   ├── QUICK_START.md      # Quick start guide
│   └── SETUP_GUIDE.md      # Detailed setup instructions
├── config/                  # Configuration files
├── dags/                    # Airflow DAGs
├── plugins/                 # Airflow plugins
├── scripts/                 # Utility scripts
│   ├── setup_airflow.sh
│   ├── start_airflow.sh
│   └── stop_airflow.sh
├── docker/                  # Docker configuration
│   ├── Dockerfile
│   └── docker-compose.yml
├── gateway/                 # Phase 1: Input guardrail (3 classifiers)
├── rag/                     # Phase 2: RAG pipeline (Qdrant + Qwen3-4B embedder)
├── multi_agent/             # Phase 3: MAD output guardrail (active)
├── multi_agent_debate/       # MAD pipeline with SQLite storage for GRPO
├── confidence/              # Phase 4: Confidence Scoring Engine
├── rlhf/                    # Phase 5: GRPO feedback loop
├── finetuning/              # Fine-tuning scripts for all 4 models
├── synthetic_data/          # Synthetic evaluation dataset generation
├── MAD_SETUP_GUIDE.md       # Standalone MAD setup guide
└── requirements.txt         # Root-level shared dependencies

🚀 Quick Start

For detailed setup instructions, see:

✅ Active Pipelines

PII NER Pipeline

  • Location: dags/pii_ner_pipeline.py
  • Purpose: Download, EDA, and BIO NER transformation of the ai4privacy/pii-masking-200k dataset → GCS
  • Status: ✅ Active
  • Tasks: download_from_huggingfaceload_raw_dataperform_edatransform_dataupload_processed_data

Multi-Agent Debate (MAD) Output Guardrail

  • Location: multi_agent/ (core pipeline), multi_agent_debate/ (with SQLite storage)
  • Purpose: Verifies enterprise LLM answers against a regulatory evidence corpus. Extracts atomic claims, runs a 2-cycle adversarial debate (Agent A vs Agent B), routes through a partially-blind judge, and returns a routing decision (DELIVER / RETRY / HARD_BLOCK / HUMAN_REVIEW)
  • Status: ✅ Active
  • Run: uvicorn multi_agent.api:app --port 8001 --reload

🔮 Planned / In Development

Gateway — Input Guardrail

  • Location: gateway/
  • Purpose: 3 parallel classifiers (prompt injection, PII detection, jailbreak detection) with a weighted decision engine. Blocks threats before they reach the LLM.
  • Status: 🚧 In Development

RAG Pipeline

  • Location: rag/
  • Purpose: Qdrant vector store + Qwen3-4B embedder + BM25 hybrid + cross-encoder reranker over a 72+ regulatory document corpus
  • Status: 🚧 In Development

Confidence Scoring Engine

  • Location: confidence/
  • Purpose: Compute a final confidence score from LLM faithfulness, hallucination rate, RAG relevancy, and judge evaluation
  • Status: 🚧 In Development

GRPO Feedback Loop

  • Location: rlhf/
  • Purpose: GRPO-based fine-tuning loop for Agent A (Brier reward) and Agent B (precision reward) using data written by the MAD pipeline to SQLite
  • Status: 🚧 In Development
  • Deployment handoff: GRPO-finetuned Agent A/B LoRA adapters are shared through Drive. See GRPO LoRA Deployment Handoff for adapter file manifest, vLLM model names (agent_a, agent_b), prompts, JSON-only schema, generation settings, AWS serving notes, and held-out eval results.

Finetuning Pipelines

  • Location: finetuning/
  • Purpose: Fine-tuning scripts for RoBERTa (jailbreak + PII), Llama-Prompt-Guard-2-86M (prompt injection), Qwen2.5-3B-Instruct (LLM generator), and Qwen3-4B (RAG embedder)
  • Status: 🚧 In Development

Synthetic Evaluation Dataset

  • Location: synthetic_data/
  • Purpose: Generate 200-example healthcare-domain evaluation set (4 error types: fully_correct, missing_caveat, hallucinated_specific, jurisdiction_blind) for MAD pipeline benchmarking
  • Status: 🚧 In Development

🛠️ Development

Prerequisites

  • Docker & Docker Compose
  • Python 3.10+
  • Google Cloud Platform account (for GCS)
  • Ollama (for MAD pipeline — ollama pull qwen2.5:7b)

Airflow Setup

# Setup Airflow
./scripts/setup_airflow.sh

# Start services
./scripts/start_airflow.sh

# Stop services
./scripts/stop_airflow.sh

MAD Pipeline (Quick Start)

pip install -r multi_agent/requirements.txt
ollama pull qwen2.5:7b && ollama serve
python -m multi_agent.run_test

📚 Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

guardrails_enterprise-0.2.0.tar.gz (195.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

guardrails_enterprise-0.2.0-py3-none-any.whl (238.3 kB view details)

Uploaded Python 3

File details

Details for the file guardrails_enterprise-0.2.0.tar.gz.

File metadata

  • Download URL: guardrails_enterprise-0.2.0.tar.gz
  • Upload date:
  • Size: 195.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for guardrails_enterprise-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0a4c162ae1b2accb1ca9a755f55c38086943e41d455b1b2558e8d3258b34d499
MD5 d216a8bbc3b00285c5a38af574a86107
BLAKE2b-256 fe11faa4ece9c2e5245d63e65c319c1a8ec2562525724e85642ccaafb511b59c

See more details on using hashes here.

File details

Details for the file guardrails_enterprise-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for guardrails_enterprise-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1a171a4401dc163893ec33bbd2489506fdbafaeb77f1a65f34a6075e786136c0
MD5 6eccfde1cdf93d7509da775d0ae751a2
BLAKE2b-256 49bd05918f44c87166593143c561330a3de1a643ebcbe15650da271e8e594799

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page