Skip to main content

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt

Project description

Director-AI — Real-time LLM Hallucination Guardrail

Director-AI

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt

CI Tests PyPI Coverage Python 3.10+ mypy Docker License: AGPL v3 HF Spaces DOI Docs Discord

Try it live on Hugging Face Spaces →

Sales Pitch & Pricing · Contact Sales · invest@anulum.li


What It Does

Director-AI sits between your LLM and the user. It scores every output for hallucination before it reaches anyone — and can halt generation mid-stream if coherence drops below threshold.

from director_ai import CoherenceAgent

agent = CoherenceAgent()
result = agent.process("What color is the sky?")

print(result.coherence.score)      # 0.94 — high coherence
print(result.coherence.approved)   # True
print(result.coherence.h_logical)  # 0.10 — low contradiction probability
print(result.coherence.h_factual)  # 0.10 — low factual deviation

Three things make it different:

  1. Token-level streaming halt — not post-hoc review. The safety kernel monitors coherence token-by-token and severs output the moment it degrades.
  2. Dual-entropy scoring — NLI contradiction detection (DeBERTa) + RAG fact-checking against your own knowledge base. Both must pass.
  3. Your data, your rules — ingest PDFs, directories, or any text into a ChromaDB-backed knowledge base. The scorer checks LLM output against your ground truth, not a generic model.

Architecture

          ┌──────────────────────────┐
          │    Coherence Agent       │
          │    (Orchestrator)        │
          └─────────┬────────────────┘
                    │
       ┌────────────┼────────────────┐
       │            │                │
┌──────▼──────┐ ┌───▼──────────┐ ┌───▼────────────┐
│  Generator  │ │  Coherence   │ │  Safety        │
│  (LLM       │ │  Scorer      │ │  Kernel        │
│   backend)  │ │              │ │  (streaming    │
│             │ │  NLI + RAG   │ │   interlock)   │
└─────────────┘ └───┬──────────┘ └────────────────┘
                    │
          ┌─────────▼─────────┐
          │  Ground Truth     │
          │  Store            │
          │  (ChromaDB / RAM) │
          └───────────────────┘

Installation

Performance note: Without a GPU, NLI scoring via ONNX Runtime runs at ~383 ms/pair (CPU) vs 14.6 ms/pair (CUDA). For production NLI workloads, a CUDA-capable GPU is strongly recommended. The heuristic-only mode (pip install director-ai without [nli]) runs at <0.1 ms but has ~55% accuracy — suitable for development/testing, not production gating.

# Basic install (heuristic scoring, no GPU needed)
pip install director-ai

# With NLI model (DeBERTa-based contradiction detection)
pip install director-ai[nli]

# With vector store (ChromaDB for custom knowledge bases)
pip install director-ai[vector]

# With high-quality embeddings (bge-large-en-v1.5)
pip install director-ai[embeddings]

# With ONNX Runtime (portable deployment, no PyTorch needed)
pip install director-ai[onnx]

# With 8-bit quantized NLI
pip install director-ai[quantize]

# Framework integrations
pip install director-ai[langchain]
pip install director-ai[llamaindex]
pip install director-ai[langgraph]
pip install director-ai[haystack]
pip install director-ai[crewai]

# With REST API server
pip install director-ai[server]

# Everything
pip install "director-ai[nli,vector,server,embeddings]"

# Development
git clone https://github.com/anulum/director-ai.git
cd director-ai
pip install -e ".[dev]"

Docker

# CPU-only (heuristic scoring, ~200 MB image)
docker run -p 8080:8080 ghcr.io/anulum/director-ai:latest

# GPU-enabled (ONNX CUDA, 14 ms/pair on GTX 1060, 0.9 ms on Ada)
docker run --gpus all -p 8080:8080 ghcr.io/anulum/director-ai:gpu

# docker compose
docker compose up                    # CPU
docker compose --profile gpu up      # GPU (NVIDIA)
docker compose --profile full up     # CPU + ChromaDB

Verify:

curl localhost:8080/v1/health
curl -X POST localhost:8080/v1/review \
  -H 'Content-Type: application/json' \
  -d '{"prompt":"What color is the sky?","response":"The sky is blue."}'

Production scaling

# Multi-worker (each gets own NLI model instance)
uvicorn director_ai.server:app --workers 4 --host 0.0.0.0 --port 8080

# GPU multi-replica with Docker Compose
docker compose --profile gpu up --scale director-api-gpu=3
Workload CPU RAM GPU VRAM Latency
Heuristic only 1 core 256 MB <0.1 ms
ONNX GPU batch 2 cores 2 GB 1.2 GB 14.6 ms/pair
ONNX CPU batch 4 cores 2 GB 383 ms/pair

Full production guide: docs-site/deployment/production.md.

Usage

Score a single response

from director_ai.core import CoherenceScorer, GroundTruthStore

store = GroundTruthStore()
store.add("sky color", "The sky is blue due to Rayleigh scattering.")

scorer = CoherenceScorer(threshold=0.6, ground_truth_store=store)
approved, score = scorer.review("What color is the sky?", "The sky is green.")

print(approved)     # False — contradicts ground truth
print(score.score)  # 0.42

With a real LLM backend

from director_ai import CoherenceAgent

# Works with any OpenAI-compatible endpoint (llama.cpp, vLLM, Ollama, etc.)
agent = CoherenceAgent(llm_api_url="http://localhost:8080/completion")
result = agent.process("Explain quantum entanglement")

if result.halted:
    print("Output blocked — coherence too low")
else:
    print(result.output)

Token-level streaming with halt

from director_ai.core import StreamingKernel

kernel = StreamingKernel(hard_limit=0.4, window_size=5, window_threshold=0.5)

session = kernel.stream_tokens(
    token_generator=my_token_iterator,
    coherence_callback=lambda tok: my_scorer(tok),
)

for event in session.events:
    if event.halted:
        print(f"\n[HALTED — {session.halt_reason}]")
        break
    print(event.token, end="")

NLI-based scoring (requires torch)

from director_ai.core import CoherenceScorer

scorer = CoherenceScorer(use_nli=True, threshold=0.6)
approved, score = scorer.review(
    "The Earth orbits the Sun.",
    "The Sun orbits the Earth."
)
print(score.h_logical)  # High — NLI detects contradiction

Custom knowledge base with ChromaDB

from director_ai.core import VectorGroundTruthStore

store = VectorGroundTruthStore()  # Uses ChromaDB
store.add_fact("company policy", "Refunds are available within 30 days.")
store.add_fact("pricing", "Enterprise plan starts at $99/month.")

scorer = CoherenceScorer(ground_truth_store=store)
approved, score = scorer.review(
    "What is the refund policy?",
    "We offer full refunds within 90 days."  # Wrong
)
# approved = False — contradicts your KB

LangChain integration

pip install director-ai[langchain,nli]
from director_ai.integrations.langchain import DirectorAIGuard

guard = DirectorAIGuard(
    facts={"refund": "Refunds available within 30 days."},
    threshold=0.6,
    use_nli=True,
)

# Pipe after any LLM in a chain
chain = my_llm | guard
result = chain.invoke({"query": "What is the refund policy?"})

print(result["approved"])  # False if hallucinated
print(result["score"])     # 0.0–1.0 coherence

Raises HallucinationError if raise_on_fail=True. Async supported via ainvoke().

LlamaIndex integration

pip install director-ai[llamaindex,nli]
from director_ai.integrations.llamaindex import DirectorAIPostprocessor

postprocessor = DirectorAIPostprocessor(
    facts={"pricing": "Enterprise plan starts at $99/month."},
    threshold=0.6,
)

# Filters out hallucinated nodes before they reach the user
query_engine = index.as_query_engine(
    node_postprocessors=[postprocessor]
)
response = query_engine.query("What does Enterprise cost?")

Adds director_ai_score metadata to surviving nodes. Also usable standalone via postprocessor.check(query, response).

LangGraph integration

from director_ai.integrations.langgraph import director_ai_node, director_ai_conditional_edge

node = director_ai_node(facts={"policy": "Refunds within 30 days."}, on_fail="flag")
edge = director_ai_conditional_edge("output", "retry")
# Wire into your LangGraph StateGraph

Haystack integration

from director_ai.integrations.haystack import DirectorAIChecker

checker = DirectorAIChecker(facts={"policy": "Refunds within 30 days."})
result = checker.run(query="Refund policy?", replies=["60-day refunds."])
print(result["scores"])  # [CoherenceScore(...)]

CrewAI integration

from director_ai.integrations.crewai import DirectorAITool

tool = DirectorAITool(facts={"policy": "Refunds within 30 days."})
result = tool.check("Refund policy?", "We offer 30-day refunds.")
print(result["approved"])  # True

Score caching

scorer = CoherenceScorer(
    threshold=0.6,
    cache_size=1024,   # LRU cache for streaming deduplication
    cache_ttl=300,     # TTL in seconds
)

More examples

Example Backend What it shows
quickstart.py None Guard any output in 10 lines
openai_guard.py OpenAI Score + streaming halt for GPT-4o
ollama_guard.py Ollama Local LLM guard with Llama 3
langchain_guard.py LangChain Full chain guardrail
streaming_halt_demo.py Simulated All 3 halt mechanisms visualised

Interactive demo

Open in Colab

pip install director-ai gradio
python demo/app.py

Scoring Formula

Coherence = 1 - (0.6 * H_logical + 0.4 * H_factual)
Component Source Range Meaning
H_logical NLI model (DeBERTa) 0-1 Contradiction probability
H_factual RAG retrieval 0-1 Ground truth deviation
  • Score >= 0.6 → approved (configurable)
  • Score < 0.5 → safety kernel emergency halt

Benchmarks

Accuracy — LLM-AggreFact (29,320 samples, 11 datasets)

Model AggreFact Balanced Acc Latency (measured)
FactCG-DeBERTa-v3-Large (default) 75.8% 14.6 ms/pair (ONNX GPU batch)
FactCG-DeBERTa-v3-Large (PyTorch) 75.8% 19 ms/pair (GPU batch)
DeBERTa-v3-base (legacy) 66.2% 220 ms (CPU)

Per-dataset highlights (FactCG, threshold 0.46):

Dataset Balanced Accuracy Notes
Lfqa 87.3% Long-form QA
TofuEval-MediaS 86.2% Media summarization
ClaimVerify 82.1% Factual claims
FactCheck-GPT 81.1% GPT-generated text
RAGTruth 79.0% RAG-specific hallucination
AggreFact-CNN 68.8% Summarization (weak spot)
ExpertQA 59.1% Expert Q&A (weak spot)

Latency — Measured on GTX 1060 6GB

Pipeline Median Per-pair Notes
ONNX GPU batch (16 pairs) 233 ms 14.6 ms Fastest
PyTorch GPU batch (16 pairs) 304 ms 19.0 ms 10.4x vs sequential
ONNX GPU sequential 1042 ms 65.1 ms
PyTorch GPU sequential 3145 ms 196.6 ms Baseline
ONNX CPU batch (16 pairs) 6124 ms 383 ms No GPU
Lightweight (no NLI) 0.08 ms 0.08 ms Heuristic only
Streaming session 0.02 ms 0.02 ms Token-level

Run: python -m benchmarks.latency_bench --nli --onnx --device cuda

Head-to-head (same benchmark — LLM-AggreFact leaderboard)

Tool Bal. Acc Params Latency Streaming
Bespoke-MiniCheck-7B 77.4% 7B ~100 ms (GPU) No
Director-AI (FactCG) 75.8% 0.4B 14.6 ms (ONNX GPU) Yes
MiniCheck-Flan-T5-L 75.0% 0.8B ~120 ms No
MiniCheck-DeBERTa-L 72.6% 0.4B ~120 ms No
HHEM-2.1-Open 71.8% ~0.4B ~200 ms No

Honest assessment: 75.8% balanced accuracy ranks 4th on the LLM-AggreFact leaderboard — within 1.6pp of the top 7B model at 17x fewer params. With ONNX GPU batching, Director-AI is now faster than every competitor at this accuracy tier (14.6 ms/pair vs ~100-200 ms). Director-AI's unique value is the system: NLI + KB facts + streaming token-level halt. No competitor offers real-time streaming gating.

Full competitor analysis with per-class metrics, E2E results, and 14 benchmark scripts: benchmarks/comparison/COMPETITOR_COMPARISON.md. Reproduce all results with benchmarks/ scripts.

Known Limitations

  1. Heuristic fallback is weak: Without pip install director-ai[nli], scoring uses word-overlap heuristics (~55% accuracy). Pass strict_mode=True to disable heuristics and return neutral 0.5 instead.
  2. Summarisation is a weak spot: NLI models (including DeBERTa) under-perform on summarisation datasets (AggreFact-CNN: 68.8%, ExpertQA: 59.1%). Best for factual QA and claim verification.
  3. Chunked NLI: Bidirectional chunked scoring (v1.5.0+) handles long premises and hypotheses. Very short chunks may lose context — prefer documents with 3+ sentences per chunk.
  4. Weights are domain-dependent: Default w_logic=0.6, w_fact=0.4 suits general QA. Adjust via constructor args for your domain.
  5. ONNX CPU is slow: 383 ms/pair without GPU. Use onnxruntime-gpu for production NLI workloads.

Package Structure

src/director_ai/
├── core/                           # Production API
│   ├── agent.py                    # CoherenceAgent — main orchestrator
│   ├── scorer.py                   # Dual-entropy coherence scorer
│   ├── kernel.py                   # Safety kernel (streaming interlock)
│   ├── streaming.py                # Token-level streaming oversight
│   ├── async_streaming.py          # Non-blocking async streaming
│   ├── nli.py                      # NLI scorer (FactCG-DeBERTa-v3-Large)
│   ├── actor.py                    # LLM generator interface
│   ├── knowledge.py                # Ground truth store (in-memory)
│   ├── vector_store.py             # Vector store (ChromaDB / sentence-transformers)
│   ├── cache.py                    # LRU score cache (blake2b, TTL)
│   ├── policy.py                   # YAML declarative policy engine
│   ├── audit.py                    # Structured JSONL audit logger
│   ├── tenant.py                   # Multi-tenant KB isolation
│   ├── sanitizer.py                # Prompt injection hardening
│   └── types.py                    # CoherenceScore, ReviewResult
├── integrations/                   # Framework integrations
│   ├── langchain.py                # LangChain Runnable guardrail
│   ├── llamaindex.py               # LlamaIndex postprocessor
│   ├── langgraph.py                # LangGraph node + conditional edge
│   ├── haystack.py                 # Haystack 2.x component
│   └── crewai.py                   # CrewAI tool
├── cli.py                          # CLI: review, process, batch, serve
├── server.py                       # FastAPI REST wrapper
benchmarks/                         # AggreFact evaluation suite
training/                           # DeBERTa fine-tuning pipeline

Testing

pytest tests/ -v

License & Pricing

Dual-licensed:

  1. Open-Source: GNU AGPL v3.0 — research, personal use, open-source projects. Full source, self-host, no restrictions beyond AGPL copyleft obligations.
  2. Commercial: Proprietary license from ANULUM — removes copyleft, allows closed-source and SaaS deployment.

Commercial Tiers

Tier Monthly Yearly Best for
Hobbyist $9 $90 Students, side projects, experiments. 1 local deployment, community support (GitHub/Discord), delayed updates.
Indie $49 $490 Solo devs, bootstrapped teams (<$2M ARR). 1 production deployment, email support, 12 months updates.
Pro $249 $2,490 Startups & scale-ups. Unlimited internal devs, multiple envs, Slack priority support, early releases.
Enterprise Custom Custom Large orgs. SLA (99.9%), on-prem/air-gapped, SOC2/HIPAA-ready, dedicated engineer, custom NLI fine-tunes.

Perpetual license: $1,299 one-time (Indie equivalent). First 50 commercial licensees: 50% off first year.

Contact: anulum.li/contact or invest@anulum.li

See NOTICE for full terms and third-party acknowledgements.

Q: Is the core library Apache 2.0? No. The entire Director-AI package (core, server, integrations) is licensed under AGPL v3.0. A commercial license is available for closed-source deployment.

Roadmap

Shipped in v1.2.0

  • Score caching — LRU cache with blake2b keys and TTL for streaming dedup
  • Framework integrations — LangGraph nodes, Haystack components, CrewAI tools
  • Quantized NLI — 8-bit bitsandbytes quantization for <80ms GPU inference
  • Upgraded embeddings — bge-large-en-v1.5 via SentenceTransformerBackend
  • MkDocs site — full API reference, deployment guides, domain cookbooks
  • Enhanced demo — side-by-side comparison with token-level highlighting

Shipped in v1.1.0

  • Native SDK interceptorsguard(OpenAI(), facts={...}) wraps any OpenAI/Anthropic client with transparent coherence scoring
  • MiniCheck backend — 72.6% balanced accuracy on LLM-AggreFact
  • Evidence return — every CoherenceScore carries top-K chunks, NLI premise/hypothesis, and similarity distances
  • Graceful fallbacksfallback="retrieval" / "disclaimer" + soft warning zone + streaming on_halt callback

Completed

  • Score caching, LangGraph/Haystack/CrewAI, quantized NLI, MkDocs site
  • director-ai eval — structured CLI benchmarking
  • Native OpenAI/Anthropic SDK interceptors (guard())
  • Evidence schema on all rejections
  • Graceful fallback patterns (retrieval, disclaimer, soft warning)
  • End-to-end guardrail benchmark (600+ traces, 8 metrics)
  • HuggingFace Spaces live demo

Next

  • Cross-encoder reranking for RAG retrieval
  • LLM-as-judge hybrid escalation for low-confidence NLI
  • director-ai serve --workers N multi-process mode

Citation

@software{sotek2026director,
  author    = {Sotek, Miroslav},
  title     = {Director-AI: Real-time LLM Hallucination Guardrail},
  year      = {2026},
  url       = {https://github.com/anulum/director-ai},
  version   = {1.5.1},
  license   = {AGPL-3.0-or-later}
}

Contributing

See CONTRIBUTING.md for guidelines. By contributing, you agree to the Code of Conduct and AGPL v3 licensing terms.

Security

See SECURITY.md for reporting vulnerabilities.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

director_ai-1.7.0.tar.gz (151.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

director_ai-1.7.0-py3-none-any.whl (97.1 kB view details)

Uploaded Python 3

File details

Details for the file director_ai-1.7.0.tar.gz.

File metadata

  • Download URL: director_ai-1.7.0.tar.gz
  • Upload date:
  • Size: 151.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for director_ai-1.7.0.tar.gz
Algorithm Hash digest
SHA256 45877d500309d59030c40adbf66561e08933c1fbe2e1fd22c602d51a990da6b0
MD5 1542a2f7de5052ef7fd6586d0a5ea450
BLAKE2b-256 bc06907e395fda721ff629b1279d5ec5588f6362d13e331e85419bda5171684b

See more details on using hashes here.

Provenance

The following attestation bundles were made for director_ai-1.7.0.tar.gz:

Publisher: publish.yml on anulum/director-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file director_ai-1.7.0-py3-none-any.whl.

File metadata

  • Download URL: director_ai-1.7.0-py3-none-any.whl
  • Upload date:
  • Size: 97.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for director_ai-1.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f4ec2d74bcf0356c723f661be6b7c7f5f3d8ae65f089762afee5f9e3094dac8f
MD5 fc81f67fbbfce5f61ed6062dd7929288
BLAKE2b-256 a2571835d8825b7acdaf21c06807a7770eb93be6351088b181a147c17534b97e

See more details on using hashes here.

Provenance

The following attestation bundles were made for director_ai-1.7.0-py3-none-any.whl:

Publisher: publish.yml on anulum/director-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page