Skip to main content

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt

Project description

Director-AI — Real-time LLM Hallucination Guardrail

Director-AI

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt

CI Tests PyPI Coverage Python 3.10+ Docker License: AGPL v3 HF Spaces DOI Docs


What It Does

Director-AI sits between your LLM and the user. It scores every output for hallucination before it reaches anyone — and can halt generation mid-stream if coherence drops below threshold.

graph LR
    LLM["LLM<br/>(any provider)"] --> D["Director-AI"]
    D --> S["Scorer<br/>NLI + RAG"]
    D --> K["StreamingKernel<br/>token-level halt"]
    S --> V{Approved?}
    K --> V
    V -->|Yes| U["User"]
    V -->|No| H["HALT + evidence"]

Three things make it different:

  1. Token-level streaming halt — not post-hoc review. Severs output the moment coherence degrades.
  2. Dual-entropy scoring — NLI contradiction detection (DeBERTa) + RAG fact-checking against your knowledge base.
  3. Your data, your rules — ingest your own documents. The scorer checks against your ground truth.

Scope

100% Python — no compiled extensions required. Works on any platform with Python 3.10+.

Layer Packages Install
Core (zero heavy deps) CoherenceScorer, StreamingKernel, GroundTruthStore, SafetyKernel pip install director-ai
NLI models DeBERTa, FactCG, MiniCheck, ONNX Runtime pip install director-ai[nli]
Vector DBs ChromaDB, Pinecone, Weaviate, Qdrant pip install director-ai[vector]
LLM judge OpenAI, Anthropic escalation pip install director-ai[openai]
Observability OpenTelemetry spans pip install director-ai[otel]
Server FastAPI + Uvicorn pip install director-ai[server]

Quickstart

Method Command
pip install pip install director-ai
CLI scaffold director-ai quickstart --profile medical
Colab Open in Colab
HF Spaces Try it live
Docker docker run -p 8080:8080 ghcr.io/anulum/director-ai:latest

6-line guard

from director_ai import guard
from openai import OpenAI

client = guard(
    OpenAI(),
    facts={"refund_policy": "Refunds within 30 days only"},
    threshold=0.6,
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What is the refund policy?"}],
)

Catch and inspect a halt

from director_ai import guard, HallucinationError
from openai import OpenAI

client = guard(OpenAI(), facts={"policy": "Refunds within 30 days only"})

try:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "What is the refund policy?"}],
    )
except HallucinationError as exc:
    print(f"HALTED: coherence={exc.score.score:.3f}")
    print(f"Evidence: {exc.score.evidence}")

Score a response

from director_ai.core import CoherenceScorer, GroundTruthStore

store = GroundTruthStore()
store.add("sky color", "The sky is blue due to Rayleigh scattering.")

scorer = CoherenceScorer(threshold=0.6, ground_truth_store=store)
approved, score = scorer.review("What color is the sky?", "The sky is green.")

print(approved)     # False
print(score.score)  # 0.42

Streaming halt

from director_ai.core import StreamingKernel

kernel = StreamingKernel(hard_limit=0.4, window_size=5)
session = kernel.stream_tokens(token_generator, lambda tok: my_scorer(tok))

if session.halted:
    print(f"Halted at token {session.halt_index}: {session.halt_reason}")

Installation

pip install director-ai                      # heuristic scoring
pip install director-ai[nli]                 # NLI model (DeBERTa)
pip install director-ai[vector]              # ChromaDB knowledge base
pip install "director-ai[nli,vector,server]" # production stack

Framework integrations: [langchain], [llamaindex], [langgraph], [haystack], [crewai].

Full installation guide: docs.

Docker

docker run -p 8080:8080 ghcr.io/anulum/director-ai:latest        # CPU
docker run --gpus all -p 8080:8080 ghcr.io/anulum/director-ai:gpu # GPU

Benchmarks

Accuracy — LLM-AggreFact (29,320 samples)

Model Balanced Acc Params Latency Streaming
Bespoke-MiniCheck-7B 77.4% 7B ~100 ms No
Director-AI (FactCG) 75.8% 0.4B 14.6 ms Yes
MiniCheck-Flan-T5-L 75.0% 0.8B ~120 ms No
MiniCheck-DeBERTa-L 72.6% 0.4B ~120 ms No

75.8% balanced accuracy at 17x fewer params than the leader. 14.6 ms/pair with ONNX GPU batching — faster than every competitor at this accuracy tier. Director-AI's unique value is the system: NLI + KB + streaming halt.

Full results: benchmarks/comparison/COMPETITOR_COMPARISON.md.

Performance Trade-offs

Backend Latency (GPU) Latency (CPU) Accuracy Streaming When to use
Heuristic (no NLI) <0.1 ms <0.1 ms ~55% Yes Prototyping, latency-critical
ONNX GPU batch 14.6 ms/pair 383 ms/pair 75.8% Yes Production GPU
PyTorch GPU batch 19.0 ms/pair N/A 75.8% Yes When ONNX unavailable
PyTorch GPU seq 197 ms/pair N/A 75.8% Yes Single-pair scoring
Hybrid (NLI + LLM judge) 200-500 ms 500-2000 ms ~78% est. Yes Max accuracy, summarisation

Streaming cadence multiplies per-token overhead. At score_every_n=4, divide callback cost by 4. See docs-site/guide/streaming-overhead.md.

End-to-End Pipeline (300 traces)

Full pipeline (CoherenceAgent + GroundTruthStore + StreamingKernel):

Metric Value
Catch rate (recall) 46.7%
Precision 56.9%
F1 51.3%
Evidence coverage 100% (every rejection includes supporting chunks)
Avg latency 15.8 ms (p95: 40 ms)

Dialogue catch rate is 80%; QA and summarisation are lower (36%, 24%) due to NLI weakness on short-form text. Hybrid mode improves summarisation. Run: python -m benchmarks.e2e_eval.

Domain Presets

8 built-in profiles with tuned thresholds:

director-ai config --profile medical   # threshold=0.75, NLI on, reranker on
director-ai config --profile finance   # threshold=0.70, w_fact=0.6
director-ai config --profile legal     # threshold=0.68, w_logic=0.6
director-ai config --profile creative  # threshold=0.40, permissive

Known Limitations

  1. Heuristic fallback is weak: Without [nli], scoring uses word-overlap heuristics (~55% accuracy). Use strict_mode=True to reject (0.9) instead of guessing.
  2. Summarisation is a weak spot: NLI models under-perform on summarisation (AggreFact-CNN: 68.8%, ExpertQA: 59.1%).
  3. ONNX CPU is slow: 383 ms/pair without GPU. Use onnxruntime-gpu for production.
  4. Weights are domain-dependent: Default w_logic=0.6, w_fact=0.4 suits general QA. Adjust for your domain.
  5. Chunked NLI: Very short chunks (<3 sentences) may lose context.
  6. LLM-as-judge sends data externally: When llm_judge_enabled=True, truncated prompt+response (500 chars) are sent to the configured provider (OpenAI/Anthropic). Do not enable in privacy-sensitive deployments without user consent.
  7. guard() provider coverage: guard() auto-detects OpenAI-compatible clients (OpenAI, vLLM, Groq, LiteLLM, Ollama, Together) via client.chat.completions.create and Anthropic via client.messages.create. AWS Bedrock, Google Gemini, and Cohere have different SDK shapes — use the low-level CoherenceScorer.review() API instead.

Migrating from 1.x

1.x name 2.x name Notes
DirectorModule CoherenceScorer Same API, new name
BackfireKernel SafetyKernel Same API, new name
StrangeLoopAgent CoherenceAgent Same API, new name
KnowledgeBase GroundTruthStore Same API, new name
MockActor MockGenerator Same API, new name
RealActor LLMGenerator Same API, new name

Old names still work but emit DeprecationWarning. They will be removed in 3.0.

Breaking changes in 2.3.0:

  • strict_mode=True now rejects (divergence=0.9) when NLI is unavailable, instead of returning neutral 0.5.
  • guard() uses duck-type detection instead of module-name checks. Custom clients that expose client.chat.completions.create are now accepted.
  • Enterprise modules are lazy-loaded since 2.2.0 — import director_ai no longer pulls heavy deps.

Citation

@software{sotek2026director,
  author    = {Sotek, Miroslav},
  title     = {Director-AI: Real-time LLM Hallucination Guardrail},
  year      = {2026},
  url       = {https://github.com/anulum/director-ai},
  version   = {2.3.0},
  license   = {AGPL-3.0-or-later}
}

License

Dual-licensed:

  1. Open-Source: GNU AGPL v3.0 — research, personal use, open-source projects.
  2. Commercial: Proprietary license — removes copyleft for closed-source and SaaS.

See Licensing for pricing tiers and FAQ.

Contact: anulum.li/contact | invest@anulum.li

Contributing

See CONTRIBUTING.md. By contributing, you agree to AGPL v3 terms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

director_ai-2.4.0.tar.gz (180.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

director_ai-2.4.0-py3-none-any.whl (112.8 kB view details)

Uploaded Python 3

File details

Details for the file director_ai-2.4.0.tar.gz.

File metadata

  • Download URL: director_ai-2.4.0.tar.gz
  • Upload date:
  • Size: 180.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for director_ai-2.4.0.tar.gz
Algorithm Hash digest
SHA256 c1d1b89860a76f2154e0c7d7745f2b2d130dd1acb927caf34a4e944aafe4219f
MD5 41eabd0fc3a72c816c22c975fd5c3d5c
BLAKE2b-256 2724e05df2eb1faf49fbf783437fbddf26671770a12b12cb29280320f1512c65

See more details on using hashes here.

Provenance

The following attestation bundles were made for director_ai-2.4.0.tar.gz:

Publisher: publish.yml on anulum/director-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file director_ai-2.4.0-py3-none-any.whl.

File metadata

  • Download URL: director_ai-2.4.0-py3-none-any.whl
  • Upload date:
  • Size: 112.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for director_ai-2.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 96579f5f21305ed41f44c2c900598cb0a3b752419860cad0151dc0c393882fa8
MD5 fb5fc9b18d75010f8a2350233b43ee8c
BLAKE2b-256 32d9a9f3fe534f652241ccf20a8cafce3ffdfde0fc0107dee8e61bb1d294784e

See more details on using hashes here.

Provenance

The following attestation bundles were made for director_ai-2.4.0-py3-none-any.whl:

Publisher: publish.yml on anulum/director-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page