Skip to main content

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt

Project description

Director-AI — Real-time LLM Hallucination Guardrail

Director-AI

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt

CI Tests PyPI Coverage Python 3.10+ Docker License: AGPL v3 HF Spaces DOI Docs


What It Does

Director-AI sits between your LLM and the user. It scores every output for hallucination before it reaches anyone — and can halt generation mid-stream if coherence drops below threshold.

graph LR
    LLM["LLM<br/>(any provider)"] --> D["Director-AI"]
    D --> S["Scorer<br/>NLI + RAG"]
    D --> K["StreamingKernel<br/>token-level halt"]
    S --> V{Approved?}
    K --> V
    V -->|Yes| U["User"]
    V -->|No| H["HALT + evidence"]

Three things make it different:

  1. Token-level streaming halt — not post-hoc review. Severs output the moment coherence degrades.
  2. Dual-entropy scoring — NLI contradiction detection (DeBERTa) + RAG fact-checking against your knowledge base.
  3. Your data, your rules — ingest your own documents. The scorer checks against your ground truth.

Scope

100% Python — no compiled extensions required. Works on any platform with Python 3.10+.

Layer Packages Install
Core (zero heavy deps) CoherenceScorer, StreamingKernel, GroundTruthStore, SafetyKernel pip install director-ai
NLI models DeBERTa, FactCG, MiniCheck, ONNX Runtime pip install director-ai[nli]
Vector DBs ChromaDB, Pinecone, Weaviate, Qdrant pip install director-ai[vector]
LLM judge OpenAI, Anthropic escalation pip install director-ai[openai]
Observability OpenTelemetry spans pip install director-ai[otel]
Server FastAPI + Uvicorn pip install director-ai[server]

Quickstart

Method Command
pip install pip install director-ai
CLI scaffold director-ai quickstart --profile medical
Colab Open in Colab
HF Spaces Try it live
Docker docker run -p 8080:8080 ghcr.io/anulum/director-ai:latest

6-line guard

from director_ai import guard
from openai import OpenAI

client = guard(
    OpenAI(),
    facts={"refund_policy": "Refunds within 30 days only"},
    threshold=0.6,
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What is the refund policy?"}],
)

Catch and inspect a halt

from director_ai import guard, HallucinationError
from openai import OpenAI

client = guard(OpenAI(), facts={"policy": "Refunds within 30 days only"})

try:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "What is the refund policy?"}],
    )
except HallucinationError as exc:
    print(f"HALTED: coherence={exc.score.score:.3f}")
    print(f"Evidence: {exc.score.evidence}")

Score a response

from director_ai.core import CoherenceScorer, GroundTruthStore

store = GroundTruthStore()
store.add("sky color", "The sky is blue due to Rayleigh scattering.")

scorer = CoherenceScorer(threshold=0.6, ground_truth_store=store)
approved, score = scorer.review("What color is the sky?", "The sky is green.")

print(approved)     # False
print(score.score)  # 0.42

Streaming halt

from director_ai.core import StreamingKernel

kernel = StreamingKernel(hard_limit=0.4, window_size=5)
session = kernel.stream_tokens(token_generator, lambda tok: my_scorer(tok))

if session.halted:
    print(f"Halted at token {session.halt_index}: {session.halt_reason}")

Installation

pip install director-ai                      # heuristic scoring
pip install director-ai[nli]                 # NLI model (DeBERTa)
pip install director-ai[vector]              # ChromaDB knowledge base
pip install "director-ai[nli,vector,server]" # production stack

Framework integrations: [langchain], [llamaindex], [langgraph], [haystack], [crewai].

Full installation guide: docs.

Docker

docker run -p 8080:8080 ghcr.io/anulum/director-ai:latest        # CPU
docker run --gpus all -p 8080:8080 ghcr.io/anulum/director-ai:gpu # GPU

Benchmarks

Accuracy — LLM-AggreFact (29,320 samples)

Model Balanced Acc Params Latency Streaming
Bespoke-MiniCheck-7B 77.4% 7B ~100 ms No
Director-AI (FactCG) 75.8% 0.4B 14.6 ms Yes
MiniCheck-Flan-T5-L 75.0% 0.8B ~120 ms No
MiniCheck-DeBERTa-L 72.6% 0.4B ~120 ms No

75.8% balanced accuracy at 17x fewer params than the leader. 14.6 ms/pair with ONNX GPU batching — faster than every competitor at this accuracy tier. Director-AI's unique value is the system: NLI + KB + streaming halt.

Full results: benchmarks/comparison/COMPETITOR_COMPARISON.md.

Performance Trade-offs

Backend Latency (GPU) Latency (CPU) Accuracy Streaming When to use
Heuristic (no NLI) <0.1 ms <0.1 ms ~55% Yes Prototyping, latency-critical
ONNX GPU batch 14.6 ms/pair 383 ms/pair 75.8% Yes Production GPU
PyTorch GPU batch 19.0 ms/pair N/A 75.8% Yes When ONNX unavailable
PyTorch GPU seq 197 ms/pair N/A 75.8% Yes Single-pair scoring
Hybrid (NLI + LLM judge) 200-500 ms 500-2000 ms ~78% est. Yes Max accuracy, summarisation

Streaming cadence multiplies per-token overhead. At score_every_n=4, divide callback cost by 4. See docs-site/guide/streaming-overhead.md.

End-to-End Pipeline (300 traces)

Full pipeline (CoherenceAgent + GroundTruthStore + StreamingKernel):

Metric Value
Catch rate (recall) 46.7%
Precision 56.9%
F1 51.3%
Evidence coverage 100% (every rejection includes supporting chunks)
Avg latency 15.8 ms (p95: 40 ms)

Dialogue catch rate is 80%; QA and summarisation are lower (36%, 24%) due to NLI weakness on short-form text. Hybrid mode improves summarisation. Run: python -m benchmarks.e2e_eval.

Domain Presets

8 built-in profiles with tuned thresholds:

director-ai config --profile medical   # threshold=0.75, NLI on, reranker on
director-ai config --profile finance   # threshold=0.70, w_fact=0.6
director-ai config --profile legal     # threshold=0.68, w_logic=0.6
director-ai config --profile creative  # threshold=0.40, permissive

Known Limitations

  1. Heuristic fallback is weak: Without [nli], scoring uses word-overlap heuristics (~55% accuracy). Use strict_mode=True to reject (0.9) instead of guessing.
  2. Summarisation is a weak spot: NLI models under-perform on summarisation (AggreFact-CNN: 68.8%, ExpertQA: 59.1%).
  3. ONNX CPU is slow: 383 ms/pair without GPU. Use onnxruntime-gpu for production.
  4. Weights are domain-dependent: Default w_logic=0.6, w_fact=0.4 suits general QA. Adjust for your domain.
  5. Chunked NLI: Very short chunks (<3 sentences) may lose context.
  6. LLM-as-judge sends data externally: When llm_judge_enabled=True, truncated prompt+response (500 chars) are sent to the configured provider (OpenAI/Anthropic). Do not enable in privacy-sensitive deployments without user consent.
  7. guard() provider coverage: guard() auto-detects OpenAI-compatible clients (OpenAI, vLLM, Groq, LiteLLM, Ollama, Together) via client.chat.completions.create and Anthropic via client.messages.create. AWS Bedrock, Google Gemini, and Cohere have different SDK shapes — use the low-level CoherenceScorer.review() API instead.

Migrating from 1.x

1.x name 2.x name Notes
DirectorModule CoherenceScorer Same API, new name
BackfireKernel SafetyKernel Same API, new name
StrangeLoopAgent CoherenceAgent Same API, new name
KnowledgeBase GroundTruthStore Same API, new name
MockActor MockGenerator Same API, new name
RealActor LLMGenerator Same API, new name

Old names still work but emit DeprecationWarning. They will be removed in 3.0.

Breaking changes in 2.3.0:

  • strict_mode=True now rejects (divergence=0.9) when NLI is unavailable, instead of returning neutral 0.5.
  • guard() uses duck-type detection instead of module-name checks. Custom clients that expose client.chat.completions.create are now accepted.
  • Enterprise modules are lazy-loaded since 2.2.0 — import director_ai no longer pulls heavy deps.

Citation

@software{sotek2026director,
  author    = {Sotek, Miroslav},
  title     = {Director-AI: Real-time LLM Hallucination Guardrail},
  year      = {2026},
  url       = {https://github.com/anulum/director-ai},
  version   = {2.6.0},
  license   = {AGPL-3.0-or-later}
}

License

Dual-licensed:

  1. Open-Source: GNU AGPL v3.0 — research, personal use, open-source projects.
  2. Commercial: Proprietary license — removes copyleft for closed-source and SaaS.

See Licensing for pricing tiers and FAQ.

Contact: anulum.li/contact | invest@anulum.li

Contributing

See CONTRIBUTING.md. By contributing, you agree to AGPL v3 terms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

director_ai-2.6.0.tar.gz (187.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

director_ai-2.6.0-py3-none-any.whl (117.1 kB view details)

Uploaded Python 3

File details

Details for the file director_ai-2.6.0.tar.gz.

File metadata

  • Download URL: director_ai-2.6.0.tar.gz
  • Upload date:
  • Size: 187.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for director_ai-2.6.0.tar.gz
Algorithm Hash digest
SHA256 0cc5c99de8ad79142fd718f8f12aeb4de3e7c4ae81cba2bd1757f561fc9806ce
MD5 1ee8b41face516bb0b829789503e4f42
BLAKE2b-256 d08e7cd3453c49d2afa04ff858a5f2e5311401610dc45edc7f4da34ca77f9d95

See more details on using hashes here.

Provenance

The following attestation bundles were made for director_ai-2.6.0.tar.gz:

Publisher: publish.yml on anulum/director-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file director_ai-2.6.0-py3-none-any.whl.

File metadata

  • Download URL: director_ai-2.6.0-py3-none-any.whl
  • Upload date:
  • Size: 117.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for director_ai-2.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b46120dcd260fcf5cb1a8e63799edba62e66ef50b975243137c08941d52457d1
MD5 7805b32eedaa51c180e58c4652cfb7d7
BLAKE2b-256 cdee6d647cf7453996104779145172252187f70921f79ef8916de7393aa84180

See more details on using hashes here.

Provenance

The following attestation bundles were made for director_ai-2.6.0-py3-none-any.whl:

Publisher: publish.yml on anulum/director-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page