Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt
Project description
Director-AI
Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt
What It Does
Director-AI sits between your LLM and the user. It scores every output for hallucination before it reaches anyone — and can halt generation mid-stream if coherence drops below threshold.
from director_ai import CoherenceAgent
agent = CoherenceAgent()
result = agent.process("What color is the sky?")
print(result.coherence.score) # 0.94 — high coherence
print(result.coherence.approved) # True
print(result.coherence.h_logical) # 0.10 — low contradiction probability
print(result.coherence.h_factual) # 0.10 — low factual deviation
Three things make it different:
- Token-level streaming halt — not post-hoc review. The safety kernel monitors coherence token-by-token and severs output the moment it degrades.
- Dual-entropy scoring — NLI contradiction detection (DeBERTa) + RAG fact-checking against your own knowledge base. Both must pass.
- Your data, your rules — ingest PDFs, directories, or any text into a ChromaDB-backed knowledge base. The scorer checks LLM output against your ground truth, not a generic model.
Architecture
┌──────────────────────────┐
│ Coherence Agent │
│ (Orchestrator) │
└─────────┬────────────────┘
│
┌────────────┼────────────────┐
│ │ │
┌──────▼──────┐ ┌───▼──────────┐ ┌───▼────────────┐
│ Generator │ │ Coherence │ │ Safety │
│ (LLM │ │ Scorer │ │ Kernel │
│ backend) │ │ │ │ (streaming │
│ │ │ NLI + RAG │ │ interlock) │
└─────────────┘ └───┬──────────┘ └────────────────┘
│
┌─────────▼─────────┐
│ Ground Truth │
│ Store │
│ (ChromaDB / RAM) │
└───────────────────┘
Installation
# Basic install (heuristic scoring, no GPU needed)
pip install director-ai
# With NLI model (DeBERTa-based contradiction detection)
pip install director-ai[nli]
# With vector store (ChromaDB for custom knowledge bases)
pip install director-ai[vector]
# With LangChain or LlamaIndex
pip install director-ai[langchain]
pip install director-ai[llamaindex]
# With REST API server
pip install director-ai[server]
# Fine-tuning pipeline
pip install director-ai[train]
# Everything
pip install "director-ai[nli,vector,server]"
# Development
git clone https://github.com/anulum/director-ai.git
cd director-ai
pip install -e ".[dev]"
Usage
Score a single response
from director_ai.core import CoherenceScorer, GroundTruthStore
store = GroundTruthStore()
store.add("sky color", "The sky is blue due to Rayleigh scattering.")
scorer = CoherenceScorer(threshold=0.6, ground_truth_store=store)
approved, score = scorer.review("What color is the sky?", "The sky is green.")
print(approved) # False — contradicts ground truth
print(score.score) # 0.42
With a real LLM backend
from director_ai import CoherenceAgent
# Works with any OpenAI-compatible endpoint (llama.cpp, vLLM, Ollama, etc.)
agent = CoherenceAgent(llm_api_url="http://localhost:8080/completion")
result = agent.process("Explain quantum entanglement")
if result.halted:
print("Output blocked — coherence too low")
else:
print(result.output)
Token-level streaming with halt
from director_ai.core import StreamingKernel
kernel = StreamingKernel(hard_limit=0.4, window_size=5, window_threshold=0.5)
session = kernel.stream_tokens(
token_generator=my_token_iterator,
coherence_callback=lambda tok: my_scorer(tok),
)
for event in session.events:
if event.halted:
print(f"\n[HALTED — {session.halt_reason}]")
break
print(event.token, end="")
NLI-based scoring (requires torch)
from director_ai.core import CoherenceScorer
scorer = CoherenceScorer(use_nli=True, threshold=0.6)
approved, score = scorer.review(
"The Earth orbits the Sun.",
"The Sun orbits the Earth."
)
print(score.h_logical) # High — NLI detects contradiction
Custom knowledge base with ChromaDB
from director_ai.core import VectorGroundTruthStore
store = VectorGroundTruthStore() # Uses ChromaDB
store.add_fact("company policy", "Refunds are available within 30 days.")
store.add_fact("pricing", "Enterprise plan starts at $99/month.")
scorer = CoherenceScorer(ground_truth_store=store)
approved, score = scorer.review(
"What is the refund policy?",
"We offer full refunds within 90 days." # Wrong
)
# approved = False — contradicts your KB
Integration examples
See examples/ for ready-to-run integrations:
| Example | Backend | What it shows |
|---|---|---|
quickstart.py |
None | Guard any output in 10 lines |
openai_guard.py |
OpenAI | Score + streaming halt for GPT-4o |
ollama_guard.py |
Ollama | Local LLM guard with Llama 3 |
langchain_guard.py |
LangChain | Output checker for chains |
streaming_halt_demo.py |
Simulated | All 3 halt mechanisms visualised |
Interactive demo
Try Director-AI in the browser — no install needed:
Or run the Gradio demo locally:
pip install director-ai gradio
python demo/app.py
Scoring Formula
Coherence = 1 - (0.6 * H_logical + 0.4 * H_factual)
| Component | Source | Range | Meaning |
|---|---|---|---|
| H_logical | NLI model (DeBERTa) | 0-1 | Contradiction probability |
| H_factual | RAG retrieval | 0-1 | Ground truth deviation |
- Score >= 0.6 → approved (configurable)
- Score < 0.5 → safety kernel emergency halt
Benchmarks
Evaluated on LLM-AggreFact (29,320 samples across 11 datasets):
| Model | AggreFact Balanced Acc | Latency (avg) |
|---|---|---|
| DeBERTa-v3-base (baseline) | 66.2% | 220 ms |
| Fine-tuned DeBERTa-v3-large | 64.7% | 223 ms |
| Fine-tuned DeBERTa-v3-base | 59.0% | 220 ms |
Per-dataset highlights:
| Dataset | Balanced Accuracy | Notes |
|---|---|---|
| Reveal | 80.7% | Strong on factual claims |
| FactCheck-GPT | 71.7% | Good on GPT-generated text |
| Lfqa | 64.8% | Long-form QA |
| RAGTruth | 58.9% | RAG-specific hallucination |
| AggreFact-CNN | 53.0% | Summarization (known weak spot) |
Head-to-head (same benchmark, same metric — LLM-AggreFact leaderboard):
| Tool | Bal. Acc | Params | Latency | Streaming |
|---|---|---|---|---|
| Bespoke-MiniCheck-7B | 77.4% | 7B | ~100 ms (GPU) | No |
| MiniCheck-Flan-T5-L | 75.0% | 0.8B | ~120 ms | No |
| MiniCheck-DeBERTa-L | 72.6% | 0.4B | ~120 ms | No |
| HHEM-2.1-Open | 71.8% | ~0.4B | ~200 ms | No |
| Director-AI | 66.2% | 0.4B | 220 ms | Yes |
Honest assessment: The NLI scorer alone is not state-of-the-art. Director-AI's value is in the system — combining NLI with your own KB facts, streaming token-level gating, and configurable halt thresholds. No competitor offers real-time streaming halt. The NLI component is pluggable; swap in any model that improves on these numbers.
Full comparison with SelfCheckGPT, RAGAS, NeMo Guardrails, Lynx, and others
in benchmarks/comparison/. Benchmark scripts in
benchmarks/. Fine-tuning pipeline in training/.
Package Structure
src/director_ai/
├── core/ # Production API
│ ├── agent.py # CoherenceAgent — main orchestrator
│ ├── scorer.py # Dual-entropy coherence scorer
│ ├── kernel.py # Safety kernel (streaming interlock)
│ ├── streaming.py # Token-level streaming oversight
│ ├── async_streaming.py # Non-blocking async streaming
│ ├── nli.py # NLI scorer (DeBERTa)
│ ├── actor.py # LLM generator interface
│ ├── knowledge.py # Ground truth store (in-memory)
│ ├── vector_store.py # Vector store (ChromaDB backend)
│ ├── policy.py # YAML declarative policy engine
│ ├── audit.py # Structured JSONL audit logger
│ ├── tenant.py # Multi-tenant KB isolation
│ ├── sanitizer.py # Prompt injection hardening
│ ├── bridge.py # Physics-backed scorer (optional)
│ └── types.py # CoherenceScore, ReviewResult
├── integrations/ # Framework integrations
│ ├── langchain.py # LangChain Runnable guardrail
│ └── llamaindex.py # LlamaIndex postprocessor
├── cli.py # CLI: review, process, batch, serve
├── server.py # FastAPI REST wrapper
benchmarks/ # AggreFact evaluation suite
training/ # DeBERTa fine-tuning pipeline
Testing
pytest tests/ -v
License
Dual-licensed:
- Open-Source: GNU AGPL v3.0 — academic research, personal use, open-source projects
- Commercial: Proprietary license from ANULUM — closed-source and commercial use
See NOTICE for full terms and third-party acknowledgements.
Citation
@software{sotek2026director,
author = {Sotek, Miroslav},
title = {Director-AI: Real-time LLM Hallucination Guardrail},
year = {2026},
url = {https://github.com/anulum/director-ai},
version = {1.0.0},
license = {AGPL-3.0-or-later}
}
Contributing
See CONTRIBUTING.md for guidelines. By contributing, you agree to the Code of Conduct and AGPL v3 licensing terms.
Security
See SECURITY.md for reporting vulnerabilities.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file director_ai-1.0.0.tar.gz.
File metadata
- Download URL: director_ai-1.0.0.tar.gz
- Upload date:
- Size: 87.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3cfb59fe7a0d78dbcedd12c177b911be01461671674c02aff1d266643d75ee95
|
|
| MD5 |
e9b7fe389fe3e3a4b394a9c5a0f53313
|
|
| BLAKE2b-256 |
7cc072cf09d825ed0029c9fa04c6424a1714738afc764de9f35ec5a79af221df
|
Provenance
The following attestation bundles were made for director_ai-1.0.0.tar.gz:
Publisher:
publish.yml on anulum/director-ai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
director_ai-1.0.0.tar.gz -
Subject digest:
3cfb59fe7a0d78dbcedd12c177b911be01461671674c02aff1d266643d75ee95 - Sigstore transparency entry: 993784782
- Sigstore integration time:
-
Permalink:
anulum/director-ai@68bae19f0bc7fe556c81384a3f40d381b86ddcfd -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/anulum
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@68bae19f0bc7fe556c81384a3f40d381b86ddcfd -
Trigger Event:
release
-
Statement type:
File details
Details for the file director_ai-1.0.0-py3-none-any.whl.
File metadata
- Download URL: director_ai-1.0.0-py3-none-any.whl
- Upload date:
- Size: 68.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d21fe99473b9140e1f23176dc8cbb9b0a211ba27e48cb3b29727b600b5dd4b51
|
|
| MD5 |
4c630dca6491f98d4131c1ae4f775fe9
|
|
| BLAKE2b-256 |
2afd4042df61a3ac4e7cd026b5684ddbc8d9784ddc396586fbdcc5808658c05a
|
Provenance
The following attestation bundles were made for director_ai-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on anulum/director-ai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
director_ai-1.0.0-py3-none-any.whl -
Subject digest:
d21fe99473b9140e1f23176dc8cbb9b0a211ba27e48cb3b29727b600b5dd4b51 - Sigstore transparency entry: 993784863
- Sigstore integration time:
-
Permalink:
anulum/director-ai@68bae19f0bc7fe556c81384a3f40d381b86ddcfd -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/anulum
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@68bae19f0bc7fe556c81384a3f40d381b86ddcfd -
Trigger Event:
release
-
Statement type: