Skip to main content

Unified AI reliability platform. One install, 12 diagnostic engines. Continuous monitoring, fault diagnosis, and compliance for LLM pipelines.

Project description

llmguardrail

One install. 12 diagnostic engines. Your AI pipeline's immune system.

Stop guessing why your LLM app is broken. llmguardrail runs 12 specialized diagnostic engines across your entire AI stack — RAG pipelines, agent loops, chain-of-thought reasoning, prompt stability, model swaps, and output drift — in a single scan.

pip install llm-sentry

The Problem

You have an LLM in production. It breaks. You don't know why.

  • Is it retrieval? Generation? Both?
  • Is your agent stuck in a loop?
  • Did your last prompt change break something?
  • Is your chain-of-thought reasoning actually coherent?
  • Did swapping models change behavior?

You're left stitching together 5+ tools, each with different APIs, different install processes, different report formats.

llmguardrail gives you one API, one report, one answer.

Quick Start

import llmguardrail as lg

# Run a full diagnostic scan
report = lg.scan(
    pipeline_name="my_rag_app",
    checks=["rag", "coherence", "agents"],
    rag_queries=[
        ("What is the return policy?",
         [("Returns accepted within 30 days", 0.95)],
         "Our return policy allows returns within 30 days."),
    ],
    coherence_traces=[
        ("1. User asked about returns\n2. Policy doc found\n3. Answer generated",
         "Returns are allowed within 30 days"),
    ],
    agent_task="answer customer questions",
    agent_actions=[
        ("search_docs", "found 3 results"),
        ("generate_answer", "answer produced"),
    ],
)

print(report.summary())
# Pipeline: my_rag_app
# Health: HEALTHY (85%)
# Checks: 3 run
#   [+] rag: healthy (90%)
#   [+] coherence: healthy (88%)
#   [+] agents: healthy (100%)

12 Diagnostic Engines

Engine What it catches Package
RAG Pathology Retrieval miss, poor grounding, context noise (Four Soils) rag-pathology
Chain Probe CASCADE fault analysis — finds root cause in multi-step pipelines chain-probe
Agent Patrol Futile cycles, oscillation, stall, drift, abandonment in agents agent-patrol
CoT Coherence Reasoning gaps, contradictions, unsupported conclusions cot-coherence
Prompt Brittleness Prompts that break under paraphrase stress prompt-brittleness
Inject Lock Prompt injection vulnerability detection inject-lock
LLM Mutation Mutation testing for prompt robustness llm-mutation
Model Parity Behavioral drift when swapping models model-parity
Spec Drift Output schema violations in production spec-drift
Drift Sentinel PR intent vs. actual code drift drift-sentinel
LLM Contract Runtime output contract enforcement llm-contract
Context Recall Context window position bias auditing context-recall

Every engine is zero-dependency. No OpenAI key required. No LLM calls to evaluate LLMs.

Use Individual Engines

# RAG diagnosis
from llmguardrail.rag import RAGDiagnoser, RAGQuery, Chunk

diagnoser = RAGDiagnoser("my_pipeline")
diagnosis = diagnoser.diagnose_query(RAGQuery(
    query="What is GDP?",
    retrieved_chunks=[Chunk("GDP is 6%", score=0.9)],
    generated_answer="GDP is 6%",
))
print(diagnosis.soil_type)  # SoilType.GOOD

# Agent monitoring
from llmguardrail.agents import PatrolMonitor

monitor = PatrolMonitor(task_description="answer questions")
report = monitor.observe(action="searching", result="no results")

# Chain fault analysis
from llmguardrail.chains import Pipeline

pipeline = Pipeline("my_chain")
# ... add steps with @pipeline.probe ...
result = pipeline.cascade(initial_input="data")
print(result.root_cause_step)

Scan History & Trends

from llmguardrail import ScanStore, scan

report = scan(pipeline_name="prod", checks=["rag", "coherence"])

store = ScanStore("guardrail.db")
store.save(report)

# Track health over time
trend = store.trend("prod", last_n=10)
print(trend)  # [0.72, 0.75, 0.78, 0.81, ...]

# Full history
history = store.get_history("prod")

CLI

# Check which engines are installed
llmguardrail status

# View scan history
llmguardrail history --pipeline my_app

Custom Checks

from llmguardrail import register_check, CheckResult, HealthStatus, scan

def my_custom_check(**kwargs) -> CheckResult:
    # Your custom diagnostic logic
    score = run_my_diagnostics()
    return CheckResult(
        check_name="my_check",
        score=score,
        status=HealthStatus.from_score(score),
        recommendations=["Fix X"] if score < 0.7 else [],
    )

register_check("my_check", my_custom_check)
report = scan(checks=["rag", "my_check"])

Why not RAGAS / DeepEval / TruLens?

llmguardrail RAGAS DeepEval TruLens
Needs OpenAI key No Yes Yes Yes
Diagnoses failure type Yes No (just scores) No No
Agent monitoring Yes No No No
Chain fault analysis Yes No No No
Prompt robustness Yes No Partial No
Zero dependencies Yes No No No
Works offline Yes No No No

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_sentry-0.2.0.tar.gz (16.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_sentry-0.2.0-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file llm_sentry-0.2.0.tar.gz.

File metadata

  • Download URL: llm_sentry-0.2.0.tar.gz
  • Upload date:
  • Size: 16.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for llm_sentry-0.2.0.tar.gz
Algorithm Hash digest
SHA256 025cd3d027c3ced35bfbe794914e7cd1fd1b18b8e58d7d47fc57f2674646623d
MD5 89cccf66357b15e68a8f10c232a3af14
BLAKE2b-256 d43fb309c3b53f4b4223a0eee1a4d157415c574a34d0f512a7c93d5cfbca0fcb

See more details on using hashes here.

File details

Details for the file llm_sentry-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: llm_sentry-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for llm_sentry-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3df8087277a5679ec11d063a4b8c23fd9958db0a9f786bc77061d5e71a98601d
MD5 395378e82549cf602b5c5acf091e77c0
BLAKE2b-256 c5ad53cceb5003004b6cd2079bbb4feda631f1eb029546340e1f06541abc92ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page