Skip to main content

A composable validation framework for LLM inputs and outputs

Project description

pip install validate-llm

Composable validation guardrails for LLM pipelines — accuracy, relevancy, toxicity, privacy, and bias checks in one pipeline.

Documentation →


Install

pip install validate-llm

Requires Python 3.11+. Optional extras:

pip install "validate-llm[demo]"   # FastAPI demo server + web UI
pip install "validate-llm[test]"   # pytest + datasets
pip install "validate-llm[dev]"    # everything

Quick start

from llm_validation_framework import ValidationFramework, LLMProvider, Pipe
from llm_validation_framework import ToxicityAgent, PrivacyAgent, AccuracyAgent
from llm_validation_framework.config_loader import load_api_key

llm = LLMProvider(provider="anthropic", model="claude-haiku-4-5-20251001", key=load_api_key())

vf = ValidationFramework(
    llm=llm,
    input_guardrail=Pipe(steps=[ToxicityAgent()], verbose=False),
    output_guardrail=Pipe(steps=[ToxicityAgent(), PrivacyAgent(), AccuracyAgent()], verbose=False),
)

result = vf.validate("What is the Pacific Ocean?")
print(result["status"], result["score"])  # PASS 0.87

validate() returns a structured dict with status, score, and per-agent results for both the input and output guardrails. See the docs for the full schema.

Agents

Agent What it does Needs API key
ToxicityAgent Three-layer check: profanity filter → toxicity model → semantic similarity No
PrivacyAgent Regex scan for SSN, credit cards, API keys; optional system prompt leakage detection No
AccuracyAgent LLM-as-a-judge factual accuracy + relevancy, with optional RAG grounding Yes
RelevancyAgent LLM-as-a-judge check that the answer addresses the question Yes
BiasAgent LLM-as-a-judge scan for stereotypes and discriminatory language Yes

ToxicityAgent and PrivacyAgent run fully locally with no external calls.

Config

export ANTHROPIC_API_KEY=your-key

Or create a config.ini at the repo root (gitignored):

[ANTHROPIC]
API_KEY=your-key

Supported providers follow litellm's naming.

RAG grounding

Pass a retriever to AccuracyAgent to ground factual checks against your own corpus:

from llm_validation_framework import AccuracyAgent, RAGProvider

accuracy = AccuracyAgent(rag=RAGProvider(your_vectorstore.as_retriever()))

See the RAG Integration guide for a full walkthrough.

Demo

The demo is a FastAPI backend + static web UI.

# Terminal 1
uvicorn demo.api_server:app --host 127.0.0.1 --port 5050

# Terminal 2
python demo/serve_ui.py

Open http://127.0.0.1:8000.

Contributors

  • Hitha Shri Nagaruru
  • James Wu
  • Lewis Lui
  • Thomas Yeoh

License

MIT — see LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

validate_llm-0.1.1.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

validate_llm-0.1.1-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file validate_llm-0.1.1.tar.gz.

File metadata

  • Download URL: validate_llm-0.1.1.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for validate_llm-0.1.1.tar.gz
Algorithm Hash digest
SHA256 390d2103d6a0d81ef705e716a2467d18791b4bbb8e72a02b7452af223fa421b4
MD5 796d5da174ae1db888260157d3dce3ad
BLAKE2b-256 fbe014008c65d73269588ef0cd67f4589a3d1f656b76295cfb64527762c21e59

See more details on using hashes here.

File details

Details for the file validate_llm-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: validate_llm-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for validate_llm-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8ab11dce31a8a92e7ec58411a7e00b3cddad9e4b9ed846972b2c23e8171e011e
MD5 1b16805ce0fbbb8789a20399451d77c9
BLAKE2b-256 783c88dac62d87d4166af4a3b4fca7f421c87341fe04c67124a0b52d54a28702

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page