A composable validation framework for LLM inputs and outputs
Project description
Composable validation guardrails for LLM pipelines — accuracy, relevancy, toxicity, privacy, and bias checks in one pipeline.
Install
pip install validate-llm
Requires Python 3.11+. Optional extras:
pip install "validate-llm[demo]" # FastAPI demo server + web UI
pip install "validate-llm[test]" # pytest + datasets
pip install "validate-llm[dev]" # everything
Quick start
from llm_validation_framework import ValidationFramework, LLMProvider, Pipe
from llm_validation_framework import ToxicityAgent, PrivacyAgent, AccuracyAgent
from llm_validation_framework.config_loader import load_api_key
llm = LLMProvider(provider="anthropic", model="claude-haiku-4-5-20251001", key=load_api_key())
vf = ValidationFramework(
llm=llm,
input_guardrail=Pipe(steps=[ToxicityAgent()], verbose=False),
output_guardrail=Pipe(steps=[ToxicityAgent(), PrivacyAgent(), AccuracyAgent()], verbose=False),
)
result = vf.validate("What is the Pacific Ocean?")
print(result["status"], result["score"]) # PASS 0.87
validate() returns a structured dict with status, score, and per-agent results for both the input and output guardrails. See the docs for the full schema.
Agents
| Agent | What it does | Needs API key |
|---|---|---|
ToxicityAgent |
Three-layer check: profanity filter → toxicity model → semantic similarity | No |
PrivacyAgent |
Regex scan for SSN, credit cards, API keys; optional system prompt leakage detection | No |
AccuracyAgent |
LLM-as-a-judge factual accuracy + relevancy, with optional RAG grounding | Yes |
RelevancyAgent |
LLM-as-a-judge check that the answer addresses the question | Yes |
BiasAgent |
LLM-as-a-judge scan for stereotypes and discriminatory language | Yes |
ToxicityAgent and PrivacyAgent run fully locally with no external calls.
Config
export ANTHROPIC_API_KEY=your-key
Or create a config.ini at the repo root (gitignored):
[ANTHROPIC]
API_KEY=your-key
Supported providers follow litellm's naming.
RAG grounding
Pass a retriever to AccuracyAgent to ground factual checks against your own corpus:
from llm_validation_framework import AccuracyAgent, RAGProvider
accuracy = AccuracyAgent(rag=RAGProvider(your_vectorstore.as_retriever()))
See the RAG Integration guide for a full walkthrough.
Demo
The demo is a FastAPI backend + static web UI.
# Terminal 1
uvicorn demo.api_server:app --host 127.0.0.1 --port 5050
# Terminal 2
python demo/serve_ui.py
Open http://127.0.0.1:8000.
Contributors
- Hitha Shri Nagaruru
- James Wu
- Lewis Lui
- Thomas Yeoh
License
MIT — see LICENSE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file validate_llm-0.1.1.tar.gz.
File metadata
- Download URL: validate_llm-0.1.1.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
390d2103d6a0d81ef705e716a2467d18791b4bbb8e72a02b7452af223fa421b4
|
|
| MD5 |
796d5da174ae1db888260157d3dce3ad
|
|
| BLAKE2b-256 |
fbe014008c65d73269588ef0cd67f4589a3d1f656b76295cfb64527762c21e59
|
File details
Details for the file validate_llm-0.1.1-py3-none-any.whl.
File metadata
- Download URL: validate_llm-0.1.1-py3-none-any.whl
- Upload date:
- Size: 17.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ab11dce31a8a92e7ec58411a7e00b3cddad9e4b9ed846972b2c23e8171e011e
|
|
| MD5 |
1b16805ce0fbbb8789a20399451d77c9
|
|
| BLAKE2b-256 |
783c88dac62d87d4166af4a3b4fca7f421c87341fe04c67124a0b52d54a28702
|