Skip to main content

Validate LLM responses against schemas, types, and constraints. Catch bad JSON, missing fields, and hallucinated formats before they crash your app.

Project description

llm-response-validator

Validate LLM responses against schemas, types, and constraints. Catch invalid JSON, missing fields, wrong types, and hallucinated formats before they crash your pipeline.

The Pain

You ask GPT-4 for JSON and it wraps it in markdown code blocks. Or returns 4 fields instead of 5. Or puts a string where you need an int. Your downstream code crashes at 3 AM.

Install

pip install llm-response-validator

Quick Start

from llm_response_validator import validate, extract_json, ensure

# Extract JSON from LLM output (handles markdown blocks, extra text, etc.)
raw = '''Here's the data:
```json
{"name": "Alice", "age": 30, "scores": [95, 87]}

''' data = extract_json(raw) # {"name": "Alice", "age": 30, "scores": [95, 87]}

Validate against a schema

schema = { "name": {"type": "string", "required": True}, "age": {"type": "int", "min": 0, "max": 150}, "email": {"type": "string", "required": True}, "scores": {"type": "list", "min_length": 1}, } result = validate(data, schema) print(result.valid) # False print(result.errors) # ["Missing required field: email"]

ensure() - validate or raise

data = ensure(raw, schema) # Extracts JSON + validates, raises on failure


## Schema Definition

```python
schema = {
    "field_name": {
        "type": "string",       # string, int, float, bool, list, dict, any
        "required": True,       # Field must be present (default: False)
        "min": 0,               # Min value for numbers
        "max": 100,             # Max value for numbers
        "min_length": 1,        # Min length for strings/lists
        "max_length": 500,      # Max length for strings/lists
        "pattern": r"^\w+$",    # Regex pattern for strings
        "enum": ["a", "b"],     # Allowed values
        "default": "unknown",   # Default if missing (makes it not required)
        "items": {              # Schema for list items
            "type": "string"
        },
        "properties": {         # Schema for nested dict
            "sub_field": {"type": "int"}
        }
    }
}

API

from llm_response_validator import (
    validate,       # Validate dict against schema
    extract_json,   # Extract JSON from messy LLM output
    ensure,         # Extract + validate + raise on error
    repair_json,    # Attempt to fix common JSON errors
    ValidationResult,
    ValidationError,
)

# Extract JSON (handles code blocks, extra text, multiple objects)
data = extract_json(llm_output)              # Returns dict/list or None
data = extract_json(llm_output, default={})  # With default

# Repair common JSON issues
fixed = repair_json('{"name": "test",}')     # Removes trailing comma
fixed = repair_json("{'name': 'test'}")      # Fixes single quotes

# Validate
result = validate(data, schema)
result.valid          # bool
result.errors         # list of error strings
result.warnings       # list of warning strings
result.cleaned_data   # data with defaults applied and types coerced

# Ensure (extract + validate + raise)
data = ensure(llm_output, schema)  # Returns cleaned data or raises

Features

  • JSON extraction — pulls JSON from code blocks, mixed text, multiple formats
  • JSON repair — fixes trailing commas, single quotes, unquoted keys, missing brackets
  • Type validation — string, int, float, bool, list, dict with coercion
  • Nested schemas — validate deeply nested structures
  • Range/length checks — min, max, min_length, max_length
  • Pattern matching — regex validation for string fields
  • Enum validation — restrict to allowed values
  • Defaults — fill missing fields with defaults
  • Zero dependencies — pure Python, stdlib only

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_response_validator-0.1.0.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_response_validator-0.1.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file llm_response_validator-0.1.0.tar.gz.

File metadata

  • Download URL: llm_response_validator-0.1.0.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for llm_response_validator-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fc05da5fb7a2dea2d50d2ddb37886c1c5d3e1a5beb0067e9357c532cdb1e8628
MD5 bef3b9ed6238ea24ae63234b708e5d8b
BLAKE2b-256 4c06d468e878313d3677c8cc908c7b8ebcddf0578fcb29beb93458afe1c3fc41

See more details on using hashes here.

File details

Details for the file llm_response_validator-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_response_validator-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 15aa6d9398399db88730b0f205c3997acb2bf26ae8af0dd362d13ead56feb517
MD5 801eacb0bf1f8a0cd983f0b5987fb696
BLAKE2b-256 5ae22830bd0e1ae34289b8cd958cf82f67365b87627e68978e6e0afc01815d5a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page