Validate, repair, and retry LLM structured outputs

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ncorder

These details have not been verified by PyPI

Project description

outputguard

Stop wrestling with broken LLM JSON. Validate, repair, and retry — automatically.

The Problem

LLMs produce broken JSON constantly. They wrap it in markdown fences, leave trailing commas, use Python True/False instead of true/false, sprinkle in NaN, truncate mid-object when they hit token limits, and helpfully add commentary around the JSON you asked for. Every AI application ends up writing the same brittle json.loads() + try/except + regex gauntlet.

The Solution

import outputguard

schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"}
    },
    "required": ["name", "age"]
}

# Typical LLM output — fenced, trailing comma, single quotes
llm_output = '''```json
{'name': 'Alice', 'age': 30,}
```'''

result = outputguard.validate_and_repair(llm_output, schema)
print(result.valid)              # True
print(result.data)               # {'name': 'Alice', 'age': 30}
print(result.strategies_applied) # ['strip_fences', 'fix_quotes', 'fix_commas']

Fourteen repair strategies, JSON Schema validation, retry prompt generation, and a CLI — in one tiny package with three dependencies.

Installation

pip install outputguard

Or with uv:

uv add outputguard

Quick Start

Validate & Repair

The most common pattern — validate against a schema, auto-repair if broken, get clean data back:

import outputguard

result = outputguard.validate_and_repair(llm_output, schema)

if result.valid:
    process(result.data)                  # Clean, validated dict
    if result.repaired:
        log(result.strategies_applied)    # What was fixed
else:
    handle_errors(result.errors)          # Detailed error paths

Repair Only

When you just need parseable JSON and don't have a schema:

result = outputguard.repair(broken_json)
print(result.text)                # Clean JSON string
print(result.strategies_applied)  # ['fix_booleans', 'fix_commas']

Validate Only

Check JSON against a schema without attempting repair:

result = outputguard.validate(llm_output, schema)
for error in result.errors:
    print(f"{error.path}: {error.message}")
    # $.age: 'thirty' is not of type 'integer'

Retry Loop

When repair is not enough, generate a correction prompt and send it back to the LLM:

import outputguard

def get_structured_output(llm, prompt, schema, max_retries=3):
    for attempt in range(max_retries + 1):
        raw = llm.generate(prompt)
        result = outputguard.validate_and_repair(raw, schema)

        if result.valid:
            return result.data

        # Generate a targeted correction prompt
        prompt = outputguard.retry_prompt(raw, schema, result.errors)

    raise RuntimeError("Failed to get valid output")

The retry prompt tells the LLM exactly what went wrong — which fields are missing, which types are incorrect, and what the schema expects. Works with any LLM provider.

CLI

# Validate JSON against a schema
outputguard validate output.json -s schema.json

# Validate with auto-repair
outputguard validate output.json -s schema.json --repair

# Repair only (no schema)
outputguard repair output.json

# Pipe from stdin
echo '{name: "Alice", age: 30,}' | outputguard repair -

# Generate a retry prompt
outputguard retry-prompt output.json -s schema.json

# List all repair strategies
outputguard strategies

What It Fixes

Fourteen strategies, applied in order. Each one targets a specific class of LLM JSON malformation:

#	Strategy	Before	After
1	`strip_fences`	```json\n{"a": 1}\n```	`{"a": 1}`
2	`extract_json`	`Sure! Here's the JSON: {"a": 1} Let me know!`	`{"a": 1}`
3	`remove_comments`	`{"a": 1} // a comment`	`{"a": 1}`
4	`fix_commas`	`{"a": 1, "b": 2,}`	`{"a": 1, "b": 2}`
5	`fix_quotes`	`{'a': 'hello'}`	`{"a": "hello"}`
6	`fix_keys`	`{a: 1, b: 2}`	`{"a": 1, "b": 2}`
7	`fix_values`	`{"a": NaN, "b": Infinity}`	`{"a": null, "b": null}`
8	`fix_booleans`	`{"a": True, "b": None}`	`{"a": true, "b": null}`
9	`fix_truncated`	`{"a": 1, "b": "hel`	`{"a": 1, "b": "hel"}`
10	`fix_ellipsis`	`{"items": [1, 2, ...]}`	`{"items": [1, 2]}`
11	`fix_unicode`	`{"a": "\u00"}`	`{"a": "�"}`
12	`fix_inner_quotes`	`{"a": " "hello" "}`	`{"a": " \"hello\" "}`
13	`fix_closers`	`{"a": [1, 2, 3`	`{"a": [1, 2, 3]}`
14	`fix_newlines`	`{"a": "line1↵line2"}`	`{"a": "line1\nline2"}`

Tested Against 290 Real LLM Models

We tested outputguard against every text-generation model on OpenRouter — 290 models across 40+ providers.

Result: 99% success rate (286/290 models handled correctly)

	Count
Models tested	290
Valid immediately	225 (78%)
Repaired by outputguard	61 (21%)
Failed	4 (1%)

The 61 repaired outputs were fixed automatically — 55 needed strip_fences, 4 needed extract_json, 2 needed fix_truncated. The 4 failures were models that returned corrupted/garbled output (tokenizer issues, not JSON problems).

Highlighted model results (click to expand)

Model	Provider	Result	Fix Applied
GPT-5 Mini	OpenAI	✅ Clean	—
GPT-5 Pro	OpenAI	✅ Clean	—
GPT-4.1 Mini	OpenAI	✅ Clean	—
Claude Sonnet 4.6	Anthropic	✅ Clean	—
Claude Opus 4.7	Anthropic	✅ Clean	—
Claude Haiku 4.5	Anthropic	🛠️ Repaired	`strip_fences`
Gemini 2.5 Flash	Google	✅ Clean	—
Gemini 2.5 Pro	Google	🛠️ Repaired	`strip_fences`
Gemini 3.1 Flash Lite	Google	✅ Clean	—
Grok 4.1 Fast	xAI	✅ Clean	—
Grok 4.3	xAI	✅ Clean	—
Mistral Medium 3.5	Mistral	✅ Clean	—
Mistral Large	Mistral	✅ Clean	—
DeepSeek v4 Pro	DeepSeek	✅ Clean	—
DeepSeek v3.2	DeepSeek	🛠️ Repaired	`strip_fences`
Llama 4 Maverick	Meta	✅ Clean	—
Llama 4 Scout	Meta	🛠️ Repaired	`strip_fences`
Qwen 3.6 Flash	Alibaba	✅ Clean	—
Qwen 3 Max	Alibaba	✅ Clean	—
Kimi K2.6	Moonshot	✅ Clean	—
GLM 5.1	Zhipu	✅ Clean	—
Command A	Cohere	✅ Clean	—
Phi-4	Microsoft	🛠️ Repaired	`strip_fences`
Nova Premier	Amazon	🛠️ Repaired	`strip_fences`
Seed 1.6	ByteDance	✅ Clean	—
Mercury 2	Inception	✅ Clean	—

All 290 raw model outputs are committed as test fixtures. Run python -m tests.real_model_runner sweep to re-test against every model yourself.

Test Suite

1,347 tests across 7 testing dimensions:

Category	Tests	What it covers
Strategy exhaustive	159	Every strategy pushed to edge cases
Adversarial & fuzzing	286	141 chaotic inputs, concurrency, performance
API contracts	145	`parse()`, exceptions, reports, CLI, registry
LLM corpus	119	Real failure patterns from 7 model families
Combinations	115	Multi-strategy interactions, ordering, idempotency
Real model fixtures	580	Actual outputs from 290 LLM models
Core & integration	414	Strategies, validator, repairer, guard, stress

uv run pytest tests/ -q
# 1,881 passed in 1.52s

Configuration

Use the OutputGuard class for fine-grained control over which strategies run:

from outputguard import OutputGuard

# Strict mode — only fix formatting, not content
strict = OutputGuard(
    strategies=["strip_fences", "fix_commas"],
    max_repair_attempts=1,
)
result = strict.validate_and_repair(text, schema)

# Aggressive mode — all strategies, more attempts
aggressive = OutputGuard(
    strategies=None,          # All 13 strategies (default)
    max_repair_attempts=5,
)
result = aggressive.validate_and_repair(text, schema)

RepairReport

For debugging and observability, RepairReport gives you a full breakdown of what happened:

from outputguard.report import RepairReport

report = RepairReport(
    original_text=original,
    final_text=repaired,
    success=True,
    steps=steps,
)

print(report.summary)
# Repaired using 2 strategy(ies): strip_fences, fix_commas

print(report.confidence)   # 0.8 — fewer strategies = higher confidence
print(report.diff)         # Unified diff from original to repaired
print(report.step_diffs()) # Per-strategy diffs for verbose logging

Confidence scoring is a heuristic from 0.0 to 1.0. It decreases as more strategies are needed and as the text changes more. Useful for deciding whether to trust a repair or escalate to a retry.

API Reference

Module-level Functions

Function	Returns	Description
`validate(text, schema)`	`ValidationResult`	Validate JSON against a schema
`repair(text)`	`RepairResult`	Auto-repair malformed JSON
`validate_and_repair(text, schema)`	`ValidationResult`	Validate, repair if needed, re-validate
`retry_prompt(text, schema, errors)`	`str`	Generate a correction prompt for the LLM

Classes

Class	Description
`OutputGuard`	Configurable pipeline with strategy selection and retry limits
`ValidationResult`	Result with `valid`, `data`, `errors`, `repaired`, `strategies_applied`
`RepairResult`	Result with `repaired`, `text`, `strategies_applied`, `parse_error`
`ValidationError`	Error detail with `message`, `path`, `schema_path`, `value`
`RepairReport`	Detailed report with `diff`, `confidence`, `summary`, `step_diffs()`

Exceptions

Exception	Description
`OutputGuardError`	Base exception
`ParseError`	JSON could not be parsed even after repair
`SchemaValidationError`	JSON parsed but does not match the schema
`RepairError`	Repair was attempted but failed
`StrategyError`	A specific repair strategy encountered an error

CLI Reference

outputguard [COMMAND] [OPTIONS]

Command	Description
`validate INPUT -s SCHEMA`	Validate JSON against a schema
`validate INPUT -s SCHEMA --repair`	Validate with auto-repair
`repair INPUT`	Repair malformed JSON
`repair INPUT --strategies strip_fences,fix_commas`	Repair with specific strategies
`retry-prompt INPUT -s SCHEMA`	Generate a correction prompt
`strategies`	List all available strategies

All commands accept -f json for machine-readable output, -o FILE to write to a file, and - as INPUT to read from stdin.

Why outputguard?

	`json.loads()` + regex	outputguard
Repair strategies	Roll your own	14, tested and ordered
Schema validation	Separate library	Built in (jsonschema)
Retry prompts	Write your own	One function call
Confidence scoring	No	Yes
Truncated JSON	Breaks	Recovers
Tests	Probably zero	1,347 (including real LLM outputs)
LLM dependencies	—	None (works with any provider)
Footprint	—	3 deps: click, jsonschema, rich

outputguard has no opinion about which LLM you use. It operates on strings and schemas — plug it into OpenAI, Anthropic, local models, or anything else.

Examples

See the examples/ directory for complete, runnable scripts:

basic_usage.py — Core validate/repair workflow
retry_loop.py — Retry pattern with correction prompts
custom_pipeline.py — Custom strategy configuration
batch_processing.py — Process multiple outputs with statistics

Contributing

Contributions are welcome. Please open an issue first to discuss what you'd like to change.

git clone https://github.com/ndcorder/outputguard.git
cd outputguard
uv sync --dev
uv run pytest tests/ -v

TypeScript / JavaScript

Looking for a JS/TS version? See outputguard-js — same 13 strategies, same API shape, TypeScript-native.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ncorder

These details have not been verified by PyPI

Release history Release notifications | RSS feed

2.1.1

May 14, 2026

2.1.0

May 11, 2026

2.0.0

May 10, 2026

1.0.0

May 10, 2026

This version

0.2.0

May 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outputguard-0.2.0.tar.gz (156.5 kB view details)

Uploaded May 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

outputguard-0.2.0-py3-none-any.whl (28.7 kB view details)

Uploaded May 9, 2026 Python 3

File details

Details for the file outputguard-0.2.0.tar.gz.

File metadata

Download URL: outputguard-0.2.0.tar.gz
Upload date: May 9, 2026
Size: 156.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for outputguard-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`f791c2e70d94fa9f1a4b1fab3db4c34a9dbbce6f2b3a3c503881bf6a19d21fbd`
MD5	`1e51468a77abb044d36117dac572f18f`
BLAKE2b-256	`13e74c9b307efd766a97b31ba405c323d84fcd14f7b7eb59e7e43195cd7367aa`

See more details on using hashes here.

Provenance

The following attestation bundles were made for outputguard-0.2.0.tar.gz:

Publisher: ci.yml on ndcorder/outputguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: outputguard-0.2.0.tar.gz
- Subject digest: f791c2e70d94fa9f1a4b1fab3db4c34a9dbbce6f2b3a3c503881bf6a19d21fbd
- Sigstore transparency entry: 1487496198
- Sigstore integration time: May 9, 2026
Source repository:
- Permalink: ndcorder/outputguard@5636bfe94831f3c708d13d591f42278593646231
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/ndcorder
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@5636bfe94831f3c708d13d591f42278593646231
- Trigger Event: push

File details

Details for the file outputguard-0.2.0-py3-none-any.whl.

File metadata

Download URL: outputguard-0.2.0-py3-none-any.whl
Upload date: May 9, 2026
Size: 28.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for outputguard-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0e40239973039eb8b76af79753bcf4ec090c47596f900c56ab4170d46b94b8b4`
MD5	`28c2923fc15e2aabab54586f3c1fa9f6`
BLAKE2b-256	`6abdee340f78b59280be28bb426de10a8e19a80a824641615fcb5683f7298b23`

See more details on using hashes here.

Provenance

The following attestation bundles were made for outputguard-0.2.0-py3-none-any.whl:

Publisher: ci.yml on ndcorder/outputguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: outputguard-0.2.0-py3-none-any.whl
- Subject digest: 0e40239973039eb8b76af79753bcf4ec090c47596f900c56ab4170d46b94b8b4
- Sigstore transparency entry: 1487496224
- Sigstore integration time: May 9, 2026
Source repository:
- Permalink: ndcorder/outputguard@5636bfe94831f3c708d13d591f42278593646231
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/ndcorder
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@5636bfe94831f3c708d13d591f42278593646231
- Trigger Event: push

outputguard 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

outputguard

The Problem

The Solution

Installation

Quick Start

Validate & Repair

Repair Only

Validate Only

Retry Loop

CLI

What It Fixes

Tested Against 290 Real LLM Models

Test Suite

Configuration

RepairReport

API Reference

Module-level Functions

Classes

Exceptions

CLI Reference

Why outputguard?

Examples

Contributing

TypeScript / JavaScript

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance