Robust JSON parsing for LLM outputs with automatic repair and field extraction

These details have not been verified by PyPI

Project links

Project description

LLM JSON Repair

Robust JSON parsing for LLM outputs with automatic repair and field extraction.

LLMs (like Claude, GPT, etc.) often produce malformed JSON due to:

Trailing commas in arrays and objects
Unquoted property names
Truncated output from context length limits
Markdown wrapping (```json blocks)
JavaScript literals (undefined, NaN)

This library handles all of these issues automatically.

Installation

pip install llm-json-repair

Or install from source:

pip install -e .

Quick Start

from llm_json_repair import parse_json

# Handles trailing commas
result = parse_json('{"items": [1, 2, 3,]}')
print(result.data)  # {'items': [1, 2, 3]}

# Extracts from markdown code blocks
text = '```json\n{"status": "ok"}\n```'
result = parse_json(text)
print(result.data)  # {'status': 'ok'}

# Reports what was fixed
print(result.was_repaired)      # True
print(result.repair_actions)    # ['removed_trailing_commas']

Features

Automatic Repair

The parse_json() function automatically fixes common issues:

from llm_json_repair import parse_json

# Trailing commas
parse_json('{"a": 1,}').data  # {'a': 1}

# Unquoted keys
parse_json('{foo: "bar"}').data  # {'foo': 'bar'}

# Missing closing brackets
parse_json('{"items": [1, 2').data  # {'items': [1, 2]}

# JavaScript undefined/NaN
parse_json('{"x": undefined}').data  # {'x': None}

Field Extraction for Truncated Responses

When JSON is too broken to parse, extract specific fields:

from llm_json_repair import FieldExtractor, extract_field

# LLM response was truncated mid-JSON
malformed = '''{"facts": ["fact1", "fact2"],
                "confidence": 0.8,
                "reasoning": "Based on the ana'''

# Extract what we can
extractor = FieldExtractor()
extractor.add_string_array("facts")
extractor.add_number("confidence")

result = extractor.extract(malformed)
print(result["facts"])       # ['fact1', 'fact2']
print(result["confidence"])  # 0.8

# Or use convenience function
facts = extract_field(malformed, "facts", "string_array")

Strict Mode

Raise an exception instead of returning None for unparseable input:

from llm_json_repair import parse_json, ParseError

try:
    result = parse_json("not json", strict=True)
except ParseError as e:
    print(f"Failed: {e}")
    print(f"Tried: {e.attempts}")

API Reference

`parse_json(text, *, strict=False, extract_from_text=True)`

Main entry point for parsing JSON from LLM output.

Parameters:

text: The text containing JSON to parse
strict: If True, raise ParseError on failure instead of returning None
extract_from_text: If True, try to extract JSON from markdown/prose

Returns: ParseResult with:

data: The parsed JSON data (or None if parsing failed)
was_repaired: Whether repairs were needed
repair_actions: List of repairs applied
original_text: The original input
repaired_text: The text after repairs

`repair_json(text)`

Low-level function to apply repairs without parsing.

Returns: Tuple of (repaired_text, list_of_repairs)

`extract_json_from_text(text)`

Extract JSON from text that may contain markdown or prose.

Returns: The extracted JSON string, or None

`FieldExtractor`

Builder for extracting specific fields from malformed JSON.

extractor = FieldExtractor()
extractor.add_string("name")
extractor.add_number("count")
extractor.add_boolean("active")
extractor.add_string_array("tags")
extractor.add_object_array("items")
extractor.add_object("metadata")

result = extractor.extract(text)

Convenience Functions

extract_field(text, field_name, field_type="auto")
extract_array(text, field_name)
extract_object(text, field_name)

Testing

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=llm_json_repair

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Dec 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_json_repair-1.0.0.tar.gz (12.7 kB view details)

Uploaded Dec 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_json_repair-1.0.0-py3-none-any.whl (9.4 kB view details)

Uploaded Dec 29, 2025 Python 3

File details

Details for the file llm_json_repair-1.0.0.tar.gz.

File metadata

Download URL: llm_json_repair-1.0.0.tar.gz
Upload date: Dec 29, 2025
Size: 12.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for llm_json_repair-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`b9b2be32b159be847eccb8b4570b747054dd15f579c97d0a8cbf79f34b600e4c`
MD5	`0885f5593d56d8abf43510f071bb3a39`
BLAKE2b-256	`36d00ea0f68253492088d08a130df630120ece19e1a849010e4cca053d40d9e6`

See more details on using hashes here.

File details

Details for the file llm_json_repair-1.0.0-py3-none-any.whl.

File metadata

Download URL: llm_json_repair-1.0.0-py3-none-any.whl
Upload date: Dec 29, 2025
Size: 9.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for llm_json_repair-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`907abd6a86e4c8450e55eeb3be008421b9367c075d5eb4400452611daa6728ea`
MD5	`edd3e98fcdf105cf8d369d92f8694d03`
BLAKE2b-256	`07e7c656aaff9f907749c3813a8e50c47f6bfe23b39f0d8daf11063e9fc193eb`

See more details on using hashes here.

llm-json-repair 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLM JSON Repair

Installation

Quick Start

Features

Automatic Repair

Field Extraction for Truncated Responses

Strict Mode

API Reference

parse_json(text, *, strict=False, extract_from_text=True)

repair_json(text)

extract_json_from_text(text)

FieldExtractor

Convenience Functions

Testing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`parse_json(text, *, strict=False, extract_from_text=True)`

`repair_json(text)`

`extract_json_from_text(text)`

`FieldExtractor`