Skip to main content

High-performance JSON repair library using Rust

Project description

fast_json_repair

PyPI version Python 3.11-3.14 License: MIT

A high-performance JSON repair library for Python, powered by Rust. This is a drop-in replacement for json_repair with significant performance improvements.

๐Ÿ™ Attribution

This library is a Rust port of the excellent json_repair library created by Stefano Baccianella. The original Python implementation is a brilliant solution for fixing malformed JSON from Large Language Models (LLMs), and this port aims to bring the same functionality with improved performance.

All credit for the original concept, logic, and implementation goes to Stefano Baccianella. This Rust port maintains API compatibility with the original library while leveraging Rust's performance benefits.

If you find this library useful, please also consider starring the original json_repair repository.

Features

  • ๐Ÿ“ฆ Available on PyPI: pip install fast-json-repair
  • ๐Ÿš€ Rust Performance: Core repair logic implemented in Rust for maximum speed
  • ๐Ÿ”ง Automatic Repair: Fixes common JSON errors automatically
  • ๐Ÿ Python Compatible: Works with Python 3.11-3.14
  • ๐Ÿ”„ Drop-in Replacement: Compatible API with the original json_repair library
  • โšก Fast JSON Parsing: Uses orjson for JSON parsing operations

Compatibility with Original json_repair

This is a drop-in replacement for the original json_repair library with the same API:

โœ… Included:

  • repair_json() - Main repair function with return_objects, skip_json_loads, ensure_ascii, indent parameters
  • loads() - Convenience function for loading broken JSON directly to Python objects
  • All repair capabilities: quotes, literals, commas, brackets, escape sequences, Unicode

โŒ Not Included:

  • File operations (load(), from_file()) - Use Python's built-in file handling + repair_json()
  • CLI tool - Library-only implementation
  • Streaming support - Not yet implemented

Key Differences:

  • ๐Ÿš€ 20x faster average, up to 110x for large objects with long strings
  • ๐Ÿ”ข Unquoted numbers parsed as numbers (not strings)
  • ๐Ÿ“ฆ Uses orjson for high-performance JSON operations

Installation

Quick Install

pip install fast-json-repair

Build from Source

Click to expand build instructions

Prerequisites

  • Python 3.11-3.14
  • Rust toolchain (curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh)
  • uv (recommended) or pip

Quick Start with uv (Recommended)

# Clone the repository
git clone https://github.com/dvideby0/fast_json_repair.git
cd fast_json_repair

# Run the automated setup script
./setup.sh

The setup script will:

  • โœ… Install uv and Rust if needed
  • โœ… Create a virtual environment (.venv)
  • โœ… Install all dependencies
  • โœ… Build the Rust extension
  • โœ… Verify the installation

Manual Build Steps

# Clone the repository
git clone https://github.com/dvideby0/fast_json_repair.git
cd fast_json_repair

# Option 1: Using uv (fast!)
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv sync
maturin develop --release

# Option 2: Using pip
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install maturin orjson
maturin develop --release

Usage

from fast_json_repair import repair_json, loads

# Fix broken JSON
broken = "{'name': 'John', 'age': 30}"  # Single quotes
fixed = repair_json(broken)
print(fixed)  # {"age":30,"name":"John"}

# Parse directly to Python object
data = loads("{'key': 'value'}")
print(data)  # {'key': 'value'}

# Handle Unicode properly
text = "{'message': 'ไฝ ๅฅฝไธ–็•Œ'}"
result = repair_json(text, ensure_ascii=False)
print(result)  # {"message":"ไฝ ๅฅฝไธ–็•Œ"}

# Format with indentation
formatted = repair_json("{'a': 1}", indent=2)

What It Repairs

Automatically fixes common JSON formatting issues:

Issue Fix
Single quotes โ†’ Double quotes
Unquoted keys โ†’ Quoted keys
Python literals (True/False/None) โ†’ JSON (true/false/null)
Trailing commas Removed
Missing commas Added
Extra commas Removed
Unclosed brackets/braces Auto-closed
Invalid escape sequences Fixed
Unicode characters Preserved or escaped (configurable)

API Reference

repair_json(json_string, **kwargs)

Repairs invalid JSON and returns valid JSON string.

Parameters:

  • json_string (str): The potentially invalid JSON string to repair
  • return_objects (bool): If True, return parsed Python object instead of JSON string
  • skip_json_loads (bool): If True, skip initial validation for better performance
  • ensure_ascii (bool): If True, escape non-ASCII characters in output
  • indent (int): Number of spaces for indentation (None for compact output)

Returns:

  • str or object: Repaired JSON string or parsed Python object

loads(json_string, **kwargs)

Repairs and parses invalid JSON string to Python object.

Parameters:

  • json_string (str): The potentially invalid JSON string to repair and parse
  • **kwargs: Additional arguments passed to repair_json

Returns:

  • object: The parsed Python object

Performance

This Rust-based implementation provides significant performance improvements over the pure Python original.

Fast Path Optimization

The library automatically uses the fastest path when possible:

Fast Path (uses orjson for serialization):

  • Valid JSON input
  • ensure_ascii=False
  • indent is either None (compact) or 2

Fallback Path (uses stdlib json):

  • Valid JSON input with ensure_ascii=True
  • Valid JSON input with indent values other than None or 2

Repair Path (uses Rust implementation):

  • Any invalid JSON that needs repair
  • Always respects ensure_ascii and indent settings

For maximum performance with valid JSON:

# Fastest - uses orjson throughout
result = repair_json(valid_json, ensure_ascii=False, indent=2)

# Slower - falls back to json.dumps for formatting
result = repair_json(valid_json, ensure_ascii=True)  # ASCII escaping
result = repair_json(valid_json, indent=4)  # Custom indentation

Benchmark Results

Comprehensive comparison of fast_json_repair vs json_repair across 20 test cases (10 invalid JSON, 10 valid JSON) with both ensure_ascii settings:

Test Case fast_json_repair (ms) json_repair (ms) Speedup
Invalid JSON (needs repair)
Simple quotes (ascii=T) 0.007 0.032 ๐Ÿš€ 4.7x
Simple quotes (ascii=F) 0.006 0.037 ๐Ÿš€ 5.7x
Medium nested (ascii=T) 0.020 0.192 ๐Ÿš€ 9.6x
Medium nested (ascii=F) 0.019 0.197 ๐Ÿš€ 10.5x
Large array 1000 (ascii=T) 0.246 2.273 ๐Ÿš€ 9.3x
Large array 1000 (ascii=F) 0.237 2.162 ๐Ÿš€ 9.1x
Deep nesting 50 (ascii=T) 0.055 0.410 ๐Ÿš€ 7.5x
Deep nesting 50 (ascii=F) 0.050 0.420 ๐Ÿš€ 8.4x
Large object 500 (ascii=T) 0.404 27.339 ๐Ÿš€ 67.7x
Large object 500 (ascii=F) 0.408 26.436 ๐Ÿš€ 64.8x
Complex mixed (ascii=T) 0.033 0.408 ๐Ÿš€ 12.2x
Complex mixed (ascii=F) 0.035 0.401 ๐Ÿš€ 11.4x
Very large 5000 (ascii=T) 29.531 580.959 ๐Ÿš€ 19.7x
Very large 5000 (ascii=F) 28.526 581.489 ๐Ÿš€ 20.4x
Long strings 10K (ascii=T) 0.040 4.403 ๐Ÿš€ 110.2x
Long strings 10K (ascii=F) 0.040 4.360 ๐Ÿš€ 108.7x
Valid JSON (fast path)
Small ASCII (ascii=T) 0.003 0.004 ๐Ÿš€ 1.3x
Small ASCII (ascii=F) 0.002 0.005 ๐Ÿš€ 2.9x
Nested structure (ascii=T) 0.007 0.008 ๐Ÿš€ 1.2x
Nested structure (ascii=F) 0.003 0.008 ๐Ÿš€ 2.4x
Large array 1000 (ascii=T) 0.799 0.907 ๐Ÿš€ 1.1x
Large array 1000 (ascii=F) 0.421 0.903 ๐Ÿš€ 2.1x
Large object 500 (ascii=T) 0.506 0.590 ๐Ÿš€ 1.2x
Large object 500 (ascii=F) 0.281 0.571 ๐Ÿš€ 2.0x

Overall: 19.7x faster across all test cases

Key Insights:

  • ๐Ÿš€ = fast_json_repair is faster (all test cases)
  • Invalid JSON repair: 5-110x faster
  • Valid JSON with ensure_ascii=False: 2-3x faster (uses orjson fast path)
  • Valid JSON with ensure_ascii=True: 1.1-1.3x faster
  • Best performance gains: Long strings (110x), large objects (68x), very large arrays (20x)

Performance Advantages

  • Large JSON documents: 10-70x faster for documents with many keys/values
  • Long strings: Up to 110x faster for documents with large string values
  • Very large arrays: 20x faster for arrays with thousands of elements
  • Deeply nested structures: 7-10x faster with consistent performance
  • Memory efficiency: Lower memory footprint due to Rust's zero-cost abstractions and optimized allocations

Run python benchmark.py to test performance on your system. See PERFORMANCE.md for detailed analysis.

AWS Deployment

Works seamlessly on AWS with pre-built wheels for all architectures:

  • x86_64 - Standard EC2 instances (t2, t3, m5, c5, etc.)
  • ARM64/aarch64 - Graviton instances (t4g, m6g, c6g, etc.)
# Install on any AWS instance - pip auto-selects the correct wheel
pip install fast-json-repair

For Lambda layers and cross-compilation, see DEPLOYMENT.md.

Development

Quick Reference

Task Command VS Code Task
Setup ./setup.sh -
Build (debug) maturin develop ๐Ÿ”ง Build: Development
Build (release) maturin develop --release ๐Ÿš€ Build: Development (Release)
Run tests pytest tests/ -v ๐Ÿงช Test: Python (All)
Run benchmarks python benchmark.py โšก Benchmark: Run Full Suite
Format code cargo fmt && black . && isort . โœจ Format: All (Rust + Python)
Lint Rust cargo clippy ๐Ÿฆ€ Rust: Clippy
Lint Python ruff check . ๐Ÿ Python: Lint (Ruff)
Full check maturin develop && pytest && python benchmark.py โœ… Full Check: Build + Test + Benchmark

Quick Setup

# Automated setup (recommended)
./setup.sh

# Or manually with uv
uv venv && source .venv/bin/activate
uv sync
maturin develop

VS Code Integration

This project includes a complete VS Code workspace configuration:

Getting Started:

  1. Open the project folder in VS Code
  2. Install recommended extensions (you'll see a prompt)
  3. The Python interpreter will auto-detect .venv
  4. Press Cmd+Shift+P โ†’ "Tasks: Run Task" to see all available commands

Available Tasks:

  • ๐Ÿ”ง Build Tasks: Debug build, release build, wheels, cross-platform builds
  • ๐Ÿงช Test Tasks: Run all tests, quick tests, coverage reports
  • โšก Benchmark Tasks: Full benchmarks, quick benchmarks, save results
  • ๐Ÿฆ€ Rust Tasks: Check, clippy, format, clean
  • ๐Ÿ Python Tasks: Format (black), sort imports (isort), lint (ruff)
  • ๐Ÿšข Workflows: Full check (build+test+benchmark), release prep, quality checks

Debugging:

  • Press F5 to debug Python tests
  • Set breakpoints in Python code
  • Use "Debug: Select and Start Debugging" for specific configs

Common Commands

See the Quick Reference table above for the most common tasks. Additional commands:

# Code quality
black .                # Format Python code
isort .                # Sort Python imports
ruff check .           # Lint Python code

# Cross-platform builds (requires zig)
maturin build --release --target x86_64-unknown-linux-gnu --zig
maturin build --release --target aarch64-unknown-linux-gnu --zig
maturin build --release --target universal2-apple-darwin

Project Structure

fast_json_repair/
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ lib.rs              # Rust implementation (core repair logic)
โ”œโ”€โ”€ python/
โ”‚   โ””โ”€โ”€ fast_json_repair/
โ”‚       โ””โ”€โ”€ __init__.py     # Python API wrapper
โ”œโ”€โ”€ tests/
โ”‚   โ””โ”€โ”€ test_all.py         # Python test suite
โ”œโ”€โ”€ benchmark.py            # Performance benchmarks
โ”œโ”€โ”€ pyproject.toml          # Python package configuration
โ”œโ”€โ”€ Cargo.toml              # Rust package configuration
โ””โ”€โ”€ .vscode/                # VS Code workspace settings (local)
    โ”œโ”€โ”€ settings.json       # Python/Rust interpreter & formatting
    โ”œโ”€โ”€ tasks.json          # Build/test/benchmark tasks
    โ”œโ”€โ”€ launch.json         # Debug configurations
    โ””โ”€โ”€ extensions.json     # Recommended extensions

Typical Workflow

  1. Make Changes - Edit Rust (src/) or Python (python/) code
  2. Rebuild - maturin develop or VS Code task ๐Ÿ”ง Build: Development
  3. Test - pytest tests/ -v or VS Code task ๐Ÿงช Test: Python (All)
  4. Benchmark - python benchmark.py or VS Code task โšก Benchmark: Run Full Suite
  5. Release - maturin build --release when ready to publish

License

MIT License (same as original json_repair)

Credits & Acknowledgments

Original Author

  • Stefano Baccianella - Creator of the original json_repair library
    • Original concept and algorithm design
    • Python implementation that this library is based on
    • Comprehensive test cases and edge case handling

This Rust Port

  • Performance optimization through Rust implementation
  • Maintains full API compatibility with the original
  • Uses PyO3 for Python bindings
  • Uses orjson for fast JSON parsing

Special Thanks

A huge thank you to Stefano Baccianella for creating json_repair and making it open source. This library wouldn't exist without the original brilliant implementation that has helped countless developers handle malformed JSON from LLMs.

If you appreciate this performance-focused port, please also show support for the original json_repair project that made it all possible.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast_json_repair-0.2.0.tar.gz (67.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fast_json_repair-0.2.0-cp311-abi3-win_amd64.whl (148.2 kB view details)

Uploaded CPython 3.11+Windows x86-64

fast_json_repair-0.2.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (249.6 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

fast_json_repair-0.2.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (237.1 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

fast_json_repair-0.2.0-cp311-abi3-macosx_11_0_arm64.whl (217.1 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

fast_json_repair-0.2.0-cp311-abi3-macosx_10_12_x86_64.whl (230.9 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file fast_json_repair-0.2.0.tar.gz.

File metadata

  • Download URL: fast_json_repair-0.2.0.tar.gz
  • Upload date:
  • Size: 67.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.6

File hashes

Hashes for fast_json_repair-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e8c188bd6534dd753b6499d7a24ef455c7aa26f5a558da097a11a16d5e291bb9
MD5 0798d8194263271104d15514a7fc4c87
BLAKE2b-256 cea18a3804b79e283d7482b21d60338a9672c60410452f68521658a31eefd869

See more details on using hashes here.

File details

Details for the file fast_json_repair-0.2.0-cp311-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for fast_json_repair-0.2.0-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 4b0aef384ddb30c49c82ab8773caa21fb62dd6a3f3394bdaed21f611e9c8f13a
MD5 8c7e989a71b13ff6569da280e320163e
BLAKE2b-256 0afee6cdccaec098d937aca227c820648d46258e47affb4a683519ee81f38ae0

See more details on using hashes here.

File details

Details for the file fast_json_repair-0.2.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_json_repair-0.2.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 112cc0e60b57dbc2e461baac69c3a3b7ab58f84528bdaad502af1786699b8f32
MD5 a86e3397f99e0083eff68fd58d0a8932
BLAKE2b-256 fe8c1a345a69f79acafb83d8025eaabd9dcbdfca0330795945569d0458c3937e

See more details on using hashes here.

File details

Details for the file fast_json_repair-0.2.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fast_json_repair-0.2.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ebb6799c580d5222835e280ce79effb97045f4d9db87c836db380a4a99d5f59b
MD5 6a2fad2debe8874bba5b5debdae6158a
BLAKE2b-256 0a7b38d0597015cca95c63dafab8fcb929517ad1dd21aabeedc86f06bcd7c3aa

See more details on using hashes here.

File details

Details for the file fast_json_repair-0.2.0-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fast_json_repair-0.2.0-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2481db7ff64a52fc32cc5bcd9b98cddb0d598100350e00b1d9646e006dfec4ca
MD5 c849908c8eeb12a8e924fa1c91d394cc
BLAKE2b-256 80ab4eea9ea8cbbfe5b158496d63d85bd6d97bff3dfa7e843401b4971a3b2e9e

See more details on using hashes here.

File details

Details for the file fast_json_repair-0.2.0-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for fast_json_repair-0.2.0-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 77b948676a302d8e9a712115a89a80b59bd1bdf4a5e120e303ae172e47b14926
MD5 03064532d19a9f85a322bcf293901dbe
BLAKE2b-256 3f7cec7a9b21c417e1365372899fb57d1c537d6c1f7592d11b5a15e73d71275b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page