Skip to main content

High-performance JSON repair library using Rust

Project description

fast_json_repair

A high-performance JSON repair library for Python, powered by Rust. This is a drop-in replacement for json_repair with significant performance improvements.

🙏 Attribution

This library is a Rust port of the excellent json_repair library created by Stefano Baccianella. The original Python implementation is a brilliant solution for fixing malformed JSON from Large Language Models (LLMs), and this port aims to bring the same functionality with improved performance.

All credit for the original concept, logic, and implementation goes to Stefano Baccianella. This Rust port maintains API compatibility with the original library while leveraging Rust's performance benefits.

If you find this library useful, please also consider starring the original json_repair repository.

Features

  • 🚀 Rust Performance: Core repair logic implemented in Rust for maximum speed
  • 🔧 Automatic Repair: Fixes common JSON errors automatically
  • 🐍 Python Compatible: Works with Python 3.11+
  • 📦 Drop-in Replacement: Compatible API with the original json_repair library
  • Fast JSON Parsing: Uses orjson for JSON parsing operations

Compatibility with Original json_repair

✅ Included Features

  • repair_json() - Main repair function with all parameters:
    • return_objects - Return parsed Python objects instead of JSON string
    • skip_json_loads - Skip initial JSON validation for better performance
    • ensure_ascii - Control Unicode escaping in output
    • indent - Format output with indentation
  • loads() - Convenience function for loading broken JSON directly to Python objects
  • All repair capabilities:
    • Single quotes → double quotes
    • Unquoted keys → quoted keys
    • Python literals (True/False/None) → JSON literals (true/false/null)
    • Trailing commas removal
    • Missing commas addition
    • Auto-closing unclosed brackets/braces
    • Escape sequence handling
    • Unicode support

❌ Not Included (By Design)

  • File operations (load(), from_file()) - Use Python's built-in file handling + repair_json()
  • CLI tool - This is a library-only implementation
  • Streaming support (stream_stable parameter) - Not yet implemented
  • Custom JSON encoder parameters - Uses orjson's optimized defaults

🔄 Differences

  • Number parsing: Unquoted numbers are parsed as numbers (not strings)
  • Performance: 5x faster average, up to 15x for large objects
  • Dependencies: Requires orjson instead of standard json library

Installation

Quick Install (Coming Soon)

Once published to PyPI, you'll be able to install with:

pip install fast-json-repair

Install from GitHub Releases

Download pre-built wheels from the Releases page:

# For macOS (Intel)
pip install https://github.com/dvideby0/fast_json_repair/releases/download/v0.1.0/fast_json_repair-0.1.0-cp311-abi3-macosx_10_12_x86_64.whl

# For macOS (Apple Silicon)  
pip install https://github.com/dvideby0/fast_json_repair/releases/download/v0.1.0/fast_json_repair-0.1.0-cp311-abi3-macosx_11_0_arm64.whl

# For Linux x86_64
pip install https://github.com/dvideby0/fast_json_repair/releases/download/v0.1.0/fast_json_repair-0.1.0-cp311-abi3-manylinux_2_17_x86_64.whl

# For Linux ARM64 (AWS Graviton)
pip install https://github.com/dvideby0/fast_json_repair/releases/download/v0.1.0/fast_json_repair-0.1.0-cp311-abi3-manylinux_2_17_aarch64.whl

# For Windows
pip install https://github.com/dvideby0/fast_json_repair/releases/download/v0.1.0/fast_json_repair-0.1.0-cp311-abi3-win_amd64.whl

Build from Source

Click to expand build instructions

Prerequisites

  • Python 3.11 or higher
  • Rust toolchain (curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh)

Build Steps

# Clone the repository
git clone https://github.com/dvideby0/fast_json_repair.git
cd fast_json_repair

# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install build dependencies
pip install maturin orjson

# Build and install the package
maturin develop --release

Usage

from fast_json_repair import repair_json, loads

# Fix broken JSON
broken = "{'name': 'John', 'age': 30}"  # Single quotes
fixed = repair_json(broken)
print(fixed)  # {"age":30,"name":"John"}

# Parse directly to Python object
data = loads("{'key': 'value'}")
print(data)  # {'key': 'value'}

# Handle Unicode properly
text = "{'message': '你好世界'}"
result = repair_json(text, ensure_ascii=False)
print(result)  # {"message":"你好世界"}

# Format with indentation
formatted = repair_json("{'a': 1}", indent=2)

What it repairs

  • Single quotes → Double quotes
  • Unquoted keys → Quoted keys
  • Python literals → JSON equivalents (True→true, False→false, None→null)
  • Trailing commas → Removed
  • Missing commas → Added
  • Extra commas → Removed
  • Unclosed brackets/braces → Auto-closed
  • Escape sequences → Properly handled
  • Unicode characters → Preserved or escaped based on settings

API Reference

repair_json(json_string, **kwargs)

Repairs invalid JSON and returns valid JSON string.

Parameters:

  • json_string (str): The potentially invalid JSON string to repair
  • return_objects (bool): If True, return parsed Python object instead of JSON string
  • skip_json_loads (bool): If True, skip initial validation for better performance
  • ensure_ascii (bool): If True, escape non-ASCII characters in output
  • indent (int): Number of spaces for indentation (None for compact output)

Returns:

  • str or object: Repaired JSON string or parsed Python object

loads(json_string, **kwargs)

Repairs and parses invalid JSON string to Python object.

Parameters:

  • json_string (str): The potentially invalid JSON string to repair and parse
  • **kwargs: Additional arguments passed to repair_json

Returns:

  • object: The parsed Python object

Performance

This Rust-based implementation provides significant performance improvements over the pure Python original.

Benchmark Results

Test Case Rust (ms) Python (ms) Speedup
Simple quotes 0.02 0.04 1.8x
Medium nested structure 0.08 0.19 2.4x
Large array (1000 items) 1.13 2.23 2.0x
Deep nesting (50 levels) 0.16 0.38 2.4x
Large object (500 keys) 1.87 29.50 15.8x
Complex mixed issues 0.12 0.39 3.4x
Very large array (5000 items) 106.00 514.00 4.9x
Unicode and special chars 0.06 0.15 2.5x
Long strings (10K chars) 0.59 3.60 6.1x

Overall: ~5x faster than the pure Python implementation, with up to 15x speedup for certain workloads.

Performance Advantages

  • Large JSON documents: 5-15x faster for documents with many keys/values
  • Deeply nested structures: 2-3x faster with consistent performance
  • Documents with many errors: Scales better with number of repairs needed
  • Memory efficiency: Lower memory footprint due to Rust's zero-cost abstractions

Run python benchmark.py to test performance on your system.

Differences from original json_repair

  • Core repair logic implemented in Rust for performance
  • Uses orjson instead of standard json library
  • Numbers in unquoted values are parsed as numbers (not strings)
  • Focused on string repair only (no file operations in core library)

AWS Deployment

This library works perfectly on AWS Linux systems. Pre-built wheels are available for:

  • x86_64 (standard EC2 instances like t2, t3, m5, c5)
  • ARM64/aarch64 (Graviton instances like t4g, m6g, c6g)

Quick Deploy to AWS

# For x86_64 instances
pip install target/wheels/fast_json_repair-0.1.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl orjson

# For ARM64/Graviton instances
pip install target/wheels/fast_json_repair-0.1.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl orjson

Building Linux Wheels (Cross-compilation from macOS)

# Install build tools
cargo install cargo-zigbuild
brew install zig

# Add Linux targets
rustup target add x86_64-unknown-linux-gnu
rustup target add aarch64-unknown-linux-gnu

# Build Linux x86_64 wheel
maturin build --release --target x86_64-unknown-linux-gnu --zig

# Build Linux ARM64 wheel (for AWS Graviton)
maturin build --release --target aarch64-unknown-linux-gnu --zig

AWS Lambda Deployment

Create a Lambda layer:

mkdir -p lambda-layer/python
pip install -t lambda-layer/python/ target/wheels/fast_json_repair-*.whl orjson
cd lambda-layer && zip -r fast_json_repair_layer.zip python

See DEPLOYMENT.md for detailed deployment instructions.

Development

# Run tests
python test_repair.py

# Build for current platform
maturin develop

# Build release wheels
maturin build --release

License

MIT License (same as original json_repair)

Credits & Acknowledgments

Original Author

  • Stefano Baccianella - Creator of the original json_repair library
    • Original concept and algorithm design
    • Python implementation that this library is based on
    • Comprehensive test cases and edge case handling

This Rust Port

  • Performance optimization through Rust implementation
  • Maintains full API compatibility with the original
  • Uses PyO3 for Python bindings
  • Uses orjson for fast JSON parsing

Special Thanks

A huge thank you to Stefano Baccianella for creating json_repair and making it open source. This library wouldn't exist without the original brilliant implementation that has helped countless developers handle malformed JSON from LLMs.

If you appreciate this performance-focused port, please also show support for the original json_repair project that made it all possible.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast_json_repair-0.1.2.tar.gz (24.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fast_json_repair-0.1.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (244.7 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

fast_json_repair-0.1.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (232.6 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

fast_json_repair-0.1.2-cp311-abi3-macosx_11_0_arm64.whl (211.7 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

File details

Details for the file fast_json_repair-0.1.2.tar.gz.

File metadata

  • Download URL: fast_json_repair-0.1.2.tar.gz
  • Upload date:
  • Size: 24.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for fast_json_repair-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ea1053e30396142d206ef4a1f252b8e308d4d47e562d2c79c45126ef41ee2a50
MD5 d0a8e6f69bf5aa96b16eece5ffb5f43d
BLAKE2b-256 3a72eb29c9e9a15b498ab2e066d2c370c0e24e9e9d3a9fb0538ef61b264572b0

See more details on using hashes here.

File details

Details for the file fast_json_repair-0.1.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fast_json_repair-0.1.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4a9d0c58649f5f4e05ee45870170000ef384f69570fdd46c040aee210462f55a
MD5 6c2ffafab127d470001d7f6cafc4a1e8
BLAKE2b-256 ba1772708bf073a95d1758f21a8fb896b9d74d58693cf4cb46a8de048b713a4b

See more details on using hashes here.

File details

Details for the file fast_json_repair-0.1.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fast_json_repair-0.1.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b780d29d01594456b5a8674e18b5c784a77c5b5a48a4c991f3605cc227108fa8
MD5 139731367542e0db0846a070dca50098
BLAKE2b-256 9e77c9002e33a61e1db23c91666237f3df79a0a0c8c5580ebfd259b38de22047

See more details on using hashes here.

File details

Details for the file fast_json_repair-0.1.2-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fast_json_repair-0.1.2-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 96484880db1b28716ade66147e20265af82ae06239939f89152966f9ded027ad
MD5 0b8570e9d2dbd822890cfaa18e6b4a13
BLAKE2b-256 a8981e2876b7bdabaed08d8bb0a9d8b0ca079727015fcc57b4e78ffcafb49317

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page