High-performance JSON repair library using Rust
Project description
fast_json_repair
A high-performance JSON repair library for Python, powered by Rust. This is a drop-in replacement for json_repair with significant performance improvements.
🙏 Attribution
This library is a Rust port of the excellent json_repair library created by Stefano Baccianella. The original Python implementation is a brilliant solution for fixing malformed JSON from Large Language Models (LLMs), and this port aims to bring the same functionality with improved performance.
All credit for the original concept, logic, and implementation goes to Stefano Baccianella. This Rust port maintains API compatibility with the original library while leveraging Rust's performance benefits.
If you find this library useful, please also consider starring the original json_repair repository.
Features
- 🚀 Rust Performance: Core repair logic implemented in Rust for maximum speed
- 🔧 Automatic Repair: Fixes common JSON errors automatically
- 🐍 Python Compatible: Works with Python 3.11+
- 📦 Drop-in Replacement: Compatible API with the original json_repair library
- ⚡ Fast JSON Parsing: Uses orjson for JSON parsing operations
Compatibility with Original json_repair
✅ Included Features
repair_json()- Main repair function with all parameters:return_objects- Return parsed Python objects instead of JSON stringskip_json_loads- Skip initial JSON validation for better performanceensure_ascii- Control Unicode escaping in outputindent- Format output with indentation
loads()- Convenience function for loading broken JSON directly to Python objects- All repair capabilities:
- Single quotes → double quotes
- Unquoted keys → quoted keys
- Python literals (True/False/None) → JSON literals (true/false/null)
- Trailing commas removal
- Missing commas addition
- Auto-closing unclosed brackets/braces
- Escape sequence handling
- Unicode support
❌ Not Included (By Design)
- File operations (
load(),from_file()) - Use Python's built-in file handling + repair_json() - CLI tool - This is a library-only implementation
- Streaming support (
stream_stableparameter) - Not yet implemented - Custom JSON encoder parameters - Uses orjson's optimized defaults
🔄 Differences
- Number parsing: Unquoted numbers are parsed as numbers (not strings)
- Performance: 5x faster average, up to 15x for large objects
- Dependencies: Requires
orjsoninstead of standardjsonlibrary
Installation
Quick Install (Coming Soon)
Once published to PyPI, you'll be able to install with:
pip install fast-json-repair
Install from GitHub Releases
Download pre-built wheels from the Releases page:
# For macOS (Intel)
pip install https://github.com/dvideby0/fast_json_repair/releases/download/v0.1.0/fast_json_repair-0.1.0-cp311-abi3-macosx_10_12_x86_64.whl
# For macOS (Apple Silicon)
pip install https://github.com/dvideby0/fast_json_repair/releases/download/v0.1.0/fast_json_repair-0.1.0-cp311-abi3-macosx_11_0_arm64.whl
# For Linux x86_64
pip install https://github.com/dvideby0/fast_json_repair/releases/download/v0.1.0/fast_json_repair-0.1.0-cp311-abi3-manylinux_2_17_x86_64.whl
# For Linux ARM64 (AWS Graviton)
pip install https://github.com/dvideby0/fast_json_repair/releases/download/v0.1.0/fast_json_repair-0.1.0-cp311-abi3-manylinux_2_17_aarch64.whl
# For Windows
pip install https://github.com/dvideby0/fast_json_repair/releases/download/v0.1.0/fast_json_repair-0.1.0-cp311-abi3-win_amd64.whl
Build from Source
Click to expand build instructions
Prerequisites
- Python 3.11 or higher
- Rust toolchain (
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh)
Build Steps
# Clone the repository
git clone https://github.com/dvideby0/fast_json_repair.git
cd fast_json_repair
# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install build dependencies
pip install maturin orjson
# Build and install the package
maturin develop --release
Usage
from fast_json_repair import repair_json, loads
# Fix broken JSON
broken = "{'name': 'John', 'age': 30}" # Single quotes
fixed = repair_json(broken)
print(fixed) # {"age":30,"name":"John"}
# Parse directly to Python object
data = loads("{'key': 'value'}")
print(data) # {'key': 'value'}
# Handle Unicode properly
text = "{'message': '你好世界'}"
result = repair_json(text, ensure_ascii=False)
print(result) # {"message":"你好世界"}
# Format with indentation
formatted = repair_json("{'a': 1}", indent=2)
What it repairs
- Single quotes → Double quotes
- Unquoted keys → Quoted keys
- Python literals → JSON equivalents (True→true, False→false, None→null)
- Trailing commas → Removed
- Missing commas → Added
- Extra commas → Removed
- Unclosed brackets/braces → Auto-closed
- Escape sequences → Properly handled
- Unicode characters → Preserved or escaped based on settings
API Reference
repair_json(json_string, **kwargs)
Repairs invalid JSON and returns valid JSON string.
Parameters:
json_string(str): The potentially invalid JSON string to repairreturn_objects(bool): If True, return parsed Python object instead of JSON stringskip_json_loads(bool): If True, skip initial validation for better performanceensure_ascii(bool): If True, escape non-ASCII characters in outputindent(int): Number of spaces for indentation (None for compact output)
Returns:
- str or object: Repaired JSON string or parsed Python object
loads(json_string, **kwargs)
Repairs and parses invalid JSON string to Python object.
Parameters:
json_string(str): The potentially invalid JSON string to repair and parse**kwargs: Additional arguments passed to repair_json
Returns:
- object: The parsed Python object
Performance
This Rust-based implementation provides significant performance improvements over the pure Python original.
Benchmark Results
| Test Case | Rust (ms) | Python (ms) | Speedup |
|---|---|---|---|
| Simple quotes | 0.02 | 0.04 | 1.8x |
| Medium nested structure | 0.08 | 0.19 | 2.4x |
| Large array (1000 items) | 1.13 | 2.23 | 2.0x |
| Deep nesting (50 levels) | 0.16 | 0.38 | 2.4x |
| Large object (500 keys) | 1.87 | 29.50 | 15.8x |
| Complex mixed issues | 0.12 | 0.39 | 3.4x |
| Very large array (5000 items) | 106.00 | 514.00 | 4.9x |
| Unicode and special chars | 0.06 | 0.15 | 2.5x |
| Long strings (10K chars) | 0.59 | 3.60 | 6.1x |
Overall: ~5x faster than the pure Python implementation, with up to 15x speedup for certain workloads.
Performance Advantages
- Large JSON documents: 5-15x faster for documents with many keys/values
- Deeply nested structures: 2-3x faster with consistent performance
- Documents with many errors: Scales better with number of repairs needed
- Memory efficiency: Lower memory footprint due to Rust's zero-cost abstractions
Run python benchmark.py to test performance on your system.
Differences from original json_repair
- Core repair logic implemented in Rust for performance
- Uses orjson instead of standard json library
- Numbers in unquoted values are parsed as numbers (not strings)
- Focused on string repair only (no file operations in core library)
AWS Deployment
This library works perfectly on AWS Linux systems. Pre-built wheels are available for:
- x86_64 (standard EC2 instances like t2, t3, m5, c5)
- ARM64/aarch64 (Graviton instances like t4g, m6g, c6g)
Quick Deploy to AWS
# For x86_64 instances
pip install target/wheels/fast_json_repair-0.1.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl orjson
# For ARM64/Graviton instances
pip install target/wheels/fast_json_repair-0.1.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl orjson
Building Linux Wheels (Cross-compilation from macOS)
# Install build tools
cargo install cargo-zigbuild
brew install zig
# Add Linux targets
rustup target add x86_64-unknown-linux-gnu
rustup target add aarch64-unknown-linux-gnu
# Build Linux x86_64 wheel
maturin build --release --target x86_64-unknown-linux-gnu --zig
# Build Linux ARM64 wheel (for AWS Graviton)
maturin build --release --target aarch64-unknown-linux-gnu --zig
AWS Lambda Deployment
Create a Lambda layer:
mkdir -p lambda-layer/python
pip install -t lambda-layer/python/ target/wheels/fast_json_repair-*.whl orjson
cd lambda-layer && zip -r fast_json_repair_layer.zip python
See DEPLOYMENT.md for detailed deployment instructions.
Development
# Run tests
python test_repair.py
# Build for current platform
maturin develop
# Build release wheels
maturin build --release
License
MIT License (same as original json_repair)
Credits & Acknowledgments
Original Author
- Stefano Baccianella - Creator of the original json_repair library
- Original concept and algorithm design
- Python implementation that this library is based on
- Comprehensive test cases and edge case handling
This Rust Port
- Performance optimization through Rust implementation
- Maintains full API compatibility with the original
- Uses PyO3 for Python bindings
- Uses orjson for fast JSON parsing
Special Thanks
A huge thank you to Stefano Baccianella for creating json_repair and making it open source. This library wouldn't exist without the original brilliant implementation that has helped countless developers handle malformed JSON from LLMs.
If you appreciate this performance-focused port, please also show support for the original json_repair project that made it all possible.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fast_json_repair-0.1.2.tar.gz.
File metadata
- Download URL: fast_json_repair-0.1.2.tar.gz
- Upload date:
- Size: 24.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ea1053e30396142d206ef4a1f252b8e308d4d47e562d2c79c45126ef41ee2a50
|
|
| MD5 |
d0a8e6f69bf5aa96b16eece5ffb5f43d
|
|
| BLAKE2b-256 |
3a72eb29c9e9a15b498ab2e066d2c370c0e24e9e9d3a9fb0538ef61b264572b0
|
File details
Details for the file fast_json_repair-0.1.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: fast_json_repair-0.1.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 244.7 kB
- Tags: CPython 3.11+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a9d0c58649f5f4e05ee45870170000ef384f69570fdd46c040aee210462f55a
|
|
| MD5 |
6c2ffafab127d470001d7f6cafc4a1e8
|
|
| BLAKE2b-256 |
ba1772708bf073a95d1758f21a8fb896b9d74d58693cf4cb46a8de048b713a4b
|
File details
Details for the file fast_json_repair-0.1.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: fast_json_repair-0.1.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 232.6 kB
- Tags: CPython 3.11+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b780d29d01594456b5a8674e18b5c784a77c5b5a48a4c991f3605cc227108fa8
|
|
| MD5 |
139731367542e0db0846a070dca50098
|
|
| BLAKE2b-256 |
9e77c9002e33a61e1db23c91666237f3df79a0a0c8c5580ebfd259b38de22047
|
File details
Details for the file fast_json_repair-0.1.2-cp311-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: fast_json_repair-0.1.2-cp311-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 211.7 kB
- Tags: CPython 3.11+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96484880db1b28716ade66147e20265af82ae06239939f89152966f9ded027ad
|
|
| MD5 |
0b8570e9d2dbd822890cfaa18e6b4a13
|
|
| BLAKE2b-256 |
a8981e2876b7bdabaed08d8bb0a9d8b0ca079727015fcc57b4e78ffcafb49317
|