High-performance JSON repair library using Rust
Project description
fast_json_repair
A high-performance JSON repair library for Python, powered by Rust. This is a drop-in replacement for json_repair with significant performance improvements.
๐ Attribution
This library is a Rust port of the excellent json_repair library created by Stefano Baccianella. The original Python implementation is a brilliant solution for fixing malformed JSON from Large Language Models (LLMs), and this port aims to bring the same functionality with improved performance.
All credit for the original concept, logic, and implementation goes to Stefano Baccianella. This Rust port maintains API compatibility with the original library while leveraging Rust's performance benefits.
If you find this library useful, please also consider starring the original json_repair repository.
Features
- ๐ฆ Available on PyPI:
pip install fast-json-repair - ๐ Rust Performance: Core repair logic implemented in Rust for maximum speed
- ๐ง Automatic Repair: Fixes common JSON errors automatically
- ๐ Python Compatible: Works with Python 3.11-3.14
- ๐ Drop-in Replacement: Compatible API with the original json_repair library
- โก Fast JSON Parsing: Uses orjson for JSON parsing operations
Compatibility with Original json_repair
This is a drop-in replacement for the original json_repair library with the same API:
โ Included:
repair_json()- Main repair function withreturn_objects,skip_json_loads,ensure_ascii,indentparametersloads()- Convenience function for loading broken JSON directly to Python objects- All repair capabilities: quotes, literals, commas, brackets, escape sequences, Unicode
โ Not Included:
- File operations (
load(),from_file()) - Use Python's built-in file handling +repair_json() - CLI tool - Library-only implementation
- Streaming support - Not yet implemented
Key Differences:
- ๐ 20x faster average, up to 110x for large objects with long strings
- ๐ข Unquoted numbers parsed as numbers (not strings)
- ๐ฆ Uses
orjsonfor high-performance JSON operations
Installation
Quick Install
pip install fast-json-repair
Build from Source
Click to expand build instructions
Prerequisites
- Python 3.11-3.14
- Rust toolchain (
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh) - uv (recommended) or pip
Quick Start with uv (Recommended)
# Clone the repository
git clone https://github.com/dvideby0/fast_json_repair.git
cd fast_json_repair
# Run the automated setup script
./setup.sh
The setup script will:
- โ
Install
uvand Rust if needed - โ
Create a virtual environment (
.venv) - โ Install all dependencies
- โ Build the Rust extension
- โ Verify the installation
Manual Build Steps
# Clone the repository
git clone https://github.com/dvideby0/fast_json_repair.git
cd fast_json_repair
# Option 1: Using uv (fast!)
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv sync
maturin develop --release
# Option 2: Using pip
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install maturin orjson
maturin develop --release
Usage
from fast_json_repair import repair_json, loads
# Fix broken JSON
broken = "{'name': 'John', 'age': 30}" # Single quotes
fixed = repair_json(broken)
print(fixed) # {"age":30,"name":"John"}
# Parse directly to Python object
data = loads("{'key': 'value'}")
print(data) # {'key': 'value'}
# Handle Unicode properly
text = "{'message': 'ไฝ ๅฅฝไธ็'}"
result = repair_json(text, ensure_ascii=False)
print(result) # {"message":"ไฝ ๅฅฝไธ็"}
# Format with indentation
formatted = repair_json("{'a': 1}", indent=2)
What It Repairs
Automatically fixes common JSON formatting issues:
| Issue | Fix |
|---|---|
| Single quotes | โ Double quotes |
| Unquoted keys | โ Quoted keys |
| Python literals (True/False/None) | โ JSON (true/false/null) |
| Trailing commas | Removed |
| Missing commas | Added |
| Extra commas | Removed |
| Unclosed brackets/braces | Auto-closed |
| Invalid escape sequences | Fixed |
| Unicode characters | Preserved or escaped (configurable) |
API Reference
repair_json(json_string, **kwargs)
Repairs invalid JSON and returns valid JSON string.
Parameters:
json_string(str): The potentially invalid JSON string to repairreturn_objects(bool): If True, return parsed Python object instead of JSON stringskip_json_loads(bool): If True, skip initial validation for better performanceensure_ascii(bool): If True, escape non-ASCII characters in outputindent(int): Number of spaces for indentation (None for compact output)
Returns:
- str or object: Repaired JSON string or parsed Python object
loads(json_string, **kwargs)
Repairs and parses invalid JSON string to Python object.
Parameters:
json_string(str): The potentially invalid JSON string to repair and parse**kwargs: Additional arguments passed to repair_json
Returns:
- object: The parsed Python object
Performance
This Rust-based implementation provides significant performance improvements over the pure Python original.
Fast Path Optimization
The library automatically uses the fastest path when possible:
Fast Path (uses orjson for serialization):
- Valid JSON input
ensure_ascii=Falseindentis eitherNone(compact) or2
Fallback Path (uses stdlib json):
- Valid JSON input with
ensure_ascii=True - Valid JSON input with
indentvalues other thanNoneor2
Repair Path (uses Rust implementation):
- Any invalid JSON that needs repair
- Always respects
ensure_asciiandindentsettings
For maximum performance with valid JSON:
# Fastest - uses orjson throughout
result = repair_json(valid_json, ensure_ascii=False, indent=2)
# Slower - falls back to json.dumps for formatting
result = repair_json(valid_json, ensure_ascii=True) # ASCII escaping
result = repair_json(valid_json, indent=4) # Custom indentation
Benchmark Results
Comprehensive comparison of fast_json_repair vs json_repair across 20 test cases (10 invalid JSON, 10 valid JSON) with both ensure_ascii settings:
| Test Case | fast_json_repair (ms) | json_repair (ms) | Speedup |
|---|---|---|---|
| Invalid JSON (needs repair) | |||
| Simple quotes (ascii=T) | 0.007 | 0.032 | ๐ 4.7x |
| Simple quotes (ascii=F) | 0.006 | 0.037 | ๐ 5.7x |
| Medium nested (ascii=T) | 0.020 | 0.192 | ๐ 9.6x |
| Medium nested (ascii=F) | 0.019 | 0.197 | ๐ 10.5x |
| Large array 1000 (ascii=T) | 0.246 | 2.273 | ๐ 9.3x |
| Large array 1000 (ascii=F) | 0.237 | 2.162 | ๐ 9.1x |
| Deep nesting 50 (ascii=T) | 0.055 | 0.410 | ๐ 7.5x |
| Deep nesting 50 (ascii=F) | 0.050 | 0.420 | ๐ 8.4x |
| Large object 500 (ascii=T) | 0.404 | 27.339 | ๐ 67.7x |
| Large object 500 (ascii=F) | 0.408 | 26.436 | ๐ 64.8x |
| Complex mixed (ascii=T) | 0.033 | 0.408 | ๐ 12.2x |
| Complex mixed (ascii=F) | 0.035 | 0.401 | ๐ 11.4x |
| Very large 5000 (ascii=T) | 29.531 | 580.959 | ๐ 19.7x |
| Very large 5000 (ascii=F) | 28.526 | 581.489 | ๐ 20.4x |
| Long strings 10K (ascii=T) | 0.040 | 4.403 | ๐ 110.2x |
| Long strings 10K (ascii=F) | 0.040 | 4.360 | ๐ 108.7x |
| Valid JSON (fast path) | |||
| Small ASCII (ascii=T) | 0.003 | 0.004 | ๐ 1.3x |
| Small ASCII (ascii=F) | 0.002 | 0.005 | ๐ 2.9x |
| Nested structure (ascii=T) | 0.007 | 0.008 | ๐ 1.2x |
| Nested structure (ascii=F) | 0.003 | 0.008 | ๐ 2.4x |
| Large array 1000 (ascii=T) | 0.799 | 0.907 | ๐ 1.1x |
| Large array 1000 (ascii=F) | 0.421 | 0.903 | ๐ 2.1x |
| Large object 500 (ascii=T) | 0.506 | 0.590 | ๐ 1.2x |
| Large object 500 (ascii=F) | 0.281 | 0.571 | ๐ 2.0x |
Overall: 19.7x faster across all test cases
Key Insights:
- ๐ = fast_json_repair is faster (all test cases)
- Invalid JSON repair: 5-110x faster
- Valid JSON with ensure_ascii=False: 2-3x faster (uses orjson fast path)
- Valid JSON with ensure_ascii=True: 1.1-1.3x faster
- Best performance gains: Long strings (110x), large objects (68x), very large arrays (20x)
Performance Advantages
- Large JSON documents: 10-70x faster for documents with many keys/values
- Long strings: Up to 110x faster for documents with large string values
- Very large arrays: 20x faster for arrays with thousands of elements
- Deeply nested structures: 7-10x faster with consistent performance
- Memory efficiency: Lower memory footprint due to Rust's zero-cost abstractions and optimized allocations
Run python benchmark.py to test performance on your system. See PERFORMANCE.md for detailed analysis.
AWS Deployment
Works seamlessly on AWS with pre-built wheels for all architectures:
- x86_64 - Standard EC2 instances (t2, t3, m5, c5, etc.)
- ARM64/aarch64 - Graviton instances (t4g, m6g, c6g, etc.)
# Install on any AWS instance - pip auto-selects the correct wheel
pip install fast-json-repair
For Lambda layers and cross-compilation, see DEPLOYMENT.md.
Development
Quick Reference
| Task | Command | VS Code Task |
|---|---|---|
| Setup | ./setup.sh |
- |
| Build (debug) | maturin develop |
๐ง Build: Development |
| Build (release) | maturin develop --release |
๐ Build: Development (Release) |
| Run tests | pytest tests/ -v |
๐งช Test: Python (All) |
| Run benchmarks | python benchmark.py |
โก Benchmark: Run Full Suite |
| Format code | cargo fmt && black . && isort . |
โจ Format: All (Rust + Python) |
| Lint Rust | cargo clippy |
๐ฆ Rust: Clippy |
| Lint Python | ruff check . |
๐ Python: Lint (Ruff) |
| Full check | maturin develop && pytest && python benchmark.py |
โ Full Check: Build + Test + Benchmark |
Quick Setup
# Automated setup (recommended)
./setup.sh
# Or manually with uv
uv venv && source .venv/bin/activate
uv sync
maturin develop
VS Code Integration
This project includes a complete VS Code workspace configuration:
Getting Started:
- Open the project folder in VS Code
- Install recommended extensions (you'll see a prompt)
- The Python interpreter will auto-detect
.venv - Press
Cmd+Shift+Pโ "Tasks: Run Task" to see all available commands
Available Tasks:
- ๐ง Build Tasks: Debug build, release build, wheels, cross-platform builds
- ๐งช Test Tasks: Run all tests, quick tests, coverage reports
- โก Benchmark Tasks: Full benchmarks, quick benchmarks, save results
- ๐ฆ Rust Tasks: Check, clippy, format, clean
- ๐ Python Tasks: Format (black), sort imports (isort), lint (ruff)
- ๐ข Workflows: Full check (build+test+benchmark), release prep, quality checks
Debugging:
- Press
F5to debug Python tests - Set breakpoints in Python code
- Use "Debug: Select and Start Debugging" for specific configs
Common Commands
See the Quick Reference table above for the most common tasks. Additional commands:
# Code quality
black . # Format Python code
isort . # Sort Python imports
ruff check . # Lint Python code
# Cross-platform builds (requires zig)
maturin build --release --target x86_64-unknown-linux-gnu --zig
maturin build --release --target aarch64-unknown-linux-gnu --zig
maturin build --release --target universal2-apple-darwin
Project Structure
fast_json_repair/
โโโ src/
โ โโโ lib.rs # Rust implementation (core repair logic)
โโโ python/
โ โโโ fast_json_repair/
โ โโโ __init__.py # Python API wrapper
โโโ tests/
โ โโโ test_all.py # Python test suite
โโโ benchmark.py # Performance benchmarks
โโโ pyproject.toml # Python package configuration
โโโ Cargo.toml # Rust package configuration
โโโ .vscode/ # VS Code workspace settings (local)
โโโ settings.json # Python/Rust interpreter & formatting
โโโ tasks.json # Build/test/benchmark tasks
โโโ launch.json # Debug configurations
โโโ extensions.json # Recommended extensions
Typical Workflow
- Make Changes - Edit Rust (
src/) or Python (python/) code - Rebuild -
maturin developor VS Code task๐ง Build: Development - Test -
pytest tests/ -vor VS Code task๐งช Test: Python (All) - Benchmark -
python benchmark.pyor VS Code taskโก Benchmark: Run Full Suite - Release -
maturin build --releasewhen ready to publish
License
MIT License (same as original json_repair)
Credits & Acknowledgments
Original Author
- Stefano Baccianella - Creator of the original json_repair library
- Original concept and algorithm design
- Python implementation that this library is based on
- Comprehensive test cases and edge case handling
This Rust Port
- Performance optimization through Rust implementation
- Maintains full API compatibility with the original
- Uses PyO3 for Python bindings
- Uses orjson for fast JSON parsing
Special Thanks
A huge thank you to Stefano Baccianella for creating json_repair and making it open source. This library wouldn't exist without the original brilliant implementation that has helped countless developers handle malformed JSON from LLMs.
If you appreciate this performance-focused port, please also show support for the original json_repair project that made it all possible.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fast_json_repair-0.2.0.tar.gz.
File metadata
- Download URL: fast_json_repair-0.2.0.tar.gz
- Upload date:
- Size: 67.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8c188bd6534dd753b6499d7a24ef455c7aa26f5a558da097a11a16d5e291bb9
|
|
| MD5 |
0798d8194263271104d15514a7fc4c87
|
|
| BLAKE2b-256 |
cea18a3804b79e283d7482b21d60338a9672c60410452f68521658a31eefd869
|
File details
Details for the file fast_json_repair-0.2.0-cp311-abi3-win_amd64.whl.
File metadata
- Download URL: fast_json_repair-0.2.0-cp311-abi3-win_amd64.whl
- Upload date:
- Size: 148.2 kB
- Tags: CPython 3.11+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4b0aef384ddb30c49c82ab8773caa21fb62dd6a3f3394bdaed21f611e9c8f13a
|
|
| MD5 |
8c7e989a71b13ff6569da280e320163e
|
|
| BLAKE2b-256 |
0afee6cdccaec098d937aca227c820648d46258e47affb4a683519ee81f38ae0
|
File details
Details for the file fast_json_repair-0.2.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: fast_json_repair-0.2.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 249.6 kB
- Tags: CPython 3.11+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
112cc0e60b57dbc2e461baac69c3a3b7ab58f84528bdaad502af1786699b8f32
|
|
| MD5 |
a86e3397f99e0083eff68fd58d0a8932
|
|
| BLAKE2b-256 |
fe8c1a345a69f79acafb83d8025eaabd9dcbdfca0330795945569d0458c3937e
|
File details
Details for the file fast_json_repair-0.2.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: fast_json_repair-0.2.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 237.1 kB
- Tags: CPython 3.11+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ebb6799c580d5222835e280ce79effb97045f4d9db87c836db380a4a99d5f59b
|
|
| MD5 |
6a2fad2debe8874bba5b5debdae6158a
|
|
| BLAKE2b-256 |
0a7b38d0597015cca95c63dafab8fcb929517ad1dd21aabeedc86f06bcd7c3aa
|
File details
Details for the file fast_json_repair-0.2.0-cp311-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: fast_json_repair-0.2.0-cp311-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 217.1 kB
- Tags: CPython 3.11+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2481db7ff64a52fc32cc5bcd9b98cddb0d598100350e00b1d9646e006dfec4ca
|
|
| MD5 |
c849908c8eeb12a8e924fa1c91d394cc
|
|
| BLAKE2b-256 |
80ab4eea9ea8cbbfe5b158496d63d85bd6d97bff3dfa7e843401b4971a3b2e9e
|
File details
Details for the file fast_json_repair-0.2.0-cp311-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: fast_json_repair-0.2.0-cp311-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 230.9 kB
- Tags: CPython 3.11+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77b948676a302d8e9a712115a89a80b59bd1bdf4a5e120e303ae172e47b14926
|
|
| MD5 |
03064532d19a9f85a322bcf293901dbe
|
|
| BLAKE2b-256 |
3f7cec7a9b21c417e1365372899fb57d1c537d6c1f7592d11b5a15e73d71275b
|