A package to repair broken json strings
Project description
This simple package can be used to repair a broken json file, particularly useful if you are using LLMs because those things keep fucking up the json output.
Inspired by https://github.com/josdejong/jsonrepair with contributions by GPT-4
How to use
from json_repair import repair_json
good_json_string = repair_json(bad_json_string)
How it works
This module will parse the JSON file following the BNF definition:
<json> ::= <primitive> | <container>
<primitive> ::= <number> | <string> | <boolean>
; Where:
; <number> is a valid real number expressed in one of a number of given formats
; <string> is a string of valid characters enclosed in quotes
; <boolean> is one of the literal strings 'true', 'false', or 'null' (unquoted)
<container> ::= <object> | <array>
<array> ::= '[' [ <json> *(', ' <json>) ] ']' ; A sequence of JSON values separated by commas
<object> ::= '{' [ <member> *(', ' <member>) ] '}' ; A sequence of 'members'
<member> ::= <string> ': ' <json> ; A pair consisting of a name, and a JSON value
If something is wrong (a missing parantheses or quotes for example) it will use a few simple heuristics to fix the JSON string:
- Add the missing parentheses if the parser believes that the array or object should be closed
- Quote strings or add missing single quotes
- Adjust whitespaces and remove line breaks
I am sure some corner cases will be missing, if you have examples please open an issue or even better push a PR
How to develop
Just create a virtual environment with requirements.txt
, the setup uses pre-commit to make sure all tests are run
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for json_repair-0.1.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d8d9fa082fbbf5ba48555f2d6d3f7cfe4d8d4644d4f980c71e72cf565e5b037b |
|
MD5 | d8191b36356b06c5ad6adee92727d7d9 |
|
BLAKE2b-256 | a6035779f7cb12f26d14317b314c94c27688563581616c25fb812a9aeac0c006 |