gjson-py is a Python package that provides a simple way to filter and extract data from JSON-like objects or JSON files, using the GJSON syntax.
Project description
Introduction
gjson-py is a Python package that provides a simple way to filter and extract data from JSON-like objects or JSON files, using the GJSON syntax.
It is, compatibly with the language differences and with some limitation, the Python equivalent of the Go GJSON package. The main difference from GJSON is that gjson-py doesn’t work directly with JSON strings but instead with JSON-like Python objects, that can either be the resulting object when calling json.load() or json.loads(), or any Python object that is JSON-serializable.
A detailed list of the GJSON features supported by gjson-py is provided below.
See also the full gjson-py documentation.
Installation
gjson-py is available on the Python Package Index (PyPI) and can be easily installed with:
pip install gjson
How to use the library
gjson-py provides different ways to perform queries on JSON-like objects.
gjson.get()
A quick accessor to GJSON functionalities exposed for simplicity of use. Particularly useful to perform a single query on a given object:
>>> import gjson >>> data = {'name': {'first': 'Tom', 'last': 'Anderson'}, 'age': 37} >>> gjson.get(data, 'name.first') 'Tom'
It’s also possible to make it return a JSON-encoded string and decide on failure if it should raise an exception or return None. See the full API documentation for more details.
GJSON class
The GJSON class provides full access to the gjson-py API allowing to perform multiple queries on the same object:
>>> import gjson >>> data = {'name': {'first': 'Tom', 'last': 'Anderson'}, 'age': 37} >>> source = gjson.GJSON(data) >>> source.get('name.first') 'Tom' >>> str(source) '{"name": {"first": "Tom", "last": "Anderson"}, "age": 37}' >>> source.getj('name.first') '"Tom"' >>> name = source.get_gjson('name') >>> name.get('first') 'Tom' >>> name <gjson.GJSON object at 0x102735b20>
See the full API documentation for more details.
How to use the CLI
gjson-py provides also a command line interface (CLI) for ease of use:
$ echo '{"name": {"first": "Tom", "last": "Anderson"}, "age": 37}' > test.json
$ cat test.json | gjson 'name.first' # Read from stdin
"Tom"
$ gjson test.json 'age' # Read from a file
37
$ cat test.json | gjson - 'name.first' # Explicitely read from stdin
"Tom"
JSON Lines
JSON Lines support in the CLI allows for different use cases. All the examples in this section operates on a test.json file generated with:
$ echo -e '{"name": "Gilbert", "age": 61}\n{"name": "Alexa", "age": 34}\n{"name": "May", "age": 57}' > test.json
Apply the same query to each line
Using the -l/--lines CLI argument, for each input line gjson-py applies the query and filters the data according to it. Lines are read one by one so there is no memory overhead for the processing. It can be used while tailing log files in JSON format for example.
$ gjson --lines test.json 'age'
61
34
57
$ tail -f log.json | gjson --lines 'bytes_sent' # Dummy example
Encapsulate all lines in an array, then apply the query
Using the special query prefix syntax .., as described in GJSON’s documentation for JSON Lines, gjson-py will read all lines from the input and encapsulate them into an array. This approach has of course the memory overhead of loading the whole input to perform the query.
$ gjson test.json '..#.name'
["Gilbert", "Alexa", "May"]
Filter lines based on their values
Combining the -l/--lines CLI argument with the special query prefix .. described above, it’s possible to filter input lines based on their values. In this case gjson-py encapsulates each line in an array so that is possible to use the Queries GJSON syntax to filter them. As the ecapsulation is performed on each line, there is no memory overhead. Because technically when a line is filtered is because there was no match on the whole line query, the final exit code, if any line is filtered, will be 1.
$ gjson --lines test.json '..#(age>40).name'
"Gilbert"
"May"
Filter lines and apply query to the result
Combining the methods above is possible for example to filter/extract data from the lines first and then apply a query to the aggregated result. The memory overhead in this case is based on the amount of data resulting from the first filtering/extraction.
$ gjson --lines test.json 'age' | gjson '..@sort'
[34, 57, 61]
$ gjson --lines test.json '..#(age>40).age' | gjson '..@sort'
[57, 61]
Query syntax
For the generic query syntax refer to the original GJSON Path Syntax documentation.
Supported GJSON features
This is the list of GJSON features and how they are supported by gjson-py:
GJSON feature |
Supported by gjson-py |
Notes |
---|---|---|
YES |
||
YES |
||
YES |
||
YES |
||
YES |
||
PARTIALLY |
Subqueries are not supported [1] |
|
YES |
||
PARTIALLY |
See the table below |
|
YES |
Only a JSON object is accepted as argument |
|
YES |
Only a JSON object is accepted as argument |
|
NO |
||
NO |
||
YES |
This is the list of modifiers and how they are supported by gjson-py:
GJSON Modifier |
Supported by gjson-py |
Notes |
@reverse |
YES |
|
@ugly |
YES |
|
@pretty |
PARTIALLY |
The width argument is not supported |
@this |
YES |
|
@valid |
YES |
|
@flatten |
YES |
|
@join |
NO |
|
@keys |
YES |
|
@values |
YES |
|
@tostr |
NO |
|
@fromstr |
NO |
|
@group |
NO |
|
@sort |
YES |
Not present in GJSON |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.