Skip to main content

gjson-py is a Python package that provides a simple way to filter and extract data from JSON-like objects or JSON files, using the GJSON syntax.

Project description

CI results

Introduction

gjson-py is a Python package that provides a simple way to filter and extract data from JSON-like objects or JSON files, using the GJSON syntax.

It is, compatibly with the language differences and with some limitation, the Python equivalent of the Go GJSON package. The main difference from GJSON is that gjson-py doesn’t work directly with JSON strings but instead with JSON-like Python objects, that can either be the resulting object when calling json.load() or json.loads(), or any Python object that is JSON-serializable.

A detailed list of the GJSON features supported by gjson-py is provided below.

See also the full gjson-py documentation.

Installation

gjson-py is available on the Python Package Index (PyPI) and can be easily installed with:

pip install gjson

It’s also available as a Debian package (python3-gjson) on Debian systems starting from Debian 12 (bookworm) and can be installed with:

apt-get install python3-gjson

A .deb package for the current stable and unstable Debian versions is also available for download on the releases page on GitHub.

How to use the library

gjson-py provides different ways to perform queries on JSON-like objects.

gjson.get()

A quick accessor to GJSON functionalities exposed for simplicity of use. Particularly useful to perform a single query on a given object:

>>> import gjson
>>> data = {'name': {'first': 'Tom', 'last': 'Anderson'}, 'age': 37}
>>> gjson.get(data, 'name.first')
'Tom'

It’s also possible to make it return a JSON-encoded string and decide on failure if it should raise an exception or return None. See the full API documentation for more details.

GJSON class

The GJSON class provides full access to the gjson-py API allowing to perform multiple queries on the same object:

>>> import gjson
>>> data = {'name': {'first': 'Tom', 'last': 'Anderson'}, 'age': 37}
>>> source = gjson.GJSON(data)
>>> source.get('name.first')
'Tom'
>>> str(source)
'{"name": {"first": "Tom", "last": "Anderson"}, "age": 37}'
>>> source.getj('name.first')
'"Tom"'
>>> name = source.get_gjson('name')
>>> name.get('first')
'Tom'
>>> name
<gjson.GJSON object at 0x102735b20>

See the full API documentation for more details.

How to use the CLI

gjson-py provides also a command line interface (CLI) for ease of use:

$ echo '{"name": {"first": "Tom", "last": "Anderson"}, "age": 37}' > test.json
$ cat test.json | gjson 'name.first'  # Read from stdin
"Tom"
$ gjson test.json 'age'  # Read from a file
37
$ cat test.json | gjson - 'name.first'  # Explicitely read from stdin
"Tom"

JSON Lines

JSON Lines support in the CLI allows for different use cases. All the examples in this section operates on a test.json file generated with:

$ echo -e '{"name": "Gilbert", "age": 61}\n{"name": "Alexa", "age": 34}\n{"name": "May", "age": 57}' > test.json
Apply the same query to each line

Using the -l/--lines CLI argument, for each input line gjson-py applies the query and filters the data according to it. Lines are read one by one so there is no memory overhead for the processing. It can be used while tailing log files in JSON format for example.

$ gjson --lines test.json 'age'
61
34
57
$ tail -f log.json | gjson --lines 'bytes_sent'  # Dummy example
Encapsulate all lines in an array, then apply the query

Using the special query prefix syntax .., as described in GJSON’s documentation for JSON Lines, gjson-py will read all lines from the input and encapsulate them into an array. This approach has of course the memory overhead of loading the whole input to perform the query.

$ gjson test.json '..#.name'
["Gilbert", "Alexa", "May"]
Filter lines based on their values

Combining the -l/--lines CLI argument with the special query prefix .. described above, it’s possible to filter input lines based on their values. In this case gjson-py encapsulates each line in an array so that is possible to use the Queries GJSON syntax to filter them. As the ecapsulation is performed on each line, there is no memory overhead. Because technically when a line is filtered is because there was no match on the whole line query, the final exit code, if any line is filtered, will be 1.

$ gjson --lines test.json '..#(age>40).name'
"Gilbert"
"May"
Filter lines and apply query to the result

Combining the methods above is possible for example to filter/extract data from the lines first and then apply a query to the aggregated result. The memory overhead in this case is based on the amount of data resulting from the first filtering/extraction.

$ gjson --lines test.json 'age' | gjson '..@sort'
[34, 57, 61]
$ gjson --lines test.json '..#(age>40).age' | gjson '..@sort'
[57, 61]

Query syntax

For the generic query syntax refer to the original GJSON Path Syntax documentation.

Supported GJSON features

This is the list of GJSON features and how they are supported by gjson-py:

GJSON feature

Supported by gjson-py

Notes

Path Structure

YES

Basic

YES

Wildcards

YES

Escape Character

YES

Arrays

YES

Queries

YES

Using Python’s operators [1] [2]

Dot vs Pipe

YES

Modifiers

YES

See the table below for all the details

Modifier arguments

YES

Only a JSON object is accepted as argument

Custom modifiers

YES

Only a JSON object is accepted as argument [3]

Multipaths

YES

Object keys, if specified, must be JSON strings [4]

Literals

YES

Including infinite and NaN values [5]

JSON Lines

YES

CLI support [6] [7]

This is the list of modifiers present in GJSON and how they are supported by gjson-py:

GJSON Modifier

Supported by gjson-py

Notes

@reverse

YES

@ugly

YES

@pretty

PARTIALLY

The width argument is not supported

@this

YES

@valid

YES

@flatten

YES

@join

PARTIALLY

Preserving duplicate keys not supported

@keys

YES

Valid only on JSON objects (mappings)

@values

YES

Valid only on JSON objects (mappings)

@tostr

YES

@fromstr

YES

@group

YES

Additional features

Additional modifiers

This is the list of additional modifiers specific to gjson-py not present in GJSON:

  • @ascii: escapes all non-ASCII characters when printing/returning the string representation of the object, ensuring that the output is made only of ASCII characters. It’s implemented using the ensure_ascii arguments in the Python’s json module. This modifier doesn’t accept any arguments.

  • @sort: sorts a mapping object by its keys or a sequence object by its values. This modifier doesn’t accept any arguments.

  • @top_n: given a sequence object groups the items in the sequence counting how many occurrences of each value are present. It returns a mapping object where the keys are the distinct values of the list and the values are the number of times the key was present in the list, ordered from the most common to the least common item. The items in the original sequence object must be Python hashable. This modifier accepts an optional argument n to return just the N items with the higher counts. When the n argument is not provided all items are returned. Example usage:

    $ echo '["a", "b", "c", "b", "c", "c"]' | gjson '@top_n'
    {"c": 3, "b": 2, "a": 1}
    $ echo '["a", "b", "c", "b", "c", "c"]' | gjson '@top_n:{"n":2}'
    {"c": 3, "b": 2}
  • @sum_n: given a sequence of objects, groups the items in the sequence using a grouping key and sum the values of a sum key provided. It returns a mapping object where the keys are the distinct values of the grouping key and the values are the sums of all the values of the sum key for each distinct grouped key, ordered from the highest sum to the lowest. The values of the grouping key must be Python hashable. The values of the sum key must be integers or floats. This modifier required two mandatory arguments, group and sum that have as values the respective keys in the objects of the sequence. An optional n argument is also accepted to return just the top N items with the highest sum. Example usage:

    $ echo '[{"key": "a", "time": 1}, {"key": "b", "time": 2}, {"key": "c", "time": 3}, {"key": "a", "time": 4}]' > test.json
    $ gjson test.json '@sum_n:{"group": "key", "sum": "time"}'
    {"a": 5, "c": 3, "b": 2}
    $ gjson test.json '@sum_n:{"group": "key", "sum": "time", "n": 2}'
    {"a": 5, "c": 3}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gjson-1.1.0.tar.gz (59.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gjson-1.1.0-py3-none-any.whl (36.2 kB view details)

Uploaded Python 3

File details

Details for the file gjson-1.1.0.tar.gz.

File metadata

  • Download URL: gjson-1.1.0.tar.gz
  • Upload date:
  • Size: 59.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for gjson-1.1.0.tar.gz
Algorithm Hash digest
SHA256 3b15a4503e3ff0caa0896085dd43a3721a05380a2bb1f3d0375b2e12bc124500
MD5 b3ee9e506af51611cc71da9d87d2aba3
BLAKE2b-256 7a95c47c36777c2533882c19d8d5d01924b1fcf122f9c68e1d9b2dff612321c9

See more details on using hashes here.

File details

Details for the file gjson-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: gjson-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 36.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for gjson-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 470fbe37be23cb78312681db6ca2c1ce432e7e731017a28a7befe62a1514322d
MD5 45577ad34171b4ff6740a9625dfd0bf4
BLAKE2b-256 7a25789af295d2320a8ea0e26c8370f9c426c7d152db631d0b4caa62d2cb97b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page