More JSON Tools!

These details have not been verified by PyPI

Project links

Homepage

Project description

More JSON Tools

This set of modules provides the following benefits:

Serialize more datastructures into JSON
More flexibility in what's accepted as "JSON"
Iterate over massive JSON easily (mo_json.stream)
Provide a bijection between strictly typed JSON, and dynamic typed JSON.

Recent Changes

Version 6.x.x - Typed encoder no longer encodes to typed multivalues, rather, encodes to array of typed values. For example, instead of
```
{"a": {"~n~": [1, 2]}}
```
we get
```
{"a": {"~a~": [{"~n~": 1},{"~n~": 2}]}} 
```

Usage

Encode using `json`

Add a __json__ method to any class you wish to serialize to JSON. It is incumbent on you to ensure valid JSON is emitted:

class MyClass(object):
    def __init__(self, a, b):
        self.a = a
        self.b = b

    def __json__(self):
        separator = "{"
        for k, v in self.__dict__.items():
            yield separator
            separator = ","
            yield value2json(k)+": "+value2json(v)
        yield "}"

With the __json__ function defined, you may use the value2json function:

from mo_json import value2json

result = value2json(MyClass(a="name", b=42))

Encode using `data`

Add a __data__ method that will convert your class into some JSON-serializable data structures. You may find this easier to implement than emitting pure JSON. If both __data__ and __json__ exist, then __json__ is used.

from mo_json import value2json

class MyClass(object):
    def __init__(self, a, b):
        self.a = a
        self.b = b

    def __data__(self):
        return self.__dict__

result = value2json(MyClass(a="name", b=42))

Decoding

The json2value function provides a couple of options

flexible - will be very forgiving of JSON accepted (see hjson)
leaves - will interpret keys with dots (".") as dot-delimited paths

from mo_json import json2value

result = json2value(
    "http.headers.referer: http://example.com", 
    flexible=True, 
    leaves=True
)
assert result=={'http': {'headers': {'referer': 'http://example.com'}}}

Notice the lack of quotes in the JSON (hjson) and the deep structure created by the dot-delimited path name

Running tests

pip install -r tests/requirements.txt
set PYTHONPATH=.    
python.exe -m unittest discover tests

Module Details

Method `mo_json.scrub()`

Remove, or convert, a number of objects from a structure that are not JSON-izable. It is faster to scrub and use the default (aka c-based) python encoder than it is to use default serializer that forces the use of an interpreted python encoder.

Module `mo_json.stream`

A module that supports queries over very large JSON strings. The overall objective is to make a large JSON document appear like a hierarchical database, where arrays of any depth, can be queried like tables.

Limitations

This is not a generic streaming JSON parser. It is only intended to breakdown the top-level array, or object for less memory usage.

Array values must be the last object property - If you query into a nested array, all sibling properties found after that array must be ignored (must not be in the expected_vars). The code will raise an exception if you can not extract all expected variables.

Method `mo_json.stream.parse()`

Will return an iterator over all objects found in the JSON stream.

Parameters:

json - a parameter-less function, when called returns some number of bytes from the JSON stream. It can also be a string.
path - a dot-delimited string specifying the path to the nested JSON. Use "." if your JSON starts with [, and is a list.
expected_vars - a list of strings specifying the full property names required (all other properties are ignored)

Common Usage

The most common use of parse() is to iterate over all the objects in a large, top-level, array:

parse(json, path=".", required_vars=["."]}

For example, given the following JSON:

[
    {"a": 1},
    {"a": 2},
    {"a": 3},
    {"a": 4}
]

returns a generator that provides

{"a": 1}
{"a": 2}
{"a": 3}
{"a": 4}

Examples

Simple Iteration

json = {"b": "done", "a": [1, 2, 3]}
parse(json, path="a", required_vars=["a", "b"]}

We will iterate through the array found on property a, and return both a and b variables. It will return the following values:

{"b": "done", "a": 1}
{"b": "done", "a": 2}
{"b": "done", "a": 3}

Bad - Property follows array

The same query, but different JSON with b following a:

json = {"a": [1, 2, 3], "b": "done"}
parse(json, path="a", required_vars=["a", "b"]}

Since property b follows the array we're iterating over, this will raise an error.

Good - No need for following properties

The same JSON, but different query, which does not require b:

json = {"a": [1, 2, 3], "b": "done"}
parse(json, path="a", required_vars=["a"]}

If we do not require b, then streaming will proceed just fine:

{"a": 1}
{"a": 2}
{"a": 3}

Complex Objects

This streamer was meant for very long lists of complex objects. Use dot-delimited naming to refer to full name of the property

json = [{"a": {"b": 1, "c": 2}}, {"a": {"b": 3, "c": 4}}, ...
parse(json, path=".", required_vars=["a.c"])

The dot (.) can be used to refer to the top-most array. Notice the structure is maintained, but only includes the required variables.

{"a": {"c": 2}}
{"a": {"c": 4}}
...

Nested Arrays

Nested array iteration is meant to mimic a left-join from parent to child table; as such, it includes every record in the parent.

json = [
    {"o": 1: "a": [{"b": 1}: {"b": 2}: {"b": 3}: {"b": 4}]},
    {"o": 2: "a": {"b": 5}},
    {"o": 3}
]
parse(json, path=[".", "a"], required_vars=["o", "a.b"])

The path parameter can be a list, which is used to indicate which properties are expected to have an array, and to iterate over them. Please notice if no array is found, it is treated like a singleton array, and missing arrays still produce a result.

{"o": 1, "a": {"b": 1}}
{"o": 1, "a": {"b": 2}}
{"o": 1, "a": {"b": 3}}
{"o": 1, "a": {"b": 4}}
{"o": 2, "a": {"b": 5}}
{"o": 3}

Large top-level objects

Some JSON is a single large object, rather than an array of objects. In these cases, you can use the items operator to iterate through all name/value pairs of an object:

json = {
    "a": "test",
    "b": 2,
    "c": [1, 2]
}
parse(json, {"items": "."}, {"name", "value"})

produces an iterator of

{"name": "a", "value": "test"} 
{"name": "b", "value": 2} 
{"name": "c", "value": [1,2]}

Module `typed_encoder`

One reason that NoSQL documents stores are wonderful is their schema can automatically expand to accept new properties. Unfortunately, this flexibility is not limitless; A string assigned to property prevents an object being assigned to the same, or visa-versa. This flexibility is under attack by the strict-typing zealots; who, in their self-righteous delusion, believe explicit types are better. They make the lives of humans worse; as we are forced to toil over endless schema modifications.

This module translates JSON documents into "typed" form; which allows document containers to store both objects and primitives in the same property. This also enables the storage of values with no containing object!

The typed JSON has a different form than the original, and queries into the document store must take this into account. This conversion is intended to be hidden behind a query abstraction layer that can understand this format.

How it works

There are three main conversions:

Primitive values are replaced with single-property objects, where the property name indicates the data type of the value stored:
```
{"a": true} -> {"a": {"~b~": true}} 
{"a": 1   } -> {"a": {"~n~": 1   }} 
{"a": "1" } -> {"a": {"~s~": "1" }}
```
JSON objects get an additional property, ~e~, to mark existence. This allows us to query for object existence, and to count the number of objects.
```
{"a": {}} -> {"a": {"~e~": 1}, "~e~": 1}  
```
JSON arrays are contained in a new object, along with ~e~ to count the number of elements in the array:
```
{"a": [1, 2, 3]} -> {"a": {
    "~e~": 3, 
    "~a~": [
        {"~n~": 1},
        {"~n~": 2},
        {"~n~": 3}
    ]
}}
```
Note the sum of a.~e~ works for both objects and arrays; letting us interpret sub-objects as single-value nested object arrays.

Function `typed_encode()`

Accepts a dict, list, or primitive value, and generates the typed JSON that can be inserted into a document store.

Function `json2typed()`

Converts an existing JSON unicode string and returns the typed JSON unicode string for the same.

Update Mar2016 - PyPy version 5.x appears to have improved C integration to the point that the C library callbacks are no longer a significant overhead: This pure Python JSON encoder is no longer faster than a compound C/Python solution.

Fast JSON encoder used in convert.value2json() when running in Pypy. Run the speed test to compare with default implementation and ujson

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

6.647.24166

Jun 14, 2024

6.637.24140

May 19, 2024

6.634.24139

May 18, 2024

6.624.24125

May 4, 2024

6.619.24125

May 4, 2024

6.617.24125

May 4, 2024

6.609.24119

Apr 28, 2024

6.606.24115

Apr 24, 2024

6.605.24115

Apr 24, 2024

6.603.24115

Apr 24, 2024

6.602.24115

Apr 24, 2024

6.601.24115

Apr 24, 2024

6.600.24114

Apr 23, 2024

6.599.24114

Apr 23, 2024

6.589.24111

Apr 20, 2024

This version

6.584.24095

Apr 4, 2024

6.582.24095

Apr 4, 2024

6.579.24081

Mar 21, 2024

6.577.24081

Mar 21, 2024

6.576.24080

Mar 20, 2024

6.574.24078

Mar 18, 2024

6.571.24077

Mar 17, 2024

6.570.24076

Mar 16, 2024

6.566.24076

Mar 16, 2024

6.562.24075

Mar 15, 2024

6.561.24073

Mar 13, 2024

6.556.24070

Mar 10, 2024

6.552.24062

Mar 2, 2024

6.547.24058

Feb 27, 2024

6.543.24046

Feb 15, 2024

6.541.24038

Feb 7, 2024

6.538.24037

Feb 6, 2024

6.531.24035

Feb 4, 2024

6.530.24035

Feb 4, 2024

6.527.24034

Feb 3, 2024

6.522.24033

Feb 2, 2024

6.511.24028

Jan 28, 2024

6.508.24028

Jan 28, 2024

6.505.24024

Jan 24, 2024

6.503.24024

Jan 24, 2024

6.492.24021

Jan 21, 2024

6.486.24020

Jan 20, 2024

6.481.24007

Jan 7, 2024

6.478.24007

Jan 7, 2024

6.458.23316

Nov 12, 2023

6.457.23316

Nov 12, 2023

6.451.23305

Nov 1, 2023

6.446.23282

Oct 9, 2023

6.444.23276

Oct 3, 2023

6.441.23265

Sep 22, 2023

6.438.23256

Sep 13, 2023

6.434.23241

Aug 29, 2023

6.425.23207

Jul 26, 2023

6.420.23168

Jun 17, 2023

6.413.23168

Jun 17, 2023

6.412.23165

Jun 14, 2023

6.411.23165

Jun 14, 2023

6.407.23153

Jun 2, 2023

6.403.23150

May 30, 2023

6.385.23129

May 9, 2023

6.384.23124

May 4, 2023

6.382.23124

May 4, 2023

6.374.23120

Apr 30, 2023

6.373.23120

Apr 30, 2023

6.371.23118

Apr 28, 2023

6.370.23104

Apr 14, 2023

6.368.23092

Apr 2, 2023

6.365.23080

Mar 21, 2023

6.359.23070

Mar 11, 2023

6.358.23070

Mar 11, 2023

6.346.23011

Jan 11, 2023

6.341.23006

Jan 6, 2023

6.340.23006

Jan 6, 2023

6.333.23006

Jan 6, 2023

6.331.23005

Jan 5, 2023

6.330.23004

Jan 4, 2023

6.329.23004

Jan 4, 2023

6.325.22362

Dec 28, 2022

6.310.22362

Dec 28, 2022

6.281.22341

Dec 7, 2022

6.279.22339

Dec 5, 2022

6.278.22339

Dec 5, 2022

6.274.22339

Dec 5, 2022

6.272.22339

Dec 5, 2022

6.252.22317

Nov 13, 2022

6.251.22317

Nov 13, 2022

6.249.22317

Nov 13, 2022

6.239.22316

Nov 12, 2022

6.230.22310

Nov 6, 2022

6.220.22308

Nov 4, 2022

6.219.22308

Nov 4, 2022

6.218.22308

Nov 4, 2022

6.173.22126

May 6, 2022

6.143.22057

Feb 26, 2022

6.132.22039

Feb 8, 2022

6.127.22022

Jan 22, 2022

6.125.22022

Jan 22, 2022

6.121.22022

Jan 22, 2022

6.116.22021

Jan 21, 2022

4.453.23305

Nov 1, 2023

4.23.21108

Apr 18, 2021

3.93.20259

Sep 15, 2020

3.92.20258

Sep 14, 2020

3.80.20196

Jul 14, 2020

3.79.20194

Jul 12, 2020

3.78.20194

Jul 12, 2020

3.77.20190

Jul 8, 2020

3.76.20190

Jul 8, 2020

3.67.20113

Apr 22, 2020

3.63.20108

Apr 17, 2020

3.57.20089

Mar 29, 2020

3.51.20059

Feb 28, 2020

3.50.20043

Feb 12, 2020

3.47.20042

Feb 11, 2020

3.38.20029

Jan 29, 2020

3.4.19316

Nov 12, 2019

2.53.19239

Aug 27, 2019

2.43.19055

Feb 24, 2019

2.40.19027

Jan 27, 2019

2.33.19026

Jan 26, 2019

2.18.18240

Aug 28, 2018

2.16.18199

Jul 18, 2018

2.1.18025

Jan 25, 2018

2.1.17319

Nov 15, 2017

1.2.17304

Oct 31, 2017

1.0.17236

Aug 24, 2017

1.0.17227

Aug 15, 2017

1.0.17168

Jun 17, 2017

1.0.17131

May 11, 2017

1.0.17085

Mar 26, 2017

1.0.17056

Feb 25, 2017

1.0.17049

Feb 18, 2017

1.0.17041

Feb 9, 2017

1.0.17039

Feb 7, 2017

1.0.17035

Feb 3, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mo-json-6.584.24095.tar.gz (35.1 kB view details)

Uploaded Apr 4, 2024 Source

Built Distribution

mo_json-6.584.24095-py3-none-any.whl (33.8 kB view details)

Uploaded Apr 4, 2024 Python 3

File details

Details for the file mo-json-6.584.24095.tar.gz.

File metadata

Download URL: mo-json-6.584.24095.tar.gz
Upload date: Apr 4, 2024
Size: 35.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.6

File hashes

Hashes for mo-json-6.584.24095.tar.gz
Algorithm	Hash digest
SHA256	`5b9eb3c914930232c848e28e44a0b4dcf6547eff356a01eb5d92df11d44b8ff7`
MD5	`eb7a27d382149ee606bae7e22a8fca4a`
BLAKE2b-256	`feca4ff86a184e64a1937e671b5c3f0a6ac9406c18a49b05a815c88f1b275887`

See more details on using hashes here.

File details

Details for the file mo_json-6.584.24095-py3-none-any.whl.

File metadata

Download URL: mo_json-6.584.24095-py3-none-any.whl
Upload date: Apr 4, 2024
Size: 33.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.6

File hashes

Hashes for mo_json-6.584.24095-py3-none-any.whl
Algorithm	Hash digest
SHA256	`785911beff09edb591a8fa113c47eba646dec34d5f9984aa619b4d3553934b10`
MD5	`8e3be4dd693c9b57aebbc22bd243cf1a`
BLAKE2b-256	`4533b207800c9cb820112c3065a9c6ef44ae7fee6eb16201aec30e21d3a68e84`

See more details on using hashes here.

mo-json 6.584.24095

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

More JSON Tools

Recent Changes

Usage

Encode using __json__

Encode using __data__

Decoding

Running tests

Module Details

Method mo_json.scrub()

Module mo_json.stream

Limitations

Method mo_json.stream.parse()

Common Usage

Examples

Module typed_encoder

How it works

Function typed_encode()

Function json2typed()

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Encode using `json`

Encode using `data`

Method `mo_json.scrub()`

Module `mo_json.stream`

Method `mo_json.stream.parse()`

Module `typed_encoder`

Function `typed_encode()`

Function `json2typed()`