Skip to main content

A python library for fast serialization and deserialization of complex Python objects into JSON.

Project description

serializejson

Authors

Baptiste de La Gorce

PyPI

https://pypi.org/project/serializejson

Documentation

https://smartaudiotools.github.io/serializejson

Sources

https://github.com/SmartAudioTools/serializejson

Issues

https://github.com/SmartAudioTools/serializejson/issues

Noncommercial license

Prosperity Public License 3.0.0

Commercial license

Patron License 1.0.0Sponsor me ! or contact me !

serializejson is a python library for fast serialization and deserialization of python objects in JSON designed as a safe, interoperable and human-readable drop-in replacement for the Python pickle package. Complex python object hierarchies are serializable, deserializable or updatable in once, allowing for example to save or restore a complete application state in few lines of code. The library is build upon python-rapidjson, pybase64 and blosc for optional zstandard compression.

Some of the main features:

  • supports Python 3.7 (maybe lower) or greater.

  • serializes arbitrary python objects into a dictionary by adding __class__ ,and eventually __init__, __new__, __state__, __items__ keys.

  • calls the same objects methods as pickle. Therefore almost all pickable objects are serializable with serializejson without any modification.

  • for not already pickable object, you will allways be able to serialize it by adding methodes to the object or creating plugins for pickle or serializejson.

  • generally 2x slower than pickle for dumping and 3x slower than pickle for loading (on your benchmark) except for big arrays (optimisation will soon be done).

  • serializes and deserializes bytes and bytearray very quickly in base64 thanks to pybase64 and lossless blosc compression.

  • serialize properties and attributes with getters and setters if wanted (unlike pickle).

  • json data will still be directly loadable if you have transform some attributes in slots or properties in your code since your last serialization. (unlike pickle)

  • can serialize __init__(self,..) arguments by name instead of positions, allowing to skip arguments with defauts values and making json datas robust to a change of __init__ parameters order.

  • serialized objects take generally less space than when serialized with pickle: for binary data, the 30% increase due to base64 encoding is in general largely compensated using the lossless blosc compression.

  • serialized objects are human-readable and easy to read. Unlike pickled data, your data will never become unreadable if your code evolves: you will always be able to modify your datas with a text editor (with find & replace for example if you change an attribut name).

  • serialized objects are text and therefore versionable and comparable with versionning and comparaison tools.

  • can safely load untrusted / unauthenticated sources if authorized_classes list parameter is set carefully with strictly necessary objects (unlike pickle).

  • can update existing objects recursively instead of override them. serializejson can be used to save and restore in place a complete application state (⚠ not yet well tested).

  • filters attribute starting with “_” by default (unlike pickle). You can keep them if wanted with filter_ = False.

  • numpy arrays can be serialized as lists with automatic conversion in both ways or in a conservative way.

  • supports circular references and serialize only once duplicated objects, using “$ref” key an path to the first occurance in the json : {“$ref”: “root.xxx.elt”} (⚠ not yet if the object is a list or dictionary).

  • accepts json with comment (// and /* */) if accept_comments = True.

  • can automatically recognize objects in json from keys names and recreate them, without the need of __class__ key, if passed in recognized_classes.

  • serializejson is easly interoperable outside of the Python ecosystem with this recognition of objects from keys names or with __class__ translation between python and other language classes.

  • dump and load support string path.

  • can iteratively encode (with append) and decode (with iterator) a list in json file, which helps saving memory space during the process of serialization and deserialization and useful for logs.

Installation

Last offical release

pip install serializejson

Developpement version unreleased

pip install git+https://github.com/SmartAudioTools/serializejson.git

Examples

Serialization with fonctions API

import serializejson

# serialize in string
object1 = set([1,2])
dumped1 = serializejson.dumps(object1)
loaded1 = serializejson.loads(dumped1)
print(dumped1)
>{
>        "__class__": "set",
>        "__init__": [1,2]
>}


# serialize in file
object2 = set([3,4])
serializejson.dump(object2,"dumped2.json")
loaded2 = serializejson.load("dumped2.json")

Serialization with classes based API.

import serializejson
encoder = serializejson.Encoder()
decoder = serializejson.Decoder()

# serialize in string

object1 = set([1,2])
dumped1 = encoder.dumps(object1)
loaded1 = decoder.loads(dumped1)
print(dumped1)

# serialize in file
object2 = set([3,4])
encoder.dump(object2,"dumped2.json")
loaded2 = decoder.load("dumped2.json")

Update existing object

import serializejson
object1 = set([1,2])
object2 = set([3,4])
dumped1 = serializejson.dumps(object1)
print(f"id {id(object2)} :  {object2}")
serializejson.loads(dumped1,obj = object2, updatables_classes = [set])
print(f"id {id(object2)} :  {object2}")

Iterative serialization and deserialization

import serializejson
encoder = serializejson.Encoder("my_list.json",indent = None)
for elt in range(3):
    encoder.append(elt)
print(open("my_list.json").read())
for elt in serializejson.Decoder("my_list.json"):
    print(elt)
>[0,1,2]
>0
>1
>2

More examples and complete documentation here

License

Copyright 2020 Baptiste de La Gorce

For noncommercial use or thirty-day limited free-trial period commercial use, this project is licensed under the Prosperity Public License 3.0.0.

For non limited commercial use, this project is licensed under the Patron License 1.0.0. To acquire a license please contact me, or just sponsor me on GitHub under the appropriate tier ! This funding model helps me making my work sustainable and compensates me for the work it took to write this crate!

Third-party contributions are licensed under Apache License, Version 2.0 and belong to their respective authors. History =======

Version 0.3.4

Date:

2023-06-11

  • Restore ducumentation

Version 0.3.3

Date:

2022-10-18

  • Big speed improvement for bytes and numpy array serialization

Version 0.3.2

Date:

2022-10-01

  • API changed

  • add better support for cicular reférences and duplicates with {“$ref”: …}

Version 0.2.0

Date:

2021-02-18

  • API changed

  • can serialize dict with no-string keys

  • add support for cicular reférences and duplicates with {“$ref”: …}

Version 0.1.0

Date:

2020-11-28

  • change description for pipy

  • add license for pipy

  • enable load of tuple, time.struct_time, Counter, OrderedDict and defaultdict

Version 0.0.4

Date:

2020-11-24

  • API changed

  • add plugins support

  • add bytes, bytearray and numpy.array compression with blosc zstd

  • fix itertive append and decode (not fully tested).

  • fix dump of numpy types without conversion to python types(not yet numpy.float64)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

serializejson-0.3.4.tar.gz (93.9 kB view hashes)

Uploaded Source

Built Distribution

serializejson-0.3.4-py3.11-linux-x86_64.egg (1.5 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page