A JIT implementation for Marshmallow to speed up dumping and loading objects
Project description
:fire: Deep-Fried Marshmallow – Makes Marshmallow a Chicken Nugget
I need to be honest with you — I have no idea how to compare the speed of a marshmallow and the speed of a chicken nugget. I really liked that headline, though, so let's just assume that a nugget is indeed faster than a marshmallow. So is this project, Deep-Fried Marshmallow, faster than vanilla Marshmallow. Or, to be precise, it makes Marshmallow faster.
Deep-Fried Marshmallow implements a JIT for Marshmallow that speeds up dumping objects 3-5x (depending on your schema). Deep-Fried Marshmallow allows you to have the great API that Marshmallow provides without having to sacrifice performance.
Benchmark Result:
Original Dump Time: 220.50 usec/dump
Original Load Time: 536.51 usec/load
Optimized Dump Time: 58.67 usec/dump
Optimized Load Time: 118.44 usec/load
Speed up for dump: 3.76x
Speed up for load: 4.53x
Deep-Fried Marshmallow is a fork of the great Toasted Marshmallow project that, sadly, has been abandoned for years. Deep-Fried Marshmallow introduces many changes that make it compatible with all latest versions of Marshmallow (3.13+). It also changes the way the library interacts with Marshmallow, which means that code of Marshmallow doesn't need to be forked and modified for the JIT magic to work. That's a whole new level of magic!
Installing Deep-Fried Marshmallow
pip install DeepFriedMarshmallow
# or, if your project uses Poetry:
poetry install DeepFriedMarshmallow
If your project doesn't have vanilla Marshmallow specified in requirements, the latest version of it will be installed alongside Deep-Fried Marshmallow. You are free to pin any version of it that you need, as long as it's newer than v3.13.
Enabling Deep-Fried Marshmallow
Enabling Deep-Fried Marshmallow on an existing schema is just one change of code. Change your schemas to inherit from the JitSchema
class in the deepfriedmarshmallow
package instead of Schema
from marshmallow
.
For example, this block of code:
from marshmallow import Schema, fields
class ArtistSchema(Schema):
name = fields.Str()
class AlbumSchema(Schema):
title = fields.Str()
release_date = fields.Date()
artist = fields.Nested(ArtistSchema())
schema = AlbumSchema()
Should become this:
from marshmallow import fields
from deepfriedmarshmallow import JitSchema
class ArtistSchema(JitSchema):
name = fields.Str()
class AlbumSchema(JitSchema):
title = fields.Str()
release_date = fields.Date()
artist = fields.Nested(ArtistSchema())
schema = AlbumSchema()
And that's it!
How it works
Deep-Fried Marshmallow works by generating code at runtime to optimize dumping objects without going through layers and layers of reflection. The generated code optimistically assumes the objects being passed in are schematically valid, falling back to the original Marshmallow code on failure.
For example, taking AlbumSchema
from above, Deep-Fried Marshmallow will
generate the following methods:
def InstanceSerializer(obj):
res = {}
value = obj.title; value = value() if callable(value) else value; value = str(value) if value is not None else None; res["title"] = value
value = obj.release_date; value = value() if callable(value) else value; res["release_date"] = _field_release_date__serialize(value, "release_date", obj)
value = obj.artist; value = value() if callable(value) else value; res["artist"] = _field_artist__serialize(value, "artist", obj)
return res
def DictSerializer(obj):
res = {}
if "title" in obj:
value = obj["title"]; value = value() if callable(value) else value; value = str(value) if value is not None else None; res["title"] = value
if "release_date" in obj:
value = obj["release_date"]; value = value() if callable(value) else value; res["release_date"] = _field_release_date__serialize(value, "release_date", obj)
if "artist" in obj:
value = obj["artist"]; value = value() if callable(value) else value; res["artist"] = _field_artist__serialize(value, "artist", obj)
return res
def HybridSerializer(obj):
res = {}
try:
value = obj["title"]
except (KeyError, AttributeError, IndexError, TypeError):
value = obj.title
value = value; value = value() if callable(value) else value; value = str(value) if value is not None else None; res["title"] = value
try:
value = obj["release_date"]
except (KeyError, AttributeError, IndexError, TypeError):
value = obj.release_date
value = value; value = value() if callable(value) else value; res["release_date"] = _field_release_date__serialize(value, "release_date", obj)
try:
value = obj["artist"]
except (KeyError, AttributeError, IndexError, TypeError):
value = obj.artist
value = value; value = value() if callable(value) else value; res["artist"] = _field_artist__serialize(value, "artist", obj)
return res
Deep-Fried Marshmallow will invoke the proper serializer based upon the input.
Since Deep-Fried Marshmallow generates code at runtime, it's critical you re-use Schema objects. If you're creating a new Schema object every time you serialize or deserialize an object, you're likely to experience much worse performance.
Special thanks to
- @rowillia/@lyft — for creating Toasted Marshmallow
- @taion — for a PoC of injecting the JIT compiler by replacing the marshaller
- @Kalepa — for needing improved Marshmallow performance so that I could actually work on this project 😅
License
See LICENSE for details.
Contributing
Contributions, issues and feature requests are welcome!
Feel free to check existing issues before reporting a new one.
Show your support
Give this repository a ⭐️ if this project helped you!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for deepfriedmarshmallow-1.0.0.dev2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 946af45afecffecaa1bb0c7f9970015b2962be24fd31c1390504c88ff5fd0e40 |
|
MD5 | 108177ff6f071eda7baf89e343aae36e |
|
BLAKE2b-256 | bfe35181456324685346b67806d564189b3a69f82690f731fb0b763637a07daa |
Hashes for deepfriedmarshmallow-1.0.0.dev2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd332842866558f10c11fa6993e31cf1fa097d135616572c0b65830d626f767f |
|
MD5 | 6252830a17a508fa913d304c4e521d0b |
|
BLAKE2b-256 | e545fb317403fff2aabfa04147b13f60eed2e3757f9099eae14ddad4c7d4b244 |