Skip to main content

No project description provided

Project description

mpkz - direct MessagePack zstd writer for Python

this is meant as a replacement for json files on disk, the file format is optimized for fast reads while still writing faster than python's json module while getting decent compression ratios.

mpkz is just messagepack with zstd compression, but implemented as efficient as possible. Running some experiments, the default compression level of 8 is giving the best performance to compression ratio

messagepack can encode a superset of json, adding types for binary data and integers. This means you can use mpkz as a drop-in replacement for json without any real downsides

Streaming

MessagePack was designed as a buffered protocol, there can be multiple messages in a single stream

In our case, this means that lists can be decoded on a per-line basis, potentially saving Memory (see example below)

for this reason, if the object is a list, it automatically gets encoded as a stream instead.

Why not use messagepack and zstd from pypi?

with the python packages, you have to first encode the whole object into memory as MessagePack, and then compress those bytes to zstd, and then write those compressed bytes to a file.

This quickly becomes impractical with larger amounts of data, so this implementation directly serializes the python objects into a streaming zstd compressor, avoiding copying data more than once.

API

Basic

if you just want something that works like json, you can use the load/dump functions

import mpkz

# Working with Files
with open("example.mpz", "wb") as f:
    mpkz.dump([1, 2, 3, 4, 5], f)
with open("example.mpz", "rb") as f:
    assert mpkz.load(f) == [1, 2, 3, 4, 5]

# Working with Bytes
input = { "greeting": "Hello World" }
binary = mpkz.dumps(input)
output = mpkz.loads(binary)
assert input == output

Streaming

the Streaming API is useful for cases where the data you want to write to the file would not entirely fit into memory.

import mpkz

# saving all rows of a Django Model into a file
writer = mpkz.create("export.mpz")
writer.extend(MyModel.objects)

# you can also append records one by one
writer = mpkz.create("example2.mpz")
writer.append("hello")
writer.append("world")

# and this is how you would iterate over the contents
# of a file without loading the whole file into memory
for row in mpkz.open("export.mpz"):
    print(row)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

mpkz-0.1.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

mpkz-0.1.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

mpkz-0.1.0-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.whl (1.5 MB view hashes)

Uploaded PyPy manylinux: glibc 2.5+ i686

mpkz-0.1.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

mpkz-0.1.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

mpkz-0.1.0-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.whl (1.5 MB view hashes)

Uploaded PyPy manylinux: glibc 2.5+ i686

mpkz-0.1.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

mpkz-0.1.0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

mpkz-0.1.0-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.whl (1.5 MB view hashes)

Uploaded PyPy manylinux: glibc 2.5+ i686

mpkz-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view hashes)

Uploaded CPython 3.13 manylinux: glibc 2.17+ ARM64

mpkz-0.1.0-cp312-none-win_amd64.whl (384.4 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

mpkz-0.1.0-cp312-none-win32.whl (360.0 kB view hashes)

Uploaded CPython 3.12 Windows x86

mpkz-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

mpkz-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

mpkz-0.1.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl (1.5 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.5+ i686

mpkz-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (679.0 kB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

mpkz-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl (812.0 kB view hashes)

Uploaded CPython 3.12 macOS 10.12+ x86-64

mpkz-0.1.0-cp311-none-win_amd64.whl (384.4 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

mpkz-0.1.0-cp311-none-win32.whl (360.2 kB view hashes)

Uploaded CPython 3.11 Windows x86

mpkz-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

mpkz-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

mpkz-0.1.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl (1.5 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.5+ i686

mpkz-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (680.2 kB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

mpkz-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl (812.9 kB view hashes)

Uploaded CPython 3.11 macOS 10.12+ x86-64

mpkz-0.1.0-cp310-none-win_amd64.whl (384.4 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

mpkz-0.1.0-cp310-none-win32.whl (360.4 kB view hashes)

Uploaded CPython 3.10 Windows x86

mpkz-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

mpkz-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

mpkz-0.1.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl (1.5 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.5+ i686

mpkz-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (680.1 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

mpkz-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl (813.0 kB view hashes)

Uploaded CPython 3.10 macOS 10.12+ x86-64

mpkz-0.1.0-cp39-none-win_amd64.whl (384.7 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

mpkz-0.1.0-cp39-none-win32.whl (360.7 kB view hashes)

Uploaded CPython 3.9 Windows x86

mpkz-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

mpkz-0.1.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

mpkz-0.1.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.whl (1.5 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.5+ i686

mpkz-0.1.0-cp38-none-win_amd64.whl (384.3 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

mpkz-0.1.0-cp38-none-win32.whl (360.6 kB view hashes)

Uploaded CPython 3.8 Windows x86

mpkz-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

mpkz-0.1.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

mpkz-0.1.0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.whl (1.5 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.5+ i686

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page