Skip to main content

safe LZ4 data compress library.

Project description

safelz4

GitHub PyPI Python Version

Python bindings for lz4_flex, the fastest pure-Rust implementation of the LZ4 compression algorithm.

Installation

Pip

You can install safelz4 via the pip manager:

pip install safelz4

From source

For the sources, you need Rust

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Make sure it's up to date and using stable channel
rustup update
git clone https://github.com/LVivona/safelz4.git
cd safelz4
pip install setuptools_rust
pip install maturin
# install
pip install -e .

Getting Started

Block Format

safelz4 block

The block format is suitable only for smaller chunks of data, as each block must be fully compressed or decompressed in memory. For larger data sequences, the frame format should be used instead, as it supports streaming and includes metadata for better handling of large-byte sequences. specs

import os
import sys
from typing import Union, Generator
from safelz4.block import compress_prepend_size, decompress_size_prepended

def chunk_blocks(filename : Union[os.PathLike, str], chunk_size : int = 1048576) -> Generator[bytes, None, None]:
    """compress read bytes into chunks blocks"""
    with open(filename, "rb") as f:
        while content := f.read(chunk_size):
            buffer = compress_prepend_size(content)
            yield buffer

# 1 Mb chunck
blocks = chunk_blocks("dickens.txt")

for block in blocks:
    output = decompress_size_prepended(buffer)
    sys.stdout.write(output.decode("utf-8"))

Frame Format

safelz4 frame

Frames are containers that encapsulate a set of compressed blocks. Information about the blocks is stored both in the frame header and within the blocks themselves. Read more within the specs

import safelz4

buffer = None
with open("dickens.txt", "rb") as file:
    buffer = file.read(-1)
    safelz4.compress_into_file("dickens.lz4", buffer)


with safelz4.open("dickens.lz4", "rb") as f:
   while content := f.read(100):
      print(content.decode("utf-8"))

Bechmarks

Benchmark results are available in the benches folder. We evaluated performance in two key scenarios:

Full byte availability, where the entire buffer is accessible during compression and decompression.

Streamed access, using reader and writer interfaces with chunked input.

Summary

In full buffer scenarios, lz4 generally performs well and occasionally outpaces safelz4, especially on larger files. However, safelz4 still remained competitive, with close times.

In reader/writer scenarios (chunked input, 1024 bytes), safelz4 significantly outperforms lz4, consistently achieving more than 2x speed improvement in both compression and decompression.

Streamed access (chunk 1024 bytes)

open Benchmark safelz4 lz4
ctx_compression_writer_compression_1k.txt 8.84 us 22.5 us: 2.54x slower
ctx_compression_writer_compression_34k.txt 9.07 us 22.6 us: 2.49x slower
ctx_compression_writer_compression_65k.txt 9.18 us 23.0 us: 2.50x slower
ctx_compression_writer_compression_66k_JSON.txt 9.18 us 23.1 us: 2.51x slower
ctx_compression_writer_dickens.txt 9.16 us 23.9 us: 2.61x slower
ctx_compression_writer_hdfs.json 9.21 us 22.9 us: 2.49x slower
ctx_compression_writer_reymont.pdf 9.26 us 22.9 us: 2.48x slower
ctx_compression_writer_xml_collection.xml 9.27 us 23.1 us: 2.49x slower
Geometric mean (ref) 2.51x slower
open Benchmark safelz4 lz4
ctx_decompression_writer_compression_1k.txt 11.0 us 17.6 us: 1.59x slower
ctx_decompression_writer_compression_34k.txt 23.8 us 46.2 us: 1.94x slower
ctx_decompression_writer_compression_65k.txt 34.6 us 68.6 us: 1.98x slower
ctx_decompression_writer_compression_66k_JSON.txt 27.1 us 61.9 us: 2.28x slower
ctx_decompression_writer_dickens.txt 4.11 ms 8.67 ms: 2.11x slower
ctx_decompression_writer_hdfs.json 1.77 ms 4.39 ms: 2.48x slower
ctx_decompression_writer_reymont.pdf 2.92 ms 5.74 ms: 1.97x slower
ctx_decompression_writer_xml_collection.xml 1.99 ms 3.97 ms: 2.00x slower
Geometric mean (ref) 2.03x slower

Full byte availability Run(s)

frame.compress Benchmark safelz4 lz4
compression_compression_1k.txt 829 ns 839 ns: 1.01x slower
compression_compression_34k.txt 26.3 us 32.5 us: 1.23x slower
compression_compression_65k.txt 49.9 us 60.1 us: 1.20x slower
compression_compression_66k_JSON.txt 26.5 us 24.7 us: 1.07x faster
compression_dickens.txt 17.0 ms 15.9 ms: 1.07x faster
compression_hdfs.json 3.16 ms 2.63 ms: 1.20x faster
compression_reymont.pdf 12.3 ms 11.4 ms: 1.08x faster
compression_xml_collection.xml 4.58 ms 4.12 ms: 1.11x faster
Geometric mean (ref) 1.01x faster
frame.decompress Benchmark safelz4 lz4
decompress_compression_1k.txt 612 ns 416 ns: 1.47x faster
decompress_compression_34k.txt 8.96 us 10.0 us: 1.12x slower
decompress_compression_65k.txt 15.4 us 17.1 us: 1.11x slower
decompress_compression_66k_JSON.txt 9.45 us 8.04 us: 1.18x faster
decompress_dickens.txt 4.00 ms 2.13 ms: 1.88x faster
decompress_hdfs.json 1.50 ms 1.03 ms: 1.45x faster
decompress_reymont.pdf 2.42 ms 1.99 ms: 1.21x faster
decompress_xml_collection.xml 1.68 ms 1.19 ms: 1.41x faster
Geometric mean (ref) 1.26x faster

NOTE: All benchmarks were performed using python package pypref, on a system equipped with an Apple M4 Max processor and 36GB of unified memory.

Acknowledgement

This project acknowledges the outstanding work of Yann Collet.

Special thanks also to the maintainers of the lz4_flex Rust crate for providing a safe, pure-Rust implementation of LZ4 compression and decompression.

Other Implementation

LZ4 implementations, including:

Python Library Build Status Version Licence
python-lz4 Build Status PyPI - License

Licence

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

safelz4-0.0.4.tar.gz (116.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

safelz4-0.0.4-cp38-abi3-win_amd64.whl (232.7 kB view details)

Uploaded CPython 3.8+Windows x86-64

safelz4-0.0.4-cp38-abi3-win32.whl (225.2 kB view details)

Uploaded CPython 3.8+Windows x86

safelz4-0.0.4-cp38-abi3-musllinux_1_2_x86_64.whl (549.5 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ x86-64

safelz4-0.0.4-cp38-abi3-musllinux_1_2_i686.whl (583.6 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ i686

safelz4-0.0.4-cp38-abi3-musllinux_1_2_armv7l.whl (650.6 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ ARMv7l

safelz4-0.0.4-cp38-abi3-musllinux_1_2_aarch64.whl (557.9 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ ARM64

safelz4-0.0.4-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (379.4 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

safelz4-0.0.4-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (419.3 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ s390x

safelz4-0.0.4-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (516.9 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ppc64le

safelz4-0.0.4-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (387.9 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARMv7l

safelz4-0.0.4-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (380.5 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

safelz4-0.0.4-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl (408.4 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.5+ i686

safelz4-0.0.4-cp38-abi3-macosx_11_0_arm64.whl (344.6 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

safelz4-0.0.4-cp38-abi3-macosx_10_12_x86_64.whl (354.5 kB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file safelz4-0.0.4.tar.gz.

File metadata

  • Download URL: safelz4-0.0.4.tar.gz
  • Upload date:
  • Size: 116.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.7

File hashes

Hashes for safelz4-0.0.4.tar.gz
Algorithm Hash digest
SHA256 9886d419423c5ddc3ecc03d0c1deaa401d2a23e37e8f4cdfb5d98a9a4e3b4ee0
MD5 b856c0a1fc8f83437ff4975a94425f0c
BLAKE2b-256 5d4d78ec95c568a96cc3911cf44b746d8e360c01e60dca23f63f6302b1079222

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: safelz4-0.0.4-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 232.7 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.7

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 096b4e2bc3ebcd877b42d945c6dac4a6262982482b528ab0068d8b8b8a526387
MD5 7e8b859ecab1d97d9e770b4a94102a48
BLAKE2b-256 2f35eaeb8124a49922ffacb044593efe13e4c3da452976e2ff96878a6d2fa02c

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-win32.whl.

File metadata

  • Download URL: safelz4-0.0.4-cp38-abi3-win32.whl
  • Upload date:
  • Size: 225.2 kB
  • Tags: CPython 3.8+, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.7

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-win32.whl
Algorithm Hash digest
SHA256 d5b6b4de5f827d5eeb02e53582a1a557e8ab3a293d1af4381a9a96ead9f67bd6
MD5 ad01cbc88392ab7fa2ec9ef3b8cff5c7
BLAKE2b-256 2c621ef4114c122694d72c3c6983317dcbd576a714985060a728dbeebb3fd376

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 ca874ad1e0ecaca2fd3acb2c6011a7c4bf1104bb14f45d1f4ec7ee0115ae8abf
MD5 79204144a7c3aba373ceb5c75cd7453c
BLAKE2b-256 e018a0a0750c2e5a57459aea510c8f2ace9dd3b967fb432b8a596d46895645a4

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 04e6570602ac414346e5d5913b77be65194c5589c9320c309fd5606abc577fbe
MD5 14695e4581317dda9278f7a3a33d2ecd
BLAKE2b-256 bf7f5cba00b20bd61a9fa419fab0e905d67f9ae07b609b70dc62c5e0240b0227

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 ad4af74fae2568e205f3a4cd88215449011306e2ea6996ced06ed7ebfe5e1bb0
MD5 b93ea98250d9ad13128c4ff119c12736
BLAKE2b-256 01a66af535e7b1d8d6659a3123b759ca6c9e2452afb311367d75ecc4db937378

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 8ec092990b7918fc6ede6994280342d29cb0433fb47ca6ed2f3af4d7fdd88ad2
MD5 95acabc6931b8540840e5f66453536b6
BLAKE2b-256 d6e6c1a43b80c657c3bdd97aadd4b6a583664574de06e6b9c02a494993b3479d

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 dd0e27ee98ad3dabbf9d35c869a8e43634a819de43bd077cd5f08873c9485c5d
MD5 558c492c6dc59d6940a06831d13f96bd
BLAKE2b-256 234764bcc50bb12d92ced73814462a6217c05bb760c7d1edd0352409f77313f9

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 49c3a0147542936b41debc2dd0e326f8dd2c59acdaac733e561fdcbc514720c2
MD5 7891cd00045c2ab6a0603d9d88972be4
BLAKE2b-256 34cdaf08d61e546db3ebaa416938a10e9435bc4fdfea8c4f2e8a91daffe9d684

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 bf6926160e52503cfbd0bb261fbb5e807a5a5fcd382db9e83834a9cdf0a4ca37
MD5 5a52a9b2ddea2bd6a9d31bdf8554310a
BLAKE2b-256 a67346e21a4c9d3009347e67ede431ae1e5b072735d54da81ea7ae5cab824f1b

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 a1b22a10d9b7eda1046ea8868b41cd61f2d6ecce70d74c48745e2652cbf3c392
MD5 47cbce9f7183c146b666919bc9641809
BLAKE2b-256 9e4336e1a3d1a4498ec878abc75aa7fd8d6b10473fe8555df832f60b6bbd3cf3

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b41bf25e8ef4802ae3ecb9a3331c285879d66ac5c5b61c6f36d723f55ec3a159
MD5 d0a6f6cddd85e2c636abfb8e55bf1276
BLAKE2b-256 c34146036153796b4cfbf5101ba015b863fe3af39f85ffd19689c77502284dcd

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 9fae7affb9411722c275f6b36d1a4f843c0083dae4771535b61ad8a90616d42b
MD5 55c94aa91ff00730f52426c009a219c6
BLAKE2b-256 78ea21b4c74b70ed09fe804ac1ac9ee8fd537c702ad99b192eec680318a96536

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3b830613009e13fd38619eef30c114035e38956108c24803d859fbab69b688d2
MD5 ace23214601c1328317fd44f3783941e
BLAKE2b-256 b5bd67006c594108c6285576ae2f1752c576f168af7e8fe16a39df2d6b5806b1

See more details on using hashes here.

File details

Details for the file safelz4-0.0.4-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.4-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 6dfb183cee10e3babe65d2de80a5ff850d7657a989abbc34952ad08fd3086373
MD5 c9fab2428c9342dd98293f7569644a03
BLAKE2b-256 92d8c3df0fd0e505e4059a3735f5c9e68e2789318c7bae5fce6b8557a68225d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page