Skip to main content

safe LZ4 data compress library.

Project description

safelz4

GitHub PyPI Python Version

Python bindings for lz4_flex, the fastest pure-Rust implementation of the LZ4 compression algorithm.

Installation

Pip

You can install safelz4 via the pip manager:

pip install safelz4

From source

For the sources, you need Rust

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Make sure it's up to date and using stable channel
rustup update
git clone https://github.com/LVivona/safelz4.git
cd safelz4
pip install setuptools_rust
pip install maturin
# install
pip install -e .

Getting Started

Block Format

safelz4 block

The block format is suitable only for smaller chunks of data, as each block must be fully compressed or decompressed in memory. For larger data sequences, the frame format should be used instead, as it supports streaming and includes metadata for better handling of large-byte sequences. specs

import os
import sys
from typing import Union, Generator
from safelz4.block import compress_prepend_size, decompress_size_prepended

def chunk_blocks(filename : Union[os.PathLike, str], chunk_size : int = 1048576) -> Generator[bytes, None, None]:
    """compress read bytes into chunks blocks"""
    with open(filename, "rb") as f:
        while content := f.read(chunk_size):
            buffer = compress_prepend_size(content)
            yield buffer

# 1 Mb chunck
blocks = chunk_blocks("dickens.txt")

for block in blocks:
    output = decompress_size_prepended(buffer)
    sys.stdout.write(output.decode("utf-8"))

Frame Format

safelz4 frame

Frames are containers that encapsulate a set of compressed blocks. Information about the blocks is stored both in the frame header and within the blocks themselves. Read more within the specs

import safelz4

buffer = None
with open("dickens.txt", "rb") as file:
    buffer = file.read(-1)
    safelz4.compress_into_file("dickens.lz4", buffer)


with safelz4.open("dickens.lz4", "rb") as f:
   while content := f.read(100):
      print(content.decode("utf-8"))

Bechmarks

Benchmark results are available in the benches folder. We evaluated performance in two key scenarios:

Full byte availability, where the entire buffer is accessible during compression and decompression.

Streamed access, using reader and writer interfaces with chunked input.

Summary

In full buffer scenarios, lz4 generally performs well and occasionally outpaces safelz4, especially on larger files. However, safelz4 still remained competitive, with close times.

In reader/writer scenarios (chunked input, 1024 bytes), safelz4 significantly outperforms lz4, consistently achieving more than 2x speed improvement in both compression and decompression.

Streamed access (chunk 1024 bytes)

open Benchmark safelz4 lz4
ctx_compression_writer_compression_1k.txt 8.84 us 22.5 us: 2.54x slower
ctx_compression_writer_compression_34k.txt 9.07 us 22.6 us: 2.49x slower
ctx_compression_writer_compression_65k.txt 9.18 us 23.0 us: 2.50x slower
ctx_compression_writer_compression_66k_JSON.txt 9.18 us 23.1 us: 2.51x slower
ctx_compression_writer_dickens.txt 9.16 us 23.9 us: 2.61x slower
ctx_compression_writer_hdfs.json 9.21 us 22.9 us: 2.49x slower
ctx_compression_writer_reymont.pdf 9.26 us 22.9 us: 2.48x slower
ctx_compression_writer_xml_collection.xml 9.27 us 23.1 us: 2.49x slower
Geometric mean (ref) 2.51x slower
open Benchmark safelz4 lz4
ctx_decompression_writer_compression_1k.txt 11.0 us 17.6 us: 1.59x slower
ctx_decompression_writer_compression_34k.txt 23.8 us 46.2 us: 1.94x slower
ctx_decompression_writer_compression_65k.txt 34.6 us 68.6 us: 1.98x slower
ctx_decompression_writer_compression_66k_JSON.txt 27.1 us 61.9 us: 2.28x slower
ctx_decompression_writer_dickens.txt 4.11 ms 8.67 ms: 2.11x slower
ctx_decompression_writer_hdfs.json 1.77 ms 4.39 ms: 2.48x slower
ctx_decompression_writer_reymont.pdf 2.92 ms 5.74 ms: 1.97x slower
ctx_decompression_writer_xml_collection.xml 1.99 ms 3.97 ms: 2.00x slower
Geometric mean (ref) 2.03x slower

Full byte availability Run(s)

frame.compress Benchmark safelz4 lz4
compression_compression_1k.txt 829 ns 839 ns: 1.01x slower
compression_compression_34k.txt 26.3 us 32.5 us: 1.23x slower
compression_compression_65k.txt 49.9 us 60.1 us: 1.20x slower
compression_compression_66k_JSON.txt 26.5 us 24.7 us: 1.07x faster
compression_dickens.txt 17.0 ms 15.9 ms: 1.07x faster
compression_hdfs.json 3.16 ms 2.63 ms: 1.20x faster
compression_reymont.pdf 12.3 ms 11.4 ms: 1.08x faster
compression_xml_collection.xml 4.58 ms 4.12 ms: 1.11x faster
Geometric mean (ref) 1.01x faster
frame.decompress Benchmark safelz4 lz4
decompress_compression_1k.txt 612 ns 416 ns: 1.47x faster
decompress_compression_34k.txt 8.96 us 10.0 us: 1.12x slower
decompress_compression_65k.txt 15.4 us 17.1 us: 1.11x slower
decompress_compression_66k_JSON.txt 9.45 us 8.04 us: 1.18x faster
decompress_dickens.txt 4.00 ms 2.13 ms: 1.88x faster
decompress_hdfs.json 1.50 ms 1.03 ms: 1.45x faster
decompress_reymont.pdf 2.42 ms 1.99 ms: 1.21x faster
decompress_xml_collection.xml 1.68 ms 1.19 ms: 1.41x faster
Geometric mean (ref) 1.26x faster

NOTE: All benchmarks were performed using python package pypref, on a system equipped with an Apple M4 Max processor and 36GB of unified memory.

Acknowledgement

This project acknowledges the outstanding work of Yann Collet.

Special thanks also to the maintainers of the lz4_flex Rust crate for providing a safe, pure-Rust implementation of LZ4 compression and decompression.

Other Implementation

LZ4 implementations, including:

Python Library Build Status Version Licence
python-lz4 Build Status PyPI - License

Licence

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

safelz4-0.0.3.tar.gz (115.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

safelz4-0.0.3-cp38-abi3-win_amd64.whl (227.1 kB view details)

Uploaded CPython 3.8+Windows x86-64

safelz4-0.0.3-cp38-abi3-win32.whl (219.6 kB view details)

Uploaded CPython 3.8+Windows x86

safelz4-0.0.3-cp38-abi3-musllinux_1_2_x86_64.whl (543.4 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ x86-64

safelz4-0.0.3-cp38-abi3-musllinux_1_2_i686.whl (578.0 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ i686

safelz4-0.0.3-cp38-abi3-musllinux_1_2_armv7l.whl (645.6 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ ARMv7l

safelz4-0.0.3-cp38-abi3-musllinux_1_2_aarch64.whl (551.7 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ ARM64

safelz4-0.0.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (373.2 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

safelz4-0.0.3-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (413.1 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ s390x

safelz4-0.0.3-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (511.2 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ppc64le

safelz4-0.0.3-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (382.8 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARMv7l

safelz4-0.0.3-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (374.1 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

safelz4-0.0.3-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl (402.4 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.5+ i686

safelz4-0.0.3-cp38-abi3-macosx_11_0_arm64.whl (338.9 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

safelz4-0.0.3-cp38-abi3-macosx_10_12_x86_64.whl (349.4 kB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file safelz4-0.0.3.tar.gz.

File metadata

  • Download URL: safelz4-0.0.3.tar.gz
  • Upload date:
  • Size: 115.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.6

File hashes

Hashes for safelz4-0.0.3.tar.gz
Algorithm Hash digest
SHA256 b5643d1a5d9ff777bdd3059daa5ed9c07c0525395f3180541376739884159e78
MD5 a56d19f12f3d27e7c4ad8bfc8621b988
BLAKE2b-256 03db3fc43a817984d61797075c5de591eb09e3c04433ba0619cbde266c395fbb

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: safelz4-0.0.3-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 227.1 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.6

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 919bb1f9d0cb9fcfcb2b44a0d791acefa4808bfb7b279519c266aa208abb2049
MD5 7fd1ca3519c35a1d8e3cd871d2080b3e
BLAKE2b-256 2216333a75e8407c73c93fbbfc4a3151296cb5cbd509620ef8f7c70f2766369c

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-win32.whl.

File metadata

  • Download URL: safelz4-0.0.3-cp38-abi3-win32.whl
  • Upload date:
  • Size: 219.6 kB
  • Tags: CPython 3.8+, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.6

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-win32.whl
Algorithm Hash digest
SHA256 7258af5145d3a4eab30f49108a422998a9a9e591154b91061ba3370862bf387f
MD5 627da8a694370dc5a21e2d992477bddb
BLAKE2b-256 b3eac7a40d089734b6d8e3f662072086a313f05cf1dc1bcf586bae0f14d676ef

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 4dd813a504c48694233ff0a322eadc1349f4905b75e8764cdde02b4c3d3aa55f
MD5 8f9e0e68c297bef3f6eb3f4e5505fc96
BLAKE2b-256 79aaca52b196dd1f313900e0937ab36c58112425255fe195b1209fcc19272800

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 742094196819216198a49aaf94c69febaa7ff48c41e1a922147e3556ccae7cd4
MD5 0d3b966e463cb2440b7ea93a0b4e9d95
BLAKE2b-256 da8ad54095c9d093d46a6eab4ba7abf063f9aa9ae0a4f84195e60efbacbae41a

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 704402afa0a25124f71af4a796e306880b462ba916839128b55af1dbaf9bd5b7
MD5 494b48d86f05a64096f44b191b0df235
BLAKE2b-256 675dbca456a8989c7d423ddc1f5137e471cdc202c0c22be7fd53c44bc3e73a16

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 441bfa87841f18542ebec2fb71cb07e6535326429c8d31239297594f197693b7
MD5 a2f6dc92b246a0dc65343b70d318fd4f
BLAKE2b-256 25c5f267c46729b7f9ffeb4ad677602298b77af6bdda4959cb84e87a5c1afcb0

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0dddd76745a28752565266d8ce3bc8a3bdc0e569e43da91ca84a5eddeb8e5b18
MD5 efbc7d0af90873f9c1d20da283b9840b
BLAKE2b-256 47f2cfb4d26793ed63f413f65c752cf2597c243d4abe3117fc6070d8b5e95951

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 007331c3e6a7edb563fef9d6943f7188496cf37bce22c1b4f27e75ec5aeb6c9b
MD5 14250efbc68c060c200c7f5e6edc55d1
BLAKE2b-256 531e6eba4d1be12daf2bab5bed9ca7bbdee8720799555be1b4006f74aabed55d

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 d7370db4bf5575a22fb043fa47a04fb74304a48d65b0bc12897466688e8487f4
MD5 5dc2f930510c7b238865c5a8f0fa8716
BLAKE2b-256 36ec6b41bc9990aea7689a688d3aa0375ee725c009a0d45829bc57bdabe63c61

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 05faa9271a3d9c8cc67e3da7a138b0cce790ef5392cc3296603dc93d2c9430e5
MD5 8fb96a07a8f80fb3328ba34da2e58b6d
BLAKE2b-256 97ef8eb282942678601c30873fd52fe3bd0a31a535d158da106041e929a8be50

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 393d238cc554841f354debb0e2498542d4ae4b3f0c10591bb830dd6cbc727f26
MD5 26340c2d95fd6a8a6dc27c2ce7d66410
BLAKE2b-256 082aaa94be08ee11976d1546faf18aef0cd687edada9e70de91bec2f6266e3b3

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 faefe9bdaa8bf4564c49f4b71dd0c4b53d4de7fc172efddf952d7d832ffa190b
MD5 bfb86795b34bfc1af6e67b0adefe4d15
BLAKE2b-256 71cecdf6c98259b529ff640bd9f94887129b205bf5865e9b924971a4b7489acb

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5ba239a28c1e011b603640e70c2e85115d4a0857d85a38169db4ec3b48d32743
MD5 084958eb025e3ac1c01ff936409d9a58
BLAKE2b-256 05c2d1219d2a3a545e0d47ca4e9018cd80ccaab2df93d30919bcfeb27ef54715

See more details on using hashes here.

File details

Details for the file safelz4-0.0.3-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for safelz4-0.0.3-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 fd81f1f05e2accef145ccbec6ce61a4e3d1421203e0b0bb8e820e27055a1d59e
MD5 f0c35b4c09e44b0b3add42b7b6d44184
BLAKE2b-256 57b6404d8c475ab91561e9b46f1bea8a7826a00cdda24642f7e34d3e062705be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page