Skip to main content

A lossless and near-lossless compression method optimized for numbers/tensors in the Foundation Models environment

Project description

ZipNN - A Lossless Compression Library for AI pipelines

Introduction

In the realm of data compression, achieving a high compression/decompression ratio often requires careful consideration of the data types and the nature of the datasets being compressed. For instance, different strategies may be optimal for floating-point numbers compared to integers, and datasets in monotonic order may benefit from distinct preparations.

ZipNN is a lossless and near-lossless compression method optimized for numbers/tensors in the Foundation Models environment, designed to automatically prepare the data for compression according to its type. By simply calling zipnn.compress(data), users can rely on the package to apply the most effective compression technique under the hood.

Click here to explore the options we use for different datasets and data types

With zipnn, users can focus on their core tasks without worrying about the complexities of data compression, confident that the package will deliver the best possible results for their specific data types and structures.

For more details, please see our paper: Lossless and Near-Lossless Compression for Foundation Models

Currently, ZipNN compression methods are implemented on CPUs, and GPU implementations are on the way.

Given a specific data set, ZipNN Automatically rearranges the data according to it's type, and applies the most effective techniques for the given instance to improve compression ratios and rates.

Flow Image

Results

Below is a comparison of compression results between ZipNN and several other methods on bfloat16 data.

Compressor name Compression ratio / Output size Compression Throughput Decompression Throughput
ZipNN v0.2.0 1.51 / 66.3% 1120MB/sec 1660MB/sec
ZSTD v1.56 1.27 / 78.3% 785MB/sec 950MB/sec
LZ4 1 / 100% --- ---
Snappy 1 / 100% --- ---
  • Gzip, Zlib compression rate are similar to ZSTD, but much slower.
  • The above results are for a single-threaded compression (Working with chunks size of 256KB).
  • Similar results with other BF16 Models such as Mistral, Lamma-3, Lamma-3.1, Arcee-Nova and Jamba.

Installation using pip

pip install zipnn

Install source code

git clone git@github.com:zipnn/zipnn.git
cd zipnn

We are using two submodules:

  • Cyan4973/FiniteStateEntropy [https://github.com/Cyan4973/FiniteStateEntropy]
  • facebok/zstd [https://github.com/facebook/zstd] tag 1.5.6
git submodule update --init --recursive

Compile locally using pip

pip install -e .

Dependencies

This project requires the following Python packages:

  • numpy
  • zstandard
  • torch

For specific Compression methods other than ZSTD

  • For lz4 method: pip install lz4
  • For snappy method: pip install python-snappy

Usage

Import zipnn

from zipnn import ZipNN

Instance class:

zpn = ZipNN(method='zstd', input_format='torch')

Create a 1MB tensor with random numbers from a uniform distribution between -1 and 1 The dtype is bfloat

import torch
original_tensor = torch.rand(10124*1024, dtype=torch.bfloat16) * 2 - 1

Compression:

compressed_data = zpn.compress(original_tensor)

Decompression:

decompressed_data = zpn.decompress(compressed_data)

Check for correctness:

torch.equal(original_tensor, decompressed_data)

Example

Example of synthetic data

In this example, ZipNN compresses and decompresses 1MB of a random number between -1 to 1 in a torch.tensor format.

> python3 simple_example.py
...
Are the original and decompressed byte strings the same [TORCH]?  True

Example of a real module

In this example, ZipNN and ZSTD compress and decompress 1GB of the Granite model and validate that the original file and the decompressed file are equal.
The script reads the file and compresses and decompresses in Byte format.

> python3 simple_example_granite.py
...
Are the original and decompressed byte strings the same [BYTE]?  True

Configuration

The default configuration is ByteGrouping of 4 with vanilla ZSTD (running with 8 threads), and the input and outputs are "byte". For more advanced options, please consider the following parameters:

  • method: Compression method, Supporting zstd, lz4, snappy (default value = 'zstd').

  • input_format: The input data format, can be one of the following: torch, numpy, byte (default value = 'byte').

  • bytearray_dtype: The data type of the byte array, if input_format is 'byte'. If input_format is torch or numpy, the dtype will be derived from the data automatically (default value = 'float32').

  • threads: The maximum threads for the compression and the bit manipulation. If 0, the code decides according to the dataset length (default value = 1).

  • compression_threshold: Only relevant for a compression that uses byte grouping. Compression threshhold for the byte grouping (default value = 0.95).

  • byte_reorder: Number of grouping. The format is the following:

    • Bit Format:

      • [7] - Group 0/1: 4th Byte
      • [6-5] - Group 0/1/2: 3rd Byte
      • [4-3] - Group 0/1/2/3: 2nd Byte
      • [2-0] - Group 0/1/2/3/4: 1st Byte
    • Examples:

      • bg16: Two groups - 0_00_01_010 (decimal 10)
      • fp32: Four groups - 1_10_11_100 (decimal 220)
      • int32: Truncate two MSBs - 0_00_01_001 (decimal 9)
  • reorder_signbit: This parameter controls the reordering of the sign bit for float32 or bfloat16 to improve compression. Options are:

    • 255: No reordering of the sign bit.
    • 16: Reorders the sign bit for bfloat16.
    • 32: Reorders the sign bit for float32.
    • 0: Automatically decides based on the data type (default value = 0).
  • compression_chunk: Chunk size for compression. (default value = 256KB).

Click here to explore additional ZipNN configuration options

Validation test

Run tests for Byte/File input types, Byte/File compression types, Byte/File decompression types.

python3 -m unittest discover -s tests/ -p test_suit.py

Support and Questions

We are excited to hear your feedback!

For issues and feature requests, please open a GitHub issue.

Contributing

We welcome and value all contributions to the project!

Change Log

v0.2.0
  • Change the byte ordering implementation to C (for better performance).

  • Change the bfloat16/float16 implementation to a C implementation with Huffman encoding, running on chunks of 256KB each.

  • Float 32 using ZSTD compression as in v0.1.1

  • Add support with uint32 with ZSTD compression.

v0.1.1
  • Python implementation of compressing Models, float32, float15, bfloat16 with byte ordering and ZSTD.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zipnn-0.2.1.tar.gz (52.7 kB view details)

Uploaded Source

Built Distribution

zipnn-0.2.1-cp312-cp312-macosx_14_0_arm64.whl (51.1 kB view details)

Uploaded CPython 3.12 macOS 14.0+ ARM64

File details

Details for the file zipnn-0.2.1.tar.gz.

File metadata

  • Download URL: zipnn-0.2.1.tar.gz
  • Upload date:
  • Size: 52.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.3

File hashes

Hashes for zipnn-0.2.1.tar.gz
Algorithm Hash digest
SHA256 c1899b2e1c80ecb3d7a1dc5af2cabb49d30ecadff1442a731d40f040fbaaab8d
MD5 fb42a8989cfdf33130bf31d6e3e37f93
BLAKE2b-256 2cbf4b576a7f40f5003211a6f6cf68d8d2d155cfa0ae28487d967e9a0e5a03b0

See more details on using hashes here.

File details

Details for the file zipnn-0.2.1-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for zipnn-0.2.1-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 a3a89fe39720dbf57644000a5f2ac5615ca8199176fa0f6a8fd1edc99111792a
MD5 da25f948b1c896d0088c38c147fb2688
BLAKE2b-256 d24627994780ad6559fc4e8a7eb19a548b44e6049ffbcda5bdea6db638b3c1ec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page