Skip to main content

Lossy-compression utility for sequence data in NumPy

Project description

lilcom

This package lossily compresses floating-point NumPy arrays into byte strings, with an accuracy specified by the user. The main anticipated use is in machine learning applications, for storing things like training data and models.

This package requires Python 3 and is not compatible with Python 2.

Installation with PyPi

From PyPi you can install this with just

pip3 install lilcom

Installation with conda

conda install -c lilcom lilcom

How to use

The most common usage pattern will be as follows (showing Python code):

import numpy as np
import lilcom

a = np.random.randn(300,500)
a_compressed = lilcom.compress(a)
# a_compressed is of type `bytes`, a byte string.
# In this case it will use about 1.3 bytes per element.

# decompress a
a_decompressed = lilcom.decompress(a_compressed)

The compression is lossy so a_decompressed will not be exactly the same as a. The amount of error (absolute, not relative!) is determined by the optional tick_power argument to lilcom.compress() (default: -8), which is the power of 2 used for the step size between discretized values. The maximum error per element is 2**(tick_power-1), e.g. for tick_power=-8, it is 1/512.

Installation from Github

To install lilcom from github, first clone the repository;

git clone https://github.com/danpovey/lilcom.git

then run setup with install argument.

python3 setup.py install

(you may need to add the --user flag if you don't have system privileges). You need to make sure a C++ compiler is installed, e.g. g++ or clang. To test it, you can then cd to test and run:

python3 test_lilcom.py

Technical details

The algorithm regresses each element on the previous element (for a 1-d array) or, for general n-d arrays, it regresses on the previous elements along each of the axes, i.e. we regress element a[i,j] on a[i-1,j] and a[i,j-1]. The regression coefficients are global and written as part of the header in the string.

The elements are then integerized and the integers are compressed using an algorithm that gives good compression when successive elements tend to have about the same magnitude (the number of bits we're transmitting varies dynamically acccording to the magnitudes of the elements).

The core parts of the code are implemented in C++.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lilcom-1.7.tar.gz (45.7 kB view hashes)

Uploaded Source

Built Distributions

lilcom-1.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (87.2 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

lilcom-1.7-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (93.4 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686

lilcom-1.7-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (81.8 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

lilcom-1.7-cp311-cp311-macosx_10_9_universal2.whl (111.6 kB view hashes)

Uploaded CPython 3.11 macOS 10.9+ universal2 (ARM64, x86-64)

lilcom-1.7-cp310-cp310-win_amd64.whl (73.8 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

lilcom-1.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (87.1 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

lilcom-1.7-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (93.4 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

lilcom-1.7-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (81.8 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

lilcom-1.7-cp310-cp310-macosx_10_9_universal2.whl (111.6 kB view hashes)

Uploaded CPython 3.10 macOS 10.9+ universal2 (ARM64, x86-64)

lilcom-1.7-cp39-cp39-win_amd64.whl (73.9 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

lilcom-1.7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (87.1 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

lilcom-1.7-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl (93.0 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ i686

lilcom-1.7-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (81.9 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

lilcom-1.7-cp39-cp39-macosx_10_9_universal2.whl (111.7 kB view hashes)

Uploaded CPython 3.9 macOS 10.9+ universal2 (ARM64, x86-64)

lilcom-1.7-cp38-cp38-win_amd64.whl (73.7 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

lilcom-1.7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (87.1 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

lilcom-1.7-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl (92.7 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ i686

lilcom-1.7-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (81.7 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

lilcom-1.7-cp38-cp38-macosx_10_9_universal2.whl (111.7 kB view hashes)

Uploaded CPython 3.8 macOS 10.9+ universal2 (ARM64, x86-64)

lilcom-1.7-cp37-cp37m-win_amd64.whl (74.3 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

lilcom-1.7-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (88.6 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

lilcom-1.7-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl (94.9 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ i686

lilcom-1.7-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (83.3 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

lilcom-1.7-cp36-cp36m-win_amd64.whl (74.3 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

lilcom-1.7-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (88.6 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64

lilcom-1.7-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl (94.8 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.17+ i686

lilcom-1.7-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (83.1 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ARM64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page