Skip to main content

No project description provided

Project description

tamp logo

Python compat PyPi GHA Status Coverage Documentation Status

Tamp

Tamp is a low-memory, DEFLATE-inspired lossless compression library.

Features

  • Various implementations available:

    • Pure Python:

      • tamp/compressor.py, tamp/decompressor.py

      • When available, Tamp will use a python-bound C implementation for speed.

    • C library:

      • tamp/_c_src/

  • High compression ratios and low memory use.

  • Compact compression and decompression implementations.

    • Compiled C library is <4KB (compressor + decompressor).

  • Mid-stream flushing.

    • Allows for submission of messages while continuing to compress subsequent data.

  • Customizable dictionary for greater compression of small messages.

  • Convenient CLI interface.

Installation

Tamp contains 3 implementations:

  1. A desktop Cpython implementation that is optimized for readability

  2. A micropython viper implementation that is optimized for runtime performance.

  3. A C implementation (with python bindings) for accelerated desktop use and to be used in C projects.

Desktop Python

The Tamp library and CLI requires Python >=3.8 and can be installed via:

pip install tamp

MicroPython

For micropython use, there are 3 main files:

  1. tamp/__init__.py - Always required.

  2. tamp/decompressor_viper.py - Required for on-device decompression.

  3. tamp/compressor_viper.py - Required for on-device compression.

For example, if on-device decompression isn’t used, then do not include decompressor_viper.py. If manually installing, just copy these files to your microcontroller’s /lib/tamp folder.

If using mip, tamp can be installed by specifying the appropriate package-*.json file.

mip install github:brianpugh/tamp  # Defaults to package.json: Compressor & Decompressor
mip install github:brianpugh/tamp/package-compressor.json  # Compressor only
mip install github:brianpugh/tamp/package-decompressor.json  # Decompressor only

If using Belay, tamp can be installed by adding the following to pyproject.toml.

[tool.belay.dependencies]
tamp = [
   "https://github.com/BrianPugh/tamp/blob/main/tamp/__init__.py",
   "https://github.com/BrianPugh/tamp/blob/main/tamp/compressor_viper.py",
   "https://github.com/BrianPugh/tamp/blob/main/tamp/decompressor_viper.py",
]

C

Copy the tamp/_c_src/tamp folder into your project. For more information, see the documentation.

Usage

Tamp works on desktop python and micropython. On desktop, Tamp is bundled with the tamp command line tool for compressing and decompressing tamp files.

CLI

Compression

Use tamp compress to compress a file or stream. If no input file is specified, data from stdin will be read. If no output is specified, the compressed output stream will be written to stdout.

$ tamp compress --help

 Usage: tamp compress [OPTIONS] [INPUT_PATH]

 Compress an input file or stream.

╭─ Arguments ────────────────────────────────────────────────────────────────────────╮
   input_path      [INPUT_PATH]  Input file to compress or decompress. Defaults to  
                                 stdin.                                             
╰────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ──────────────────────────────────────────────────────────────────────────╮
 --output   -o      PATH                      Output file. Defaults to stdout.      
 --window   -w      INTEGER RANGE [8<=x<=15]  Number of bits used to represent the  
                                              dictionary window.                    
                                              [default: 10]                         
 --literal  -l      INTEGER RANGE [5<=x<=8]   Number of bits used to represent a    
                                              literal.                              
                                              [default: 8]                          
 --help                                       Show this message and exit.           
╰────────────────────────────────────────────────────────────────────────────────────╯

Example usage:

tamp compress enwik8 -o enwik8.tamp  # Compress a file
echo "hello world" | tamp compress | wc -c  # Compress a stream and print the compressed size.

The following options can impact compression ratios and memory usage:

  • window - 2^window plaintext bytes to look back to try and find a pattern. A larger window size will increase the chance of finding a longer pattern match, but will use more memory, increase compression time, and cause each pattern-token to take up more space. Try smaller window values if compressing highly repetitive data, or short messages.

  • literal - Number of bits used in each plaintext byte. For example, if all input data is 7-bit ASCII, then setting this to 7 will improve literal compression ratios by 11.1%. The default, 8-bits, can encode any binary data.

Decompression

Use tamp decompress to decompress a file or stream. If no input file is specified, data from stdin will be read. If no output is specified, the compressed output stream will be written to stdout.

 $ tamp decompress --help

 Usage: tamp decompress [OPTIONS] [INPUT_PATH]

 Decompress an input file or stream.

╭─ Arguments ────────────────────────────────────────────────────────────────────────╮
   input_path      [INPUT_PATH]  Input file. If not provided, reads from stdin.     
╰────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ──────────────────────────────────────────────────────────────────────────╮
 --output  -o      PATH  Output file. Defaults to stdout.                           
 --help                  Show this message and exit.                                
╰────────────────────────────────────────────────────────────────────────────────────╯

Example usage:

tamp decompress enwik8.tamp -o enwik8
echo "hello world" | tamp compress | tamp decompress

Python

The python library can perform one-shot compression, as well as operate on files/streams.

import tamp

# One-shot compression
string = b"I scream, you scream, we all scream for ice cream."
compressed_data = tamp.compress(string)
reconstructed = tamp.decompress(compressed_data)
assert reconstructed == string

# Streaming compression
with tamp.open("output.tamp", "wb") as f:
    for _ in range(10):
        f.write(string)

# Streaming decompression
with tamp.open("output.tamp", "rb") as f:
    reconstructed = f.read()

Benchmark

In the following section, we compare Tamp against:

  • zlib, a python builtin gzip-compatible DEFLATE compression library.

  • heatshrink, a data compression library for embedded/real-time systems. Heatshrink has similar goals as Tamp.

All of these are LZ-based compression algorithms, and tests were performed using a 1KB (10 bit) window. Since zlib already uses significantly more memory by default, the lowest memory level (memLevel=1) was used in these benchmarks. It should be noted that higher zlib memory levels will having greater compression ratios than Tamp. Currently, there is no micropython-compatible zlib or heatshrink compression implementation, so these numbers are provided simply as a reference.

Compression Ratio

The following table shows compression algorithm performance over a variety of input data sourced from the Silesia Corpus and Enwik8. This should give a general idea of how these algorithms perform over a variety of input data types.

dataset

raw

tamp

zlib

heatshrink

enwik8

100,000,000

51,635,633

56,205,166

56,110,394

build/silesia/dickens

10,192,446

5,546,761

6,049,169

6,155,768

build/silesia/mozilla

51,220,480

25,121,385

25,104,966

25,435,908

build/silesia/mr

9,970,564

5,027,032

4,864,734

5,442,180

build/silesia/nci

33,553,445

8,643,610

5,765,521

8,247,487

build/silesia/ooffice

6,152,192

3,814,938

4,077,277

3,994,589

build/silesia/osdb

10,085,684

8,520,835

8,625,159

8,747,527

build/silesia/reymont

6,627,202

2,847,981

2,897,661

2,910,251

build/silesia/samba

21,606,400

9,102,594

8,862,423

9,223,827

build/silesia/sao

7,251,944

6,137,755

6,506,417

6,400,926

build/silesia/webster

41,458,703

18,694,172

20,212,235

19,942,817

build/silesia/x-ray

8,474,240

7,510,606

7,351,750

8,059,723

build/silesia/xml

5,345,280

1,681,687

1,586,985

1,665,179

Tamp usually out-performs heatshrink, and is generally very competitive with zlib. While trying to be an apples-to-apples comparison, zlib still uses significantly more memory during both compression and decompression (see next section). Tamp accomplishes competitive performance while using around 10x less memory.

Memory Usage

The following table shows approximately how much memory each algorithm uses during compression and decompression.

Action

tamp

zlib

heatshrink

Compression

(1 << windowBits)

(1 << (windowBits+2)) + 7 KB

(1 << windowBits)

Decompression

(1 << windowBits)

(1 << windowBits) + 7 KB

(1 << windowBits)

Both tamp and heatshrink have a few dozen bytes of overhead in addition to the primary window buffer, but are implementation-specific and ignored for clarity here.

Runtime

As a rough benchmark, here is the performance (in seconds) of these different compression algorithms on the 100MB enwik8 dataset. These tests were performed on an M1 Macbook Air.

Action

tamp (Python Reference)

tamp (C)

zlib

heatshrink (with index)

heatshrink (without index)

Compression

109.5

16.45

4.84

6.22

41.729

Decompression

54.0

0.70

0.98

0.82

0.82

Heatshrink v0.4.1 was used in these benchmarks. When heathshrink uses an index, an additional (1 << (windowBits + 1)) bytes of memory are used, tripling the memory requirement. Tamp could use a similar indexing to increase compression speed, but has chosen not to to focus on the primary goal of a low-memory compressor.

To give an idea of Tamp’s speed on an embedded device, the following table shows compression/decompression in bytes/second of the first 100KB of enwik8 on a pi pico (rp2040) at the default 125MHz clock rate. This isn’t exactly an apples-to-apples comparison because the C benchmark does not use a filesystem (and thusly, reduced overhead) nor dynamic memory allocation, but is good enough to get the idea across.

Action

tamp (Micropython Viper)

tamp (C)

Compression

~4,300

~28,500

Decompression

~42,000

~1,042,524

Binary Size

To give an idea on the resulting binary sizes, Tamp and other libraries were compiled for the Pi Pico (armv6m). All libraries were compiled with -O3. Numbers reported in bytes.

Library

Compressor

Decompressor

Compressor + Decompressor

Tamp (micropython)

4429

4205

7554

Tamp (C)

2008

1972

3864

Heatshrink

2956

3876

6832

uzlib

2355

3963

6318

Heatshrink doesn’t include a high level API; in an apples-to-apples comparison the Tamp library would be even smaller.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tamp-1.2.0.tar.gz (35.3 kB view hashes)

Uploaded Source

Built Distributions

tamp-1.2.0-cp312-cp312-win_amd64.whl (164.9 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

tamp-1.2.0-cp312-cp312-win32.whl (147.0 kB view hashes)

Uploaded CPython 3.12 Windows x86

tamp-1.2.0-cp312-cp312-musllinux_1_1_x86_64.whl (874.7 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ x86-64

tamp-1.2.0-cp312-cp312-musllinux_1_1_ppc64le.whl (890.2 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ ppc64le

tamp-1.2.0-cp312-cp312-musllinux_1_1_i686.whl (827.3 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ i686

tamp-1.2.0-cp312-cp312-musllinux_1_1_aarch64.whl (863.3 kB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ ARM64

tamp-1.2.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (872.3 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

tamp-1.2.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (888.7 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ppc64le

tamp-1.2.0-cp312-cp312-manylinux_2_17_i686.manylinux_2_5_i686.manylinux1_i686.manylinux2014_i686.whl (829.6 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

tamp-1.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (861.2 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

tamp-1.2.0-cp312-cp312-macosx_12_0_x86_64.whl (177.3 kB view hashes)

Uploaded CPython 3.12 macOS 12.0+ x86-64

tamp-1.2.0-cp312-cp312-macosx_12_0_arm64.whl (170.0 kB view hashes)

Uploaded CPython 3.12 macOS 12.0+ ARM64

tamp-1.2.0-cp311-cp311-win_amd64.whl (164.0 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

tamp-1.2.0-cp311-cp311-win32.whl (146.5 kB view hashes)

Uploaded CPython 3.11 Windows x86

tamp-1.2.0-cp311-cp311-musllinux_1_1_x86_64.whl (866.1 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ x86-64

tamp-1.2.0-cp311-cp311-musllinux_1_1_ppc64le.whl (890.7 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ ppc64le

tamp-1.2.0-cp311-cp311-musllinux_1_1_i686.whl (823.3 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ i686

tamp-1.2.0-cp311-cp311-musllinux_1_1_aarch64.whl (860.9 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ ARM64

tamp-1.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (860.0 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

tamp-1.2.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (891.9 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ppc64le

tamp-1.2.0-cp311-cp311-manylinux_2_17_i686.manylinux_2_5_i686.manylinux1_i686.manylinux2014_i686.whl (824.9 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

tamp-1.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (857.4 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

tamp-1.2.0-cp311-cp311-macosx_12_0_x86_64.whl (176.0 kB view hashes)

Uploaded CPython 3.11 macOS 12.0+ x86-64

tamp-1.2.0-cp311-cp311-macosx_12_0_arm64.whl (169.4 kB view hashes)

Uploaded CPython 3.11 macOS 12.0+ ARM64

tamp-1.2.0-cp310-cp310-win_amd64.whl (163.8 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

tamp-1.2.0-cp310-cp310-win32.whl (146.7 kB view hashes)

Uploaded CPython 3.10 Windows x86

tamp-1.2.0-cp310-cp310-musllinux_1_1_x86_64.whl (811.6 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

tamp-1.2.0-cp310-cp310-musllinux_1_1_ppc64le.whl (832.3 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ ppc64le

tamp-1.2.0-cp310-cp310-musllinux_1_1_i686.whl (777.4 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ i686

tamp-1.2.0-cp310-cp310-musllinux_1_1_aarch64.whl (806.4 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ ARM64

tamp-1.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (787.5 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

tamp-1.2.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (832.0 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ppc64le

tamp-1.2.0-cp310-cp310-manylinux_2_17_i686.manylinux_2_5_i686.manylinux1_i686.manylinux2014_i686.whl (754.0 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

tamp-1.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (780.3 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

tamp-1.2.0-cp310-cp310-macosx_12_0_x86_64.whl (175.9 kB view hashes)

Uploaded CPython 3.10 macOS 12.0+ x86-64

tamp-1.2.0-cp310-cp310-macosx_12_0_arm64.whl (169.5 kB view hashes)

Uploaded CPython 3.10 macOS 12.0+ ARM64

tamp-1.2.0-cp39-cp39-win_amd64.whl (165.0 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

tamp-1.2.0-cp39-cp39-win32.whl (147.9 kB view hashes)

Uploaded CPython 3.9 Windows x86

tamp-1.2.0-cp39-cp39-musllinux_1_1_x86_64.whl (817.1 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

tamp-1.2.0-cp39-cp39-musllinux_1_1_ppc64le.whl (837.7 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ ppc64le

tamp-1.2.0-cp39-cp39-musllinux_1_1_i686.whl (783.7 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ i686

tamp-1.2.0-cp39-cp39-musllinux_1_1_aarch64.whl (810.6 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ ARM64

tamp-1.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (792.6 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

tamp-1.2.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (838.9 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

tamp-1.2.0-cp39-cp39-manylinux_2_17_i686.manylinux_2_5_i686.manylinux1_i686.manylinux2014_i686.whl (760.1 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

tamp-1.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (785.9 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

tamp-1.2.0-cp39-cp39-macosx_12_0_x86_64.whl (177.2 kB view hashes)

Uploaded CPython 3.9 macOS 12.0+ x86-64

tamp-1.2.0-cp39-cp39-macosx_12_0_arm64.whl (170.7 kB view hashes)

Uploaded CPython 3.9 macOS 12.0+ ARM64

tamp-1.2.0-cp38-cp38-win_amd64.whl (165.1 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

tamp-1.2.0-cp38-cp38-win32.whl (147.8 kB view hashes)

Uploaded CPython 3.8 Windows x86

tamp-1.2.0-cp38-cp38-musllinux_1_1_x86_64.whl (828.1 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

tamp-1.2.0-cp38-cp38-musllinux_1_1_ppc64le.whl (850.8 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ ppc64le

tamp-1.2.0-cp38-cp38-musllinux_1_1_i686.whl (793.5 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ i686

tamp-1.2.0-cp38-cp38-musllinux_1_1_aarch64.whl (820.6 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ ARM64

tamp-1.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (804.6 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

tamp-1.2.0-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (841.1 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

tamp-1.2.0-cp38-cp38-manylinux_2_17_i686.manylinux_2_5_i686.manylinux1_i686.manylinux2014_i686.whl (770.5 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

tamp-1.2.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (796.3 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

tamp-1.2.0-cp38-cp38-macosx_12_0_x86_64.whl (176.7 kB view hashes)

Uploaded CPython 3.8 macOS 12.0+ x86-64

tamp-1.2.0-cp38-cp38-macosx_12_0_arm64.whl (170.0 kB view hashes)

Uploaded CPython 3.8 macOS 12.0+ ARM64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page