Skip to main content

Python bindings for libhuffman

Project description

libhuffman - The Huffman coding library

The Huffman library is a simple, pure C library for encoding and decoding data using a frequency-sorted binary tree. The implementation of this library is pretty straightforward, additional information regarding Huffman coding could be gained from the Wikipedia.

Installation

The build mechanism of the library is based on the CMake tool, so you could easily install it on your distribution in the following way:

$ sudo cmake install

By default the command above install the library into /usr/local/lib and all required headers into /usr/local/include. The installation process is organized using CMake. Just create a new directory build and generate required makefiles:

$ mkdir -p build
$ cmake ..

After that run the install target:

$ make install

Usage

Encoding

To encode the data, use either a file stream huf_fdopen or huf_memopen to use an in-memory stream. Consider the following example, where the input is a memory stream and output of the encoder is also memory buffer of 1MiB size.

void *bufin, *bufout = NULL;
huf_read_writer_t *input, *output = NULL;

// Allocate the necessary memory.
huf_memopen(&input, &bufin, HUF_1MIB_BUFFER);
huf_memopen(&output, &bufout, HUF_1MIB_BUFFER);

// Write the data for encoding to the input.
size_t input_len = 10;
input->write(input->stream, "0123456789", input_len);

Create a configuration used to encode the input string using Huffman algorithm:

huf_config_t config = {
   .reader = input,
   .length = input_len,
   .writer = output,
   .blocksize = HUF_64KIB_BUFFER,
};

huf_error_t err = huf_encode(&config);
printf("%s\n", huf_error_string(err));
  • reader - input ready for the encoding.
  • writer - output for the encoded data.
  • length - length of the data in bytes to encode.
  • blocksize - the length of each chunk in bytes (instead of reading the file twice libhuffman reads and encodes data by blocks).
  • reader_buffer_size - this is opaque reader buffer size in bytes, if the buffer size is set to zero, all reads will be unbuffered.
  • writer_buffer_size - this is opaque writer buffer size ib bytes, if the buffer size is set to zero, all writes will be unbuffered.

After the encoding, the output memory buffer could be automatically scaled to fit all necessary encoded bytes. To retrieve a new length of the buffer, use the following:

size_t out_size = 0;
huf_memlen(output, &out_size);

// The data is accessible through the `bufout` variable or using `read` function:
uint8_t result[10] = {0};
size_t result_len = 10;

// result_len is inout parameter, and will contain the length of encoding
// after the reading from the stream.
output->read(output->stream, result, &result_len);

Decoding

Decoding is similar to the encoding, except that reader attribute of the configuration should contain the data used to decode:

input->write(input->stream, decoding, decoding_len);

huf_config_t config = {
    .reader = input,
    .length = input_len,
    .writer = output,
    .blockize = HUF_64KIB_BUFFER,
};


// After the decoding the original data will be writter to the `output`.
huf_decode(&config);

Resource Deallocation

Once the processing of the encoding is completed, consider freeing the allocated memory:

// This does not free underlying buffer, call free for the buffer.
huf_memclose(&mem_out);

free(buf);

For more examples, please, refer to the tests directory.

Python Bindings

Python bindings for libhuffman library are distributed as PyPI package, to install that package, execute the following command:

pip install huffmanfile

You can use the libhuffman for performant compression and decompression of Huffman encoding. The interface of the Python library is similar to the interface of the bz2 and lzma packages from Python's standard library.

Examples of usage

Reading in a compressed file:

import huffmanfile
with huffmanfile.open("file.hm") as f:
    file_content = f.read()

Creating a compressed file:

import huffmanfile
data = b"Insert Data Here"
with huffmanfile.open("file.hm", "w") as f:
    f.write(data)

Compressing data in memory:

import huffmanfile
data_in = b"Insert Data Here"
data_out = huffmanfile.compress(data_in)

Incremental compression:

import huffmanfile
hfc = huffmanfile.HuffmanCompressor()
out1 = hfc.compress(b"Some data\n")
out2 = hfc.compress(b"Another piece of data\n")
out3 = hfc.compress(b"Even more data\n")
out4 = hfc.flush()
# Concatenate all the partial results:
result = b"".join([out1, out2, out3, out4])

Note, random data tends to compress poorly, while ordered, repetitive data usually yields a high compression ratio.

License

The Huffman library is distributed under MIT license, therefore you are free to do with code whatever you want. See the LICENSE file for full license text.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

huffmanfile-1.0.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (96.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

huffmanfile-1.0.4-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (88.0 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

huffmanfile-1.0.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (95.9 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

huffmanfile-1.0.4-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (87.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

huffmanfile-1.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (95.9 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

huffmanfile-1.0.4-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (87.7 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

File details

Details for the file huffmanfile-1.0.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for huffmanfile-1.0.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 15897b3cc4e2ecce3690606bb6e8d055ebcc4a5ea94660fb5e2e30c714cca3c8
MD5 c878dd84f1f7f8bb5441c26289a8f28f
BLAKE2b-256 ff00aed861101039e46f3c4d454c8baab38dc424e8f855e32a29028d5f6e6109

See more details on using hashes here.

File details

Details for the file huffmanfile-1.0.4-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for huffmanfile-1.0.4-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 a96f3bda702d0402dc4cc6a46be23bc5fe7df0b38138aaf876541bb126c81d21
MD5 6bce1e9441bdd280ee531334ef066a66
BLAKE2b-256 daf4957a305cd09634360df1eb2db2a120352497ff806b10a4573e3ffa76fa82

See more details on using hashes here.

File details

Details for the file huffmanfile-1.0.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for huffmanfile-1.0.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3590e1089c00a69d7b2e8775239d78a59d88bb7a56e5ccced05a19d117c80175
MD5 d66ac8c9e7d68930453799aaa2e06837
BLAKE2b-256 0531e38034373978d03c56911e0b45a6c4cd9f06490db814ac327ea2bc958eb5

See more details on using hashes here.

File details

Details for the file huffmanfile-1.0.4-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for huffmanfile-1.0.4-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 c16bdf63e7ec8fe56060dbeb6c8605dc51d848ec89a115a18d1ad2d16b91d7a2
MD5 38327f29c66ed3484e8b0a4e53394e8f
BLAKE2b-256 4a245bf0cbac067d793dd80305935f82f696933c4b7b2929a7a293d2eabab7db

See more details on using hashes here.

File details

Details for the file huffmanfile-1.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for huffmanfile-1.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3815999478cd2a6f0ba68ac26a7c5e1a841de4885f0c8f9b6e9f054a31992fd8
MD5 5b5f0a1de563d7c828ef3948b69fcc6e
BLAKE2b-256 9eb1c3aa0a24b008df28098784cb8f95894098ed977509bbca622e440a6543da

See more details on using hashes here.

File details

Details for the file huffmanfile-1.0.4-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for huffmanfile-1.0.4-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 fc02f9388f4bfaec7e2b6c3b8efc7a6a7e7e4d40b206d571c2e1270a6579494c
MD5 895eb1687631c7c7b5d4841bf3b82807
BLAKE2b-256 573331e1f956d137c8a90fdd164a80a3cdd2c587565c77a797a8da8847ac132a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page