Skip to main content

A C++ implementation with Python bindings of StreamVByte.

Project description


libstreamvbyte


Table of Contents
  1. About The Project
  2. Getting Started
  3. Roadmap
  4. Contributing
  5. License
  6. Reference
  7. Contact

About The Project

libstreamvbyte is a C++ implementation of StreamVByte, with Python bindings using pybind11.

StreamVByte is a integer compression technique that use SIMD instructions (vectorization) to improve performance. The library is optimized for CPUs with the SSSE3 instruction set (which is supported by most x86_64 processors), and can also be used with ARM processors and other 32-bit architectures, although it will fall back to scalar implementations in those cases.

With libstreamvbyte, you can quickly and efficiently compress integer sequences, reducing the amount of storage space and network bandwidth required. The library is easy to use and integrates seamlessly with Python via pybind11 bindings. Whether you're working with large datasets or building a distributed computing system, libstreamvbyte can help you improve performance and reduce the resources needed to handle your data.

Currently supports Python 3.10+ on Windows, Linux (manylinux_2_17, musllinux_1_1) and macOS (universal2).

(back to top)

Getting Started

Installation

For Python

Install from PyPI using pip.

pip install libstreamvbyte

Or install from .whl file.

pip install "path/to/your/downloaded/whl"

To find appropriate .whl file, please visit releases.

For C++

You must have CMake installed on your system.

# clone the repo
git clone https://github.com/wst24365888/libstreamvbyte
cd libstreamvbyte

# build and install
cmake .
make
sudo make install

Usage

For Python

Import libstreamvbyte first.

import libstreamvbyte as svb

And here are the APIs.

# Encode an array of unsigned integers into a byte array.
encode(arg0: numpy.ndarray[numpy.uint32]) -> numpy.ndarray[numpy.uint8]

# Decode a byte array into an array of unsigned integers.
decode(arg0: numpy.ndarray[numpy.uint8], arg1: int) -> numpy.ndarray[numpy.uint32]

# Encode an array of signed integers into an array of unsigned integers.
encode_zigzag(arg0: numpy.ndarray[numpy.int32]) -> numpy.ndarray[numpy.uint32]

# Decode an array of unsigned integers into an array of signed integers.
decode_zigzag(arg0: numpy.ndarray[numpy.uint32]) -> numpy.ndarray[numpy.int32]

For C++

Include streamvbyte.h first.

#include "streamvbyte.h"

For the APIs, please refer to include/streamvbyte.h.

Example

For Python

import libstreamvbyte as svb

N = 2**20 + 2

# type(original_data) == np.ndarray
# original_data.dtype == np.int32
original_data = np.random.randint(-2**31, 2**31, N, dtype=np.int32)

# type(compressed_bytes) == np.ndarray
# compressed_bytes.dtype == np.uint8
compressed_bytes = svb.encode(svb.encode_zigzag(original_data))

# type(recovered_data) == np.ndarray
# recovered_data.dtype == np.int32
recovered_data = svb.decode_zigzag(svb.decode(compressed_bytes, N))

For C++

#include "streamvbyte.h"

int main() {
    std::size_t N = (1 << 20) + 2;

    std::vector<int32_t> original_data(N);
    for (std::size_t i = 0; i < N; i++) {
        original_data[i] = rand() - rand();
    }

    std::vector<uint8_t> compressed_bytes = streamvbyte::encode(streamvbyte::encode_zigzag(original_data));
    std::vector<int32_t> recovered_data = streamvbyte::decode_zigzag(streamvbyte::decode(compressed_bytes, N));

    return 0;
}

Compile it with linking to libstreamvbyte.

g++ -o example example.cpp -lstreamvbyte

(back to top)

Roadmap

  • Zigzag encoding/decoding.
  • Support ARM processors with NEON intrinsics.
  • Differential coding (delta encoding/decoding).

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feat/amazing-feature)
  3. Commit your Changes with Conventional Commits
  4. Push to the Branch (git push origin feat/amazing-feature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

Reference

(back to top)

Contact

Author

Project Link

(back to top)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libstreamvbyte-0.2.0.tar.gz (761.8 kB view details)

Uploaded Source

Built Distributions

libstreamvbyte-0.2.0-cp311-cp311-win_amd64.whl (60.2 kB view details)

Uploaded CPython 3.11 Windows x86-64

libstreamvbyte-0.2.0-cp311-cp311-win32.whl (47.8 kB view details)

Uploaded CPython 3.11 Windows x86

libstreamvbyte-0.2.0-cp311-cp311-musllinux_1_1_x86_64.whl (606.2 kB view details)

Uploaded CPython 3.11 musllinux: musl 1.1+ x86-64

libstreamvbyte-0.2.0-cp311-cp311-musllinux_1_1_i686.whl (661.8 kB view details)

Uploaded CPython 3.11 musllinux: musl 1.1+ i686

libstreamvbyte-0.2.0-cp311-cp311-musllinux_1_1_aarch64.whl (586.7 kB view details)

Uploaded CPython 3.11 musllinux: musl 1.1+ ARM64

libstreamvbyte-0.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (82.0 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

libstreamvbyte-0.2.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (86.5 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686

libstreamvbyte-0.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (75.3 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

libstreamvbyte-0.2.0-cp311-cp311-macosx_10_9_universal2.whl (105.9 kB view details)

Uploaded CPython 3.11 macOS 10.9+ universal2 (ARM64, x86-64)

libstreamvbyte-0.2.0-cp310-cp310-win_amd64.whl (60.3 kB view details)

Uploaded CPython 3.10 Windows x86-64

libstreamvbyte-0.2.0-cp310-cp310-win32.whl (47.8 kB view details)

Uploaded CPython 3.10 Windows x86

libstreamvbyte-0.2.0-cp310-cp310-musllinux_1_1_x86_64.whl (606.2 kB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

libstreamvbyte-0.2.0-cp310-cp310-musllinux_1_1_i686.whl (661.8 kB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ i686

libstreamvbyte-0.2.0-cp310-cp310-musllinux_1_1_aarch64.whl (586.7 kB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ ARM64

libstreamvbyte-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (81.9 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

libstreamvbyte-0.2.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (86.5 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

libstreamvbyte-0.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (75.3 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

libstreamvbyte-0.2.0-cp310-cp310-macosx_10_9_universal2.whl (105.9 kB view details)

Uploaded CPython 3.10 macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file libstreamvbyte-0.2.0.tar.gz.

File metadata

  • Download URL: libstreamvbyte-0.2.0.tar.gz
  • Upload date:
  • Size: 761.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.1

File hashes

Hashes for libstreamvbyte-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0ed9c478761fa9acf3eaa3168bd8e5c20993fb901559382c16c4e9b37d032f64
MD5 d1f13dd630ec19decb91e0ecad32982a
BLAKE2b-256 2d80265253f23b02648ff28f958d07b3599ce8823789c93b5e7f4634fdbad02b

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 fb1806b4c7db7b9fae1fad4de70b10ac11c1b3b909333d680376da2767451c29
MD5 b3ee34d0392233d2ef8232fa45383227
BLAKE2b-256 76b04ea804d69f8d7c3e0f38b37ce7b3d1663d0790289f6f6f152ee8101a4a0c

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp311-cp311-win32.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 c1f26b0b019fe8b97e12681edec6fef14f2d494a086c0285cd403322c0888bb3
MD5 8b66f593d248e084d20c92dd3e1424b0
BLAKE2b-256 30a148a205049e895acf127edef1519185f9c72d06020ac79b3089e93c31188d

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp311-cp311-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 148879d640dab7cd1430369ad81ebd53bb3762cb0efd69558893328c95d3c7a9
MD5 b1a09a4c4d7f6b816bcdfa75a0928c23
BLAKE2b-256 3fe8f436c830e62d3b2e5c8d9c2d36962eb0f1e53ea55836fc5a73d670532815

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp311-cp311-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp311-cp311-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 59354582ad912caf0b3ec29abd0454ea5eb918315641d2e25b79da2db84fa54c
MD5 2309f944d32fa72d603c57e0b2811897
BLAKE2b-256 0f762350231e1e796ab2d97e3d61ce7da6ebe9089199b9826bd7f5bb35a89c60

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp311-cp311-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp311-cp311-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 9da3453acec00793b24c3da38a805ba672b4d7e24c00573d86614dfeb4fab448
MD5 cc11e0d711ed3ba6133e06653442da32
BLAKE2b-256 3a5418cb8888314117a3c64e598c585a29c510e759e04e0d3289a67792630f06

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7c09085c030750e694b7d57377040e794179bf46be202b2be3cfbab80e03705f
MD5 a71ee8ea4a3065392ecb9065da132775
BLAKE2b-256 4a74d3b828b620438030ed0846313a9f91d792e2bb2fec0561c3fd28cdec6fd1

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 975ef10f02d8068d0c94099c7e7b1e21d835602576194275f9a00c267b625974
MD5 2edb23fd3541ae0d0dc15fd462608c8e
BLAKE2b-256 a6d33413612cf3bfccece03cc078982148f61ddc844753cebcb7668a0da0b18b

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0c880a32244175f9ab209e87dbdf45dcedb3f52fce8170d0ce33cea7c0c70333
MD5 40168492e18970ca17c5b502a6b32108
BLAKE2b-256 aac9bdcee4ac8839dd1190b17b2a92acdb7a3c64f7537300297ebc5ee4e0cb93

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 fc4c318393a344f7d345a5794e663bc8cdc0aebcd9eccb391b7fbebf19eaa18b
MD5 cbcb0e4421c634fb41ea1a1cdd3e3367
BLAKE2b-256 736f358508fb0e98c8c2b38fc671eb132811510d001b28d4016edf6a46f70707

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 3404a168234d0311b4d89ba3f14f1726ccc9a703b6bf0eeceb6b7806100b739b
MD5 5e83c77c73cc0e1211bf4cca051fbfe8
BLAKE2b-256 17890cf7149f8117ccbe9c1b62e9d14d35610e7a9ed294ff24be2403003585c7

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp310-cp310-win32.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 1a7cff70feec1feeb674cef82eac16e1a69d30ffcde9576bcb90eb8854efb89a
MD5 a871e6ed769be8ed9e74b1a95b27e476
BLAKE2b-256 c64516621f747d56160ef6638be8afedc11e00a25ca6ef42e41f6f3ceefa1b5f

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp310-cp310-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 9995bb213639be2fb4036899bb57530baae55582ef0e702a2f2a3171a328944b
MD5 7b474a3eb244af991ebc168e51ffb417
BLAKE2b-256 59d28214d80353bdb8b857f3f9429a13504aae1636705ed0798f468f7c0f0b38

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp310-cp310-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp310-cp310-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 2af02a67bd7afb7b111153b2a0c078e0814c5737c009aff4162e7a6db237b162
MD5 279786cea5cf7e5a1dcc8f89fe33cc49
BLAKE2b-256 2743973c715ec8c368b5cf72acb4f9fd2bd5f387c7a285a07fb95ec920a34961

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp310-cp310-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp310-cp310-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 5ae0e7bf141a7cbcad5910062b6e4ca1ec69f3d650578fd086e2b51a610621e3
MD5 8bfde0ae1277b8cfdee5707149211577
BLAKE2b-256 f993e77d26486c6809a48ada4e829dbb245e5bd734e1d4860e31bf82618831cf

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6d69c63794453cfb9439617fee7dc588c87aff0d14671f145eeeadc8ed62c78f
MD5 e5a1793bd2148303d675b08ae52869db
BLAKE2b-256 67094494cc63dd4dfe6b5d803af68dbefcd9c75d915816bf155d0ec2003f6ab2

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 a7901cade56bd0a10828fa230c4375e80a7c8ea99c1c4e089f4cf6b3198c6cef
MD5 199fa663747c54354240fad033b582d0
BLAKE2b-256 384c2ed28893eacdbb5de37edd131c0938d3ffbc8c2e3d859d21cbabced9481f

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 f084f92e8603e138ab2048544898bb0c516fd8eeb692c8ffa435e0862a1ed4db
MD5 43347b980764903e2578670e3a7c85a4
BLAKE2b-256 643653fd0ecf5915e088a2b5d7805f243144d7470b21f1da49bd398ab6b961dd

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.0-cp310-cp310-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.0-cp310-cp310-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 cfc33b4410a8c11722413bf51866c4462297900f3fc2a6075d234c594674d6a3
MD5 5f0dd06338f89aea19792be842de21ce
BLAKE2b-256 5ee5395a5809c7e1410e9475fe094e2007cdd7bc95533d962e2e747c57ba521b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page