Skip to main content

A C++ implementation with Python bindings of StreamVByte.

Project description


libstreamvbyte


Table of Contents
  1. About The Project
  2. Getting Started
  3. Roadmap
  4. Contributing
  5. License
  6. Reference
  7. Contact

About The Project

libstreamvbyte is a C++ implementation of StreamVByte, with Python bindings using pybind11.

StreamVByte is a integer compression technique that use SIMD instructions (vectorization) to improve performance. The library is optimized for CPUs with the SSSE3 instruction set (which is supported by most x86_64 processors), and can also be used with ARM processors and other 32-bit architectures, although it will fall back to scalar implementations in those cases.

With libstreamvbyte, you can quickly and efficiently compress integer sequences, reducing the amount of storage space and network bandwidth required. The library is easy to use and integrates seamlessly with Python via pybind11 bindings. Whether you're working with large datasets or building a distributed computing system, libstreamvbyte can help you improve performance and reduce the resources needed to handle your data.

Currently supports Python 3.10+ on Windows, Linux (manylinux_2_17, musllinux_1_1) and macOS (universal2).

(back to top)

Getting Started

Installation

For Python

Install from PyPI using pip.

pip install libstreamvbyte

Or install from .whl file.

pip install "path/to/your/downloaded/whl"

To find appropriate .whl file, please visit releases.

For C++

You must have CMake installed on your system.

# clone the repo
git clone https://github.com/wst24365888/libstreamvbyte
cd libstreamvbyte

# build and install
cmake .
make
sudo make install

Usage

For Python

Import libstreamvbyte first.

import libstreamvbyte as svb

And here are the APIs.

# Encode an array of unsigned integers into a byte array.
encode(arg0: numpy.ndarray[numpy.uint32]) -> numpy.ndarray[numpy.uint8]

# Decode a byte array into an array of unsigned integers.
decode(arg0: numpy.ndarray[numpy.uint8], arg1: int) -> numpy.ndarray[numpy.uint32]

# Encode an array of signed integers into an array of unsigned integers.
encode_zigzag(arg0: numpy.ndarray[numpy.int32]) -> numpy.ndarray[numpy.uint32]

# Decode an array of unsigned integers into an array of signed integers.
decode_zigzag(arg0: numpy.ndarray[numpy.uint32]) -> numpy.ndarray[numpy.int32]

# Check if the current wheel is a vectorized version.
is_vectorized_version() -> bool

For C++

Include streamvbyte.h first.

#include "streamvbyte.h"

For the APIs, please refer to include/streamvbyte.h.

Example

For Python

import libstreamvbyte as svb

N = 2**20 + 2

# type(original_data) == np.ndarray
# original_data.dtype == np.int32
original_data = np.random.randint(-2**31, 2**31, N, dtype=np.int32)

# type(compressed_bytes) == np.ndarray
# compressed_bytes.dtype == np.uint8
compressed_bytes = svb.encode(svb.encode_zigzag(original_data))

# type(recovered_data) == np.ndarray
# recovered_data.dtype == np.int32
recovered_data = svb.decode_zigzag(svb.decode(compressed_bytes, N))

For C++

#include "streamvbyte.h"

int main() {
    std::size_t N = (1 << 20) + 2;

    std::vector<int32_t> original_data(N);
    for (std::size_t i = 0; i < N; ++i) {
        original_data[i] = rand() - rand();
    }

    std::vector<uint8_t> compressed_bytes = streamvbyte::encode(streamvbyte::encode_zigzag(original_data));
    std::vector<int32_t> recovered_data = streamvbyte::decode_zigzag(streamvbyte::decode(compressed_bytes, N));

    return 0;
}

Compile it with linking to libstreamvbyte.

g++ -o example example.cpp -lstreamvbyte

(back to top)

Roadmap

  • Zigzag encoding/decoding.
  • Support ARM processors with NEON intrinsics.
  • Differential coding (delta encoding/decoding).

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feat/amazing-feature)
  3. Commit your Changes with Conventional Commits
  4. Push to the Branch (git push origin feat/amazing-feature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

Reference

(back to top)

Contact

Author

Project Link

(back to top)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libstreamvbyte-0.2.3.tar.gz (762.2 kB view details)

Uploaded Source

Built Distributions

libstreamvbyte-0.2.3-cp311-cp311-win_amd64.whl (58.6 kB view details)

Uploaded CPython 3.11 Windows x86-64

libstreamvbyte-0.2.3-cp311-cp311-win32.whl (46.5 kB view details)

Uploaded CPython 3.11 Windows x86

libstreamvbyte-0.2.3-cp311-cp311-musllinux_1_1_x86_64.whl (605.2 kB view details)

Uploaded CPython 3.11 musllinux: musl 1.1+ x86-64

libstreamvbyte-0.2.3-cp311-cp311-musllinux_1_1_i686.whl (661.2 kB view details)

Uploaded CPython 3.11 musllinux: musl 1.1+ i686

libstreamvbyte-0.2.3-cp311-cp311-musllinux_1_1_aarch64.whl (586.3 kB view details)

Uploaded CPython 3.11 musllinux: musl 1.1+ ARM64

libstreamvbyte-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (80.6 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

libstreamvbyte-0.2.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (85.5 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686

libstreamvbyte-0.2.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (75.2 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

libstreamvbyte-0.2.3-cp311-cp311-macosx_10_9_universal2.whl (102.9 kB view details)

Uploaded CPython 3.11 macOS 10.9+ universal2 (ARM64, x86-64)

libstreamvbyte-0.2.3-cp310-cp310-win_amd64.whl (58.6 kB view details)

Uploaded CPython 3.10 Windows x86-64

libstreamvbyte-0.2.3-cp310-cp310-win32.whl (46.4 kB view details)

Uploaded CPython 3.10 Windows x86

libstreamvbyte-0.2.3-cp310-cp310-musllinux_1_1_x86_64.whl (605.2 kB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

libstreamvbyte-0.2.3-cp310-cp310-musllinux_1_1_i686.whl (661.3 kB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ i686

libstreamvbyte-0.2.3-cp310-cp310-musllinux_1_1_aarch64.whl (586.3 kB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ ARM64

libstreamvbyte-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (80.6 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

libstreamvbyte-0.2.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (85.6 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

libstreamvbyte-0.2.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (75.2 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

libstreamvbyte-0.2.3-cp310-cp310-macosx_10_9_universal2.whl (102.9 kB view details)

Uploaded CPython 3.10 macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file libstreamvbyte-0.2.3.tar.gz.

File metadata

  • Download URL: libstreamvbyte-0.2.3.tar.gz
  • Upload date:
  • Size: 762.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.1

File hashes

Hashes for libstreamvbyte-0.2.3.tar.gz
Algorithm Hash digest
SHA256 379ac500980a2deebe6cf13b00fd3f57d95c4b40e4f0b8a301a2c2428d67fcf5
MD5 70d40917973338c6c5994ffff7459c23
BLAKE2b-256 7508d5561b2ddd56745e036c2e9675b0a8b6e1203768b8b97473d5ca08e411e5

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 3f5e5d0c84e656191672dc75805471250b11d5834ba314ea342f46c83a9ef2a8
MD5 6b8e0cd0ef98e0cf00512d44fcf23024
BLAKE2b-256 a0c3014ee34343ff01485f5d49e086e8eccc52919f390399538064c1e962be1d

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp311-cp311-win32.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 6dbc22687c347e524509c5100f996274d500560f2fae6dbdf329a6b63ad369a6
MD5 7ae73a4256ff6f17990f536875c6854e
BLAKE2b-256 b0f9c967d16558e604c7bf06aeecb7271aae05f51e755ae8e953886eed01998b

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp311-cp311-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 ee7692cb96ba09f0691d1015542343a0cd01b0949004c81d7b471a5ce980dd51
MD5 577de61c9d7d630df3500b351020b886
BLAKE2b-256 c1c61982f3042d78c4783b2be165025a8b9e83fdcb81cfaf228848dce77d261e

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp311-cp311-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp311-cp311-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 5437a4528c28db7490e0ed7eed68332b94ba93db4b0a3564b4033d73d6d32914
MD5 fb7b6c3d72ee3332d89fe2191a06df19
BLAKE2b-256 d8bae4b3273ba9facdce83358bfc8d0d2ebc9057c98841880d0e1a1386274550

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp311-cp311-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp311-cp311-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 8ca0be3a0d7614180c0bb14756acfd481fdea27fb3481d27a4b61cc4afa18616
MD5 54ad26b01c8ba2bd89f174952891ad2a
BLAKE2b-256 5acdb2ed7a7218e21963823ad7f091de07db07e7e3dd78eb6585bf7c00d66f8e

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b33cce903c6cbb3bb17449a3dfa751f38dd38bca31f20a33151fde6dcc7b2bc3
MD5 53e75e22ce15c5aac3c3442abd527c96
BLAKE2b-256 a720399545beebe545426ce641a6b239b97a225279dbaca6af752cac889bf911

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 9d856285c0079b3cc11b1fbaae8e34bec3d76c6d9a5f9dfdfb40c3b7d5310cc1
MD5 0cd424b5a8544073a7c456bf42e57980
BLAKE2b-256 75d0be3ac92251f9c751f9ef5695fcc4ce0406e680dcb5543eda4166a36e7268

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3f78284ed95dd7b86879dc2ac533dab690e14e700f09d123999de2dc42ea0f22
MD5 d060d8ea9f5356bf259b320ff52243b8
BLAKE2b-256 0419e8e49e98e1b704cdeae226778339f67f8f50e433433442618f4fbc5b03e5

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 a38c11510f2303692be8b80472a07cb059d9b789afebbbe93cd156fbebb293c1
MD5 af18914330f576bc1fb0e12850174ef5
BLAKE2b-256 96e37b97f188141ef75cfd455168b0d329565f3b1ba01cf57403ad787383f046

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 2bfc5f0fa63c9698d532bccd53fd2ec99dba88716e8f8822f166f289435fb680
MD5 6b0b0c02d5029cd0b956fcc22998027f
BLAKE2b-256 a498f2cdd3571095c6cbbae7b70e4f61d44b8a903954e842525e3797ea88ec5d

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp310-cp310-win32.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 0717609db35aaf733373bfe7646a3ddfa22351b66902d7affd5a8d214238e331
MD5 fcc854baadcd5edf4634e7d92fd7cfc7
BLAKE2b-256 4bb435369a62f9a8ca378ded421dd3cde2edb251d251e82623d26f88c516c3ed

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp310-cp310-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 591c9609ee8f76ae47d1e3191d4952e4d7987789b4cf91a406fba7ba4357cf62
MD5 930035222d4b80bed2014e65b052fb27
BLAKE2b-256 37a4c09931a2c246cec565ff384b893862d0644420144ba94fdf515f15c5625f

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp310-cp310-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp310-cp310-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 62e817e320ff10e661c2d56aced01c79d5448acc1385827516306f024cd5826d
MD5 a27382616a1f51ff7f07badbb23012ab
BLAKE2b-256 1d3227c970eae2f609a8a28669c23660aec34f14908634cf995c43a01f5e1b1d

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp310-cp310-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp310-cp310-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 1b859f7d8d657800f2c7e11b7056adfc2cd421e387693dd1370dcfddc5095c58
MD5 96fb1f251a4097c70b4516a94dbfdc79
BLAKE2b-256 3d56bf74a302205899314b35f318b990a490cf9e3989217b20afa209eedacc41

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cc932064a39612dfb45dba3cbb36ea66c9f8c375141ce8ebc8c4fd2ad792f11e
MD5 bef2685096f3f59346ea48e264e150aa
BLAKE2b-256 7edf48777dcdcdc723cf2972eac908c7c3c96a1b20e3fda30817416e4f566ded

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 16b7e612699ff255a0148a088b5cf689acb48fd46d14eaf7d3524072322dc188
MD5 878b19a014783cff5bbc9886fb3ce5b0
BLAKE2b-256 943963733a179b8f76a7ce325ea3389253c0e49edded58aaddb7c015735c452b

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7667ebe3050ab04a2925f700331460edf985519bf20137edbb2d4e5d7e610cb5
MD5 0e26b8ed3281058689d50335667c4349
BLAKE2b-256 62101a91d215f7932414b4acde5ad86bb703e878f7fa2a861531ba83cec7da8f

See more details on using hashes here.

File details

Details for the file libstreamvbyte-0.2.3-cp310-cp310-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for libstreamvbyte-0.2.3-cp310-cp310-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 3b0db7690e00fef33bb3200fe933683c9d0d326d18f8889f42d1fde96e3bec45
MD5 d077317e3c37e179372d5303e1abddcd
BLAKE2b-256 a8c4ed9726296edfa10af7f80a3a723a654fe498abeed7e8e9291bedf0e18e0e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page