Skip to main content

A Python frontend for the splinter-rs bitmap compression library

Project description

Splynters v0.1.2

A Python package for efficient compression of sparse bitmaps

Splynters is a Python wrapping over the splinter-rs library for zero-copy querying of compressed bitmaps.

We support Python >=3.8 on recent versions of Windows, MacOS, and manylinux.

We also provide benchmarking against the PyRoaring implementation of Roaring BitMaps, which you can view in the python/benchmarking.ipynb file and run for yourself with run_all_benchmarks.py.

This package can be installed from PyPI using

pip install splynters

Examples

Create a Splinter

Splinter bitmaps can be created from Python iterables, as follows:

import splynters
from splynters import Splinter

data = [1, 5, 789423, 23]
s = Splinter.from_list(data)

Add and remove elements

Elements can be added or removed in the same manner as Python sets: note that we provide both a .remove() method which throws an error if it does not find such an element, as well as a .discard() method which fails silently in such an event.

s.add(6)

s.remove(5)

# will throw an error! uncomment to see:
# > KeyError: 'remove() could not find the key 99 in the splinter. For a fault-tolerant alternative to remove(), consider discard()'
# s.remove(99) 

s.discard(99) # will fail silently

Check for an element

We can check for the presence of an element using the .contains() method. This can be used for either single elements or iterables of elements, and will return either a single boolean or array of booleans.

Note that we also provide a .contains_many_parallel() method designed to speed up the process of searching for the presence of many elements at once by parallelizing the search. Note that the regular .contains() method is fast enough that the parallel version is not likely to be faster unless you are searching for more than 10,000 elements.

data = [1, 5, 789423, 23]
s = Splinter.from_list(data)

s.add(6)
s.remove(5)

assert(s.contains(1))
assert(not s.contains(99))
assert(s.contains([1, 23, 789423]))

We can also do the same for single elements using Python's in operator:

assert(1 in s)
assert(not 99 in s)

And can also access elements or ranges of elements using Python's slice syntax, with negative numbers indicating indices counting from the end:

assert(s[2] == 23) # access the third element
assert(s[-1] == 789423) # access the last element
assert(s[1:] == [6, 23, 789423]) # access all elements starting from the second
assert(s[0::2] == [1, 23]) # access every other element starting from the first
assert(s[::-1] == [789423, 23, 6, 1]) # access all elements backwards

Set operations

Splinters support all set operations, including bitwise operations:

data1 = [1, 5, 6, 789423, 23]
data2 = [1, 5, 789423, 23, 42]

s1 = Splinter.from_list(data1)
s2 = Splinter.from_list(data2)

splinter_intersection = s1 & s2
splinter_union = s1 | s2
splinter_xor = s1 ^ s2
splinter_sub = s1 - s2

assert(splinter_intersection.to_list() == [1, 5, 23, 789423])
assert(splinter_union.to_list() == [1, 5, 6, 23, 42, 789423]) # note that the output order will be sorted low to high! not necessarily the same as the input order
assert(splinter_xor.to_list() == [6, 42]) 
assert(splinter_sub.to_list() == [6])

As well as bitwise assignments:

s1 = Splinter.from_list(data1)
s1 &= s2
assert(s1.to_list() == [1, 5, 23, 789423])

s1 = Splinter.from_list(data1)
s1 |= s2
assert(s1.to_list() == [1, 5, 6, 23, 42, 789423])

s1 = Splinter.from_list(data1)
s1 ^= s2
assert(s1.to_list() == [6, 42])

s1 = Splinter.from_list(data1)
s1 -= s2
assert(s1.to_list() == [6])

And also explicit set operations:

data1 = [1, 5, 789423, 23, 42]
data2 = [1, 5, 789423, 42]
data3 = [2, 83, 98798]

s1 = Splinter.from_list(data1)
s2 = Splinter.from_list(data2)
s3 = Splinter.from_list(data3)

s1.isdisjoint(s3)
s2.issubset(s1)
s1.issuperset(s2)

And it also supports comparison operators for equality, subsets, and proper subsets:

data1 = [1, 5, 6, 789423, 23]
data2 = [1, 5, 789423, 23, 42]
data3 = [1, 5, 789423, 42]

s1 = Splinter.from_list(data1)
s2 = Splinter.from_list(data2)
s3 = Splinter.from_list(data3)

assert(not s1==s2) # s1 and s2 are not equal
assert(s1 != s2)

assert(not s1 <= s2) # s1 is not a subset of s2
assert(not s1 < s2) # as a consequence of the above, s1 is also not a proper subset of s2
assert(s3 <= s2) # but s3 is a subset of s2
assert(s3 < s2) # and would you look at that, it's also a proper subset!

assert(not s1 >= s2) # s2 is not a subset of s1
assert(not s1 > s2) # s2 is not a proper subset of s1

Compression Optimization

A Splinter object can be further compressed to its minimum memory footprint using the .optimize() method.

This operation is computationally expensive, so it should be called before serializing data or after very large changes in order to reduce size in memory. It is not recommended to call this in a tight loop or after small changes.

data = [1, 5, 789423, 23]
s = Splinter.from_list(data)

# ...
# a lot of changes
# ...

s.optimize()

Serialization, Deserialization, and Pickling

A Splinter object can be serialized to bytes using the .to_bytes() method, and deserialized using .from_bytes().

data = [1, 5, 789423, 23]
s = Splinter.from_list(data)

b = s.to_bytes()

s_but_fancy_this_time = Splinter.from_bytes(b)
assert(s == s_but_fancy_this_time)

In addition, a splinter object's basic data can be displayed simply by printing it (or in a REPL, using its name), and it can be decompressed to show its internal elements using .to_list():

print(s)
# > SplinterWrapper(len = 4, compressed_byte_size = 35)

print(s.to_list())
# > [1, 5, 23, 789423]

Splinter also implements getstate and setstate, so that Splinters can be serialized and deserialized with Pickle.

import pickle

s = Splinter.from_list([1, 5, 23, 789423])

pickled_s = pickle.dumps(s)
unpickled_s = pickle.loads(pickled_s)
assert(s == unpickled_s)

Dependencies

At present, splynters has no additional dependencies.

Roadmap

  • Keep splynters up to date with the splinter-rs API and capabilities
  • Benchmark runtime performance of major operations against PyRoaring
  • Add optional numpy dependency to enable faster serialization to and from numpy arrays, Pandas series, and Polars series.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

splynters-0.1.2.tar.gz (23.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

splynters-0.1.2-cp313-cp313-win_amd64.whl (705.6 kB view details)

Uploaded CPython 3.13Windows x86-64

splynters-0.1.2-cp313-cp313-macosx_11_0_arm64.whl (700.6 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

splynters-0.1.2-cp313-cp313-macosx_10_12_x86_64.whl (777.3 kB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

splynters-0.1.2-cp312-cp312-win_amd64.whl (705.5 kB view details)

Uploaded CPython 3.12Windows x86-64

splynters-0.1.2-cp312-cp312-macosx_11_0_arm64.whl (700.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

splynters-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl (777.4 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

splynters-0.1.2-cp311-cp311-win_amd64.whl (702.6 kB view details)

Uploaded CPython 3.11Windows x86-64

splynters-0.1.2-cp311-cp311-manylinux_2_28_x86_64.whl (825.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

splynters-0.1.2-cp311-cp311-macosx_11_0_arm64.whl (700.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

splynters-0.1.2-cp311-cp311-macosx_10_12_x86_64.whl (779.5 kB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

splynters-0.1.2-cp310-cp310-win_amd64.whl (701.5 kB view details)

Uploaded CPython 3.10Windows x86-64

splynters-0.1.2-cp310-cp310-macosx_11_0_arm64.whl (700.6 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

splynters-0.1.2-cp310-cp310-macosx_10_12_x86_64.whl (779.6 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

splynters-0.1.2-cp39-cp39-win_amd64.whl (703.2 kB view details)

Uploaded CPython 3.9Windows x86-64

splynters-0.1.2-cp39-cp39-macosx_11_0_arm64.whl (702.7 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

splynters-0.1.2-cp39-cp39-macosx_10_12_x86_64.whl (782.3 kB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

splynters-0.1.2-cp38-cp38-win_amd64.whl (703.1 kB view details)

Uploaded CPython 3.8Windows x86-64

splynters-0.1.2-cp38-cp38-macosx_11_0_arm64.whl (702.1 kB view details)

Uploaded CPython 3.8macOS 11.0+ ARM64

splynters-0.1.2-cp38-cp38-macosx_10_12_x86_64.whl (781.9 kB view details)

Uploaded CPython 3.8macOS 10.12+ x86-64

File details

Details for the file splynters-0.1.2.tar.gz.

File metadata

  • Download URL: splynters-0.1.2.tar.gz
  • Upload date:
  • Size: 23.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.6

File hashes

Hashes for splynters-0.1.2.tar.gz
Algorithm Hash digest
SHA256 88166d63f18a191cd6af865030438f2936945b40ce29646a25bd01d7c4c354a3
MD5 183079e13c780d1231889b08c363b340
BLAKE2b-256 9944428e87455fd21b9ccda5d65dc9cf5289b8c81d11cae4aa50e650ef93bf5c

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 e532fbe062732f2b890fa9114c4d5ee85dacd7b1b183381d25ea5f3dd2bda6c8
MD5 a5d52c40b174792dba325950103321e4
BLAKE2b-256 05ee7b1fd1852fb09f8e4744bfeb332d392d303f1e72d36b0527140a0a37b680

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eaf940322eb62d44337b2bb45a8606db218f74047fbfecf8ec6ec662b4e8a764
MD5 da64b9555b309259b56fd1a94f1534c9
BLAKE2b-256 863c0e0646120cca717ee1334d90390d4ad023cd8d39d34248a435cb2fc3d563

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3e5aeedd2643126cd55968593dfdd9ebbc0931417b9cedaf98ad1c0ccb97e9f6
MD5 e45d86d8734f803b3e6b0f6f29da54d8
BLAKE2b-256 fc1cc2b384d0ee18b235d11482bda94bf373c3318fae7eeaa62ae51119289cd1

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 85d8df0057bfb54f7dfe1a5a5dbc5d0ff0034ef396943768b2d006104005c635
MD5 3d42832320c443bdf26086b577fda116
BLAKE2b-256 01b9eacf6202cd6f29e06172eebb589484da5b2f69a2cdaee4547637eef4efae

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a0b4d24855daab081c96549745cb7005d49093fab6b8160f4f3ccc24f20d8d95
MD5 0c6d7605f2babbc01e0b1f035dd9c9ec
BLAKE2b-256 b583dea76998aa9112492dafda5d8c20b8dabda27378300dc9d13397edaedb27

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 eca8cf4c7742b410c5dc04d4ba144efed9c5d870090dd6d13beab972068ffd83
MD5 63a8e7380d12b528a87dcaf46a75bf3b
BLAKE2b-256 03a31d800ba9c0c84e58008e56d8fed7bcf89ab79844395c021545de40ab95a8

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 41021fef01d8e6fc2a4f4411610466a3b3f3a76f5de56b2d040772bc18cac34f
MD5 1f2159e68e74c0ce6f4642353eda664d
BLAKE2b-256 97ce610144de642ba6304b0f983aa9869e781c5385d02d56a828378c830c7e98

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 52384c76c01b3cfaf657ff1d9545243aca9cd0101b6517f04f2edc78ffc63610
MD5 cfd2415778667838a0c4b89c13b1f8ec
BLAKE2b-256 1f89af627998a3e767dcc43f8b945f80f5aad5b3b1876aca82604a73de75e16c

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d50b5eecccc70ef9f360d2292d3cd675c1fd5b8bdb28019f25f58810bcab97b9
MD5 22714a3cff0a11327f897aedda7d7d06
BLAKE2b-256 0ae1d15cf98d33dc0f19fa7800818b7888a79698761c2b62a028ecb7f632da2b

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 419bf6d1e6bea2eda0d3255a293ab9cf49313357d61525bbd0a9930ffb7491fb
MD5 9e4c57eaf61ed08de57355cd4bd61ce2
BLAKE2b-256 043a22c56a7fedd2dd8a24a1fae90dac4960c17d0c6b5325163487f458a9452c

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 d55d6f1cc5c06ab7d48709c22069fb9f940d15b045d1cc556f0c845c911c58d3
MD5 2517ff3a61a73975884a63e8ef482792
BLAKE2b-256 97dd96159f158c804fc28c2a3969201d09aca45e9d8ec7d9238a6ec7efb1f283

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 01de8c95f05f1f46990da42f8b7fd3b38c10601d22f5c2205551e1ec90db0a0e
MD5 78ee01bfa6e8f833776f287ddf835517
BLAKE2b-256 40950b2b79e5c9ad4e875db22d0343298b7a0431f34b9d94601c5f9ab9aec4e9

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 eaae1e6430cd3b20b689c2252fe37f4812374f73e57ed165ae935aef0d7752b2
MD5 247bc6cc3439b946562f300e56494715
BLAKE2b-256 26360b5c6ba6c641bf788aa87e8e0baa3845c679c63efe8d2c8f4caeb520a59d

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: splynters-0.1.2-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 703.2 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.6

File hashes

Hashes for splynters-0.1.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 ae78c4b7d3cb652b12ec0621a1b454fb7fc7f89600e435b787eafe3b0e6cf317
MD5 4f2f6fdd729608a26471953fa5de7cfc
BLAKE2b-256 768e1bb4d879897484ff6d8cbcedf64ebaffec3a2398bbbe6f279948246ebe4e

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a44a0aa1bc14aecc0046cae9480aee04cf896f34e5cd83369fd95eea982670f6
MD5 e3bd4eedd6103a0249d4fa3c6aba0332
BLAKE2b-256 28452b43950bfa5ac49bca97025d234aebed7e16411dcde31491d761beb8ba28

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 480bc953242fd2ad18755b34e30a86faba7e0295c66033f8bc9c7cd84d59ff6a
MD5 669363aa693711226184fc93f6ce424e
BLAKE2b-256 8ecad3abe815323be4d2aa567fedd9edf826bb0f2f2eeb2eb94c6a23fbe52652

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: splynters-0.1.2-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 703.1 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.6

File hashes

Hashes for splynters-0.1.2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 efd67a1097ddcb2e2ce9674a810d464b90c52cb9ed2f3d9dd472186c62dc80e5
MD5 901e3f506c4e85ccaae98e6ef9607bc7
BLAKE2b-256 90a6ced41666e311fbd77c95f09f6780d623b283b3697487fe06bfd9ab15349f

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 49fbc3331dabd22dcd04da03e9c868112be8b3fd5c0063c4a719b4612ac50902
MD5 d2d152091ace62a32649ffdca37c0c2b
BLAKE2b-256 4167e36abf5ba5d195f45d6397163dc4f9738b45e59a58453fbf9af847d51316

See more details on using hashes here.

File details

Details for the file splynters-0.1.2-cp38-cp38-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for splynters-0.1.2-cp38-cp38-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 056b44d49f76dfcd2aea7b9e44e15259f61ebdaa8f97ce8e53e9939b76655d01
MD5 130b9024295f1607ff5f11832ed882d9
BLAKE2b-256 ca7f5bd605c5b6387e9c9b5ff9b2217c740e5cc74aca9c542f02f5b4f8ac0c4f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page