Skip to main content

Contains a variety of ordered structures, in particular a SkipList.

Project description

SkipList

This project contains a SkipList implementation in C++ with Python bindings.

A SkipList behaves as an always sorted list with, typically, O(log(n)) cost for insertion, look-up and removal. This makes it ideal for such operations as computing the rolling median of a large dataset.

See the full documentation on this project at ReadTheDocs.

A SkipList is implemented as a singly linked list of ordered nodes where each node participates in a subset of, sparser, linked lists. These additional 'sparse' linked lists provide rapid indexing and mutation of the underlying linked list. It is a probabilistic data structure using a random function to determine how many 'sparse' linked lists any particular node participates in. As such SkipList is an alternative to binary tree, Wikipedia has a introductory page on SkipLists.

An advantage claimed for SkipLists are that the insert and remove logic is simpler (however I do not subscribe to this). The drawbacks of a SkipList include its larger space requirements and its O(log(N)) lookup behaviour compared to other, more restricted and specialised, data structures that may have either faster runtime behaviour or lower space requirements or both.

This project contains a SkipList implementation in C++ with Python bindings with:

  • No capacity restrictions apart from available memory.
  • Works with any C++ type that has meaningful comparison operators.
  • The C++ SkipList can be compiled as thread safe.
  • The Python SkipList is thread safe.
  • The SkipList has exhaustive internal integrity checks.
  • Python SkipLists can be long/float/bytes/object types, the latter can have user defined comparison functions.
  • With Python 3.8+ SkipLists can be combined with the multiprocessing.shared_memory module for concurrent operation on large arrays. For example concurrent rolling medians which speed up near linearly with the number of cores.
  • The implementation is extensively performance tested in C++ and Python.

There are a some novel features to this implementation:

Credits

Originally written by Paul Ross with credits to: Wilfred Hughes (AHL), Luke Sewell (AHL) and Terry Tsantagoeds (AHL).

Installation

C++

This SkipList requires:

  • A C++11 compiler.
  • -I<skiplist>/src/cpp as an include path.
  • <skiplist>/src/cpp/SkipList.cpp to be compiled/linked.
  • The macro SKIPLIST_THREAD_SUPPORT set if you want a thread safe SkipList using C++ mutexes.

Python

This SkipList version supports Python 3.6, 3.7, 3.8, 3.9 (and, probably, some earlier Python 3 versions). Earlier versions supported Python 2.7, this version might still do that.

From PyPi

$ pip install orderedstructs

From source

$ git clone https://github.com/paulross/skiplist.git
$ cd <skiplist>
$ python setup.py install

Testing

This SkipList has extensive tests for correctness and performance.

C++

To run all the C++ functional and performance tests:

$ cd <skiplist>/src/cpp
$ make release
$ ./SkipList_R.exe

To run the C++ functional tests with agressive internal integrity checks:

$ cd <skiplist>/src/cpp
$ make debug
$ ./SkipList_D.exe

To run all the C++ functional and performance tests for a thread safe SkipList:

$ cd <skiplist>/src/cpp
$ make release CXXFLAGS=-DSKIPLIST_THREAD_SUPPORT
$ ./SkipList_R.exe

Python

Testing requires pytest and hypothesis:

To run all the C++ functional and performance tests:

$ cd <skiplist>
$ py.test -vs tests/

Examples

Here are some examples of using a SkipList in your code:

C++

#include "SkipList.h"
    
// Declare with any type that has sane comparison.
OrderedStructs::SkipList::HeadNode<double> sl;

sl.insert(42.0);
sl.insert(21.0);
sl.insert(84.0);

sl.has(42.0) // true
sl.size()    // 3
sl.at(1)     // 42.0, throws OrderedStructs::SkipList::IndexError if index out of range

sl.remove(21.0); // throws OrderedStructs::SkipList::ValueError if value not present

sl.size()    // 2
sl.at(1)     // 84.0

The C++ SkipList is thread safe when compiled with the macro SKIPLIST_THREAD_SUPPORT, then a SkipList can then be shared across threads:

#include <thread>
#include <vector>

#include "SkipList.h"

void do_something(OrderedStructs::SkipList::HeadNode<double> *pSkipList) {
    // Insert/remove items into *pSkipList
    // Read items inserted by other threads.
}

OrderedStructs::SkipList::HeadNode<double> sl;
std::vector<std::thread> threads;

for (size_t i = 0; i < thread_count; ++i) {
    threads.push_back(std::thread(do_something, &sl));
}
for (auto &t: threads) {
    t.join();
}
// The SkipList now contains the totality of the thread actions.

Python

An example of using a SkipList of always ordered floats:

import orderedstructs

# Declare with a type. Supported types are long/float/bytes/object.
sl = orderedstructs.SkipList(float)

sl.insert(42.0)
sl.insert(21.0)
sl.insert(84.0)

sl.has(42.0) # True
sl.size()    # 3
sl.at(1)     # 42.0

sl.has(42.0) # True
sl.size()    # 3
sl.at(1)     # 42.0, raises IndexError if index out of range

sl.remove(21.0); # raises ValueError if value not present

sl.size()    # 2
sl.at(1)     # 84.0

The Python SkipList can be used with user defined objects with a user defined sort order. In this example the last name of the person takes precedence over the first name:

import functools

@functools.total_ordering
class Person:
    """Simple example of ordering based on last name/first name."""
    def __init__(self, first_name, last_name):
        self.first_name = first_name
        self.last_name = last_name

    def __eq__(self, other):
        try:
            return self.last_name == other.last_name and self.first_name == other.first_name
        except AttributeError:
            return NotImplemented

    def __lt__(self, other):
        try:
            return self.last_name < other.last_name or self.first_name < other.first_name
        except AttributeError:
            return NotImplemented

    def __str__(self):
        return '{}, {}'.format(self.last_name, self.first_name)

import orderedstructs

sl = orderedstructs.SkipList(object)

sl.insert(Person('Peter', 'Pan'))
sl.insert(Person('Alan', 'Pan'))
assert sl.size() == 2
assert str(sl.at(0)) == 'Pan, Alan' 
assert str(sl.at(1)) == 'Pan, Peter' 

The Python SkipList is thread safe when using any acceptable Python type even if that type has user defined comparison methods. This uses Pythons mutex machinery which is independent of C++ mutexes.

History

0.3.7 (2021-12-18)

  • Fix build on GCC (Issue #8).

0.3.6 (2021-12-18)

  • Add documentation on NaN in rolling median.
  • Add plots using shared_memory.
  • Add Python 3.10 support.

0.3.5 (2021-05-02)

  • Fix uncaught exception when trying to remove a NaN.

0.3.4 (2021-04-28)

  • Improve documentation mainly around multiprocessing.shared_memory and tests.

0.3.3 (2021-03-25)

  • Add Python benchmarks, fix some build issues.

0.3.2 (2021-03-18)

  • Fix lambda issues with Python 3.8, 3.9.

0.3.1 (2021-03-17)

  • Support Python 3.7, 3.8, 3.9.

0.3.0 (2017-08-18)

  • Public release.
  • Allows storing of PyObject* and rich comparison.

0.2.0

Python module now named orderedstructs.

0.1.0

Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orderedstructs-0.3.7.tar.gz (33.5 kB view details)

Uploaded Source

Built Distributions

orderedstructs-0.3.7-cp310-cp310-macosx_10_9_universal2.whl (49.0 kB view details)

Uploaded CPython 3.10 macOS 10.9+ universal2 (ARM64, x86-64)

orderedstructs-0.3.7-cp39-cp39-macosx_10_9_x86_64.whl (48.9 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

orderedstructs-0.3.7-cp38-cp38-macosx_10_9_x86_64.whl (49.1 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

orderedstructs-0.3.7-cp37-cp37m-macosx_10_9_x86_64.whl (49.0 kB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

orderedstructs-0.3.7-cp36-cp36m-macosx_10_9_x86_64.whl (49.1 kB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file orderedstructs-0.3.7.tar.gz.

File metadata

  • Download URL: orderedstructs-0.3.7.tar.gz
  • Upload date:
  • Size: 33.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.9.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for orderedstructs-0.3.7.tar.gz
Algorithm Hash digest
SHA256 da0d0df214d6e4208578bce0b481abca4298a97e913181193ef64c5631470e4a
MD5 3cfeaedf0877157167af22c7878d3867
BLAKE2b-256 0ea9fa679983c8cce81d596704bf83b48dfb2814511328a6e54856f934e021f2

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.7-cp310-cp310-macosx_10_9_universal2.whl.

File metadata

  • Download URL: orderedstructs-0.3.7-cp310-cp310-macosx_10_9_universal2.whl
  • Upload date:
  • Size: 49.0 kB
  • Tags: CPython 3.10, macOS 10.9+ universal2 (ARM64, x86-64)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.9.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for orderedstructs-0.3.7-cp310-cp310-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 1f5f57769179764aa132d68e391e01b3cf6a3f10961db650d8ddf7bc5358cd76
MD5 e879e9a3ce397a122eb3fbaca8c4ff6c
BLAKE2b-256 aaa2ef36e5a82f1fb91e87ac22b646a0651c0db5c9ac415579ddf5e36a4adc5f

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.7-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: orderedstructs-0.3.7-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 48.9 kB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.9.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for orderedstructs-0.3.7-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 dae811c9f429be57daefe7d3980a17764d88501a0e7be9b54a562e75e9b520d2
MD5 b9a4e519ce30bb71b36dd3d85c60ff21
BLAKE2b-256 b85e27ff9342b82bf724b6e7e92d2a963d000b0641f9df1a4a161dc83244c579

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.7-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: orderedstructs-0.3.7-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 49.1 kB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.9.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for orderedstructs-0.3.7-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6ec5fc22e2567df297f9d84425e381e91f78084473a79c14441751a5c29c738e
MD5 092162669ae7c4f478be113eb02d8e09
BLAKE2b-256 34e61c47282cd8d8c51b513893cbef46033a63ace4e185e1ac1a13c6b49b3520

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.7-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: orderedstructs-0.3.7-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 49.0 kB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.9.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for orderedstructs-0.3.7-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 fb8d603542eb8c8ba4828d742a6c17df630612e64b55cae8e9229c841f9248bb
MD5 73291d44b9297bc44a4bad61b54fb526
BLAKE2b-256 1736fd90a9b2005112820d0d83f6c3def2d207234cdf4b30732abacfe2baf0a1

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.7-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: orderedstructs-0.3.7-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 49.1 kB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.9.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for orderedstructs-0.3.7-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6e3df67c4cd452462a9b78661cee4a7097a502a23125276000fd1ffeecd63734
MD5 bec49f820f59165ba93b001bcbfbcf4e
BLAKE2b-256 d7f49586a09b523cf79fb5aa2538da154da75771af4e65d2572c0a3d8e91da3e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page