Skip to main content

Contains a variety of ordered structures, in particular a SkipList.

Project description

SkipList

This project contains a SkipList implementation in C++ with Python bindings.

A SkipList behaves as an always sorted list with, typically, O(log(n)) cost for insertion, look-up and removal. This makes it ideal for such operations as computing the rolling median of a large dataset.

See the full documentation on this project at ReadTheDocs.

A SkipList is implemented as a singly linked list of ordered nodes where each node participates in a subset of, sparser, linked lists. These additional 'sparse' linked lists provide rapid indexing and mutation of the underlying linked list. It is a probabilistic data structure using a random function to determine how many 'sparse' linked lists any particular node participates in. As such SkipList is an alternative to binary tree, Wikipedia has a introductory page on SkipLists.

An advantage claimed for SkipLists are that the insert and remove logic is simpler (however I do not subscribe to this). The drawbacks of a SkipList include its larger space requirements and its O(log(N)) lookup behaviour compared to other, more restricted and specialised, data structures that may have either faster runtime behaviour or lower space requirements or both.

This project contains a SkipList implementation in C++ with Python bindings with:

  • No capacity restrictions apart from available memory.
  • Works with any C++ type that has meaningful comparison operators.
  • The C++ SkipList can be compiled as thread safe.
  • The Python SkipList is thread safe.
  • The SkipList has exhaustive internal integrity checks.
  • Python SkipLists can be long/float/bytes/object types, the latter can have user defined comparison functions.
  • With Python 3.8+ SkipLists can be combined with the multiprocessing.shared_memory module for concurrent operation on large arrays. For example concurrent rolling medians.
  • This implementation is extensively performance tested in C++ and Python.

There are a some novel features to this implementation:

Credits

Originally written by Paul Ross with credits to: Wilfred Hughes (AHL), Luke Sewell (AHL) and Terry Tsantagoeds (AHL).

Installation

C++

This SkipList requires:

  • A C++11 compiler.
  • -I<skiplist>/src/cpp as an include path.
  • <skiplist>/src/cpp/SkipList.cpp to be compiled/linked.
  • The macro SKIPLIST_THREAD_SUPPORT set if you want a thread safe SkipList using C++ mutexes.

Python

This SkipList supports Python 2.7 and 3.6, 3.7, 3.8, 3.9 (and, probably, some earlier Python 3 versions).

From PyPi

$ pip install orderedstructs

From source

$ git clone https://github.com/paulross/skiplist.git
$ cd <skiplist>
$ python setup.py install

Testing

This SkipList has extensive tests for correctness and performance.

C++

To run all the C++ functional and performance tests:

$ cd <skiplist>/src/cpp
$ make release
$ ./SkipList_R.exe

To run the C++ functional tests with agressive internal integrity checks:

$ cd <skiplist>/src/cpp
$ make debug
$ ./SkipList_D.exe

To run all the C++ functional and performance tests for a thread safe SkipList:

$ cd <skiplist>/src/cpp
$ make release CXXFLAGS=-DSKIPLIST_THREAD_SUPPORT
$ ./SkipList_R.exe

Python

Testing requires pytest and hypothesis:

To run all the C++ functional and performance tests:

$ cd <skiplist>
$ py.test -vs tests/

Examples

Here are some examples of using a SkipList in your code:

C++

#include "SkipList.h"

// Declare with any type that has sane comparison.
OrderedStructs::SkipList::HeadNode<double> sl;

sl.insert(42.0);
sl.insert(21.0);
sl.insert(84.0);

sl.has(42.0) // true
sl.size()    // 3
sl.at(1)     // 42.0, throws OrderedStructs::SkipList::IndexError if index out of range

sl.remove(21.0); // throws OrderedStructs::SkipList::ValueError if value not present

sl.size()    // 2
sl.at(1)     // 84.0

The C++ SkipList is thread safe when compiled with the macro SKIPLIST_THREAD_SUPPORT, then a SkipList can then be shared across threads:

#include <thread>
#include <vector>

#include "SkipList.h"

void do_something(OrderedStructs::SkipList::HeadNode<double> *pSkipList) {
    // Insert/remove items into *pSkipList
    // Read items inserted by other threads.
}

OrderedStructs::SkipList::HeadNode<double> sl;
std::vector<std::thread> threads;

for (size_t i = 0; i < thread_count; ++i) {
    threads.push_back(std::thread(do_something, &sl));
}
for (auto &t: threads) {
    t.join();
}
// The SkipList now contains the totality of the thread actions.

Python

An example of using a SkipList of always ordered floats:

import orderedstructs

# Declare with a type. Supported types are long/float/bytes/object.
sl = orderedstructs.SkipList(float)

sl.insert(42.0)
sl.insert(21.0)
sl.insert(84.0)

sl.has(42.0) # True
sl.size()    # 3
sl.at(1)     # 42.0

sl.has(42.0) # True
sl.size()    # 3
sl.at(1)     # 42.0, raises IndexError if index out of range

sl.remove(21.0); # raises ValueError if value not present

sl.size()    # 2
sl.at(1)     # 84.0

The Python SkipList can be used with user defined objects with a user defined sort order. In this example the last name of the person takes precedence over the first name:

import functools

@functools.total_ordering
class Person:
    """Simple example of ordering based on last name/first name."""
    def __init__(self, first_name, last_name):
        self.first_name = first_name
        self.last_name = last_name

    def __eq__(self, other):
        try:
            return self.last_name == other.last_name and self.first_name == other.first_name
        except AttributeError:
            return NotImplemented

    def __lt__(self, other):
        try:
            return self.last_name < other.last_name or self.first_name < other.first_name
        except AttributeError:
            return NotImplemented

    def __str__(self):
        return '{}, {}'.format(self.last_name, self.first_name)

import orderedstructs

sl = orderedstructs.SkipList(object)

sl.insert(Person('Peter', 'Pan'))
sl.insert(Person('Alan', 'Pan'))
assert sl.size() == 2
assert str(sl.at(0)) == 'Pan, Alan' 
assert str(sl.at(1)) == 'Pan, Peter' 

The Python SkipList is thread safe when using any acceptable Python type even if that type has user defined comparison methods. This uses Pythons mutex machinery which is independent of C++ mutexes.

History

0.3.3 (2021-03-25)

  • Add Python benchmarks, fix some build issues.

0.3.2 (2021-03-18)

  • Fix lambda issues with Python 3.8, 3.9.

0.3.1 (2021-03-17)

  • Support Python 3.7, 3.8, 3.9.

0.3.0 (2017-08-18)

  • Public release.
  • Allows storing of PyObject* and rich comparison.

0.2.0

Python module now named orderedstructs.

0.1.0

Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orderedstructs-0.3.3.tar.gz (33.4 kB view details)

Uploaded Source

Built Distributions

orderedstructs-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl (49.7 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

orderedstructs-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl (49.8 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

orderedstructs-0.3.3-cp37-cp37m-macosx_10_9_x86_64.whl (49.7 kB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

orderedstructs-0.3.3-cp36-cp36m-macosx_10_6_intel.whl (102.8 kB view details)

Uploaded CPython 3.6m macOS 10.6+ intel

File details

Details for the file orderedstructs-0.3.3.tar.gz.

File metadata

  • Download URL: orderedstructs-0.3.3.tar.gz
  • Upload date:
  • Size: 33.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.0

File hashes

Hashes for orderedstructs-0.3.3.tar.gz
Algorithm Hash digest
SHA256 1dadb0f202ebbe1e5ce7fb1305752ff9e937eed6db6af951b0fcd4d65c1838cc
MD5 5ebab7019010e324b12cf4a98d2f21a4
BLAKE2b-256 aa8c3e6318516ef730a909d4713fab21a7d38ee90ea71b18204d8fb3e60b3fc8

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: orderedstructs-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 49.7 kB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.0

File hashes

Hashes for orderedstructs-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 350f85044781865d3264ca77f328cef413ab831c21b9a6ac7ba6f4ccac87913f
MD5 36d61c9a2a3477c2a5419dd6e27ac9fe
BLAKE2b-256 593d8c43868d6cf893bace23b7b33c38a766fd107a77cc95b2e41ea948013405

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: orderedstructs-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 49.8 kB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.0

File hashes

Hashes for orderedstructs-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ff046f21c927cce01b5861226c53527350a4b98e5dd903a63347b1b624aa9a80
MD5 da1e3a409b2f0169758aba7db00fca09
BLAKE2b-256 886c313ab6ce8abb8770f8291efcb5636cc07a9bcfca8e64fc6b245bdd9e0b95

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.3-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: orderedstructs-0.3.3-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 49.7 kB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.0

File hashes

Hashes for orderedstructs-0.3.3-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 5c86a1d9059d90745ff0ec1111cefaddca5c0416e46bb72c4ff4e1da24e739e7
MD5 d27770b3aa72c94cbe69f88591cc23aa
BLAKE2b-256 93eb69575d913f7b8bd0d431e14244a77d332558f4522fdcd0d6adc1b706f247

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.3-cp36-cp36m-macosx_10_6_intel.whl.

File metadata

  • Download URL: orderedstructs-0.3.3-cp36-cp36m-macosx_10_6_intel.whl
  • Upload date:
  • Size: 102.8 kB
  • Tags: CPython 3.6m, macOS 10.6+ intel
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.0

File hashes

Hashes for orderedstructs-0.3.3-cp36-cp36m-macosx_10_6_intel.whl
Algorithm Hash digest
SHA256 922cb288c3eb872144001bb13d647522b1d7c1ccb36f9af3d58b2fb67a9b9182
MD5 7fcfafc0da4dd9c55854b044702c4671
BLAKE2b-256 be6cf3983393335dbcd607abf6c9e14839f157e11268ce14b51fd0d0484fff43

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page