Skip to main content

Contains a variety of ordered structures, in particular a SkipList.

Project description

SkipList

This project contains a SkipList implementation in C++ with Python bindings.

A SkipList behaves as an always sorted list with, typically, O(log(n)) cost for insertion, look-up and removal. This makes it ideal for such operations as computing the rolling median of a large dataset.

See the full documentation on this project at ReadTheDocs.

A SkipList is implemented as a singly linked list of ordered nodes where each node participates in a subset of, sparser, linked lists. These additional 'sparse' linked lists provide rapid indexing and mutation of the underlying linked list. It is a probabilistic data structure using a random function to determine how many 'sparse' linked lists any particular node participates in. As such SkipList is an alternative to binary tree, Wikipedia has a introductory page on SkipLists.

An advantage claimed for SkipLists are that the insert and remove logic is simpler (however I do not subscribe to this). The drawbacks of a SkipList include its larger space requirements and its O(log(N)) lookup behaviour compared to other, more restricted and specialised, data structures that may have either faster runtime behaviour or lower space requirements or both.

This project contains a SkipList implementation in C++ with Python bindings with:

  • No capacity restrictions apart from available memory.
  • Works with any C++ type that has meaningful comparison operators.
  • The C++ SkipList can be compiled as thread safe.
  • The Python SkipList is thread safe.
  • The SkipList has exhaustive internal integrity checks.
  • Python SkipLists can be long/float/bytes/object types, the latter can have user defined comparison functions.
  • With Python 3.8+ SkipLists can be combined with the multiprocessing.shared_memory module for concurrent operation on large arrays. For example concurrent rolling medians which speed up near linearly with the number of cores.
  • The implementation is extensively performance tested in C++ and Python.

There are a some novel features to this implementation:

Credits

Originally written by Paul Ross with credits to: Wilfred Hughes (AHL), Luke Sewell (AHL) and Terry Tsantagoeds (AHL).

Installation

C++

This SkipList requires:

  • A C++11 compiler.
  • -I<skiplist>/src/cpp as an include path.
  • <skiplist>/src/cpp/SkipList.cpp to be compiled/linked.
  • The macro SKIPLIST_THREAD_SUPPORT set if you want a thread safe SkipList using C++ mutexes.

Python

This SkipList version supports Python 3.6, 3.7, 3.8, 3.9 (and, probably, some earlier Python 3 versions). Earlier versions supported Python 2.7, this version might still do that.

From PyPi

$ pip install orderedstructs

From source

$ git clone https://github.com/paulross/skiplist.git
$ cd <skiplist>
$ python setup.py install

Testing

This SkipList has extensive tests for correctness and performance.

C++

To run all the C++ functional and performance tests:

$ cd <skiplist>/src/cpp
$ make release
$ ./SkipList_R.exe

To run the C++ functional tests with agressive internal integrity checks:

$ cd <skiplist>/src/cpp
$ make debug
$ ./SkipList_D.exe

To run all the C++ functional and performance tests for a thread safe SkipList:

$ cd <skiplist>/src/cpp
$ make release CXXFLAGS=-DSKIPLIST_THREAD_SUPPORT
$ ./SkipList_R.exe

Python

Testing requires pytest and hypothesis:

To run all the C++ functional and performance tests:

$ cd <skiplist>
$ py.test -vs tests/

Examples

Here are some examples of using a SkipList in your code:

C++

#include "SkipList.h"

// Declare with any type that has sane comparison.
OrderedStructs::SkipList::HeadNode<double> sl;

sl.insert(42.0);
sl.insert(21.0);
sl.insert(84.0);

sl.has(42.0) // true
sl.size()    // 3
sl.at(1)     // 42.0, throws OrderedStructs::SkipList::IndexError if index out of range

sl.remove(21.0); // throws OrderedStructs::SkipList::ValueError if value not present

sl.size()    // 2
sl.at(1)     // 84.0

The C++ SkipList is thread safe when compiled with the macro SKIPLIST_THREAD_SUPPORT, then a SkipList can then be shared across threads:

#include <thread>
#include <vector>

#include "SkipList.h"

void do_something(OrderedStructs::SkipList::HeadNode<double> *pSkipList) {
    // Insert/remove items into *pSkipList
    // Read items inserted by other threads.
}

OrderedStructs::SkipList::HeadNode<double> sl;
std::vector<std::thread> threads;

for (size_t i = 0; i < thread_count; ++i) {
    threads.push_back(std::thread(do_something, &sl));
}
for (auto &t: threads) {
    t.join();
}
// The SkipList now contains the totality of the thread actions.

Python

An example of using a SkipList of always ordered floats:

import orderedstructs

# Declare with a type. Supported types are long/float/bytes/object.
sl = orderedstructs.SkipList(float)

sl.insert(42.0)
sl.insert(21.0)
sl.insert(84.0)

sl.has(42.0) # True
sl.size()    # 3
sl.at(1)     # 42.0

sl.has(42.0) # True
sl.size()    # 3
sl.at(1)     # 42.0, raises IndexError if index out of range

sl.remove(21.0); # raises ValueError if value not present

sl.size()    # 2
sl.at(1)     # 84.0

The Python SkipList can be used with user defined objects with a user defined sort order. In this example the last name of the person takes precedence over the first name:

import functools

@functools.total_ordering
class Person:
    """Simple example of ordering based on last name/first name."""
    def __init__(self, first_name, last_name):
        self.first_name = first_name
        self.last_name = last_name

    def __eq__(self, other):
        try:
            return self.last_name == other.last_name and self.first_name == other.first_name
        except AttributeError:
            return NotImplemented

    def __lt__(self, other):
        try:
            return self.last_name < other.last_name or self.first_name < other.first_name
        except AttributeError:
            return NotImplemented

    def __str__(self):
        return '{}, {}'.format(self.last_name, self.first_name)

import orderedstructs

sl = orderedstructs.SkipList(object)

sl.insert(Person('Peter', 'Pan'))
sl.insert(Person('Alan', 'Pan'))
assert sl.size() == 2
assert str(sl.at(0)) == 'Pan, Alan' 
assert str(sl.at(1)) == 'Pan, Peter' 

The Python SkipList is thread safe when using any acceptable Python type even if that type has user defined comparison methods. This uses Pythons mutex machinery which is independent of C++ mutexes.

History

0.3.4 (2021-04-28)

  • Improve documentation mainly around multiprocessing.shared_memory and tests.

0.3.3 (2021-03-25)

  • Add Python benchmarks, fix some build issues.

0.3.2 (2021-03-18)

  • Fix lambda issues with Python 3.8, 3.9.

0.3.1 (2021-03-17)

  • Support Python 3.7, 3.8, 3.9.

0.3.0 (2017-08-18)

  • Public release.
  • Allows storing of PyObject* and rich comparison.

0.2.0

Python module now named orderedstructs.

0.1.0

Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orderedstructs-0.3.4.tar.gz (33.5 kB view details)

Uploaded Source

Built Distributions

orderedstructs-0.3.4-cp39-cp39-macosx_10_9_x86_64.whl (49.8 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

orderedstructs-0.3.4-cp38-cp38-macosx_10_9_x86_64.whl (49.9 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

orderedstructs-0.3.4-cp37-cp37m-macosx_10_9_x86_64.whl (49.8 kB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

orderedstructs-0.3.4-cp36-cp36m-macosx_10_6_intel.whl (103.0 kB view details)

Uploaded CPython 3.6m macOS 10.6+ intel

File details

Details for the file orderedstructs-0.3.4.tar.gz.

File metadata

  • Download URL: orderedstructs-0.3.4.tar.gz
  • Upload date:
  • Size: 33.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.0

File hashes

Hashes for orderedstructs-0.3.4.tar.gz
Algorithm Hash digest
SHA256 3844e102755f41e608555a33fce516a3c013955591c2530607f9e130a04584ba
MD5 7fa7ece6760494bb660e64b492f19e48
BLAKE2b-256 df6d37802d0da385187d1bfbb04d063cad30d3814e244e4efc320e479497b950

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.4-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: orderedstructs-0.3.4-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 49.8 kB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.0

File hashes

Hashes for orderedstructs-0.3.4-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 3bc8e8a0efc0ecdb364c8215c7c80136661dfe58bded20478c9bc61a4bff6d20
MD5 018a2806bff22825daa66509ffa9c4a7
BLAKE2b-256 3ef1de2f1dbd08e0e8104b3e4feb652f0e376757aaa8a1303df7fa2931a03373

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.4-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: orderedstructs-0.3.4-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 49.9 kB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.0

File hashes

Hashes for orderedstructs-0.3.4-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 98ac6800f16b17da03ae36d8e99f7dd07c4191bce9d3b424fa3ceede5ff63fed
MD5 dc1b77904aa7c08ca21cbd5c8b69512f
BLAKE2b-256 5eb02aeeb28ee0102dc7b03ee55ae831d6aa6868c9d45f70f9d4c3c545a21c6d

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.4-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: orderedstructs-0.3.4-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 49.8 kB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.0

File hashes

Hashes for orderedstructs-0.3.4-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 cc0ad4dad6dfc176025388f3a0a0e2adfc16fd880e1395c81f71183ea46b7786
MD5 69eb0d1b28b13c695520cf84f9eab369
BLAKE2b-256 055045cf20f5af7f259878b4e3cecf8d4047f9f0ac0c3a51683522fe161eb618

See more details on using hashes here.

File details

Details for the file orderedstructs-0.3.4-cp36-cp36m-macosx_10_6_intel.whl.

File metadata

  • Download URL: orderedstructs-0.3.4-cp36-cp36m-macosx_10_6_intel.whl
  • Upload date:
  • Size: 103.0 kB
  • Tags: CPython 3.6m, macOS 10.6+ intel
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.0

File hashes

Hashes for orderedstructs-0.3.4-cp36-cp36m-macosx_10_6_intel.whl
Algorithm Hash digest
SHA256 ef0703549bff01936c6078e6e39bab24006c70325e767a3a6e6b962bb1b21f0c
MD5 81b4379e90609a0af73e1d58218e0ef8
BLAKE2b-256 48417ae1ce3c6bd6eee0f073345b3ad0c9e5082e8a2724d9cdaa564a8d2188fd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page