Contains a variety of ordered structures, in particular a SkipList.
Project description
SkipList
This project contains a SkipList implementation in C++ with Python bindings.
A SkipList behaves as an always sorted list with, typically, O(log(n)) cost for insertion, look-up and removal. This makes it ideal for such operations as computing the rolling median of a large dataset.
See the full documentation on this project at ReadTheDocs.
A SkipList is implemented as a singly linked list of ordered nodes where each node participates in a subset of, sparser, linked lists. These additional 'sparse' linked lists provide rapid indexing and mutation of the underlying linked list. It is a probabilistic data structure using a random function to determine how many 'sparse' linked lists any particular node participates in. As such SkipList is an alternative to binary tree, Wikipedia has a introductory page on SkipLists.
An advantage claimed for SkipLists are that the insert and remove logic is simpler (however I do not subscribe to this). The drawbacks of a SkipList include its larger space requirements and its O(log(N)) lookup behaviour compared to other, more restricted and specialised, data structures that may have either faster runtime behaviour or lower space requirements or both.
This project contains a SkipList implementation in C++ with Python bindings with:
- No capacity restrictions apart from available memory.
- Works with any C++ type that has meaningful comparison operators.
- The C++ SkipList can be compiled as thread safe.
- The Python SkipList is thread safe.
- The SkipList has exhaustive internal integrity checks.
- Python SkipLists can be long/float/bytes/object types, the latter can have user defined comparison functions.
- With Python 3.8+ SkipLists can be combined with the multiprocessing.shared_memory module for concurrent operation on large arrays. For example concurrent rolling medians which speed up near linearly with the number of cores.
- The implementation is extensively performance tested in C++ and Python.
There are a some novel features to this implementation:
- A SkipList is a probabilistic data structure but we have deterministic tests that work for any (sane) random number generator. See Testing a Probabilistic Structure
- This SkipList can dynamically generate visualisations of its current internal state. See Visualising a Skip List
Credits
Originally written by Paul Ross with credits to: Wilfred Hughes (AHL), Luke Sewell (AHL) and Terry Tsantagoeds (AHL).
Installation
C++
This SkipList requires:
- A C++11 compiler.
-I<skiplist>/src/cpp
as an include path.<skiplist>/src/cpp/SkipList.cpp
to be compiled/linked.- The macro
SKIPLIST_THREAD_SUPPORT
set if you want a thread safe SkipList using C++ mutexes.
Python
This SkipList version supports Python 3.7, 3.8, 3.9, 3.10, 3.11 (and, probably, some earlier Python 3 versions).
From PyPi
$ pip install orderedstructs
From source
$ git clone https://github.com/paulross/skiplist.git
$ cd <skiplist>
$ python setup.py install
Testing
This SkipList has extensive tests for correctness and performance.
C++
To run all the C++ functional and performance tests:
$ cd <skiplist>/src/cpp
$ make release
$ ./SkipList_R.exe
To run the C++ functional tests with agressive internal integrity checks:
$ cd <skiplist>/src/cpp
$ make debug
$ ./SkipList_D.exe
To run all the C++ functional and performance tests for a thread safe SkipList:
$ cd <skiplist>/src/cpp
$ make release CXXFLAGS=-DSKIPLIST_THREAD_SUPPORT
$ ./SkipList_R.exe
Python
Testing requires pytest
and hypothesis
:
To run all the C++ functional and performance tests:
$ cd <skiplist>
$ py.test -vs tests/
Examples
Here are some examples of using a SkipList in your code:
C++
#include "SkipList.h"
// Declare with any type that has sane comparison.
OrderedStructs::SkipList::HeadNode<double> sl;
sl.insert(42.0);
sl.insert(21.0);
sl.insert(84.0);
sl.has(42.0) // true
sl.size() // 3
sl.at(1) // 42.0, throws OrderedStructs::SkipList::IndexError if index out of range
sl.remove(21.0); // throws OrderedStructs::SkipList::ValueError if value not present
sl.size() // 2
sl.at(1) // 84.0
The C++ SkipList is thread safe when compiled with the macro SKIPLIST_THREAD_SUPPORT
, then a SkipList can then be shared across threads:
#include <thread>
#include <vector>
#include "SkipList.h"
void do_something(OrderedStructs::SkipList::HeadNode<double> *pSkipList) {
// Insert/remove items into *pSkipList
// Read items inserted by other threads.
}
OrderedStructs::SkipList::HeadNode<double> sl;
std::vector<std::thread> threads;
for (size_t i = 0; i < thread_count; ++i) {
threads.push_back(std::thread(do_something, &sl));
}
for (auto &t: threads) {
t.join();
}
// The SkipList now contains the totality of the thread actions.
Python
An example of using a SkipList of always ordered floats:
import orderedstructs
# Declare with a type. Supported types are long/float/bytes/object.
sl = orderedstructs.SkipList(float)
sl.insert(42.0)
sl.insert(21.0)
sl.insert(84.0)
sl.has(42.0) # True
sl.size() # 3
sl.at(1) # 42.0
sl.has(42.0) # True
sl.size() # 3
sl.at(1) # 42.0, raises IndexError if index out of range
sl.remove(21.0); # raises ValueError if value not present
sl.size() # 2
sl.at(1) # 84.0
The Python SkipList can be used with user defined objects with a user defined sort order. In this example the last name of the person takes precedence over the first name:
import functools
@functools.total_ordering
class Person:
"""Simple example of ordering based on last name/first name."""
def __init__(self, first_name, last_name):
self.first_name = first_name
self.last_name = last_name
def __eq__(self, other):
try:
return self.last_name == other.last_name and self.first_name == other.first_name
except AttributeError:
return NotImplemented
def __lt__(self, other):
try:
return self.last_name < other.last_name or self.first_name < other.first_name
except AttributeError:
return NotImplemented
def __str__(self):
return '{}, {}'.format(self.last_name, self.first_name)
import orderedstructs
sl = orderedstructs.SkipList(object)
sl.insert(Person('Peter', 'Pan'))
sl.insert(Person('Alan', 'Pan'))
assert sl.size() == 2
assert str(sl.at(0)) == 'Pan, Alan'
assert str(sl.at(1)) == 'Pan, Peter'
The Python SkipList is thread safe when using any acceptable Python type even if that type has user defined comparison methods. This uses Pythons mutex machinery which is independent of C++ mutexes.
History
0.3.11 (2023-08-28)
- Minor fixes missed in 0.3.10 release.
0.3.10 (2023-08-28)
- Minor fixes missed in 0.3.9 release.
0.3.9 (2023-08-28)
- Minor fixes missed in 0.3.8 release.
0.3.8 (2023-08-28)
- Add cmake build.
- Support for Python 3.7, 3.8, 3.9, 3.10, 3.11.
- Remove mention of Python 2.
0.3.7 (2021-12-18)
- Fix build on GCC (Issue #8).
0.3.6 (2021-12-18)
- Add documentation on NaN in rolling median.
- Add plots using shared_memory.
- Add Python 3.10 support.
0.3.5 (2021-05-02)
- Fix uncaught exception when trying to remove a NaN.
0.3.4 (2021-04-28)
- Improve documentation mainly around multiprocessing.shared_memory and tests.
0.3.3 (2021-03-25)
- Add Python benchmarks, fix some build issues.
0.3.2 (2021-03-18)
- Fix lambda issues with Python 3.8, 3.9.
0.3.1 (2021-03-17)
- Support Python 3.7, 3.8, 3.9.
0.3.0 (2017-08-18)
- Public release.
- Allows storing of
PyObject*
and rich comparison.
0.2.0
Python module now named orderedstructs
.
0.1.0
Initial release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file orderedstructs-0.3.11.tar.gz
.
File metadata
- Download URL: orderedstructs-0.3.11.tar.gz
- Upload date:
- Size: 33.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a937e49395e554012dd94c97db75993f5ca4e1c0a209047d0ffce7c98c5f5924 |
|
MD5 | 7c93fa4c1d65c6844dbbb6b8888366bb |
|
BLAKE2b-256 | 18391284cb7ad5e5aca73482b7b278ba27bcbd48e2aeacaf79713a6891d9beef |
File details
Details for the file orderedstructs-0.3.11-cp311-cp311-macosx_10_9_universal2.whl
.
File metadata
- Download URL: orderedstructs-0.3.11-cp311-cp311-macosx_10_9_universal2.whl
- Upload date:
- Size: 92.0 kB
- Tags: CPython 3.11, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0e12f94d41b32f807b14ddd807060f64d3e871179dee68b91ed93995c792dd6 |
|
MD5 | 593cbc26c70201940d4fbb41b55a5e85 |
|
BLAKE2b-256 | b40831b5e4c00be01f74b1b824037fd5872cf89008243666e9cdfc8c6ea8abeb |
File details
Details for the file orderedstructs-0.3.11-cp310-cp310-macosx_10_9_universal2.whl
.
File metadata
- Download URL: orderedstructs-0.3.11-cp310-cp310-macosx_10_9_universal2.whl
- Upload date:
- Size: 92.0 kB
- Tags: CPython 3.10, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | edac81768eddf3ee8d803e1badba1e9dc8cb089069dd74f27f90d8b403e1d7d7 |
|
MD5 | 36220e452bd015245b1a331b130f0599 |
|
BLAKE2b-256 | 3f45f42cefd999e88979df9fba63a34dbae2b7c1bf0c04572336a2d72e878ac4 |
File details
Details for the file orderedstructs-0.3.11-cp39-cp39-macosx_10_9_universal2.whl
.
File metadata
- Download URL: orderedstructs-0.3.11-cp39-cp39-macosx_10_9_universal2.whl
- Upload date:
- Size: 92.0 kB
- Tags: CPython 3.9, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d78bbb1ec3bed94cdfc3227717847beaa71e5f7c4d2c864096f3c424ca7d24a9 |
|
MD5 | b96a5f7a3002e42b3b8748145ac5606d |
|
BLAKE2b-256 | d35ef51ad5f0297d965a5136bbcd3e958ef680b61e47ad5e3b8fb21735977936 |
File details
Details for the file orderedstructs-0.3.11-cp38-cp38-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: orderedstructs-0.3.11-cp38-cp38-macosx_10_9_x86_64.whl
- Upload date:
- Size: 49.6 kB
- Tags: CPython 3.8, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4bb50d3edc450e9601c38100bb36be5a485e3780481a91bb7d93215d717687d |
|
MD5 | 66621162b3561ef1c28109c45164c436 |
|
BLAKE2b-256 | 65b280be83b778304c49daa01cf081480ce30fad7e8162ed95d1b86de9b730ee |
File details
Details for the file orderedstructs-0.3.11-cp37-cp37m-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: orderedstructs-0.3.11-cp37-cp37m-macosx_10_9_x86_64.whl
- Upload date:
- Size: 49.7 kB
- Tags: CPython 3.7m, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 373c7e4b4ff6cf88fcdfbfd0bb09de60f7e3bbd695d359dfe551806edaf44eac |
|
MD5 | 171810f00b61632d26eecd6d60a21d94 |
|
BLAKE2b-256 | ca6c56b76c9137fc3e54977a9f188746b90f2326d15a9aa38dcbad2fcf275911 |