Skip to main content

Fast, pure Python indexable skip list

Project description

https://travis-ci.org/geertj/pyskiplist.svg?branch=master https://coveralls.io/repos/geertj/pyskiplist/badge.svg?branch=master

PySkipList is a fast, pure Python implementation of an indexable skiplist. It implements a SkipList data structure that provides an always sorted, list-like data structure for (key, value) pairs. It efficiently supports the following operations:

  • Insert a pair in the list, maintaining sorted order.

  • Find the value of a given key.

  • Remove a given pair based on a key.

  • Iterate over all pairs in sorted order.

  • Find the position of a given key.

  • Access a pair at a certain position.

  • Delete a pair at a certain position.

Since PySkipList is a pure Python implementation, it should work well on alternative Python implementations such as PyPy and Jython.

Example

The following provides a few examples on how to use the SkipList API:

>>> from pyskiplist import SkipList
>>> sl = SkipList()
>>> sl.insert('foo', 'bar')
>>> sl.insert('baz', 'qux')
>>> sl
SkipList((('baz', 'qux'), ('foo', 'bar')))
>>> sl.search('foo')
'bar'
>>> sl[0]
('baz', 'qux')
>>> sl.remove('foo')  # remove by key
>>> del sl[0]  # remove by position

Asymptotic Complexity

Below are the Big-O complexities of the various operations implemented by pyskiplist:

Operation

Complexity

insertion

O(log N)

search by key

O(log N)

removal by key

O(log N)

forward iteration

O(1)

find by position

O(log N)

access by position

O(log N)

delete by position

O(log N)

Performance

Below are the results of some performance tests. These are for Python 3.4.2 on my Linux laptop:

Test

Operations / second

Insert @ 1k nodes

45,056

Insert @ 10k nodes

42,137

Insert @ 100k nodes

28,086

Remove @ 1k nodes

54,316

Remove @ 10k nodes

46,240

Remove @ 100k nodes

35,114

Search @ 1k nodes

137,248

Search @ 10k nodes

109,480

Search @ 100k nodes

77,939

Memory usage

PySkipList tries to be efficient with regards to memory usage. The numbers below are for Python 3.4.2 on my Linux laptop. This specific test stores pairs of integer keys and an integer values in a skiplist. The total size of the two integers on this Python version is 56 bytes.

Nodes

Bytes / node

Overhead (fixed)

1k

164

108

10k

162

106

100k

162

106

Implementation notes

Reference papers on skiplists:

This implementation uses a novel (as far as I know) technique where it stores just a single link width per node, and only in nodes with level > 0. The link corresponds to the number of nodes skipped by the highest incoming link. Other implementations that I’ve seen all store a width for every link. This approach saves a lot of memory. The overhead should just be 1/e (0.37) integers per node. It makes an indexable skiplist almost as memory efficient as its non-indexable cousin.

Duplicate keys are allowed in this implementation, and insertion order is maintained.

Skiplist nodes are implemented as plain lists instead of objects. This saves memory. Kudos to http://pythonsweetness.tumblr.com/post/45227295342 for the idea.

The built-in Mersenne Twister is used as the random source. This is preferable over SystemRandom since it doesn’t require a system call and there is no need for cryptographically secure numbers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyskiplist-1.0.0.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

pyskiplist-1.0.0-py2.py3-none-any.whl (8.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pyskiplist-1.0.0.tar.gz.

File metadata

  • Download URL: pyskiplist-1.0.0.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pyskiplist-1.0.0.tar.gz
Algorithm Hash digest
SHA256 2296f67e6591e01c57ea1837aa7ae722fa8f4701a38069cc33ffd3e56d807ffa
MD5 bfee80f564d04937315dd6cbce9c75f7
BLAKE2b-256 eb938263fe0e391f65f9a9ade497ff9a165f4fc9559a84aa3db15190a85e768b

See more details on using hashes here.

File details

Details for the file pyskiplist-1.0.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for pyskiplist-1.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5fba901abf577538f4e97e909a1b629c63550cef8c9205c2414c822a3fc7a458
MD5 af517550031d59489edaf3161d39209c
BLAKE2b-256 31db576c89e0f78a7b29d62633548edf7ef871f3dc62d680a315e4f49d06f0db

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page