Immutable Log-Balanced Search Tree
Project description
lbst - Immutable Log-Balanced Search Tree
prototype: With CPython, in so far, it is faster to use LBST with 1000+ items. wall-clock time benchmarks show that this datastructure becomes interesting with PyPy 3.7 with 100+ items.
Benchmarks
Higher is better, less than one means copying is faster.
Small number of items = 20
cpython: ▇▇▇▇▇▇▇ 0.039
pypy : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 0.262
Large number of items = 1000
cpython: ▇▇ 0.9489
pypy : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 20.15
Kesako a Log-Balanced Search Tree?
-
A search tree is a dictionary that preserves ordering according to an user specified function, also known under the name sorted dictionary.
-
In the original description of the algorithm, LBST used a logarithm function to decide how to balance the tree.
Kesako an immutable datastructure?
Immutable datastructures, also known under the name persistent datastructures are datastructures that will produce new values instead of changing, mutating, the value in-place.
Immutable datastructures are useful in situations where you need to keep around previous versions of the data to have an history to ease debugging or to implement an undo feature such as in editors; another way immutable datastructures can be put to good use is to keep the data consistent in a concurrent or parallel programming setting, while a flow of executition, the writer, produce a new version of the datastructure, the readers still have access the previous version of the truth without requiring readers to wait for the writer to finish achieving single-writer / multiple readers without locks.
When is an immutable datastructure useful?
Anytime you copy a big data structure, you may instead use an immutable datastructure.
What is the difference with dict
insertion order?
Python builtin dict
are sorted according to the time of insertion,
if "z"
is added to a dictionary first, then "a"
is added, then the
dictionary will have the keys in the following order ["z", ..., "a"]
. That is not always the best approach performance-wise.
The following kind of code is a hint that you may use LBST:
frob = dict()
frob[key1] = value1
frob[key2] = value2
frob[key3] = value3
...
# then re-create the dictionary with an order given by `mykeyfunc`:
frob = {k: v for k in sorted(frob.keys(), key=mykeyfunc)
That is somekind of copy, see the previous hint, that re-orders the
dictionary keys according to mykeyfunc
in order for instance to
speed up linear lookup. Using LBST, you can build a large
dictionary that is sorted at construction time, save a few cycles by
avoding a reconstruction, duplicated effort, copies, and keep the
overall wall-clock time under control.
lbst.make()
Return an immutable search tree, ordered according to Python builtin
rich comparison, that can be overriden in user created types with the
method called
__lt__
.
lbst.set(tree, key, value)
Return a tree based on tree
where key
is associated with
value
.
lbst.get(tree, key, default=None)
Return the value associated with key
in tree
. If key
is not
present in tree
, returns default
.
lbst.delete(tree, key)
Return a tree based on tree
where key
is not present.
lbst.size(tree)
Return the size of tree
.
lbst.start(tree)
Return the smallest key present in tree
.
lbst.end(tree)
Return the biggest key present in tree
.
lbst.cursor(tree)
Return a cursor for tree
. The initial position is not specified. A
cursor is stateful: its position is changed / mutated in-place.
lbst.cursor_clone(cursor)
Return a cursor at the same position as cursor
that does not share
state with cursor
.
lbst.cursor_seek(cursor, key)
Position cursor
near key
. If key
is present in the tree
associated with cursor
, then the cursor will be exactly positioned
on key
and lbst.cursor_seek
will return 0
. Otherwise, when key
is not present in the tree associated with cursor
, there is two
cases: 1) if the cursor is positioned after, then lbst.cursor_seek
returns 1
, and 2) if the cursor is positioned before, then
lbst.cursor_seek
return -1
.
In other words, lbst.cursor_seek
:
- Return
-1
, thencursor
is beforekey
; - Return
0
, thencursor
is onkey
; - Return
1
, thencursor
is afterkey
.
lbst.cursor_key(cursor)
Return the key associated with cursor
.
lbst.cursor_value(cursor)
Return the value associated with cursor
.
lbst.cursor_next(cursor)
Move cursor
to the next position, that is a bigger key that is just
after the current key. Returns False
if cursor
reached the end of
the key space i.e. there is no next key. Otherwise, it returns
True
.
lbst.cursor_previous(cursor)
Move cursor
to the previous position, that is a smaller key that is
just before the current key. Returns False
if cursor
reached the start of
the key space i.e. there is no previous key. Otherwise, it returns
True
.
lbst.to_dict(tree)
Return a dict
representation of tree
. The returned dict
has the
keys in the same order as tree
.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file lbst-0.2.0.tar.gz
.
File metadata
- Download URL: lbst-0.2.0.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.9.7 Linux/5.13.0-28-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 252b2a99086b0b883a653fedb689221e5b1725c94ce54cd8fdd36825862d7d8c |
|
MD5 | 16782cc2a2e16da2eb510ce05a09756b |
|
BLAKE2b-256 | 5ffb6433090f97f1583f493ac0df14050cef735dc6c51137c16758f87bc457d6 |
File details
Details for the file lbst-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: lbst-0.2.0-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.9.7 Linux/5.13.0-28-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1c62f43012d6e32a8681f057c6ec394db23b9bcd0643cf6e0cca66c97f4c099 |
|
MD5 | 54deb47ab76c28ad6b66a07ae3b3dc45 |
|
BLAKE2b-256 | 0bb5cbf7c5b57b3c03f69f92e05a07a717c1c58056da2c9975f4a237d2f8aab8 |