An immutable, copy-on-write (COW) tree-based sorted dictionary with order-statistic operations
Project description
cowtreedict
An immutable, copy-on-write (COW) sorted dictionary for Python with O(log n) order-statistic operations.
Why This Exists
No Python package on PyPI provides all four of these properties together:
| Library | Persistent | Sorted | Order-statistic | Hashable |
|---|---|---|---|---|
dict / frozenset |
no / yes | no | no | no / yes |
pyrsistent.PMap |
yes | no | no | yes |
sortedcontainers.SortedDict |
no | yes | yes | no |
immutables.Map (EdgeDB) |
yes | no | no | no |
cowtreedict.COWTreeDict |
yes | yes | yes | yes |
COWTreeDict fills this gap. If you need a sorted dict that is also immutable, hashable, and supports positional access — this is it.
Motivation
sortedcontainersgives you sorted + order statistics, but it's mutable. Sharing state across versions requires explicit deep copies — O(n) per snapshot.pyrsistentgives you persistence + hashability, but it's hash-based and unordered. No sorted iteration, no "what's the k-th smallest key?".immutables(from the EdgeDB team) gives you a fast persistent mapping, but again unsorted and not even hashable.
If you want persistent + sorted + positional access + hashable in Python, you had to roll your own. Now you don't.
Languages like Clojure (sorted-map), Scala (TreeMap), and Haskell (Data.Map) have had this in their standard libraries for years. Python didn't — until now.
Installation
pip install cowtreedict
Example Usage
from cowtreedict import COWTreeDict
# Construction (same interface as frozen_odict() plus lt)
d = COWTreeDict({'c': 3, 'a': 1, 'b': 2})
# Keys are always sorted
list(d) # ['a', 'b', 'c']
# Mutations return new instances — the original is never modified
d2 = d.set('d', 4)
d3 = d2.delete('a')
list(d) # ['a', 'b', 'c'] — unchanged
list(d3) # ['b', 'c', 'd']
# Order-statistic operations in O(log n)
d.select(0) # 'a' — smallest key
d.select(-1) # 'c' — largest key
d.rank('b') # 1 — number of keys < 'b'
d.peekitem(0) # ('a', 1) — item at position 0
d.peekitem(-1) # ('c', 3) — last item
# Range iteration in O(log n + k)
list(d.irange('a', 'b')) # ['a', 'b']
list(d.irange('a', 'c', inclusive=(False, False))) # ['b']
# Hashable — can be used as dict keys or in sets
cache = {d: 'result'}
cache[COWTreeDict({'a': 1, 'b': 2, 'c': 3})] # 'result'
# Structural sharing minimizes memory
# d and d2 share all nodes except those on the path to 'd'
Use Cases
- Functional-style programming — chain transformations without defensive copying.
- Undo/redo — keep a stack of previous versions at near-zero cost.
- Event sourcing / temporal versioning — each event produces a new version; all versions are queryable.
- Concurrent reads — no locks needed since instances are immutable.
- Sorted dicts as dict keys or set elements — because
COWTreeDictis hashable. - Competitive programming — O(log n) rank and select without pulling in C extensions.
API Reference
Construction
| Method | Description | Time |
|---|---|---|
COWTreeDict() |
Empty dict | O(1) |
COWTreeDict(mapping) |
From another mapping | O(n log n) |
COWTreeDict(iterable) |
From (key, value) pairs | O(n log n) |
COWTreeDict(mapping, lt=func) |
Custom comparison order | O(n log n) |
COW-Mutating Methods
All return a new instance. The original is never modified.
| Method | Description | Time |
|---|---|---|
d.set(key, value) |
Insert or replace a key | O(log n) |
d.delete(key) |
Remove a key (raises KeyError if absent) |
O(log n) |
d.discard(key) |
Remove a key (returns self if absent) | O(log n) |
d.update(mapping_or_pairs, **kw) |
Merge mappings | O(m log(n+m)) |
d.pop(key[, default]) |
Returns (new_dict, value) |
O(log n) |
d.setdefault(key, val) |
Returns (new_or_same_dict, value) |
O(log n) |
d.popitem(index=-1) |
Returns (new_dict, key, value) by position |
O(log n) |
d.clear() |
Returns empty instance | O(1) |
Mapping (Read) Methods
| Method | Description | Time |
|---|---|---|
d[key] |
Get value by key | O(log n) |
key in d |
Membership test | O(log n) |
len(d) |
Number of items | O(1) |
iter(d) |
Iterate keys in sorted order | O(n) |
reversed(d) |
Iterate keys in reverse sorted order | O(n) |
d.keys() |
Sorted keys view | O(1) |
d.values() |
Values in key-sorted order view | O(1) |
d.items() |
Sorted (key, value) pairs view | O(1) |
d.get(key[, default]) |
Get with default (inherited from Mapping) | O(log n) |
d == other |
Equality comparison | O(n) |
hash(d) |
Hash (cached after first call) | O(n) first, O(1) after |
Order-Statistic Operations
| Method | Description | Time |
|---|---|---|
d.select(index) |
Key at sorted position (negative indexing supported) | O(log n) |
d.rank(key) |
Number of keys strictly less than key | O(log n) |
d.peekitem(index=-1) |
(key, value) at sorted position |
O(log n) |
d.bisect_left(key) |
Alias for rank() |
O(log n) |
d.bisect_right(key) |
Number of keys ≤ key | O(log n) |
Range Operations
| Method | Description | Time |
|---|---|---|
d.irange(min, max, inclusive=(True, True)) |
Iterate keys in range | O(log n + k) |
d.irange_items(min, max, inclusive=(True, True)) |
Iterate (key, value) in range | O(log n + k) |
How It Works
COWTreeDict is backed by an AVL tree augmented with subtree sizes for O(log n) positional access.
Path copying provides persistence: when a key is inserted or deleted, only the nodes on the path from the root to the affected leaf are copied (O(log n) nodes). All other nodes are shared between the old and new tree.
Original: After d.set('f', 6):
c c'
/ \ / \
a e → a e' (only c, e, f are new nodes)
/ \ / \
d (nil) d f (a and d are shared)
This means:
- Each mutation allocates only O(log n) new node objects.
- Old versions remain valid and accessible with zero overhead.
- Memory usage grows linearly with the number of distinct mutations, not with the total number of versions.
Part of the COW Family
COWTreeDict is designed alongside cowlist to form a coherent family of persistent Python collections:
| Package | Type | Backing structure |
|---|---|---|
cowlist |
Sequence |
Copy-on-write list with O(1) slicing |
cowtreedict |
Mapping |
AVL tree with path copying |
Both are immutable, hashable, and share the same conventions (__new__-only construction, cached tuplehash, COW mutation methods returning new instances).
Dependencies
tuplehash— Pure Python reimplementation of CPython'stuplehashfor computing hash values without materializing intermediate tuples.
Contributing
Contributions are welcome! Please submit pull requests or open issues on the GitHub repository.
License
This project is licensed under the Apache License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cowtreedict-0.1.0a0.tar.gz.
File metadata
- Download URL: cowtreedict-0.1.0a0.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0efb03eb264ec68b17ab98a6deac826ea2044cb104ad2516f87aeab8a8681d0d
|
|
| MD5 |
8c246f67639cf88d20b4d584e621dc05
|
|
| BLAKE2b-256 |
fe9b0ceaf343d5495837de3ac7cbdc35db6dea0b0d41e6bb1e7d8b40e40caa76
|
File details
Details for the file cowtreedict-0.1.0a0-py2.py3-none-any.whl.
File metadata
- Download URL: cowtreedict-0.1.0a0-py2.py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd553e7f6e110560a1a3a80201f6b25a2fccc74fe33e7b0dacbe2a9d98727dbb
|
|
| MD5 |
a07c632ed7fffbfa970604836dedca8e
|
|
| BLAKE2b-256 |
5f46529f3a10cbc617969f32ae00397bafc3847904190f01c6490c5c7f2ab188
|