Skip to main content

A simple binary trie implementation

Project description

Binary Trie

Python binary trie implementation, helping with XOR distances.

Author: Guillaume Michel

Purpose

IPFS and libp2p are built upon the Kademlia DHT, which uses the XOR distance as a distance metric between keys. As explained in this blogpost, the XOR distance is tricky to understand and represent. This distance metric is odd for it is non linear. For instance, $2 \oplus 3 = 1$ but $3 \oplus 4 = 7$. Ordering $N_8$ by XOR distance to 2 gives the following result: $[2,3,0,1,6,7,4,5]$

One popular representation for XOR distance is the Binary Trie. A binary brie is a simple Trie with keyspace {$0,1$}$^n$ with $n$ being the size of the keyspace. The perfect visual representation of the XOR distance would be a N-dimensional binary trie. The example below displays a binary trie containing the keys $[2,3,4,6,7,9,11,13]$.

Alt text

Usage

Install

pip install binary-trie

Import

from binary_trie import Trie, bytes_to_bitstring, bitstring_to_bytes, int_to_bitstring, bitstring_to_int

Creating an empty trie

trie = Trie()

Adding keys

The add(key) method takes a bitstring as input. The bitstring can either be provided directly, or be computer from an int or bytes using the int_to_bitstring() and bytes_to_bitstring() functions. Note that the l parameter represent the bit length of the binary representation. It is important that all keys share the same bit length. The bit length can be omitted in bytes_to_bistring() when working with bit lengths multiple of 8.

trie.add("0010")
trie.add(4*"0")
trie.add(int_to_bitstring(3, l=4))
trie.add(bytes_to_bitstring(b'\x0e', l=4))

The add function returns True on success, and False if the key was already in the trie (not inserted).

Finding keys

The find(key) method returns True if the provided key is in the Trie, False otherwise.

trie.find("0010") # True
trie.find("0100") # False

Finding the closest keys to a target

The n_closest(key, n) method returns the n closest keys to the provided key in the trie, according to the XOR distance. The keys are sorted according to the distance to the target key. Note that only leaves of the trie will be returned, not intermediary nodes.

trie.n_closest("0001", 1) # ["0000"]
trie.n_closest("0010", 3) # ["0010", "0011", "0000"]

Attaching metadata to Trie nodes

It is possible to attach an object as metadata to a trie leaf node.

class MyObj(object):
    def __init__(self, key, name):
        self.key = key
        self.name = name

obj = MyObj(int_to_bitstring(10, 4), "Node 10")
trie.add(obj.key, obj)

trie.find(obj.key).name # "Node 10"
trie.n_closest(obj.key, 1) # [obj]

Helpers

There are 4 helpers functions to facilitate the use of this implementation with keys being not only bitstring, but also bytes or int. These helper functions help translate bytes and int to bitstring and reciprocally.

def bytes_to_bitstring(data: bytes, l: int=8*len(data)) -> bytes:
    ...
bytes_to_bitstring(b'\xff\x00') # "1111111100000000"
bytes_to_bitstring(b'\xf3',l=4) # "0011"

def bitstring_to_bytes(bs: str) -> bytes:
    ...
bitstring_to_bytes("1111111100000000") # b'\xff\x00'
bitstring_to_bytes("0011")             # b'\x03'

def int_to_bitstring(i, l: int) -> bytes:
    ...
int_to_bitstring(6, 4)  # "0110"
int_to_bitstring(6, 16) # "0000000000000110"

def bitstring_to_int(bs: str) -> int:
    ...
bitstring_to_int("0110")             # 6
bitstring_to_int("0000000000000110") # 6

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

binary-trie-0.0.5.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

binary_trie-0.0.5-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file binary-trie-0.0.5.tar.gz.

File metadata

  • Download URL: binary-trie-0.0.5.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.5

File hashes

Hashes for binary-trie-0.0.5.tar.gz
Algorithm Hash digest
SHA256 eba596fc75c20a5ce7b708437fae23bf4cb4ac8706896432d6588bd4dc1a7e8c
MD5 4114f0dba46e7930cd78a870d03c04a0
BLAKE2b-256 df69d36c63df43c4ebac635a8ec1815952d17cde836063e9eea1fd2d1d5e481b

See more details on using hashes here.

File details

Details for the file binary_trie-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: binary_trie-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.5

File hashes

Hashes for binary_trie-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2143a7e71bfa9e96e4f83578dd812b2d09c3dfe6c5314bfb4e638f7be6a1c839
MD5 1e949f195e1da3b7523a708a549820ba
BLAKE2b-256 19176d4acea551f27ad8dd0d0d288234d82f4242935911dd9f205f821b3d3bef

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page