Lightweight, async-first Kademlia DHT, with dynamic distance-based cache TTL and switchable JSON/Bencode wire serialization
Project description
kademlia-dynamic
Lightweight, async-first Kademlia DHT implementation in pure Python with zero external dependencies. Distinct from the kademlia package on PyPI, imports as kademlia_dynamic.
Installation
pip install kademlia-dynamic
Or from source:
cd kademlia_dynamic
python -m pip install -e .
from kademlia_dynamic import KademliaServer, Peer, generate_node_id
Quick Start
import asyncio
from kademlia_dynamic import KademliaServer
async def main():
server = KademliaServer() # or KademliaServer(serialization="bencode")
await server.listen(port=8000) # defaults to 127.0.0.1; pass ip="0.0.0.0" to accept LAN/WAN peers
await server.bootstrap([("192.168.1.100", 8000)])
await server.set("my_key", "my_value") # str
await server.set("my_blob", b"\x00binary") # or bytes
value = await server.get("my_key")
print(value)
server.stop()
asyncio.run(main())
API Reference
KademliaServer
Main DHT node class
KademliaServer(serialization: str = "json", verify_response_source: bool = True)
serialization—"json"(default) or"bencode"wire formatverify_response_source— whenTrue(default), responses are only accepted from the IP:port the query was sent to; forged responses from other sources are dropped. Set toFalseif peers legitimately reply from a different address (e.g. some NAT setups).
Additional keyword-only tuning arguments (k, alpha, query_timeout, ...) are listed under Configuration.
Methods
async listen(port: int, ip: str = "127.0.0.1")— Start listening for UDP packets. Passip="0.0.0.0"to accept connections from other machines (LAN/WAN). Only do this behind a firewall/NAT you control.async bootstrap(nodes: List[Tuple[str, int]])— Join network from bootstrap nodesasync set(key: str, value: str | bytes)— Store value in DHTasync get(key: str) -> Optional[str | bytes]— Retrieve value from DHT, same type as storedasync find_node(target_id: str) -> List[Peer]— Find peers close to target IDasync find_value(key: str) -> Tuple[Optional[str | bytes], List[Peer]]— Search for value, return closest peers if not foundstop()— Shut down node and close transport
Peer
Represents a network node
peer = Peer(node_id="abc123...", ip="192.168.1.100", port=8000)
peer.to_dict() # Serializable dict
Peer.from_dict(data) # Deserialize
Utility Functions
generate_node_id() -> str— Generate random 160-bit node ID (SHA1)hash_key_to_node_id(key: str) -> str— Hash key to node ID spacexor_distance(hex_id_a: str, hex_id_b: str) -> int— Compute XOR distance
Comparison to Canonical Kademlia
Canonical references: the Kademlia paper (Maymounkov & Mazières, 2002) and BEP 5 (the BitTorrent DHT protocol).
Similarities
- 160-bit node IDs (SHA1)
- XOR distance metric
- K-buckets (K=20) with replacement cache
- Alpha concurrency (α=3)
- Ping, find_node, store/retrieve operations
- Periodic bucket refresh, value republishing, expiry cleanup
- Asynchronous UDP protocol
Differences vs canonical Kademlia / BEP 5
| Feature | This Implementation | Canonical Kademlia / BEP 5 |
|---|---|---|
| Language | Python | (Language-agnostic spec) |
| Async Model | asyncio | Blocking (implementation-dependent) |
| Serialization | JSON (default) or Bencode | Bencode (bencoding) |
| Value TTL | Dynamic (distance-based) | Fixed intervals |
| Original Publisher Tracking | Yes | Not specified |
| Cached Value TTL | Inversely proportional to distance | Fixed short TTL |
| RPC Protocol | JSON over UDP | Bencoded dict over UDP |
| Socket Binding | User-specified IP | Auto-detect |
Behavioral Notes
- Node Discovery: Includes implicit peer discovery via sender fields in all responses (not explicit in BEP 5)
- Cache TTL: Cached values expire faster (shorter TTL) for distant nodes, reducing stale caches
- Bucket Refresh: Proactive refresh every 3600s (1h) of buckets that haven't seen activity
- Value Expiry: Original publishers republish every 24h; non-publishers every 1h
Configuration
All tunables are per-server constructor arguments (keyword-only). The module-level constants in kademlia_dynamic/kademlia.py are only their defaults.
server = KademliaServer(
serialization="json", # or "bencode"
verify_response_source=True, # drop responses from unexpected addresses
k=20, # peers per bucket (K_BUCKET_SIZE)
alpha=3, # parallel queries (ALPHA_CONCURRENCY)
query_timeout=2.0, # RPC timeout, seconds
bucket_refresh_interval=3600, # refresh stale buckets + expiry sweep (s)
key_expiry_seconds=86410, # value TTL (24h + 10s)
non_publisher_restore_interval=3600, # cache restore interval (1h)
original_publisher_republish_interval=86400, # publisher republish (24h)
min_cache_ttl_seconds=600, # minimum cached value TTL (10m)
max_dispatch_tasks=64, # backpressure: max concurrent inbound handlers
)
The bind address is chosen per node via listen(port, ip=...). Node ID width (160-bit) is fixed — it is tied to SHA1.
All nodes in a network should use the same k and alpha for predictable lookup behavior, and must use the same serialization.
Design Notes
JSON vs Bencode
Both are built in, pure Python, zero dependencies. Pick per-server via the serialization constructor arg:
KademliaServer(serialization="json") # default: human-readable, easy to debug
KademliaServer(serialization="bencode") # BitTorrent-style bencode wire format
All peers in a network must use the same serialization to interoperate. None values are omitted from encoded messages in both formats (bencode has no null type); readers treat a missing key as None.
One bencode edge case: a str value that begins with the internal byte marker \x00bytes\x00 will round-trip back as bytes. Avoid leading NUL bytes in string values (or use the JSON codec, which does not have this ambiguity).
Values: str or bytes
set()/get() accept and return either str or bytes. On the wire, bencode stores bytes natively; JSON (which has no binary type) base64-encodes bytes transparently and decodes them back on receipt. Type is preserved round-trip — a bytes value in never comes back as str, and vice versa.
Network Exposure
listen() binds 127.0.0.1 by default. To join a real network, pass ip="0.0.0.0" (binds all interfaces) and ensure the UDP port is open/forwarded on your firewall/router.
Thread Safety
Not thread-safe. Designed for single-threaded asyncio use. For multi-threaded access, wrap in locks or run each node in its own event loop.
Backpressure
Datagram dispatch queue limits concurrent inbound handler tasks (max_dispatch_tasks, default 64). Excess packets are dropped with a warning log.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kademlia_dynamic-1.1.0.tar.gz.
File metadata
- Download URL: kademlia_dynamic-1.1.0.tar.gz
- Upload date:
- Size: 20.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe23b0214a1c5e80ce7f144a96665038c25063ba10054dd2520038bc593e75e6
|
|
| MD5 |
7fd82ce2118ffd6ff3bad9d4ba74dad7
|
|
| BLAKE2b-256 |
e69f6cd0c8185d781830da1223c7efa27a42b27fc2eba66dc57dbedb56965838
|
File details
Details for the file kademlia_dynamic-1.1.0-py3-none-any.whl.
File metadata
- Download URL: kademlia_dynamic-1.1.0-py3-none-any.whl
- Upload date:
- Size: 12.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
058eece777391317b1abab1ba9f4a0dd4b68803648145258f02f79bcbc038b06
|
|
| MD5 |
6a4f0db4c553d008af028f16a2b242fc
|
|
| BLAKE2b-256 |
7153b75596547ca31628422fcdb3f8fb1b0b38071828c8a4a749dd94a8b43ab7
|