Skip to main content

Lightweight, async-first Kademlia DHT, with dynamic distance-based cache TTL and switchable JSON/Bencode wire serialization

Project description

kademlia-dynamic

Lightweight, async-first Kademlia DHT implementation in pure Python with zero external dependencies. Distinct from the kademlia package on PyPI, imports as kademlia_dynamic.

Installation

pip install kademlia-dynamic

Or from source:

cd kademlia_dynamic
python -m pip install -e .
from kademlia_dynamic import KademliaServer, Peer, generate_node_id

Quick Start

import asyncio
from kademlia_dynamic import KademliaServer

async def main():
    server = KademliaServer()  # or KademliaServer(serialization="bencode")
    await server.listen(port=8000)  # defaults to 127.0.0.1; pass ip="0.0.0.0" to accept LAN/WAN peers

    await server.bootstrap([("192.168.1.100", 8000)])

    await server.set("my_key", "my_value")       # str
    await server.set("my_blob", b"\x00binary")   # or bytes
    value = await server.get("my_key")
    print(value)

    server.stop()

asyncio.run(main())

API Reference

KademliaServer

Main DHT node class

KademliaServer(serialization: str = "json", verify_response_source: bool = True)
  • serialization"json" (default) or "bencode" wire format
  • verify_response_source — when True (default), responses are only accepted from the IP:port the query was sent to; forged responses from other sources are dropped. Set to False if peers legitimately reply from a different address (e.g. some NAT setups).

Additional keyword-only tuning arguments (k, alpha, query_timeout, ...) are listed under Configuration.

Methods

  • async listen(port: int, ip: str = "127.0.0.1") — Start listening for UDP packets. Pass ip="0.0.0.0" to accept connections from other machines (LAN/WAN). Only do this behind a firewall/NAT you control.
  • async bootstrap(nodes: List[Tuple[str, int]]) — Join network from bootstrap nodes
  • async set(key: str, value: str | bytes) — Store value in DHT
  • async get(key: str) -> Optional[str | bytes] — Retrieve value from DHT, same type as stored
  • async find_node(target_id: str) -> List[Peer] — Find peers close to target ID
  • async find_value(key: str) -> Tuple[Optional[str | bytes], List[Peer]] — Search for value, return closest peers if not found
  • stop() — Shut down node and close transport

Peer

Represents a network node

peer = Peer(node_id="abc123...", ip="192.168.1.100", port=8000)
peer.to_dict()  # Serializable dict
Peer.from_dict(data)  # Deserialize

Utility Functions

  • generate_node_id() -> str — Generate random 160-bit node ID (SHA1)
  • hash_key_to_node_id(key: str) -> str — Hash key to node ID space
  • xor_distance(hex_id_a: str, hex_id_b: str) -> int — Compute XOR distance

Comparison to Canonical Kademlia

Canonical references: the Kademlia paper (Maymounkov & Mazières, 2002) and BEP 5 (the BitTorrent DHT protocol).

Similarities

  • 160-bit node IDs (SHA1)
  • XOR distance metric
  • K-buckets (K=20) with replacement cache
  • Alpha concurrency (α=3)
  • Ping, find_node, store/retrieve operations
  • Periodic bucket refresh, value republishing, expiry cleanup
  • Asynchronous UDP protocol

Differences vs canonical Kademlia / BEP 5

Feature This Implementation Canonical Kademlia / BEP 5
Language Python (Language-agnostic spec)
Async Model asyncio Not specified (implementation-dependent)
Serialization JSON (default) or Bencode Bencode (bencoding)
Value TTL Dynamic (distance-based) Paper: distance-based; BEP 5: not specified
Original Publisher Tracking Yes Paper: 24h republish rule; BEP 5: no
Cached Value TTL Inversely proportional to distance Paper: same idea (§2.5); BEP 5: fixed short TTL
RPC Protocol JSON or bencoded dict over UDP Bencoded dict over UDP
Socket Binding User-specified IP Auto-detect

Behavioral Notes

  • Node Discovery: Includes implicit peer discovery via sender fields in all responses (not explicit in BEP 5)
  • Cache TTL: Cached values expire faster (shorter TTL) for distant nodes, reducing stale caches
  • Bucket Refresh: Proactive refresh every 3600s (1h) of buckets that haven't seen activity
  • Value Expiry: Original publishers republish every 24h; non-publishers every 1h

Configuration

All tunables are per-server constructor arguments (keyword-only). The module-level constants in kademlia_dynamic/kademlia.py are only their defaults.

server = KademliaServer(
    serialization="json",                          # or "bencode"
    verify_response_source=True,                   # drop responses from unexpected addresses
    k=20,                                          # peers per bucket (K_BUCKET_SIZE)
    alpha=3,                                       # parallel queries (ALPHA_CONCURRENCY)
    query_timeout=2.0,                             # RPC timeout, seconds
    bucket_refresh_interval=3600,                  # refresh stale buckets + expiry sweep (s)
    key_expiry_seconds=86410,                      # value TTL (24h + 10s)
    non_publisher_restore_interval=3600,           # cache restore interval (1h)
    original_publisher_republish_interval=86400,   # publisher republish (24h)
    min_cache_ttl_seconds=600,                     # minimum cached value TTL (10m)
    max_dispatch_tasks=64,                         # backpressure: max concurrent inbound handlers
)

The bind address is chosen per node via listen(port, ip=...). Node ID width (160-bit) is fixed — it is tied to SHA1.

All nodes in a network should use the same k and alpha for predictable lookup behavior, and must use the same serialization.

Design Notes

JSON vs Bencode

Both are built in, pure Python, zero dependencies. Pick per-server via the serialization constructor arg:

KademliaServer(serialization="json")      # default: human-readable, easy to debug
KademliaServer(serialization="bencode")   # BitTorrent-style bencode wire format

All peers in a network must use the same serialization to interoperate. None values are omitted from encoded messages in both formats (bencode has no null type); readers treat a missing key as None.

One bencode edge case: a str value that begins with the internal byte marker \x00bytes\x00 will round-trip back as bytes. Avoid leading NUL bytes in string values (or use the JSON codec, which does not have this ambiguity).

Values: str or bytes

set()/get() accept and return either str or bytes. On the wire, bencode stores bytes natively; JSON (which has no binary type) base64-encodes bytes transparently and decodes them back on receipt. Type is preserved round-trip — a bytes value in never comes back as str, and vice versa.

Network Exposure

listen() binds 127.0.0.1 by default. To join a real network, pass ip="0.0.0.0" (binds all interfaces) and ensure the UDP port is open/forwarded on your firewall/router.

Thread Safety

Not thread-safe. Designed for single-threaded asyncio use. For multi-threaded access, wrap in locks or run each node in its own event loop.

Backpressure

Datagram dispatch queue limits concurrent inbound handler tasks (max_dispatch_tasks, default 64). Excess packets are dropped with a warning log.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kademlia_dynamic-1.1.1.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kademlia_dynamic-1.1.1-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file kademlia_dynamic-1.1.1.tar.gz.

File metadata

  • Download URL: kademlia_dynamic-1.1.1.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for kademlia_dynamic-1.1.1.tar.gz
Algorithm Hash digest
SHA256 25f0ae4d77a95a9a463fcd6ed9f8b5eecc43aa47701166cb3f604f92a495314f
MD5 261973d82a306487a8ce5476eaa09227
BLAKE2b-256 f95c0b2c6ebde3ae355602f8342f0175f6120daf8bfa6b79f5bc763d5c18a0bb

See more details on using hashes here.

File details

Details for the file kademlia_dynamic-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for kademlia_dynamic-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 99659ab29a4fc5ecad1e72a547ad07b0f4007c27c288c6770ef609132c1e6a8c
MD5 16349fd82680a012d476ad547157b0fc
BLAKE2b-256 6cd1bb095d810dc65928c1f1dae9d6f180a6d9d43d62b685598ec1b0a5a9315a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page