Skip to main content

Decentralized function memoization over iroh P2P

Project description

irohds

A drop-in Python decorator that caches function results and shares them automatically across every machine running the same code. No servers to manage, no configuration, no accounts.

If someone at another institution already computed train_model("cifar10", epochs=50), your machine downloads the result instead of spending hours recomputing it. If nobody has computed it yet, your machine does the work and makes the result available to everyone else.

import irohds

@irohds.memo
def train_model(dataset, epochs=10):
    ...  # hours of GPU time
    return model

result = train_model("cifar10", epochs=50)
# First run: computes (hours). Every subsequent run, on any peer: instant.

Who is this for

Research groups and institutions that repeatedly run expensive computations across many machines. If your lab has 20 people who all run the same preprocessing pipeline on the same datasets, irohds means only the first person waits. Everyone else gets the result in seconds.

Works across institutions, across continents, across networks. Peers find each other through the BitTorrent mainline DHT (16M+ nodes). No central server, no coordinator, no shared filesystem required.

:warning: Use responsibly

Only memoize functions whose results are worth sharing over a network. Fetching a result from a peer can take several seconds of network transfer. If the function itself finishes in under 15 seconds, you are better off with functools.cache, joblib.Memory, or diskcache.

The same rules that apply to joblib and diskcache apply here: avoid passing enormous objects (large DataFrames, full image tensors) as arguments. irohds hashes every argument to build the cache key -- if serializing your arguments takes longer than the function itself, you are doing it wrong.

Install

uv add irohds

This installs the Python package and the Rust daemon binary. The daemon starts automatically on first use and installs itself as a system service (starts at boot, runs in a sandbox).

Usage

import irohds

# Basic: share results with all peers globally
@irohds.memo
def expensive_etl(dataset_path):
    ...
    return processed_data

# Namespaced: only share with peers using the same namespace
@irohds.memo(ns="my-lab")
def train(config):
    ...

# Large file outputs
@irohds.memo
def generate_embeddings(corpus):
    ...
    torch.save(embeddings, irohds.resolve("embeddings.pt"))
    return irohds.FileRef("embeddings.pt")

ref = generate_embeddings("pubmed-2024")
embeddings = torch.load(ref.path)  # file is on disk, ready to use

# Selective eviction
irohds.evict("mymodule.train")  # clear cached results for one function

# Pre-warm peer discovery (optional, reduces first-call latency)
irohds.join("my-lab")

How it works

On the first call: irohds hashes the function's AST and arguments into a cache key, executes the function, stores the result in a local content-addressed blob store, and announces it to peers via gossip.

On subsequent calls (same machine): the result is returned from an in-process dict (~0.1us) or from the local blob store via IPC (~0.2ms). No network involved.

On a different machine: irohds checks whether any peer has the result. If yes, it downloads it. If nobody has it yet, the function runs locally and the result is shared with peers.

Peer discovery is automatic via three mechanisms:

  • Mainline DHT (global, zero config, 16M+ nodes)
  • mDNS (automatic on LAN)
  • Bootstrap peers (fallback for networks that block DHT)

The daemon (irohds-daemon) is a sandboxed Rust process that owns the blob store and handles gossip/P2P networking. It installs as a system service on first use. Python communicates with it over a Unix socket. The sandbox ensures iroh network traffic cannot access the host filesystem beyond the irohds data directory.

Restricted networks

If mainline DHT is blocked (some universities, corporate networks), add known peers to ~/.local/share/irohds/config.toml:

bootstrap_peers = ["<hex-encoded-node-id>"]

Get a peer's node ID with irohds-daemon info.

Performance

Scenario Latency
Repeated call, same process ~0.1us (in-process dict)
First call after process start, data local ~0.2ms (one IPC round-trip)
First call after daemon restart, data local ~1ms (load index + IPC)
Result available from remote peer seconds (network transfer)
Full miss, compute locally depends on function

Developing

cargo build --manifest-path daemon/Cargo.toml  # build the daemon
make test                                       # Rust + Python tests
make test-vm                                    # NixOS QEMU P2P integration test

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

irohds-0.3.12.tar.gz (67.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

irohds-0.3.12-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (10.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

irohds-0.3.12-py3-none-macosx_11_0_arm64.whl (10.1 MB view details)

Uploaded Python 3macOS 11.0+ ARM64

irohds-0.3.12-py3-none-macosx_10_12_x86_64.whl (10.4 MB view details)

Uploaded Python 3macOS 10.12+ x86-64

File details

Details for the file irohds-0.3.12.tar.gz.

File metadata

  • Download URL: irohds-0.3.12.tar.gz
  • Upload date:
  • Size: 67.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for irohds-0.3.12.tar.gz
Algorithm Hash digest
SHA256 d57da39b66081f519f7567c3f2776464a6c08ae1dee889141ba2ebcd7b9c7b1b
MD5 d8d843859798cf46ba59a9e8f64232fd
BLAKE2b-256 5fca906c61cd30abdc06e280a3993182b860162059aba15980a57f4b90a5a79b

See more details on using hashes here.

File details

Details for the file irohds-0.3.12-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for irohds-0.3.12-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 5b9c4e9981dbf4e09922e4771f06beffdbcc96c4eff86e2e7f9d5af310d3055e
MD5 d2793c29e68deeea00c7e0fbad2b8d52
BLAKE2b-256 c1eb5b97c3f274874e2f1002380bdc8cb14574d9c6cda6212941c8ade0abbf4d

See more details on using hashes here.

File details

Details for the file irohds-0.3.12-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for irohds-0.3.12-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5b3f419d16eb7696f0ffd3d366878d2967c4766f128982cc0f3e2b24d1ef93b8
MD5 a95e9d91f1dc3f2fe2da18500fa684ba
BLAKE2b-256 f4b14e5bdf89278fe31b9e8b658edc9547875625fbea7c2d5e4b0d9ae1233e4d

See more details on using hashes here.

File details

Details for the file irohds-0.3.12-py3-none-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for irohds-0.3.12-py3-none-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 0baa06dbbb02ce29f9fd86bea608f4beace3add3ac3c6383f5985412bc20cf24
MD5 115ee56b43548db1fca24d1edbd42e77
BLAKE2b-256 675d90f750ded7ef84e4666bf2395b92bb6462168754dc9b8fafaba1c68b1e3b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page