Skip to main content

Fleet management dashboard for local AI inference servers. Monitor Ollama, vLLM, and llama.cpp across your machines.

Project description

Propagul Mesh

One dashboard for all your local AI servers. See GPU temps, pull models remotely, route inference — without SSH, Grafana, or LiteLLM.

Tests Python License PyPI

The Problem

You run Ollama on your desktop, vLLM on a home server, and TGI on a rented GPU. To check GPU temps, pull a model, or see which node has capacity — you SSH into each one. When something goes down overnight, you find out the next morning.

Propagul Mesh gives you one dashboard for all of them. Zero port-forwarding, zero Docker.

Quick Start

# 1. Install
pip install propagul-mesh

# 2. Start the agent (auto-detects Ollama, vLLM, TGI, llama.cpp, LM Studio)
propagul-mesh start --api-key pg_live_your_key

# 3. Open your dashboard
# → https://fleet.propagul.dev

Your node appears in the dashboard within 10 seconds. GPU temps, VRAM, loaded models — all live.

What You Get

Feature Description
Fleet Dashboard Real-time node grid with VRAM, GPU util, temperature, and power sparklines
Remote Model Pull Click a button to ollama pull llama3.1:8b on any node — no SSH
Multi-Backend Auto-detects Ollama, vLLM, TGI, llama.cpp, LM Studio on each machine
Local Proxy localhost:8787/v1/chat/completions — one OpenAI-compatible endpoint for all backends
Config Sync Declare desired model state centrally. Nodes auto-pull and auto-delete to converge. CRDT-backed.
Health Alerts Email notification when a node goes offline (via Resend, 4h cooldown)
Prompt Privacy Prompts never leave your network. The cloud sees only telemetry (model names, VRAM, temps).
Zero Port-Forwarding Agents connect outbound. No inbound ports, no Tailscale, no VPN.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│  Your Machine(s)                                                │
│                                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │ Ollama       │  │ vLLM         │  │ TGI          │          │
│  │ :11434       │  │ :8000        │  │ :8080        │          │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘          │
│         └─────────────────┼─────────────────┘                   │
│                    ┌──────┴───────┐                              │
│                    │ propagul-mesh│ ← pip install, single binary │
│                    │   Agent      │                              │
│                    │ + Local Proxy│ → localhost:8787/v1          │
│                    └──────┬───────┘                              │
│                           │ Telemetry only (no prompts)          │
└───────────────────────────┼─────────────────────────────────────┘
                            │ HTTPS (outbound only)
                   ┌────────┴────────┐
                   │ cloud.propagul  │
                   │ .dev            │
                   │                 │
                   │ Fleet Dashboard │
                   │ Sparklines      │
                   │ Model Management│
                   │ Health Alerts   │
                   └─────────────────┘

Key insight: This is the Tailscale model applied to GPU fleet management. Control plane in the cloud, data plane (your prompts) stays local.

Supported Backends

Backend Detection Model Listing Chat (sync) Chat (stream) Status
Ollama /api/version /api/tags Production
vLLM /v1/models Production
TGI /info ✅ (model_id) Production
llama.cpp /health /v1/models Production
LM Studio /v1/models Production

Backend detection is automatic — the agent probes each port and identifies the backend by its health endpoint signature (detect.py).

Local Proxy

The agent runs a local OpenAI-compatible proxy on port 8787:

# List models from all detected backends
curl http://localhost:8787/v1/models

# Chat completion (routed to the right backend automatically)
curl http://localhost:8787/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.1:8b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Works with any OpenAI-compatible client (Python openai, LangChain, Cursor, Continue, etc.). Ollama's native API is automatically translated to OpenAI format.

Security

Layer Implementation
API Authentication HMAC-SHA256 key derivation (PROPAGUL_KEY_SALT)
Content Security Policy Strict CSP with nonce-based script loading
Telemetry Validation Depth-limited JSON parsing (MAX_DEPTH=5, MAX_KEYS=500)
Heartbeat Limits 1 MB max payload, rate-limited per node
SSRF Protection RFC 1918 allowlist for proxy targets (no arbitrary IP routing)
Prompt Privacy Prompts are processed locally; cloud receives only metadata

Configuration

Variable Required Description
PROPAGUL_KEY_SALT Yes HMAC secret for API key hashing
PROPAGUL_REDIS_URL Yes (server) Redis connection URL
PROPAGUL_GOSSIP_SECRET Recommended Shared secret for gossip authentication
PROPAGUL_RESEND_API_KEY No Resend.com API key for email alerts
PROPAGUL_ALERT_EMAIL No Email address for health alerts
PROPAGUL_PUBLIC_HOST No Public hostname (default: fleet.propagul.dev)
PROPAGUL_ALLOW_PLAINTEXT No Set true for dev without TLS

How It Compares

Feature GPUStack LiteLLM llama-dash Propagul Mesh
GPU Fleet Dashboard ✅ (Grafana) ✅ (built-in)
Remote Model Pull ✅ (HF Catalog) ✅ (Ollama)
Multi-Backend ✅ (100+ providers) ❌ (llama.cpp only) ✅ (5 backends)
Zero-Config Install ❌ (K8s/Docker) ❌ (config file) ✅ (pip install + 1 cmd)
Prompt Privacy ❌ (centralized) ❌ (proxy sees all) ✅ (local) ✅ (Tailscale model)
CRDT Config Sync ✅ (unique)
Real-time Sparklines ✅ (Grafana) ✅ (SSE) ✅ (Chart.js + SSE)
Cost Tracking

When to use what

  • GPUStack — Best for teams with K8s/Docker already running, need HuggingFace catalog integration.
  • LiteLLM — Best as a centralized API gateway with cost tracking (not a fleet manager).
  • llama-dash — Best for single-machine llama.cpp monitoring with SQLite logging.
  • Propagul Mesh — Best if you want pip install + one command to monitor multiple machines with different backends, and your prompts must stay local.

Advanced: State Sync SDK

Propagul also includes a P2P state synchronization SDK for AI agent workflows. This is the original core library — the fleet dashboard was built on top of it.

from propagul import AgentStateStore

store = AgentStateStore(room="my-project", node_id=1, port=9001)
store.set("task", "research")

# Agent crashes → restarts → recovers state from peers
await store.start(peers=[("192.168.1.2", 9002)])
print(store.get("task"))  # "research" — recovered from peer

Features: ORMap CRDT, adaptive gossip protocol, crash recovery, conflict detection, CrewAI and LangGraph integrations. See docs/ for full SDK documentation.

License

MIT — Full source available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

propagul-0.9.0.tar.gz (115.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

propagul-0.9.0-py3-none-any.whl (84.2 kB view details)

Uploaded Python 3

File details

Details for the file propagul-0.9.0.tar.gz.

File metadata

  • Download URL: propagul-0.9.0.tar.gz
  • Upload date:
  • Size: 115.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.2

File hashes

Hashes for propagul-0.9.0.tar.gz
Algorithm Hash digest
SHA256 67442911e11dda2b4e3cef910fae18affbce7c12af715d23477d6f66035e5f70
MD5 e13520e9af0c68ce934e455e9a5fa6db
BLAKE2b-256 f98c78c6917884b5e72e1480b8859fa75f34ec5f788ff663a9fa3ec1b7f7746e

See more details on using hashes here.

File details

Details for the file propagul-0.9.0-py3-none-any.whl.

File metadata

  • Download URL: propagul-0.9.0-py3-none-any.whl
  • Upload date:
  • Size: 84.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.2

File hashes

Hashes for propagul-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 74ca099c7c0f8a834d40205c8663818702d3ba57cd9d6c548f3ad9fdbeaea20f
MD5 20805a44ae092ef2435daff8be36beff
BLAKE2b-256 830aa4f9bafed7d49795750ede530663d1ba52fa124314eb6665ee1f2cbb8c3d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page