Skip to main content

Distributed AI inference hub — Mixture of Models on Ollama

Project description

MoMa Hub

Distributed AI inference on consumer GPUs — Mixture of Models on Ollama

pip install momahub

What is MoMa Hub?

MoMa Hub is the infrastructure layer for the MoMa (Mixture of Models on Ollama) vision: a federated network where anyone with a gaming GPU can contribute inference capacity to the global AI commons — and route tasks to the right model on the right GPU automatically.

Inspiration What they shared MoMa Hub equivalent
SETI@home Idle CPU cycles Idle GPU cycles
Airbnb Spare bedrooms Spare VRAM
Docker Hub Container images Ollama runtime configs
GitHub Source code SPL prompt scripts

A GTX 1080 Ti (11 GB VRAM, ~$150 used) runs any 7B model in real time. There are millions sitting idle in gaming PCs. MoMa Hub organises them.


Quick Start

# 1. Install
pip install momahub

# 2. Start the hub server
momahub serve --port 8765

# 3. Register your Ollama node (in another terminal)
momahub register --node-id home-gpu-0 \
                 --url http://localhost:11434 \
                 --gpu "GTX 1080 Ti" \
                 --vram 11 \
                 --models qwen2.5:7b --models mistral:7b

# 4. Check nodes
momahub nodes

# 5. Run inference through the hub
momahub infer --model qwen2.5:7b --prompt "Explain attention mechanisms"

Python SDK

import asyncio
from momahub import MoMaHub, NodeInfo, InferenceRequest

hub = MoMaHub()
hub.register(NodeInfo(
    node_id="home-gpu-0",
    url="http://localhost:11434",
    gpu_model="GTX 1080 Ti",
    models=["qwen2.5:7b"],
))

resp = asyncio.run(hub.infer(InferenceRequest(
    model="qwen2.5:7b",
    prompt="Hello from MoMa Hub!",
)))
print(resp.content)

Architecture

Consumer GPUs (GTX 1080 Ti × N)
  │  Ollama serve (one per GPU)
  │
  ▼
MoMa Hub  ──  FastAPI registry + round-robin router
  │
  ▼
Client CLI / Python SDK / SPL scripts

See docs/DESIGN.md for the full roadmap and hardware reference.


Roadmap

Version Milestone
v0.1 (now) Local MVP: register nodes, round-robin routing, CLI + SDK
v0.2 Persistent registry, capability-aware routing, heartbeat daemon
v0.3 LAN mDNS discovery, SPL integration (USING HUB momahub://...)
v0.4 Internet federation, auth, public hub registry

Contributing

MoMa Hub is planned as open-source (Apache 2.0). Once the MVP is proven on 4× GTX 1080 Ti at home, the repo will go public. Follow the GitHub for updates.

https://github.com/digital-duck/momahub

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

momahub-0.2.0.tar.gz (29.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

momahub-0.2.0-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file momahub-0.2.0.tar.gz.

File metadata

  • Download URL: momahub-0.2.0.tar.gz
  • Upload date:
  • Size: 29.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for momahub-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c4457fa465291413e1ea0f1a8c046af515eff104b2f28e2a03a01b8e4e2623ec
MD5 dea29b14a5136aeedc087057888bdc22
BLAKE2b-256 f0839ffce8d1e7de20ad75f50297cb0a7355a2d958fc44992157766cc754cf14

See more details on using hashes here.

File details

Details for the file momahub-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: momahub-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for momahub-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d2992156740ee775a9bd852ac595e0e2f0d06f0e2bbca9076895a2939ab61b5c
MD5 7913561ee4180e8dea95db0bd4a4e984
BLAKE2b-256 221f2c5cc245dc3694e6bd9d28cf4517e9cf1d2b7b89a4c69210eb835638fb4c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page