Skip to main content

Distributed AI inference hub — Mixture of Models on Ollama

Project description

MoMa Hub

Distributed AI inference on consumer GPUs — Mixture of Models on Ollama

pip install momahub

What is MoMa Hub?

MoMa Hub is the infrastructure layer for the MoMa (Mixture of Models on Ollama) vision: a federated network where anyone with a gaming GPU can contribute inference capacity to the global AI commons — and route tasks to the right model on the right GPU automatically.

Inspiration What they shared MoMa Hub equivalent
SETI@home Idle CPU cycles Idle GPU cycles
Airbnb Spare bedrooms Spare VRAM
Docker Hub Container images Ollama runtime configs
GitHub Source code SPL prompt scripts

A GTX 1080 Ti (11 GB VRAM, ~$150 used) runs any 7B model in real time. There are millions sitting idle in gaming PCs. MoMa Hub organises them.


Quick Start

# 1. Install
pip install momahub

# 2. Start the hub server
momahub serve --port 8765

# 3. Register your Ollama node (in another terminal)
momahub register --node-id home-gpu-0 \
                 --url http://localhost:11434 \
                 --gpu "GTX 1080 Ti" \
                 --vram 11 \
                 --models qwen2.5:7b --models mistral:7b

# 4. Check nodes
momahub nodes

# 5. Run inference through the hub
momahub infer --model qwen2.5:7b --prompt "Explain attention mechanisms"

Python SDK

import asyncio
from momahub import MoMaHub, NodeInfo, InferenceRequest

hub = MoMaHub()
hub.register(NodeInfo(
    node_id="home-gpu-0",
    url="http://localhost:11434",
    gpu_model="GTX 1080 Ti",
    models=["qwen2.5:7b"],
))

resp = asyncio.run(hub.infer(InferenceRequest(
    model="qwen2.5:7b",
    prompt="Hello from MoMa Hub!",
)))
print(resp.content)

Architecture

Consumer GPUs (GTX 1080 Ti × N)
  │  Ollama serve (one per GPU)
  │
  ▼
MoMa Hub  ──  FastAPI registry + round-robin router
  │
  ▼
Client CLI / Python SDK / SPL scripts

See docs/DESIGN.md for the full roadmap and hardware reference.


Roadmap

Version Milestone
v0.1 (now) Local MVP: register nodes, round-robin routing, CLI + SDK
v0.2 Persistent registry, capability-aware routing, heartbeat daemon
v0.3 LAN mDNS discovery, SPL integration (USING HUB momahub://...)
v0.4 Internet federation, auth, public hub registry

Contributing

MoMa Hub is planned as open-source (Apache 2.0). Once the MVP is proven on 4× GTX 1080 Ti at home, the repo will go public. Follow the GitHub for updates.

https://github.com/digital-duck/momahub

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

momahub-0.1.0.tar.gz (14.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

momahub-0.1.0-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file momahub-0.1.0.tar.gz.

File metadata

  • Download URL: momahub-0.1.0.tar.gz
  • Upload date:
  • Size: 14.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for momahub-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ad8b6e3a44fa7ef82e9a7e219ad92b6f821d779fd7a29e7abc8edfeffad6a6b1
MD5 ce0f93124f0b6149b1bd41c72b9c7086
BLAKE2b-256 ab568815a1c4d68656231982ecc544bafca3a9bb6f6a9fa4e09ca1c43dbc4f47

See more details on using hashes here.

File details

Details for the file momahub-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: momahub-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for momahub-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cc396487f56cd789ceeb747836181bcf7c4c7d7bd38dadc11d7504766348317b
MD5 733e3543ddcae183a696f215eaeb90ac
BLAKE2b-256 237632d7cee7772e0dbd20ccced4de9206dbbb3de40da4044b93c91156d71096

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page