Skip to main content

Turn your PC into a private, OpenAI-compatible LLM provider in ~60 seconds

Project description

OllaBridge Logo

OllaBridge ⚡️

Your single gateway to ALL your LLMs — local, remote, anywhere.

PyPI version Python 3.10+ License: MIT Code style: black

Quick StartWhy OllaBridgeDistributed ComputeExamplesDemo ClientMCP Mode


🎯 What is OllaBridge?

One gateway. All your LLMs. Everywhere.

OllaBridge is your single, OpenAI-compatible API for every LLM you run — on your laptop, workstation, free GPU servers, cloud instances, anywhere.

The Problem: You have models running everywhere (laptop, cloud GPU, friend's gaming PC), and every app needs different configs.

OllaBridge Solution: Apps connect to ONE place. OllaBridge routes to the right compute automatically.

graph TB
    A[Your Apps] -->|OpenAI SDK| B[OllaBridge<br/>Control Plane]

    B -->|Auto Routes| C[Local Laptop<br/>llama3.1]
    B -->|Auto Routes| D[Free GPU Cloud<br/>deepseek-r1]
    B -->|Auto Routes| E[Remote Workstation<br/>mixtral]

    C -.->|Dials Out| B
    D -.->|Dials Out| B
    E -.->|Dials Out| B

    style B fill:#6366f1,stroke:#4f46e5,stroke-width:3px,color:#fff
    style A fill:#10b981,stroke:#059669,stroke-width:2px,color:#fff
    style C fill:#8b5cf6,stroke:#7c3aed,stroke-width:2px,color:#fff
    style D fill:#ec4899,stroke:#db2777,stroke-width:2px,color:#fff
    style E fill:#f59e0b,stroke:#d97706,stroke-width:2px,color:#fff

Key Innovation: Compute nodes dial out to your gateway. No port forwarding, no VPN, no config hell.


🚀 Why OllaBridge?

🎯 Single Source of Truth

  • One URL for everything — Your apps never change code
  • Zero config — Add new GPUs without touching your app
  • Smart routing — OllaBridge picks the best node automatically
  • OpenAI compatible — Works with any SDK, framework, or tool

🛡️ Enterprise-Grade Security

  • API key authentication — Protect your LLMs
  • Rate limiting — Control usage per key
  • Request logging — Full audit trail
  • Encrypted connections — TLS for remote nodes

🌍 Works Everywhere

  • Free GPU clouds — Colab, Kaggle, Lightning AI (no port forwarding needed!)
  • Ephemeral instances — Nodes dial out, IPs don't matter
  • Behind firewalls — Your laptop can join from coffee shop WiFi
  • Mixed environments — Combine local + cloud seamlessly

🤖 AI Agent Ready

  • MCP server — Agents can control your infrastructure
  • Tool exposure — Manage nodes, routes, health via tools
  • Self-healing — Auto-install, auto-configure, auto-recover

⚡ 60-Second Start

Step 1: Install

pip install ollabridge

Step 2: Start Your Gateway

ollabridge start

That's it! You'll see:

✅ Ollama installed (if needed)
✅ Model downloaded (if needed)
✅ Gateway online at http://localhost:11435

╭─────────────────── 🚀 Gateway Ready ────────────────────╮
│                                                          │
│ ✅ OllaBridge is Online                                  │
│                                                          │
│ Model:        deepseek-r1                                │
│ Local API:    http://localhost:11435/v1                 │
│ Key:          sk-ollabridge-xY9kL2mN8pQ4rT6vW1zA        │
│                                                          │
│ Node join token:  eyJ0eXAi...                           │
│ Example node command:                                    │
│   ollabridge-node join --control http://localhost:11435 │
│                        --token eyJ0eXAi...              │
│                                                          │
╰──────────────────────────────────────────────────────────╯

Step 3: Use It!

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11435/v1",
    api_key="sk-ollabridge-xY9kL2mN8pQ4rT6vW1zA"
)

response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Done! You're running private LLMs with the OpenAI API.


🌍 Add Any GPU in 60 Seconds

Have a free GPU on Colab? A remote workstation? Add it instantly:

On Your Remote GPU/Machine:

# Install
pip install ollabridge

# Join your gateway (copy the command from gateway startup)
ollabridge-node join \
  --control http://YOUR_GATEWAY_IP:11435 \
  --token eyJ0eXAi...

That's it! The remote GPU:

  • ✅ Auto-installs Ollama if needed
  • ✅ Auto-downloads models if needed
  • Dials out to your gateway (no port forwarding!)
  • ✅ Shows up as available compute

Your Apps See It Automatically

# Same code, now uses both local + remote GPU!
client = OpenAI(base_url="http://localhost:11435/v1", ...)
response = client.chat.completions.create(...)  # Auto-routed

OllaBridge routes requests across all your nodes automatically.


🎯 Real-World Scenarios

Scenario 1: "I have a gaming PC at home"

# On your gaming PC:
ollabridge-node join --control https://your-gateway.com --token ...

# Now your laptop can use your gaming PC's GPU
# Even if you're at a coffee shop!

Scenario 2: "I want to use free Colab GPUs"

# In Colab notebook:
!pip install ollabridge
!ollabridge-node join --control https://your-gateway.com --token ...

# Now your production app can use free Colab compute
# Colab session ends? Start a new one. Zero config changes.

Scenario 3: "I have multiple cloud GPUs"

# Each GPU instance:
ollabridge-node join --control https://gateway.company.com --token ...

# Your team shares one API URL
# OllaBridge load-balances across all GPUs

💻 Use It Anywhere

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11435/v1",
    api_key="your-key-here"
)

# Chat
response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)

# Embeddings
embeddings = client.embeddings.create(
    model="nomic-embed-text",
    input="Hello, world!"
)

Node.js / TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:11435/v1",
  apiKey: process.env.OLLABRIDGE_KEY
});

const completion = await client.chat.completions.create({
  model: "deepseek-r1",
  messages: [{ role: "user", content: "Hello!" }]
});

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="http://localhost:11435/v1",
    api_key="your-key-here",
    model="deepseek-r1"
)

response = llm.invoke("What is the meaning of life?")

cURL

curl -X POST http://localhost:11435/v1/chat/completions \
  -H "Authorization: Bearer your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Works with ANY OpenAI-compatible tool or library.


🎨 Try the Interactive Demo Client

Want to see OllaBridge in action? Check out our 2-click demo client in the example/ folder!

⚡ Quick Start

# 1. Install and start OllaBridge
cd example
./install-ollabridge.sh  # Mac/Linux
# or
.\install-ollabridge.ps1  # Windows

# 2. Run the demo client
make run

# 3. Open http://localhost:3000 in your browser

✨ Features

  • 🎯 Beautiful UI — Modern, responsive web interface
  • 🔌 Real Integration — Actual API calls to OllaBridge endpoints
  • 📊 Live Metrics — Request stats, latency, uptime tracking
  • 🔑 Auth Demo — See how API key authentication works
  • 📝 Best Practices — Production-ready code examples

📚 Perfect for Learning

The example client shows you:

  • ✅ How to connect to OllaBridge from a browser
  • ✅ How to handle CORS properly
  • ✅ How to implement authentication with API keys
  • ✅ How to load models dynamically
  • ✅ How to send chat requests and handle responses

View Full Documentation →


🤖 AI Agents Love OllaBridge

OllaBridge has a Model Context Protocol (MCP) server built-in.

Agents can:

  • ✅ Create enrollment tokens
  • ✅ List connected compute nodes
  • ✅ Check gateway health
  • ✅ Manage your LLM infrastructure via tools

Start MCP Server

ollabridge-mcp

Example: Agent Workflow

# Agent can call these tools:
await session.call_tool("ollabridge.enroll.create", {})
# → Returns enrollment token

await session.call_tool("ollabridge.runtimes.list", {})
# → Shows all connected nodes

await session.call_tool("ollabridge.gateway.health", {})
# → Checks gateway status

Use Case: "Hey Claude, add my workstation's GPU to our LLM gateway"

→ Agent creates token, gives you the command, you run it. Done.


🔐 Security & Configuration

Authentication

OllaBridge auto-generates a secure API key on first run (saved in .env):

API_KEYS=sk-ollabridge-xY9kL2mN8pQ4rT6vW1zA

Use it in your apps:

# Option 1: Bearer token
headers = {"Authorization": "Bearer sk-ollabridge-..."}

# Option 2: Custom header
headers = {"X-API-Key": "sk-ollabridge-..."}

Configuration (.env)

# API Keys (comma-separated for multiple)
API_KEYS=sk-ollabridge-abc123,sk-ollabridge-def456

# Server
HOST=0.0.0.0
PORT=11435

# Default models
DEFAULT_MODEL=deepseek-r1
DEFAULT_EMBED_MODEL=nomic-embed-text

# Rate limiting
RATE_LIMIT=60/minute

# Security
ENROLLMENT_SECRET=your-secret-here
ENROLLMENT_TTL_SECONDS=3600

# Database (optional)
DATABASE_URL=postgresql://user:pass@localhost/ollabridge

Enrollment Tokens

Create short-lived tokens for nodes to join:

ollabridge enroll-create --ttl 3600

Tokens expire automatically for security.


📡 API Reference

Core Endpoints

Endpoint Method Description
/health GET Gateway health + node count
/v1/chat/completions POST OpenAI-compatible chat
/v1/embeddings POST Generate embeddings
/v1/models GET List available models (aggregated from nodes)

Admin Endpoints (require API key)

Endpoint Method Description
/admin/recent GET Recent request logs
/admin/runtimes GET List connected nodes
/admin/enroll POST Create enrollment token

Example: Check Connected Nodes

curl -H "X-API-Key: your-key" http://localhost:11435/admin/runtimes

Response:

{
  "runtimes": [
    {
      "node_id": "local",
      "connector": "local_ollama",
      "healthy": true,
      "tags": ["local"],
      "models": ["deepseek-r1", "llama3.1"]
    },
    {
      "node_id": "colab-gpu-1",
      "connector": "relay_link",
      "healthy": true,
      "tags": ["gpu", "free"],
      "models": ["mixtral", "codellama"]
    }
  ]
}

🏗️ Architecture Deep Dive

How It Works

  1. Control Plane (Gateway): Your apps connect here
  2. Nodes: Any machine with GPUs/CPUs running models
  3. Relay Link: Nodes dial OUT to gateway (WebSocket)
  4. Router: Picks the best node for each request

Why "Dial Out" Matters

Traditional (broken):

App → Gateway → Try to reach GPU
                ❌ Blocked by firewall
                ❌ NAT issues
                ❌ No public IP

OllaBridge (works everywhere):

App → Gateway ← GPU dials in
               ✅ Works from anywhere
               ✅ No port forwarding
               ✅ Ephemeral IPs OK

Connector Types

  • RelayLink: Node dials out via WebSocket (default, works everywhere)
  • DirectEndpoint: HTTP to stable node (best performance)
  • LocalOllama: Built-in local runtime (zero config)

OllaBridge picks the right one automatically.


📈 Scaling

Add More Workers

ollabridge start --workers 4

Use PostgreSQL

pip install psycopg2-binary
export DATABASE_URL=postgresql://user:pass@localhost/ollabridge
ollabridge start --workers 8

Add More Nodes

# Just keep adding nodes!
ollabridge-node join --control ... --token ...

OllaBridge automatically load-balances across all healthy nodes.


🌍 Public Access (Optional)

Quick Demo (Ngrok)

ollabridge start --share

Production (Cloudflare Tunnel)

# Terminal 1: Start gateway
ollabridge start

# Terminal 2: Expose it
cloudflared tunnel --url http://localhost:11435

Now your gateway has a public https:// URL!

Security: Always use API keys for public gateways.


🎓 Beginner's Guide

"I've never used LLMs before"

  1. Install: pip install ollabridge
  2. Start: ollabridge start
  3. Copy the API key from the output
  4. Use this code:
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11435/v1",
    api_key="PASTE_KEY_HERE"
)

response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{"role": "user", "content": "Explain Python in simple terms"}]
)

print(response.choices[0].message.content)

That's it! You're running AI models on your computer.

"I want to add my gaming PC's GPU"

  1. On your main computer (gateway):

    ollabridge start
    # Copy the "Node join token" and gateway URL
    
  2. On your gaming PC:

    pip install ollabridge
    ollabridge-node join --control http://GATEWAY_IP:11435 --token TOKEN_HERE
    
  3. Done! Your apps can now use your gaming PC's power.

"I want to use free Colab GPUs"

  1. Start your gateway at home:

    ollabridge start --share
    # Note the public URL (https://xxx.ngrok.io)
    
  2. In Colab notebook:

    !pip install ollabridge
    !ollabridge-node join --control https://xxx.ngrok.io --token YOUR_TOKEN
    
  3. Now your apps use FREE Colab GPUs!

Pro tip: When Colab disconnects, just restart and run step 2 again. Zero config changes needed.


🛠️ CLI Commands Reference

OllaBridge includes powerful CLI commands for diagnostics, testing, and management.

Diagnostic Commands

ollabridge doctor

Diagnose your OllaBridge setup (Ollama, gateway, auth, CORS):

ollabridge doctor

Output:

                     OllaBridge Doctor
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Check               ┃ Result                          ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Ollama /api/tags    │ ✅ OK                           │
│ OllaBridge /health  │ ✅ OK                           │
│ API_KEYS configured │ ✅ yes                          │
│ CORS_ORIGINS        │ http://localhost:5173,...       │
│ Auth usage          │ Use Authorization: Bearer <key> │
└─────────────────────┴─────────────────────────────────┘

Use case: Troubleshooting connection issues, verifying setup before deployment.

ollabridge models

List available models (requires API key):

ollabridge models --api-key sk-ollabridge-xY9kL2mN8pQ4rT6vW1zA

Output:

deepseek-r1
llama3.1
mixtral

Use case: Verify which models are available across all nodes.

ollabridge test-chat

Send a test chat completion (requires API key):

# Simple test
ollabridge test-chat "Hello, how are you?" --api-key sk-ollabridge-...

# Specify model
ollabridge test-chat "Explain quantum computing" \
  --model deepseek-r1 \
  --api-key sk-ollabridge-...

Output:

╭─────────── Assistant ───────────╮
│ Hello! I'm doing well, thank    │
│ you for asking. How can I help  │
│ you today?                       │
╰──────────────────────────────────╯

Use case: Verify end-to-end connectivity, test API keys, validate model responses.

Gateway Management

ollabridge start

Start the gateway (standard mode):

ollabridge start

ollabridge start --lan

Start with LAN URLs displayed (for classroom/shared networks):

ollabridge start --lan

Output includes:

🌐 LAN Access
LAN API base:    http://192.168.1.50:11435/v1
LAN Health:      http://192.168.1.50:11435/health

Example (with API key):
curl -H 'Authorization: Bearer <API_KEY>' http://192.168.1.50:11435/v1/models

Use case: Sharing your gateway with other devices on your network (Quest headsets, phones, other laptops).

ollabridge start --share

Expose a public URL (via ngrok):

ollabridge start --share

Use case: Remote access, connecting nodes from anywhere.

ollabridge enroll-create

Create enrollment tokens for nodes:

ollabridge enroll-create --ttl 3600

Quick Tasks

Task Command
List models (API) ollabridge models --api-key <key>
Test connectivity ollabridge test-chat "test" --api-key <key>
Check health curl http://localhost:11435/health
Diagnose setup ollabridge doctor
See nodes curl -H "X-API-Key: <key>" http://localhost:11435/admin/runtimes
View logs curl -H "X-API-Key: <key>" http://localhost:11435/admin/recent
Create token ollabridge enroll-create

For Developers

OllaBridge requires an API key to authenticate requests. If no .env file is provided, or the .env file does not contain API_KEYS, OllaBridge will automatically generate a temporary, per-run secret API key (sk-ollabridge-...), print it to the screen, and use it only for the current run so you can start developing immediately. This key is not written to disk by default, which prevents accidental persistence of credentials and improves security. If you explicitly want OllaBridge to persist the generated API key, you must opt in by starting the gateway with:

ollabridge start --write-env

In this case, OllaBridge will write the generated key to .env. For production deployments, it is strongly recommended to set API_KEYS using environment variables or a secure secret manager, rather than relying on a .env file. This design provides safe defaults while avoiding unintentionally storing sensitive information.


☁️ Optional: OllaBridge Cloud

OllaBridge Local can optionally connect to OllaBridge Cloud for multi-user, multi-device deployments.

Cloud Features

  • 🔐 Secure device pairing with user approval
  • 👥 Multi-user support with device ownership
  • 🌍 No port forwarding needed (devices dial out to Cloud)
  • 📱 Multi-device per user (PC + Quest + phone, etc.)
  • 🔄 Streaming support for real-time responses

Pairing Your Device with Cloud

# 1. Pair this device with OllaBridge Cloud
ollabridge-node cloud-pair --cloud https://your-cloud-url.com

# Shows pairing code - approve via web UI

# 2. Connect to Cloud (uses saved credentials)
ollabridge-node cloud-connect

How it works:

  1. cloud-pair gets a pairing code from Cloud
  2. You approve the code via Cloud's web UI
  3. Device credentials saved to ~/.ollabridge/cloud_device.json
  4. cloud-connect connects your device to Cloud relay
  5. Cloud routes requests to your device securely

Local Mode (Default) vs Cloud Mode

Feature Local Mode Cloud Mode
Setup ollabridge-node join --control <gateway> --token <token> ollabridge-node cloud-pair --cloud <url>
Authentication Enrollment token Device pairing + approval
Users Single self-hosted Multi-user cloud accounts
Devices Manual node management Per-user device ownership
Streaming Not yet ✅ Supported
Port forwarding Not needed (outbound) Not needed (outbound)

Both modes work together! Run local gateway + nodes for self-hosting, and optionally pair devices with Cloud for multi-user scenarios.


🗺️ Roadmap

  • ✅ Control Plane + Node architecture
  • ✅ Outbound-only node enrollment (no port forwarding)
  • ✅ MCP server for AI agent control
  • ✅ Multi-node load balancing
  • ✅ Diagnostic CLI commands (doctor, models, test-chat)
  • ✅ Enhanced CORS handling for browser clients
  • ✅ LAN mode for classroom/shared network deployments
  • ✅ Cloud compatibility (optional device pairing)
  • ✅ Streaming support for chat completions (Cloud mode)
  • 🚧 Tag-based routing (send "coding" requests to GPU nodes)
  • 🚧 Model-specific routing rules
  • 🚧 Web UI for node management
  • 🚧 Prometheus metrics
  • 🚧 Support for more runtimes (vLLM, llama.cpp, LM Studio)

🤝 Contributing

We welcome contributions! Areas we'd love help:

  • 🔌 More runtime adapters (vLLM, llama.cpp, etc.)
  • 🎨 Web UI for management
  • 📊 Better monitoring/metrics
  • 🔒 Security enhancements
  • 📖 Documentation improvements

How to contribute:

  1. Fork the repo
  2. Create a branch (git checkout -b feature/amazing)
  3. Make your changes
  4. Add tests
  5. Submit a PR

📄 License

Apache License 2.0 - see LICENSE


🙏 Built With


💬 Support


🌟 Star History

If OllaBridge helped you, give it a star! ⭐


Made with ❤️ for the local-first AI community

Stop paying cloud tokens. Use your own compute.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollabridge-0.1.1.tar.gz (52.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ollabridge-0.1.1-py3-none-any.whl (50.4 kB view details)

Uploaded Python 3

File details

Details for the file ollabridge-0.1.1.tar.gz.

File metadata

  • Download URL: ollabridge-0.1.1.tar.gz
  • Upload date:
  • Size: 52.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ollabridge-0.1.1.tar.gz
Algorithm Hash digest
SHA256 86e23fdada7ea84e67b2420925105d6d4333e16dc429424ab77b32be9681ae67
MD5 59b57680b6b0ac83b15e851f8b3a75a2
BLAKE2b-256 1f221f45be5e40fd0cc74214814a96a5511e87d647d430cf670ea22c4edf7b95

See more details on using hashes here.

Provenance

The following attestation bundles were made for ollabridge-0.1.1.tar.gz:

Publisher: publish.yml on ruslanmv/ollabridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ollabridge-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ollabridge-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 50.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ollabridge-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7518713a1e54d63041ce1985a742c4118a7420fdd4119b845f894c4eaf16a6db
MD5 081afdd8602e0f06a3ba4e72ee2b1355
BLAKE2b-256 6be9cc4942ebfa5395732fd9c383d53b74d6fff28bef162c4bfd30ceedc7ad60

See more details on using hashes here.

Provenance

The following attestation bundles were made for ollabridge-0.1.1-py3-none-any.whl:

Publisher: publish.yml on ruslanmv/ollabridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page