Turn your PC into a private, OpenAI-compatible LLM provider in ~60 seconds

These details have not been verified by PyPI

Project description

OllaBridge ⚡️

Your single gateway to ALL your LLMs — local, remote, anywhere.

Quick Start • Why OllaBridge • Distributed Compute • Examples • MCP Mode

🎯 What is OllaBridge?

One gateway. All your LLMs. Everywhere.

OllaBridge is your single, OpenAI-compatible API for every LLM you run — on your laptop, workstation, free GPU servers, cloud instances, anywhere.

The Problem: You have models running everywhere (laptop, cloud GPU, friend's gaming PC), and every app needs different configs.

OllaBridge Solution: Apps connect to ONE place. OllaBridge routes to the right compute automatically.

graph TB
    A[Your Apps] -->|OpenAI SDK| B[OllaBridge<br/>Control Plane]

    B -->|Auto Routes| C[Local Laptop<br/>llama3.1]
    B -->|Auto Routes| D[Free GPU Cloud<br/>deepseek-r1]
    B -->|Auto Routes| E[Remote Workstation<br/>mixtral]

    C -.->|Dials Out| B
    D -.->|Dials Out| B
    E -.->|Dials Out| B

    style B fill:#6366f1,stroke:#4f46e5,stroke-width:3px,color:#fff
    style A fill:#10b981,stroke:#059669,stroke-width:2px,color:#fff
    style C fill:#8b5cf6,stroke:#7c3aed,stroke-width:2px,color:#fff
    style D fill:#ec4899,stroke:#db2777,stroke-width:2px,color:#fff
    style E fill:#f59e0b,stroke:#d97706,stroke-width:2px,color:#fff

Key Innovation: Compute nodes dial out to your gateway. No port forwarding, no VPN, no config hell.

🚀 Why OllaBridge?

🎯 Single Source of Truth

✅ One URL for everything — Your apps never change code
✅ Zero config — Add new GPUs without touching your app
✅ Smart routing — OllaBridge picks the best node automatically
✅ OpenAI compatible — Works with any SDK, framework, or tool

🛡️ Enterprise-Grade Security

✅ API key authentication — Protect your LLMs
✅ Rate limiting — Control usage per key
✅ Request logging — Full audit trail
✅ Encrypted connections — TLS for remote nodes

🌍 Works Everywhere

✅ Free GPU clouds — Colab, Kaggle, Lightning AI (no port forwarding needed!)
✅ Ephemeral instances — Nodes dial out, IPs don't matter
✅ Behind firewalls — Your laptop can join from coffee shop WiFi
✅ Mixed environments — Combine local + cloud seamlessly

🤖 AI Agent Ready

✅ MCP server — Agents can control your infrastructure
✅ Tool exposure — Manage nodes, routes, health via tools
✅ Self-healing — Auto-install, auto-configure, auto-recover

⚡ 60-Second Start

Step 1: Install

pip install ollabridge

Step 2: Start Your Gateway

ollabridge start

That's it! You'll see:

✅ Ollama installed (if needed)
✅ Model downloaded (if needed)
✅ Gateway online at http://localhost:11435

╭─────────────────── 🚀 Gateway Ready ────────────────────╮
│                                                          │
│ ✅ OllaBridge is Online                                  │
│                                                          │
│ Model:        deepseek-r1                                │
│ Local API:    http://localhost:11435/v1                 │
│ Key:          sk-ollabridge-xY9kL2mN8pQ4rT6vW1zA        │
│                                                          │
│ Node join token:  eyJ0eXAi...                           │
│ Example node command:                                    │
│   ollabridge-node join --control http://localhost:11435 │
│                        --token eyJ0eXAi...              │
│                                                          │
╰──────────────────────────────────────────────────────────╯

Step 3: Use It!

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11435/v1",
    api_key="sk-ollabridge-xY9kL2mN8pQ4rT6vW1zA"
)

response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Done! You're running private LLMs with the OpenAI API.

🌍 Add Any GPU in 60 Seconds

Have a free GPU on Colab? A remote workstation? Add it instantly:

On Your Remote GPU/Machine:

# Install
pip install ollabridge

# Join your gateway (copy the command from gateway startup)
ollabridge-node join \
  --control http://YOUR_GATEWAY_IP:11435 \
  --token eyJ0eXAi...

That's it! The remote GPU:

✅ Auto-installs Ollama if needed
✅ Auto-downloads models if needed
✅ Dials out to your gateway (no port forwarding!)
✅ Shows up as available compute

Your Apps See It Automatically

# Same code, now uses both local + remote GPU!
client = OpenAI(base_url="http://localhost:11435/v1", ...)
response = client.chat.completions.create(...)  # Auto-routed

OllaBridge routes requests across all your nodes automatically.

🎯 Real-World Scenarios

Scenario 1: "I have a gaming PC at home"

# On your gaming PC:
ollabridge-node join --control https://your-gateway.com --token ...

# Now your laptop can use your gaming PC's GPU
# Even if you're at a coffee shop!

Scenario 2: "I want to use free Colab GPUs"

# In Colab notebook:
!pip install ollabridge
!ollabridge-node join --control https://your-gateway.com --token ...

# Now your production app can use free Colab compute
# Colab session ends? Start a new one. Zero config changes.

Scenario 3: "I have multiple cloud GPUs"

# Each GPU instance:
ollabridge-node join --control https://gateway.company.com --token ...

# Your team shares one API URL
# OllaBridge load-balances across all GPUs

💻 Use It Anywhere

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11435/v1",
    api_key="your-key-here"
)

# Chat
response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)

# Embeddings
embeddings = client.embeddings.create(
    model="nomic-embed-text",
    input="Hello, world!"
)

Node.js / TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:11435/v1",
  apiKey: process.env.OLLABRIDGE_KEY
});

const completion = await client.chat.completions.create({
  model: "deepseek-r1",
  messages: [{ role: "user", content: "Hello!" }]
});

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="http://localhost:11435/v1",
    api_key="your-key-here",
    model="deepseek-r1"
)

response = llm.invoke("What is the meaning of life?")

cURL

curl -X POST http://localhost:11435/v1/chat/completions \
  -H "Authorization: Bearer your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Works with ANY OpenAI-compatible tool or library.

🤖 AI Agents Love OllaBridge

OllaBridge has a Model Context Protocol (MCP) server built-in.

Agents can:

✅ Create enrollment tokens
✅ List connected compute nodes
✅ Check gateway health
✅ Manage your LLM infrastructure via tools

Start MCP Server

ollabridge-mcp

Example: Agent Workflow

# Agent can call these tools:
await session.call_tool("ollabridge.enroll.create", {})
# → Returns enrollment token

await session.call_tool("ollabridge.runtimes.list", {})
# → Shows all connected nodes

await session.call_tool("ollabridge.gateway.health", {})
# → Checks gateway status

Use Case: "Hey Claude, add my workstation's GPU to our LLM gateway"

→ Agent creates token, gives you the command, you run it. Done.

🔐 Security & Configuration

Authentication

OllaBridge auto-generates a secure API key on first run (saved in .env):

API_KEYS=sk-ollabridge-xY9kL2mN8pQ4rT6vW1zA

Use it in your apps:

# Option 1: Bearer token
headers = {"Authorization": "Bearer sk-ollabridge-..."}

# Option 2: Custom header
headers = {"X-API-Key": "sk-ollabridge-..."}

Configuration (`.env`)

# API Keys (comma-separated for multiple)
API_KEYS=sk-ollabridge-abc123,sk-ollabridge-def456

# Server
HOST=0.0.0.0
PORT=11435

# Default models
DEFAULT_MODEL=deepseek-r1
DEFAULT_EMBED_MODEL=nomic-embed-text

# Rate limiting
RATE_LIMIT=60/minute

# Security
ENROLLMENT_SECRET=your-secret-here
ENROLLMENT_TTL_SECONDS=3600

# Database (optional)
DATABASE_URL=postgresql://user:pass@localhost/ollabridge

Enrollment Tokens

Create short-lived tokens for nodes to join:

ollabridge enroll-create --ttl 3600

Tokens expire automatically for security.

📡 API Reference

Core Endpoints

Endpoint	Method	Description
`/health`	GET	Gateway health + node count
`/v1/chat/completions`	POST	OpenAI-compatible chat
`/v1/embeddings`	POST	Generate embeddings
`/v1/models`	GET	List available models (aggregated from nodes)

Admin Endpoints (require API key)

Endpoint	Method	Description
`/admin/recent`	GET	Recent request logs
`/admin/runtimes`	GET	List connected nodes
`/admin/enroll`	POST	Create enrollment token

Example: Check Connected Nodes

curl -H "X-API-Key: your-key" http://localhost:11435/admin/runtimes

Response:

{
  "runtimes": [
    {
      "node_id": "local",
      "connector": "local_ollama",
      "healthy": true,
      "tags": ["local"],
      "models": ["deepseek-r1", "llama3.1"]
    },
    {
      "node_id": "colab-gpu-1",
      "connector": "relay_link",
      "healthy": true,
      "tags": ["gpu", "free"],
      "models": ["mixtral", "codellama"]
    }
  ]
}

🏗️ Architecture Deep Dive

How It Works

Control Plane (Gateway): Your apps connect here
Nodes: Any machine with GPUs/CPUs running models
Relay Link: Nodes dial OUT to gateway (WebSocket)
Router: Picks the best node for each request

Why "Dial Out" Matters

Traditional (broken):

App → Gateway → Try to reach GPU
                ❌ Blocked by firewall
                ❌ NAT issues
                ❌ No public IP

OllaBridge (works everywhere):

App → Gateway ← GPU dials in
               ✅ Works from anywhere
               ✅ No port forwarding
               ✅ Ephemeral IPs OK

Connector Types

RelayLink: Node dials out via WebSocket (default, works everywhere)
DirectEndpoint: HTTP to stable node (best performance)
LocalOllama: Built-in local runtime (zero config)

OllaBridge picks the right one automatically.

📈 Scaling

Add More Workers

ollabridge start --workers 4

Use PostgreSQL

pip install psycopg2-binary
export DATABASE_URL=postgresql://user:pass@localhost/ollabridge
ollabridge start --workers 8

Add More Nodes

# Just keep adding nodes!
ollabridge-node join --control ... --token ...

OllaBridge automatically load-balances across all healthy nodes.

🌍 Public Access (Optional)

Quick Demo (Ngrok)

ollabridge start --share

Production (Cloudflare Tunnel)

# Terminal 1: Start gateway
ollabridge start

# Terminal 2: Expose it
cloudflared tunnel --url http://localhost:11435

Now your gateway has a public https:// URL!

Security: Always use API keys for public gateways.

🎓 Beginner's Guide

"I've never used LLMs before"

Install: pip install ollabridge
Start: ollabridge start
Copy the API key from the output
Use this code:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11435/v1",
    api_key="PASTE_KEY_HERE"
)

response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{"role": "user", "content": "Explain Python in simple terms"}]
)

print(response.choices[0].message.content)

That's it! You're running AI models on your computer.

"I want to add my gaming PC's GPU"

On your main computer (gateway):

ollabridge start
# Copy the "Node join token" and gateway URL

On your gaming PC:

pip install ollabridge
ollabridge-node join --control http://GATEWAY_IP:11435 --token TOKEN_HERE

Done! Your apps can now use your gaming PC's power.

"I want to use free Colab GPUs"

Start your gateway at home:

ollabridge start --share
# Note the public URL (https://xxx.ngrok.io)

In Colab notebook:

!pip install ollabridge
!ollabridge-node join --control https://xxx.ngrok.io --token YOUR_TOKEN

Now your apps use FREE Colab GPUs!

Pro tip: When Colab disconnects, just restart and run step 2 again. Zero config changes needed.

🛠️ Common Tasks

List Available Models

curl http://localhost:11435/v1/models

Check Gateway Health

curl http://localhost:11435/health

See Connected Nodes

curl -H "X-API-Key: your-key" http://localhost:11435/admin/runtimes

Create New Enrollment Token

ollabridge enroll-create

View Recent Requests

curl -H "X-API-Key: your-key" http://localhost:11435/admin/recent

🗺️ Roadmap

✅ Control Plane + Node architecture
✅ Outbound-only node enrollment (no port forwarding)
✅ MCP server for AI agent control
✅ Multi-node load balancing
🚧 Tag-based routing (send "coding" requests to GPU nodes)
🚧 Model-specific routing rules
🚧 Streaming support for chat completions
🚧 Web UI for node management
🚧 Prometheus metrics
🚧 Support for more runtimes (vLLM, llama.cpp, LM Studio)

🤝 Contributing

We welcome contributions! Areas we'd love help:

🔌 More runtime adapters (vLLM, llama.cpp, etc.)
🎨 Web UI for management
📊 Better monitoring/metrics
🔒 Security enhancements
📖 Documentation improvements

How to contribute:

Fork the repo
Create a branch (git checkout -b feature/amazing)
Make your changes
Add tests
Submit a PR

📄 License

MIT License - see LICENSE

🙏 Built With

FastAPI — Modern async web framework
Ollama — Run LLMs locally
WebSockets — Real-time node connections
SQLModel — Database with Python types

💬 Support

🌟 Star History

If OllaBridge helped you, give it a star! ⭐

Made with ❤️ for the local-first AI community

Stop paying cloud tokens. Use your own compute.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.3

Mar 23, 2026

0.1.2

Jan 22, 2026

0.1.1

Jan 6, 2026

This version

0.1.0

Dec 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollabridge-0.1.0.tar.gz (35.6 kB view details)

Uploaded Dec 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ollabridge-0.1.0-py3-none-any.whl (15.8 kB view details)

Uploaded Dec 29, 2025 Python 3

File details

Details for the file ollabridge-0.1.0.tar.gz.

File metadata

Download URL: ollabridge-0.1.0.tar.gz
Upload date: Dec 29, 2025
Size: 35.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ollabridge-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`13d8b8c88b1e5751edd310659f931f3b3540568babc6269e76215295c5d2d59f`
MD5	`242ebceee90a346b8dbb69e391d52959`
BLAKE2b-256	`74053133285125dc845599a2804313ec57cf82b27c7f1c3e2256c09f6852447f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ollabridge-0.1.0.tar.gz:

Publisher: publish.yml on ruslanmv/ollabridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ollabridge-0.1.0.tar.gz
- Subject digest: 13d8b8c88b1e5751edd310659f931f3b3540568babc6269e76215295c5d2d59f
- Sigstore transparency entry: 781143792
- Sigstore integration time: Dec 29, 2025
Source repository:
- Permalink: ruslanmv/ollabridge@7a69d8599a1a35c368dbe762613b2a2da3b3711c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/ruslanmv
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@7a69d8599a1a35c368dbe762613b2a2da3b3711c
- Trigger Event: release

File details

Details for the file ollabridge-0.1.0-py3-none-any.whl.

File metadata

Download URL: ollabridge-0.1.0-py3-none-any.whl
Upload date: Dec 29, 2025
Size: 15.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ollabridge-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ce19e0d5bf988dfedf31036b6f4cc31c75ec4f158358f90d4df22344e7770037`
MD5	`34cd16b8ca4a1764c3fe8aab2a429ef0`
BLAKE2b-256	`c1854c13445adf144cc59da820bc236b46f197b32d4df9338ef9d42e57f3b290`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ollabridge-0.1.0-py3-none-any.whl:

Publisher: publish.yml on ruslanmv/ollabridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ollabridge-0.1.0-py3-none-any.whl
- Subject digest: ce19e0d5bf988dfedf31036b6f4cc31c75ec4f158358f90d4df22344e7770037
- Sigstore transparency entry: 781143793
- Sigstore integration time: Dec 29, 2025
Source repository:
- Permalink: ruslanmv/ollabridge@7a69d8599a1a35c368dbe762613b2a2da3b3711c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/ruslanmv
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@7a69d8599a1a35c368dbe762613b2a2da3b3711c
- Trigger Event: release

ollabridge 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

OllaBridge ⚡️

🎯 What is OllaBridge?

🚀 Why OllaBridge?

🎯 Single Source of Truth

🛡️ Enterprise-Grade Security

🌍 Works Everywhere

🤖 AI Agent Ready

⚡ 60-Second Start

Step 1: Install

Step 2: Start Your Gateway

Step 3: Use It!

🌍 Add Any GPU in 60 Seconds

On Your Remote GPU/Machine:

Your Apps See It Automatically

🎯 Real-World Scenarios

Scenario 1: "I have a gaming PC at home"

Scenario 2: "I want to use free Colab GPUs"

Scenario 3: "I have multiple cloud GPUs"

💻 Use It Anywhere

Python (OpenAI SDK)

Node.js / TypeScript

LangChain

cURL

🤖 AI Agents Love OllaBridge

Start MCP Server

Example: Agent Workflow

🔐 Security & Configuration

Authentication

Configuration (.env)

Enrollment Tokens

📡 API Reference

Core Endpoints

Admin Endpoints (require API key)

Example: Check Connected Nodes

🏗️ Architecture Deep Dive

How It Works

Why "Dial Out" Matters

Connector Types

📈 Scaling

Add More Workers

Use PostgreSQL

Add More Nodes

🌍 Public Access (Optional)

Quick Demo (Ngrok)

Production (Cloudflare Tunnel)

🎓 Beginner's Guide

"I've never used LLMs before"

"I want to add my gaming PC's GPU"

"I want to use free Colab GPUs"

🛠️ Common Tasks

List Available Models

Check Gateway Health

See Connected Nodes

Create New Enrollment Token

View Recent Requests

🗺️ Roadmap

🤝 Contributing

📄 License

🙏 Built With

💬 Support

🌟 Star History

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Configuration (`.env`)