MatrixLLM: OpenAI-compatible multi-provider LLM router (OpenRouter-style) with optional relay nodes

These details have not been verified by PyPI

Project description

MatrixLLM

OpenAI-compatible multi-provider LLM router with optional relay nodes.

Quick Start | How It Works | Multi-Provider Routing | OllaBridge Compatible

What is MatrixLLM?

MatrixLLM turns your computer into a private OpenAI-compatible API server.

Instead of sending your data to OpenAI, you can:

Run AI models locally on your own computer
Connect to multiple providers (OpenAI, Anthropic, Google, IBM) through one API
Use the same code you'd use with OpenAI

Your App (uses OpenAI SDK)
         |
         v
+------------------+
|    MatrixLLM     |  <-- Runs on localhost:11435
+------------------+
    /     |     \
   v      v      v
Ollama  OpenAI  Anthropic  (etc.)
(local) (cloud) (cloud)

Quick Start for Beginners

What You Need

Python 3.10 or newer (Download Python)
5 minutes of your time

Step 1: Install MatrixLLM

Open your terminal (Command Prompt on Windows, Terminal on Mac/Linux) and run:

pip install matrixllm

Step 2: Start the Server

matrixllm start

That's it! You'll see something like this:

╭─────────────────── Gateway Ready ───────────────────╮
│                                                      │
│ ✅ MatrixLLM is Online                              │
│                                                      │
│ Model:        deepseek-r1                           │
│ Local API:    http://localhost:11435/v1             │
│ Health:       http://localhost:11435/health         │
│ Key:          sk-matrixllm-xY9kL2mN8pQ4rT6v        │
│                                                      │
│ Ollabridge compatible:                              │
│   OLLAS_BASE_URL=http://localhost:11435/v1          │
│   OLLAS_API_KEY=sk-matrixllm-xY9kL2mN8pQ4rT6v      │
│   OLLAS_MODEL=deepseek-r1                           │
│                                                      │
╰──────────────────────────────────────────────────────╯

Important: Copy the API key shown (starts with sk-matrixllm-). You'll need it!

Step 3: Use It in Your Code

from openai import OpenAI

# Connect to your local MatrixLLM server
client = OpenAI(
    base_url="http://localhost:11435/v1",
    api_key="sk-matrixllm-YOUR-KEY-HERE"  # Use the key from Step 2
)

# Send a message to the AI
response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{"role": "user", "content": "Hello! What can you do?"}]
)

# Print the AI's response
print(response.choices[0].message.content)

Step 4: Test with curl (Optional)

You can also test from the command line:

curl http://localhost:11435/v1/chat/completions \
  -H "Authorization: Bearer sk-matrixllm-YOUR-KEY-HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

How It Works

API Keys Explained

When you run matrixllm start, it automatically generates a secure API key:

sk-matrixllm-xY9kL2mN8pQ4rT6vW1zA...

What does this mean?

sk- = "secret key" (standard prefix)
matrixllm- = identifies this as a MatrixLLM key
xY9kL2mN... = random secure characters

You can set your own key by creating a .env file:

API_KEYS=my-custom-api-key

Or use multiple keys (comma-separated):

API_KEYS=key-for-app-1,key-for-app-2,key-for-testing

Authentication Methods

MatrixLLM accepts API keys in two ways:

# Method 1: Authorization header (recommended)
headers = {"Authorization": "Bearer sk-matrixllm-xxx"}

# Method 2: X-API-Key header
headers = {"X-API-Key": "sk-matrixllm-xxx"}

Both work identically. The OpenAI SDK uses Method 1 automatically.

OllaBridge Compatibility

MatrixLLM is fully compatible with OllaBridge. Both projects share the same API interface, making it easy to switch between them or migrate your applications.

When to Use Each

Feature	OllaBridge	MatrixLLM
Use Case	Simple local-only proxy	Multi-provider enterprise router
Ollama Support	Local only	Local + distributed nodes
Cloud Providers	No	OpenAI, Anthropic, Google, IBM
Distributed Compute	No	Yes (relay nodes)
Complexity	Minimal	Full-featured

Choose OllaBridge when:

You only need local Ollama models
You want a lightweight, simple setup
You don't need cloud provider integration

Choose MatrixLLM when:

You need multi-provider routing (OpenAI, Anthropic, Google, IBM)
You want distributed compute across multiple machines
You need enterprise features like load balancing and failover

Shared Configuration

Both projects use the same:

Port: 11435
API structure: /v1/chat/completions, /v1/embeddings, /v1/models
Environment variables: API_KEYS, OLLAMA_BASE_URL, DEFAULT_MODEL

Using OLLAS_* Environment Variables

MatrixLLM supports OllaBridge-style environment variables for seamless migration:

Variable	Description	Example
`OLLAS_API_KEY`	API key (alias for `API_KEYS`)	`sk-matrixllm-xxx`
`OLLAS_BASE_URL`	Server URL	`http://localhost:11435/v1`
`OLLAS_MODEL`	Default model (alias for `DEFAULT_MODEL`)	`deepseek-r1`

Example .env file:

# OllaBridge-compatible configuration
OLLAS_API_KEY=sk-matrixllm-your-key-here
OLLAS_BASE_URL=http://localhost:11435/v1
OLLAS_MODEL=deepseek-r1

Example Python code:

import os
from openai import OpenAI

# Works with both MatrixLLM and OllaBridge
client = OpenAI(
    base_url=os.getenv("OLLAS_BASE_URL", "http://localhost:11435/v1"),
    api_key=os.getenv("OLLAS_API_KEY", "your-key-here"),
)

response = client.chat.completions.create(
    model=os.getenv("OLLAS_MODEL", "deepseek-r1"),
    messages=[{"role": "user", "content": "Hello!"}]
)

Migration Path

Switch between OllaBridge and MatrixLLM without changing your application code:

# Start with OllaBridge (simple local setup)
pip install ollabridge
ollabridge start

# Upgrade to MatrixLLM (when you need more features)
pip install matrixllm
matrixllm start

Your application code stays exactly the same!

Multi-Provider Routing

MatrixLLM can route requests to different AI providers based on the model name.

Supported Providers

Provider	Model Prefix	Example
Local Ollama	(no prefix)	`deepseek-r1`, `llama3`
OpenAI	`openai/`	`openai/gpt-4o-mini`
Anthropic	`anthropic/`	`anthropic/claude-3-5-sonnet-latest`
Google Gemini	`google/`	`google/gemini-1.5-pro`
IBM watsonx	`ibm/`	`ibm/granite-3-8b-instruct`

Setup Multi-Provider

Create a .env file with your API keys:

# Your MatrixLLM API key
API_KEYS=my-secure-key

# OpenAI (optional)
OPENAI_COMPAT_BASE_URL=https://api.openai.com/v1
OPENAI_COMPAT_API_KEY=sk-...

# Anthropic (optional)
ANTHROPIC_API_KEY=sk-ant-...

# Google Gemini (optional)
GEMINI_API_KEY=AIza...

# IBM watsonx (optional)
WATSONX_BASE_URL=https://us-south.ml.cloud.ibm.com
WATSONX_API_KEY=...
WATSONX_PROJECT_ID=...

Use Different Providers

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11435/v1",
    api_key="my-secure-key"
)

# Use local Ollama (default)
response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Use OpenAI
response = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Use Anthropic Claude
response = client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet-latest",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Use Google Gemini
response = client.chat.completions.create(
    model="google/gemini-1.5-pro",
    messages=[{"role": "user", "content": "Hello!"}]
)

Distributed Compute (Relay Nodes)

Add GPUs from anywhere without port forwarding. Nodes dial out to your gateway.

On Your Gateway (Control Plane)

matrixllm start
# Note the enrollment token shown

On Remote GPU/Machine (Node)

pip install matrixllm

matrixllm-node join \
  --control http://YOUR_GATEWAY_IP:11435 \
  --token YOUR_ENROLLMENT_TOKEN

Use Cases

Gaming PC at home: Join your gateway from anywhere
Free Colab/Kaggle GPUs: No port forwarding needed
Cloud instances: Auto load balancing across nodes

API Reference

Endpoints

Endpoint	Method	Auth Required	Description
`/health`	GET	No	Check if server is running
`/v1/models`	GET	Yes	List available models
`/v1/chat/completions`	POST	Yes	Generate chat responses
`/v1/embeddings`	POST	Yes	Generate text embeddings

Quick Examples

# Check health (no auth needed)
curl http://localhost:11435/health

# List models
curl -H "Authorization: Bearer YOUR-KEY" \
  http://localhost:11435/v1/models

# Chat completion
curl -X POST http://localhost:11435/v1/chat/completions \
  -H "Authorization: Bearer YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "deepseek-r1", "messages": [{"role": "user", "content": "Hi!"}]}'

CLI Commands

# Start the server
matrixllm start

# Start with options
matrixllm start --port 8080 --model llama3

# Show LAN URLs (for other devices on your network)
matrixllm start --lan

# Create public URL (via ngrok)
matrixllm start --share

# Check system health
matrixllm doctor

# List available models
matrixllm models --api-key YOUR-KEY

# Test chat
matrixllm test-chat "Hello!" --api-key YOUR-KEY

Configuration Reference

All Environment Variables

# === Server ===
PORT=11435                    # Server port
HOST=0.0.0.0                  # Bind address
CORS_ORIGINS=http://localhost:5173,http://localhost:3000

# === Authentication ===
API_KEYS=dev-key-change-me    # Comma-separated API keys

# === Rate Limiting ===
RATE_LIMIT=60/minute          # Requests per minute

# === Local Ollama ===
OLLAMA_BASE_URL=http://localhost:11434
DEFAULT_MODEL=deepseek-r1
DEFAULT_EMBED_MODEL=nomic-embed-text

# === Routing ===
ROUTING_MODE=prefix           # prefix | fallback

# === Multi-Provider (Optional) ===
OPENAI_COMPAT_BASE_URL=https://api.openai.com/v1
OPENAI_COMPAT_API_KEY=

ANTHROPIC_API_KEY=

GEMINI_API_KEY=

WATSONX_BASE_URL=https://us-south.ml.cloud.ibm.com
WATSONX_API_KEY=
WATSONX_PROJECT_ID=

# === Relay Fabric ===
RELAY_ENABLED=true
ENROLLMENT_SECRET=dev-enroll-change-me
LOCAL_RUNTIME_ENABLED=true

# === OllaBridge Compatibility ===
# OLLAS_API_KEY=              # Alias for API_KEYS
# OLLAS_BASE_URL=             # Client base URL
# OLLAS_MODEL=                # Alias for DEFAULT_MODEL

Troubleshooting

"Connection refused" error

Make sure the server is running:

matrixllm start

"Invalid API key" error

Check that you're using the correct key:

# The key is shown when you start the server
matrixllm start
# Look for: Key: sk-matrixllm-xxxxx

"Model not found" error

Check available models:

curl -H "Authorization: Bearer YOUR-KEY" http://localhost:11435/v1/models

For local models, make sure Ollama is running:
```
ollama list
```

Server won't start

Check if another service is using port 11435:

# Use a different port
matrixllm start --port 8080

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
make test

# Format code
make format

# Type check
make typecheck

License

Apache License 2.0 - see LICENSE

Built With

FastAPI - Async web framework
httpx - Async HTTP client
Ollama - Local LLM runtime
Pydantic - Data validation

MatrixLLM - Your unified gateway to all LLM providers

Report Bug | Request Feature

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.1

Jan 23, 2026

This version

0.1.0

Jan 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matrixllm-0.1.0.tar.gz (45.8 kB view details)

Uploaded Jan 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

matrixllm-0.1.0-py3-none-any.whl (56.2 kB view details)

Uploaded Jan 22, 2026 Python 3

File details

Details for the file matrixllm-0.1.0.tar.gz.

File metadata

Download URL: matrixllm-0.1.0.tar.gz
Upload date: Jan 22, 2026
Size: 45.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for matrixllm-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b6cfebad43ef25629e1aa8527758a696766bc10ba2cb6ca76890b5840dd48b24`
MD5	`bd43454573e6aad97f150a8c6f4fec1c`
BLAKE2b-256	`15321d4bf5a710c55ceec9cb709e3252041d80335b64049fc23a5dad036a51c1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for matrixllm-0.1.0.tar.gz:

Publisher: publish.yml on agent-matrix/matrix-llm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: matrixllm-0.1.0.tar.gz
- Subject digest: b6cfebad43ef25629e1aa8527758a696766bc10ba2cb6ca76890b5840dd48b24
- Sigstore transparency entry: 844833401
- Sigstore integration time: Jan 22, 2026
Source repository:
- Permalink: agent-matrix/matrix-llm@5daa4debd45b3522f5f4c514cfb3270c58cb849a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/agent-matrix
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5daa4debd45b3522f5f4c514cfb3270c58cb849a
- Trigger Event: release

File details

Details for the file matrixllm-0.1.0-py3-none-any.whl.

File metadata

Download URL: matrixllm-0.1.0-py3-none-any.whl
Upload date: Jan 22, 2026
Size: 56.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for matrixllm-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d86386056ca53785885e6508db5d3861ebc6f1a3f62ae2871031ecbeaf6cedbc`
MD5	`7f53169f8fa3d6551e160ca332687863`
BLAKE2b-256	`8c11d94b2cc1dbd4b2465d33a21cc7c2b72132481af8c8329d813a511bd67f66`

See more details on using hashes here.

Provenance

The following attestation bundles were made for matrixllm-0.1.0-py3-none-any.whl:

Publisher: publish.yml on agent-matrix/matrix-llm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: matrixllm-0.1.0-py3-none-any.whl
- Subject digest: d86386056ca53785885e6508db5d3861ebc6f1a3f62ae2871031ecbeaf6cedbc
- Sigstore transparency entry: 844833403
- Sigstore integration time: Jan 22, 2026
Source repository:
- Permalink: agent-matrix/matrix-llm@5daa4debd45b3522f5f4c514cfb3270c58cb849a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/agent-matrix
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5daa4debd45b3522f5f4c514cfb3270c58cb849a
- Trigger Event: release

matrixllm 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

MatrixLLM

What is MatrixLLM?

Quick Start for Beginners

What You Need

Step 1: Install MatrixLLM

Step 2: Start the Server

Step 3: Use It in Your Code

Step 4: Test with curl (Optional)

How It Works

API Keys Explained

Authentication Methods

OllaBridge Compatibility

When to Use Each

Shared Configuration

Using OLLAS_* Environment Variables

Migration Path

Multi-Provider Routing

Supported Providers

Setup Multi-Provider

Use Different Providers

Distributed Compute (Relay Nodes)

On Your Gateway (Control Plane)

On Remote GPU/Machine (Node)

Use Cases

API Reference

Endpoints

Quick Examples

CLI Commands

Configuration Reference

All Environment Variables

Troubleshooting

"Connection refused" error

"Invalid API key" error

"Model not found" error

Server won't start

Development

License

Built With

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance