Skip to main content

LLMHub - Open-source multi-provider LLM gateway

Project description

LLMHub

Lightweight gateway for routing LLM requests across providers.


Overview

LLMHub is a lightweight infrastructure layer that routes chat requests between multiple LLM providers.

It provides a single API and CLI interface to:

  • switch between providers
  • optimize cost and latency
  • run locally or in production

Designed for developers building AI-powered apps without vendor lock-in.


Features

  • Single /chat endpoint powered by FastAPI
  • Multi-provider routing (Gemini, Ollama, optional OpenAI)
  • Auto routing via rules or LLM-based agent
  • CLI for local workflows
  • .env-based configuration
  • Fallback handling & latency tracking
  • Business-friendly monitoring dashboard at /monitoring/dashboard
  • Operational monitoring APIs (/monitoring/overview, /monitoring/timeseries, /monitoring/failures)

Quick Start

1. Install

pip install -r requirements.txt

2. Configure

cp .env.example .env

Fill required keys.

3. Run server

uvicorn app.main:app --reload

Docs available at:

http://127.0.0.1:8000/docs

Example Request

{
	"message": "Hello",
	"preferred_provider": "auto",
	"max_cost_tier": "low",
	"timeout_ms": 120000
}

Example Response

{
	"answer": "string",
	"provider": "gemini",
	"model": "string",
	"latency_ms": 120,
	"request_id": "uuid",
	"fallback_used": false
}

CLI

Install:

pip install -e .

Usage:

llmhub chat "Hello" --provider auto
llmhub serve --reload

Routing Modes

rules

  • Fast
  • Deterministic
  • Based on heuristics

agent

  • Uses LLM (Gemini) for routing decisions
  • More flexible, but adds latency and cost

Configuration

Environment variables:

  • GEMINI_API_KEY
  • GEMINI_MODEL
  • OLLAMA_BASE_URL
  • OLLAMA_MODEL
  • ROUTER_MODE
  • ROUTER_MODEL

.env must not be committed


Troubleshooting

  • Ollama 404

    ollama pull llama3:8b
    
  • Gemini 404

    • Check available models via API

Monitoring

Run service and open:

http://127.0.0.1:8000/monitoring/dashboard

Key endpoints:

  • /metrics - Prometheus scrape endpoint
  • /monitoring/overview - KPI snapshot for selected time window
  • /monitoring/timeseries - request/error/latency series for charting
  • /monitoring/failures - latest failed requests list

Roadmap

  • OpenAI provider stabilization
  • Request caching
  • Metrics dashboard
  • Rate limiting
  • Plugin system

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmhub_gateway-0.1.1.tar.gz (44.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmhub_gateway-0.1.1-py3-none-any.whl (44.1 kB view details)

Uploaded Python 3

File details

Details for the file llmhub_gateway-0.1.1.tar.gz.

File metadata

  • Download URL: llmhub_gateway-0.1.1.tar.gz
  • Upload date:
  • Size: 44.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for llmhub_gateway-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3e2a5de3eecd3e7aad049de0b1c51612d783a42c5f89c039d84c365dd14e139d
MD5 289b9d6ba5d4f352ea88ecaec2e68de8
BLAKE2b-256 a84de17dfc85d00fd21ab71e155957d9af4b4fd0108795d9442cd1119a80c1dd

See more details on using hashes here.

File details

Details for the file llmhub_gateway-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: llmhub_gateway-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 44.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for llmhub_gateway-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 879d557106a1554e665a4051e8804ed36de92537b3fe92099e20e68ad2a50aa4
MD5 1d0ea35e416753b5a004da7758d176f1
BLAKE2b-256 d1812226f6d2fce239ae5e4f40756d4f14a7ad31247e7ea5eaa6567892da81df

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page