Skip to main content

LLMHub - Open-source multi-provider LLM gateway

Project description

LLMHub

Lightweight gateway for routing LLM requests across providers.


Overview

LLMHub is a lightweight infrastructure layer that routes chat requests between multiple LLM providers.

It provides a single API and CLI interface to:

  • switch between providers
  • optimize cost and latency
  • run locally or in production

Designed for developers building AI-powered apps without vendor lock-in.


Features

  • Single /chat endpoint powered by FastAPI
  • Multi-provider routing (Gemini, Ollama, optional OpenAI)
  • Auto routing via rules or LLM-based agent
  • CLI for local workflows
  • .env-based configuration
  • Fallback handling & latency tracking
  • Business-friendly monitoring dashboard at /monitoring/dashboard
  • Operational monitoring APIs (/monitoring/overview, /monitoring/timeseries, /monitoring/failures)

Quick Start

1. Install

pip install -r requirements.txt

2. Configure

cp .env.example .env

Fill required keys.

3. Run server

uvicorn app.main:app --reload

Docs available at:

http://127.0.0.1:8000/docs

Example Request

{
	"message": "Hello",
	"preferred_provider": "auto",
	"max_cost_tier": "low",
	"timeout_ms": 120000
}

Example Response

{
	"answer": "string",
	"provider": "gemini",
	"model": "string",
	"latency_ms": 120,
	"request_id": "uuid",
	"fallback_used": false
}

CLI

Install:

pip install -e .

Usage:

llmhub chat "Hello" --provider auto
llmhub serve --reload

Routing Modes

rules

  • Fast
  • Deterministic
  • Based on heuristics

agent

  • Uses LLM (Gemini) for routing decisions
  • More flexible, but adds latency and cost

Configuration

Environment variables:

  • GEMINI_API_KEY
  • GEMINI_MODEL
  • OLLAMA_BASE_URL
  • OLLAMA_MODEL
  • ROUTER_MODE
  • ROUTER_MODEL

.env must not be committed


Troubleshooting

  • Ollama 404

    ollama pull llama3:8b
    
  • Gemini 404

    • Check available models via API

Monitoring

Run service and open:

http://127.0.0.1:8000/monitoring/dashboard

Key endpoints:

  • /metrics - Prometheus scrape endpoint
  • /monitoring/overview - KPI snapshot for selected time window
  • /monitoring/timeseries - request/error/latency series for charting
  • /monitoring/failures - latest failed requests list

Roadmap

  • OpenAI provider stabilization
  • Request caching
  • Metrics dashboard
  • Rate limiting
  • Plugin system

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmhub_ai-0.1.2.tar.gz (44.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmhub_ai-0.1.2-py3-none-any.whl (44.0 kB view details)

Uploaded Python 3

File details

Details for the file llmhub_ai-0.1.2.tar.gz.

File metadata

  • Download URL: llmhub_ai-0.1.2.tar.gz
  • Upload date:
  • Size: 44.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for llmhub_ai-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ed608085bacdc87e50eebb17317d4c5d77bf88e2992d8fb386c31f8b72847bbf
MD5 ec3852995514fdc61f2e0d91de31d36b
BLAKE2b-256 5c631130421034a942af9bcff925bad0203ae94f7f6e28afde256e48e3439f63

See more details on using hashes here.

File details

Details for the file llmhub_ai-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: llmhub_ai-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 44.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for llmhub_ai-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cafe757d2b4078d94a8b914a7e7c7647497bfb7d06443afb2a55ee4df35e5a96
MD5 dd09db20757c523babd9ba319dc8181e
BLAKE2b-256 145c82f6cf745d91b2fdd29508ae56a55fe2802e80ae1b1ef90b0c2419084559

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page