LLMHub - Open-source multi-provider LLM gateway
Project description
LLMHub
Lightweight gateway for routing LLM requests across providers.
Overview
LLMHub is a lightweight infrastructure layer that routes chat requests between multiple LLM providers.
It provides a single API and CLI interface to:
- switch between providers
- optimize cost and latency
- run locally or in production
Designed for developers building AI-powered apps without vendor lock-in.
Features
- Single
/chatendpoint powered by FastAPI - Multi-provider routing (Gemini, Ollama, optional OpenAI)
- Auto routing via rules or LLM-based agent
- CLI for local workflows
- .env-based configuration
- Fallback handling & latency tracking
- Business-friendly monitoring dashboard at
/monitoring/dashboard - Operational monitoring APIs (
/monitoring/overview,/monitoring/timeseries,/monitoring/failures)
Quick Start
1. Install
pip install -r requirements.txt
2. Configure
cp .env.example .env
Fill required keys.
3. Run server
uvicorn app.main:app --reload
Docs available at:
http://127.0.0.1:8000/docs
Example Request
{
"message": "Hello",
"preferred_provider": "auto",
"max_cost_tier": "low",
"timeout_ms": 120000
}
Example Response
{
"answer": "string",
"provider": "gemini",
"model": "string",
"latency_ms": 120,
"request_id": "uuid",
"fallback_used": false
}
CLI
Install:
pip install -e .
Usage:
llmhub chat "Hello" --provider auto
llmhub serve --reload
Routing Modes
rules
- Fast
- Deterministic
- Based on heuristics
agent
- Uses LLM (Gemini) for routing decisions
- More flexible, but adds latency and cost
Configuration
Environment variables:
GEMINI_API_KEYGEMINI_MODELOLLAMA_BASE_URLOLLAMA_MODELROUTER_MODEROUTER_MODEL
.envmust not be committed
Troubleshooting
-
Ollama 404
ollama pull llama3:8b
-
Gemini 404
- Check available models via API
Monitoring
Run service and open:
http://127.0.0.1:8000/monitoring/dashboard
Key endpoints:
/metrics- Prometheus scrape endpoint/monitoring/overview- KPI snapshot for selected time window/monitoring/timeseries- request/error/latency series for charting/monitoring/failures- latest failed requests list
Roadmap
- OpenAI provider stabilization
- Request caching
- Metrics dashboard
- Rate limiting
- Plugin system
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmhub_ai-0.1.2.tar.gz.
File metadata
- Download URL: llmhub_ai-0.1.2.tar.gz
- Upload date:
- Size: 44.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed608085bacdc87e50eebb17317d4c5d77bf88e2992d8fb386c31f8b72847bbf
|
|
| MD5 |
ec3852995514fdc61f2e0d91de31d36b
|
|
| BLAKE2b-256 |
5c631130421034a942af9bcff925bad0203ae94f7f6e28afde256e48e3439f63
|
File details
Details for the file llmhub_ai-0.1.2-py3-none-any.whl.
File metadata
- Download URL: llmhub_ai-0.1.2-py3-none-any.whl
- Upload date:
- Size: 44.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cafe757d2b4078d94a8b914a7e7c7647497bfb7d06443afb2a55ee4df35e5a96
|
|
| MD5 |
dd09db20757c523babd9ba319dc8181e
|
|
| BLAKE2b-256 |
145c82f6cf745d91b2fdd29508ae56a55fe2802e80ae1b1ef90b0c2419084559
|