Intelligent LLM Router - Route requests across multiple LLM providers with automatic failover, quota management, and caching
Project description
Intelligent LLM Router
Intelligent LLM Router is a Python library that routes requests across multiple LLM providers with automatic failover, quota management, and response caching.
Features
- Multi-Provider Support: OpenAI, Anthropic, Google Gemini, Groq, Mistral, Cohere, DeepSeek, Together, HuggingFace, OpenRouter, xAI, DashScope, and Ollama
- Automatic Failover: Automatically switches to the next provider if one fails
- Quota Management: Tracks RPM/RPD limits per provider and routes around rate limits
- Response Caching: Exact and semantic caching to reduce costs and latency
- Multiple Routing Strategies: Auto, Cost-Optimized, Quality-First, Latency-First, Round-Robin
- Vision & Embeddings: Full support for vision models and embeddings
Installation
pip install llm_router
Or with specific extras:
pip install llm_router[server] # Includes FastAPI server
Quick Start
import asyncio
from llm_router import IntelligentRouter, RoutingOptions, RoutingStrategy
async def main():
# Initialize the router
router = IntelligentRouter()
await router.start()
# Define your request
request_data = {
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"temperature": 0.7,
}
# Configure routing options (optional)
options = RoutingOptions(
strategy=RoutingStrategy.AUTO,
)
# Route the request
response = await router.route(request_data, options)
print(response["choices"][0]["message"]["content"])
print(f"Provider: {response['routing_metadata']['provider']}")
await router.stop()
asyncio.run(main())
Configuration
Environment Variables
All configuration can be done via environment variables with the ROUTER_ prefix:
| Variable | Default | Description |
|---|---|---|
ROUTER_LLM_TIMEOUT |
60 | Timeout for LLM calls in seconds |
ROUTER_MAX_RETRIES |
3 | Maximum retry attempts |
ROUTER_ENABLE_OLLAMA_FALLBACK |
true | Enable Ollama fallback when cloud providers fail |
ROUTER_CACHE_DIR |
/tmp/llm_router_cache | Directory for response cache |
ROUTER_RESPONSE_CACHE_TTL |
3600 | Cache TTL in seconds |
ROUTER_DEFAULT_STRATEGY |
auto | Default routing strategy |
Provider API Keys
Set API keys as environment variables:
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GROQ_API_KEY=gsk_...
export GEMINI_API_KEY=AIza...
# etc.
Routing Strategies
| Strategy | Description |
|---|---|
auto |
Balanced: quota + latency + quality (default) |
cost_optimized |
Maximize remaining free quota |
quality_first |
Prioritize highest quality models |
latency_first |
Prioritize fastest responding models |
round_robins |
Uniform spread across providers |
Request Options
from llm_router import RoutingOptions, CachePolicy
options = RoutingOptions(
strategy=RoutingStrategy.COST_OPTIMIZED,
free_tier_only=True, # Only use free tier providers
preferred_providers=["groq", "gemini"], # Prefer these providers
excluded_providers=["openai"], # Skip these providers
cache_policy=CachePolicy.ENABLED, # Enable response caching
)
API Reference
IntelligentRouter
router = IntelligentRouter()
await router.start() # Initialize router
response = await router.route(request_data, options) # Route request
stats = router.get_stats() # Get router statistics
await router.stop() # Cleanup
Models
RoutingOptions: Configuration for routing behaviorRoutingStrategy: Enum for routing strategiesTaskType: Enum for task types (chat, embeddings, vision, etc.)CachePolicy: Enum for caching behaviorsettings: Global settings object
Running the Server
# Install with server extras
pip install llm_router[server]
# Run the server
llm-router
# Or with custom settings
ROUTER_PORT=8000 llm-router
The server provides a FastAPI-compatible API at /v1/chat/completions, /v1/embeddings, and /health.
Development
# Clone the repository
git clone https://github.com/anomalyco/llm_router.git
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
ruff check src/
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file routeme-0.1.1.tar.gz.
File metadata
- Download URL: routeme-0.1.1.tar.gz
- Upload date:
- Size: 46.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
026706f069b985ba504c0f30dc28e3ce43d56f7ab55714710d08edb09fcfcf85
|
|
| MD5 |
fb52026760fa79287012492274f9fa40
|
|
| BLAKE2b-256 |
ff5f09420d87d245a35cc8ba67e5d9226478215c0b5eb993cc3c84e7eb4aa773
|
File details
Details for the file routeme-0.1.1-py3-none-any.whl.
File metadata
- Download URL: routeme-0.1.1-py3-none-any.whl
- Upload date:
- Size: 41.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42487886fa62fa915ff8b16c4d19c270954bae1ade074315e8adb5a7d279ec73
|
|
| MD5 |
89810dc4102c9cc7dba5286fcd220d47
|
|
| BLAKE2b-256 |
bd50ea2283f17d7d9fb46c9510b79b2bde8d7190239267235c48a4b14198c4d6
|