A Python library for selecting the best LLM model based on user input using any LLM via LiteLLM
Project description
LLM Selector
A Python library that uses any LLM (via LiteLLM) to intelligently select the best model for a given task.
Installation
From PyPI (Recommended)
pip install llm-selector
From Source
git clone https://github.com/YoannDev90/llm-selector.git
cd llm-selector
pip install .
Development Installation
git clone https://github.com/YoannDev90/llm-selector.git
cd llm-selector
pip install -e ".[test]"
Quick Start
import asyncio
from llm_selector import LLMSelector, LLMSelectorConfig
async def main():
# Define available models
models = {
"gpt-4": {
"description": "OpenAI GPT-4 for complex reasoning",
"weight": 2.0,
"cost_per_token": 0.00003
},
"claude-3": {
"description": "Anthropic Claude 3 for balanced tasks",
"weight": 1.0,
"cost_per_token": 0.000015
}
}
# Configure selector LLM (the one that makes the decision)
selector_configs = [
LLMSelectorConfig("gpt-3.5-turbo", api_key="your-openai-key")
]
# Create selector
selector = LLMSelector(models=models, selector_configs=selector_configs)
# Select model for a task
result = await selector.select("Write a complex Python data analysis script")
print(f"Selected: {result}")
asyncio.run(main())
import asyncio
from llm_selector import LLMSelector, LLMSelectorConfig
async def main():
# Configuration des modèles disponibles avec poids et coûts
models = {
"openai/gpt-4": {
"description": "OpenAI GPT-4 for complex reasoning and coding",
"weight": 2.0, # 2x plus de chances
"cost_per_token": 0.00003 # Coût estimé par token
},
"anthropic/claude-3": {
"description": "Anthropic Claude 3 for safe and helpful responses",
"weight": 1.5,
"cost_per_token": 0.000015
},
"cerebras/llama3.3-70b": {
"description": "Cerebras Llama 3.3 70B for fast inference",
"weight": 1.0
},
}
# Configurations des services de sélection avec fallbacks
selector_configs = [
LLMSelectorConfig(
model="openrouter/openai/gpt-oss-20b",
api_base="https://openrouter.ai/api/v1",
api_key="env:OPENROUTER_API_KEY" # Variable d'environnement
),
LLMSelectorConfig(
model="anthropic/claude-3-haiku",
api_key="dotenv:ANTHROPIC_KEY" # Clé dans .env
),
]
selector = LLMSelector(
models=models,
selector_configs=selector_configs,
cache_enabled=True,
debug=True
)
user_input = "I need to write a complex Python script for data analysis."
selected_model = await selector.select(user_input)
print(f"Selected model: {selected_model}")
# Métriques
metrics = selector.get_metrics()
print(f"Cache hit rate: {metrics['cache_hit_rate']:.2%}")
print(f"Total cost: ${selector.get_total_cost():.4f}")
asyncio.run(main())
Advanced Features
Synchronous Interface
selector = LLMSelector(models=models, selector_configs=configs)
result = selector.select_sync("My query") # Synchronous version
Caching
selector = LLMSelector(
models=models,
cache_enabled=True, # Enable caching
cache_ttl=3600, # 1 hour TTL
max_cache_size=1000 # Limit cache size
)
# Clear cache when needed
selector.clear_cache()
Cost Tracking
# Set budget limit
selector.set_budget_limit(10.0) # $10 limit
# Check if exceeded
if selector.is_budget_exceeded():
print("Budget exceeded!")
# Get cost breakdown
total_cost = selector.get_total_cost()
cost_by_model = selector.get_metrics()['cost_by_model']
Metrics & Monitoring
metrics = selector.get_metrics()
print(f"""
Requests: {metrics['total_requests']}
Cache hits: {metrics['cache_hits']} ({metrics['cache_hit_rate']:.1%})
Success rate: {metrics['successful_selections']/metrics['total_requests']:.1%}
Average latency: {metrics['average_latency']:.2f}s
Total cost: ${metrics['total_cost']:.4f}
""")
# Reset metrics
selector.reset_metrics()
Cost Tracking
# Set budget limit
selector.set_budget_limit(10.0) # $10 limit
# Check if exceeded
if selector.is_budget_exceeded():
print("Budget exceeded!")
# Get cost breakdown
total_cost = selector.get_total_cost()
cost_by_model = selector.get_metrics()['cost_by_model']
Retry Logic
selector = LLMSelector(
models=models,
max_retries=3, # Max retry attempts
retry_delay=1.0 # Base delay in seconds
)
# Automatic exponential backoff for network/rate limit errors
API Reference
LLMSelector
Constructor
LLMSelector(
models: dict, # Required: Model configurations
preprompt: str = None, # System prompt for selector
selector_configs: list = None, # LLM configs for selection
default_model: str = None, # Fallback model
validate_configs: bool = True, # Validate configs on init
max_retries: int = 3, # Max API retry attempts
retry_delay: float = 1.0, # Base retry delay (seconds)
cache_enabled: bool = True, # Enable result caching
cache_ttl: int = 3600, # Cache TTL (seconds)
debug: bool = False, # Enable debug logging
max_cache_size: int = 1000, # Max cache entries
**litellm_kwargs # Additional litellm params
)
Methods
select(input: str) -> str: Async model selectionselect_sync(input: str) -> str: Sync model selectionget_metrics() -> dict: Get performance metricsreset_metrics(): Reset all metricsclear_cache(): Clear cached resultsget_total_cost() -> float: Get total API costset_budget_limit(limit: float): Set cost budgetis_budget_exceeded() -> bool: Check budget statusvalidate_configuration() -> bool: Validate current config
LLMSelectorConfig
Constructor
LLMSelectorConfig(
model: str, # Required: Model identifier
api_base: str = None, # Custom API base URL
api_key: str = None, # API key (plain/env/dotenv)
**kwargs # Additional litellm parameters
)
Methods
validate() -> bool: Validate configuration
Model Configuration Format
models = {
"model_name": {
"description": "Human-readable description", # Required
"weight": 1.0, # Optional, default 1.0
"cost_per_token": 0.00001 # Optional, for cost tracking
}
}
Configuration
API Key Sources
API keys can be specified in three ways:
- Plain text:
"sk-your-api-key" - Environment variable:
"env:VARIABLE_NAME" - .env file:
"dotenv:KEY_NAME"
Environment Variables
Create a .env file in your project root:
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-key
Model Weights
Model weights influence selection probability:
- Higher weights (> 1.0) increase selection chance
- Lower weights (< 1.0) decrease selection chance
- Weights are relative to other models
Deployment
Docker
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
RUN pip install .
CMD ["python", "your_app.py"]
Production Considerations
- Set appropriate
max_cache_sizebased on memory constraints - Configure
cache_ttlbased on how often model preferences change - Use environment variables for API keys in production
- Monitor metrics regularly for performance optimization
- Set budget limits to control costs
Model Configuration
Each model supports:
description: Human-readable descriptionweight: Selection weight (default: 1.0)cost_per_token: Cost estimation (optional)
API Key Configuration
API keys can be specified in three ways:
- Plain text:
"sk-your-api-key" - Environment variable:
"env:VARIABLE_NAME" - .env file:
"dotenv:KEY_NAME"
LLMSelectorConfig
model: The model name (e.g., "openrouter/openai/gpt-4")api_base: Custom API base URL (optional)api_key: API key (plain, env:, or dotenv:)**kwargs: Additional litellm parameters
Testing
The library includes comprehensive unit tests, integration tests, and performance benchmarks.
Install Test Dependencies
pip install -e ".[test]"
Run Tests
# All tests
pytest
# Unit tests only
pytest -m "not integration and not performance"
# Integration tests
pytest -m integration
# Performance tests
pytest -m performance
# With coverage
pytest --cov=llm_selector --cov-report=html
Test Coverage
- Unit Tests: Core functionality, error handling, configuration validation
- Integration Tests: Multi-provider scenarios, cost optimization, concurrent load
- Performance Tests: Latency benchmarks, throughput testing, memory usage
- Coverage Goal: 80%+ code coverage with detailed edge case testing
Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
git clone https://github.com/YoannDev90/llm-selector.git
cd llm-selector
pip install -e ".[test]"
pre-commit install
Running Tests
# All tests
pytest
# With coverage
pytest --cov=llm_selector --cov-report=html
# Specific test categories
pytest -m "unit" # Unit tests only
pytest -m "integration" # Integration tests
pytest -m "performance" # Performance tests
Changelog
v0.1.0 (Current)
- Initial release with core LLM selection functionality
- Support for multiple LLM providers via LiteLLM
- Intelligent caching with TTL and size limits
- Cost tracking and budget management
- Comprehensive metrics and monitoring
- Retry logic with exponential backoff
- Async and sync interfaces
- Extensive test coverage
Publishing
For information on how to publish new versions to PyPI, see PUBLISHING.md.
The project includes automated publishing via GitHub Actions that triggers on version tags.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_selector-0.1.0.tar.gz.
File metadata
- Download URL: llm_selector-0.1.0.tar.gz
- Upload date:
- Size: 25.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f13fface905b7778c4fe581da495d87d2d11fd3e37e665aaa993bc8adc62baad
|
|
| MD5 |
6a45b423698691a14fc556bb977d7e22
|
|
| BLAKE2b-256 |
a26a99f2390539f09b7958ae28df3d8cf877b2dc11845ec217c2f10e5f1d9f04
|
File details
Details for the file llm_selector-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llm_selector-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
99af03b75ba26908bff5fee0a25201fb38b1251dbd389f65d0b384b96f8f12a8
|
|
| MD5 |
ef585c867ad186c540038029a28c4a27
|
|
| BLAKE2b-256 |
19480a2390544e4112d556a364bc20d2d2f3abd9172da73233f804cb1f7829d8
|