Skip to main content

A Python library for selecting the best LLM model based on user input using any LLM via LiteLLM

Project description

LLM Selector

A Python library that uses any LLM (via LiteLLM) to intelligently select the best model for a given task.

Installation

From PyPI (Recommended)

pip install llm-selector

From Source

git clone https://github.com/YoannDev90/llm-selector.git
cd llm-selector
pip install .

Development Installation

git clone https://github.com/YoannDev90/llm-selector.git
cd llm-selector
pip install -e ".[test]"

Quick Start

import asyncio
from llm_selector import LLMSelector, LLMSelectorConfig

async def main():
    # Define available models
    models = {
        "gpt-4": {
            "description": "OpenAI GPT-4 for complex reasoning",
            "weight": 2.0,
            "cost_per_token": 0.00003
        },
        "claude-3": {
            "description": "Anthropic Claude 3 for balanced tasks",
            "weight": 1.0,
            "cost_per_token": 0.000015
        }
    }
    
    # Configure selector LLM (the one that makes the decision)
    selector_configs = [
        LLMSelectorConfig("gpt-3.5-turbo", api_key="your-openai-key")
    ]
    
    # Create selector
    selector = LLMSelector(models=models, selector_configs=selector_configs)
    
    # Select model for a task
    result = await selector.select("Write a complex Python data analysis script")
    print(f"Selected: {result}")

asyncio.run(main())
import asyncio
from llm_selector import LLMSelector, LLMSelectorConfig

async def main():
    # Configuration des modèles disponibles avec poids et coûts
    models = {
        "openai/gpt-4": {
            "description": "OpenAI GPT-4 for complex reasoning and coding",
            "weight": 2.0,  # 2x plus de chances
            "cost_per_token": 0.00003  # Coût estimé par token
        },
        "anthropic/claude-3": {
            "description": "Anthropic Claude 3 for safe and helpful responses", 
            "weight": 1.5,
            "cost_per_token": 0.000015
        },
        "cerebras/llama3.3-70b": {
            "description": "Cerebras Llama 3.3 70B for fast inference",
            "weight": 1.0
        },
    }
    
    # Configurations des services de sélection avec fallbacks
    selector_configs = [
        LLMSelectorConfig(
            model="openrouter/openai/gpt-oss-20b",
            api_base="https://openrouter.ai/api/v1",
            api_key="env:OPENROUTER_API_KEY"  # Variable d'environnement
        ),
        LLMSelectorConfig(
            model="anthropic/claude-3-haiku",
            api_key="dotenv:ANTHROPIC_KEY"  # Clé dans .env
        ),
    ]
    
    selector = LLMSelector(
        models=models, 
        selector_configs=selector_configs,
        cache_enabled=True,
        debug=True
    )
    
    user_input = "I need to write a complex Python script for data analysis."
    selected_model = await selector.select(user_input)
    print(f"Selected model: {selected_model}")
    
    # Métriques
    metrics = selector.get_metrics()
    print(f"Cache hit rate: {metrics['cache_hit_rate']:.2%}")
    print(f"Total cost: ${selector.get_total_cost():.4f}")

asyncio.run(main())

Advanced Features

Synchronous Interface

selector = LLMSelector(models=models, selector_configs=configs)
result = selector.select_sync("My query")  # Synchronous version

Caching

selector = LLMSelector(
    models=models,
    cache_enabled=True,      # Enable caching
    cache_ttl=3600,          # 1 hour TTL
    max_cache_size=1000      # Limit cache size
)

# Clear cache when needed
selector.clear_cache()

Cost Tracking

# Set budget limit
selector.set_budget_limit(10.0)  # $10 limit

# Check if exceeded
if selector.is_budget_exceeded():
    print("Budget exceeded!")

# Get cost breakdown
total_cost = selector.get_total_cost()
cost_by_model = selector.get_metrics()['cost_by_model']

Metrics & Monitoring

metrics = selector.get_metrics()
print(f"""
Requests: {metrics['total_requests']}
Cache hits: {metrics['cache_hits']} ({metrics['cache_hit_rate']:.1%})
Success rate: {metrics['successful_selections']/metrics['total_requests']:.1%}
Average latency: {metrics['average_latency']:.2f}s
Total cost: ${metrics['total_cost']:.4f}
""")

# Reset metrics
selector.reset_metrics()

Cost Tracking

# Set budget limit
selector.set_budget_limit(10.0)  # $10 limit

# Check if exceeded
if selector.is_budget_exceeded():
    print("Budget exceeded!")

# Get cost breakdown
total_cost = selector.get_total_cost()
cost_by_model = selector.get_metrics()['cost_by_model']

Retry Logic

selector = LLMSelector(
    models=models,
    max_retries=3,      # Max retry attempts
    retry_delay=1.0     # Base delay in seconds
)
# Automatic exponential backoff for network/rate limit errors

API Reference

LLMSelector

Constructor

LLMSelector(
    models: dict,                           # Required: Model configurations
    preprompt: str = None,                  # System prompt for selector
    selector_configs: list = None,          # LLM configs for selection
    default_model: str = None,              # Fallback model
    validate_configs: bool = True,          # Validate configs on init
    max_retries: int = 3,                   # Max API retry attempts
    retry_delay: float = 1.0,               # Base retry delay (seconds)
    cache_enabled: bool = True,             # Enable result caching
    cache_ttl: int = 3600,                  # Cache TTL (seconds)
    debug: bool = False,                    # Enable debug logging
    max_cache_size: int = 1000,             # Max cache entries
    **litellm_kwargs                         # Additional litellm params
)

Methods

  • select(input: str) -> str: Async model selection
  • select_sync(input: str) -> str: Sync model selection
  • get_metrics() -> dict: Get performance metrics
  • reset_metrics(): Reset all metrics
  • clear_cache(): Clear cached results
  • get_total_cost() -> float: Get total API cost
  • set_budget_limit(limit: float): Set cost budget
  • is_budget_exceeded() -> bool: Check budget status
  • validate_configuration() -> bool: Validate current config

LLMSelectorConfig

Constructor

LLMSelectorConfig(
    model: str,                    # Required: Model identifier
    api_base: str = None,          # Custom API base URL
    api_key: str = None,           # API key (plain/env/dotenv)
    **kwargs                       # Additional litellm parameters
)

Methods

  • validate() -> bool: Validate configuration

Model Configuration Format

models = {
    "model_name": {
        "description": "Human-readable description",    # Required
        "weight": 1.0,                                  # Optional, default 1.0
        "cost_per_token": 0.00001                       # Optional, for cost tracking
    }
}

Configuration

API Key Sources

API keys can be specified in three ways:

  • Plain text: "sk-your-api-key"
  • Environment variable: "env:VARIABLE_NAME"
  • .env file: "dotenv:KEY_NAME"

Environment Variables

Create a .env file in your project root:

OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-key

Model Weights

Model weights influence selection probability:

  • Higher weights (> 1.0) increase selection chance
  • Lower weights (< 1.0) decrease selection chance
  • Weights are relative to other models

Deployment

Docker

FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
RUN pip install .

CMD ["python", "your_app.py"]

Production Considerations

  • Set appropriate max_cache_size based on memory constraints
  • Configure cache_ttl based on how often model preferences change
  • Use environment variables for API keys in production
  • Monitor metrics regularly for performance optimization
  • Set budget limits to control costs

Model Configuration

Each model supports:

  • description: Human-readable description
  • weight: Selection weight (default: 1.0)
  • cost_per_token: Cost estimation (optional)

API Key Configuration

API keys can be specified in three ways:

  • Plain text: "sk-your-api-key"
  • Environment variable: "env:VARIABLE_NAME"
  • .env file: "dotenv:KEY_NAME"

LLMSelectorConfig

  • model: The model name (e.g., "openrouter/openai/gpt-4")
  • api_base: Custom API base URL (optional)
  • api_key: API key (plain, env:, or dotenv:)
  • **kwargs: Additional litellm parameters

Testing

The library includes comprehensive unit tests, integration tests, and performance benchmarks.

Install Test Dependencies

pip install -e ".[test]"

Run Tests

# All tests
pytest

# Unit tests only
pytest -m "not integration and not performance"

# Integration tests
pytest -m integration

# Performance tests
pytest -m performance

# With coverage
pytest --cov=llm_selector --cov-report=html

Test Coverage

  • Unit Tests: Core functionality, error handling, configuration validation
  • Integration Tests: Multi-provider scenarios, cost optimization, concurrent load
  • Performance Tests: Latency benchmarks, throughput testing, memory usage
  • Coverage Goal: 80%+ code coverage with detailed edge case testing

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

git clone https://github.com/YoannDev90/llm-selector.git
cd llm-selector
pip install -e ".[test]"
pre-commit install

Running Tests

# All tests
pytest

# With coverage
pytest --cov=llm_selector --cov-report=html

# Specific test categories
pytest -m "unit"          # Unit tests only
pytest -m "integration"   # Integration tests
pytest -m "performance"   # Performance tests

Changelog

v0.1.0 (Current)

  • Initial release with core LLM selection functionality
  • Support for multiple LLM providers via LiteLLM
  • Intelligent caching with TTL and size limits
  • Cost tracking and budget management
  • Comprehensive metrics and monitoring
  • Retry logic with exponential backoff
  • Async and sync interfaces
  • Extensive test coverage

Publishing

For information on how to publish new versions to PyPI, see PUBLISHING.md.

The project includes automated publishing via GitHub Actions that triggers on version tags.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_selector-0.1.0.tar.gz (25.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_selector-0.1.0-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file llm_selector-0.1.0.tar.gz.

File metadata

  • Download URL: llm_selector-0.1.0.tar.gz
  • Upload date:
  • Size: 25.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for llm_selector-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f13fface905b7778c4fe581da495d87d2d11fd3e37e665aaa993bc8adc62baad
MD5 6a45b423698691a14fc556bb977d7e22
BLAKE2b-256 a26a99f2390539f09b7958ae28df3d8cf877b2dc11845ec217c2f10e5f1d9f04

See more details on using hashes here.

File details

Details for the file llm_selector-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llm_selector-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for llm_selector-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 99af03b75ba26908bff5fee0a25201fb38b1251dbd389f65d0b384b96f8f12a8
MD5 ef585c867ad186c540038029a28c4a27
BLAKE2b-256 19480a2390544e4112d556a364bc20d2d2f3abd9172da73233f804cb1f7829d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page