Skip to main content

A Python library for managing multi-agent model invocation with automatic failover strategies

Project description

llm-invoker

PyPI version Python Support License: MIT

A Python library for managing multi-agent model invocation with automatic failover strategies, designed for POC development with seamless provider switching and conversation history management.

๐ŸŽฏ Why This Project Exists

The Problem

During the development of multi-agent systems and proof-of-concept projects, developers face several recurring challenges:

  1. Rate Limiting: Free and low-cost LLM providers impose strict rate limits, causing interruptions during active development
  2. Provider Reliability: Individual providers can experience downtime or temporary service issues
  3. Model Comparison: Developers need to test the same prompts across different models and providers to find the best fit
  4. Context Loss: When switching between providers manually, conversation history and context are often lost
  5. Configuration Complexity: Managing multiple API keys and provider configurations becomes cumbersome

The Solution

llmInvoker was created to solve these exact problems by providing:

  • Automatic Provider Switching: When one provider hits rate limits or fails, automatically switch to the next available provider
  • Context Preservation: Maintain conversation history across provider switches, ensuring continuity
  • Unified Interface: Single API to interact with multiple LLM providers (GitHub Models, OpenRouter, Google, OpenAI, Anthropic, etc.)
  • Development-Focused: Optimized for rapid prototyping and POC development workflows
  • Zero Configuration: Works out of the box with sensible defaults, but fully customizable when needed

This library was born from real-world frustration during multi-agent system development, where hitting rate limits would halt development flow and require manual intervention to switch providers.

โœจ Features

  • ๐Ÿ”„ Automatic Failover: Seamlessly switch between providers when rate limits or errors occur
  • โšก Parallel Invocation: Compare responses from multiple models simultaneously
  • ๐Ÿ’ญ Conversation History: Maintain context across provider switches
  • ๐Ÿ”Œ Multi-Provider Support: GitHub Models, OpenRouter, Google Generative AI, Hugging Face, OpenAI, Anthropic
  • ๐Ÿ” LangSmith Integration: Monitor token usage and trace executions
  • ๐Ÿ› ๏ธ LangChain Compatible: Easy integration with existing multi-agent frameworks
  • โš™๏ธ Simple Configuration: Environment-based API key management with code-level provider setup

๐Ÿš€ Installation

# Using uv (recommended for modern Python projects)
uv add llm-invoker

# Using pip
pip install llm-invoker

# For development/contribution
git clone https://github.com/RaedJlassi/llm-invoker.git
cd llm-invoker
uv sync --dev

โš™๏ธ Environment Setup

Create a .env file in your project root with your API keys (add only the providers you plan to use):

# OpenAI API Key
OPENAI_API_KEY=your_openai_api_key_here

# Anthropic API Key  
ANTHROPIC_API_KEY=your_anthropic_api_key_here

# GitHub Models API Key (free tier available)
GITHUB_TOKEN=your_github_token_here

# Google Generative AI API Key
GOOGLE_API_KEY=your_google_api_key_here

# Hugging Face API Key
HUGGINGFACE_API_KEY=your_huggingface_api_key_here

# OpenRouter API Key (aggregates multiple providers)
OPENROUTER_API_KEY=your_openrouter_api_key_here

# LangSmith Configuration (optional - for monitoring)
LANGSMITH_API_KEY=your_langsmith_api_key_here
LANGSMITH_PROJECT=multiagent_failover_poc

Note: You don't need all API keys. The library will automatically detect which providers are available based on your environment variables.

๐ŸŽฏ Use Cases

This library is particularly useful for:

๐Ÿ”ฌ Research & Prototyping

  • Multi-agent system development where different agents might use different models
  • POC development where you need reliable access to LLMs without manual intervention
  • Comparing model outputs across different providers for research purposes

๐Ÿ—๏ธ Development Workflows

  • Rate limit management during intensive development sessions
  • Provider redundancy for production applications that can't afford downtime
  • Cost optimization by utilizing free tiers across multiple providers

๐Ÿค– Multi-Agent Applications

  • Agent swarms where different agents can use different models
  • Fallback strategies for critical agent communications
  • Context preservation when agents switch between conversation partners

๐Ÿ“Š Model Evaluation

  • A/B testing different models on the same prompts
  • Performance benchmarking across providers
  • Response quality comparison for specific use cases

๐Ÿš€ Quick Start

1. Installation & Setup

# Convenience function (recommended for simple use cases)
from llmInvoker import invoke_failover

response = invoke_failover(
    message="Explain quantum computing in simple terms",
    providers={
        "github": ["gpt-4o", "gpt-4o-mini"],
        "google": ["gemini-2.0-flash-exp"]
    }
)

if response['success']:
    print(response['response'])

2. Class-based Usage

from llmInvoker import llmInvoker

# Initialize with custom configuration
invoker = llmInvoker(
    strategy="failover",
    max_retries=3,
    timeout=30,
    enable_history=True
)

### Convenience Functions

```python
from llmInvoker import invoke_failover, invoke_parallel

# Quick failover
response = invoke_failover(
    "What are the benefits of renewable energy?",
    providers={
        "github": ["gpt-4o"],
        "google": ["gemini-2.0-flash-exp"]
    }
)

# Parallel comparison
response = invoke_parallel(
    "Explain machine learning in one sentence",
    providers={
        "github": ["gpt-4o"],
        "openrouter": ["deepseek/deepseek-r1"],
        "google": ["gemini-2.0-flash-exp"]
    }
)

# Compare responses from all providers
for result in response['successful_responses']:
    print(f"{result['provider']}: {result['response']}")

๐Ÿ“‹ Strategies

1. Failover Strategy

Tries providers in order until one succeeds:

invoker = llmInvoker(strategy="failover")
invoker.configure_providers(
    github=["gpt-4o", "gpt-4o-mini"],
    google=["gemini-2.0-flash-exp"],
    openrouter=["deepseek/deepseek-r1"]
)

2. Parallel Strategy

Invokes all providers simultaneously for comparison:

invoker = llmInvoker(strategy="parallel")
# Same configuration as above
response = invoker.invoke_sync("Your question here")
# Get multiple responses to compare

๐Ÿ“‹ Response Structure

All methods return a standardized response structure for consistency across different strategies and providers:

Successful Response

{
    'success': True,
    'response': str,           # The actual LLM response content
    'provider': str,           # Provider name (e.g., 'github', 'google', 'openrouter')
    'model': str,              # Model name (e.g., 'gpt-4o', 'gemini-2.0-flash-exp')
    'timestamp': str,          # ISO format timestamp
    'attempt': int,            # Number of attempts made (for failover strategy)
    'all_responses': list,     # All responses (for parallel strategy)
    'metadata': dict           # Additional metadata (tokens, latency, etc.)
}

Failed Response

{
    'success': False,
    'error': str,              # Error message describing what went wrong
    'provider': str,           # Last attempted provider (if any)
    'model': str,              # Last attempted model (if any)
    'timestamp': str,          # ISO format timestamp
    'attempt': int,            # Number of attempts made
    'metadata': dict           # Error metadata and debugging info
}

Parallel Strategy Response

When using parallel strategy, successful responses include additional fields:

{
    'success': True,
    'response': str,           # Response from the first successful provider
    'all_responses': [         # List of all provider responses
        {
            'provider': str,
            'model': str,
            'response': str,   # Individual provider response
            'success': bool,   # Whether this specific provider succeeded
            'error': str,      # Error message if failed
            'metadata': dict   # Provider-specific metadata
        },
        # ... more provider responses
    ],
    'provider': str,           # First successful provider
    'model': str,              # First successful model
    'timestamp': str,
    'metadata': dict
}

Metadata Fields

The metadata field may contain:

{
    'tokens': {
        'prompt': int,         # Tokens used for prompt
        'completion': int,     # Tokens used for completion
        'total': int          # Total tokens used
    },
    'latency': float,         # Response time in seconds
    'rate_limit_info': dict,  # Rate limiting information
    'provider_config': dict,  # Provider-specific configuration used
    'langsmith_run_id': str,  # LangSmith tracking ID (if enabled)
    'conversation_id': str    # Conversation tracking ID (if history enabled)
}

๐Ÿ”ง Advanced Configuration

String-based Configuration

# Configure providers using string format
invoker.configure_from_string(
    "github['gpt-4o','gpt-4o-mini'],google['gemini-2.0-flash-exp'],openrouter['deepseek/deepseek-r1']"
)

Default Configurations

# Use default configurations for all available providers
invoker.use_defaults()

Custom Parameters

# Add model parameters
response = invoker.invoke_sync(
    "Your question",
    temperature=0.7,
    max_tokens=500,
    top_p=0.9
)

๐Ÿค– LangChain Integration

Seamlessly integrate with existing LangChain workflows:

from llmInvoker import llmInvoker

class LangChainWrapper:
    def __init__(self):
        self.invoker = llmInvoker(strategy="failover")
        self.invoker.use_defaults()
    
    async def __call__(self, prompt: str) -> str:
        response = await self.invoker.invoke(prompt)
        if response['success']:
            return self._extract_content(response['response'])
        raise Exception(f"All providers failed: {response['error']}")

# Use in LangChain chains
llm_wrapper = LangChainWrapper()

๐Ÿ“Š Monitoring & History

Conversation History

# Enable history (default)
invoker = llmInvoker(enable_history=True)

# Get conversation summary
history = invoker.get_history()
summary = history.get_summary()
print(f"Total interactions: {summary['total_entries']}")
print(f"Providers used: {summary['providers_used']}")

# Export/import history
invoker.export_history("conversation_history.json")
invoker.import_history("conversation_history.json")

Provider Statistics

stats = invoker.get_provider_stats()
print(f"Total providers: {stats['total_providers']}")
print(f"Total models: {stats['total_models']}")
print(f"Provider details: {stats['providers']}")

LangSmith Integration

Automatic token usage tracking and execution tracing when LangSmith is configured:

# In .env file
LANGSMITH_API_KEY=your_langsmith_api_key
LANGSMITH_PROJECT=your_project_name

๐Ÿ” Examples

Check the examples/ directory for complete examples:

  • failover_example.py - Comprehensive failover strategy examples
  • parallel_invoke_example.py - Parallel invocation and response comparison
  • langchain_integration.py - Integration with LangChain and multi-agent frameworks

๐Ÿ› ๏ธ Development

Project Structure

multiagent_failover_invoke/
โ”œโ”€โ”€ multiagent_failover_invoke/
โ”‚   โ”œโ”€โ”€ __init__.py          # Main exports
โ”‚   โ”œโ”€โ”€ core.py              # Core llmInvoker class  
โ”‚   โ”œโ”€โ”€ providers.py         # Provider implementations
โ”‚   โ”œโ”€โ”€ strategies.py        # Strategy implementations
โ”‚   โ”œโ”€โ”€ history.py           # Conversation history management
โ”‚   โ”œโ”€โ”€ config.py            # Configuration management
โ”‚   โ””โ”€โ”€ utils.py             # Utility functions
โ”œโ”€โ”€ examples/                # Usage examples
โ”œโ”€โ”€ tests/                   # Test suite
โ”œโ”€โ”€ .env.example            # Environment template
โ”œโ”€โ”€ pyproject.toml          # Project configuration
โ””โ”€โ”€ README.md               # This file

๐Ÿ”ง Supported Providers

llm-invoker supports 6 major AI providers with automatic failover between them:

Providers & Models

Provider Models Supported API Key Required Status
OpenAI gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo OPENAI_API_KEY โœ… Active
Anthropic claude-3-5-sonnet-20241022, claude-3-haiku, claude-3-opus ANTHROPIC_API_KEY โœ… Active
GitHub Models gpt-4o, gpt-4o-mini GITHUB_TOKEN โœ… Active
Google AI gemini-2.0-flash-exp, gemini-1.5-pro, gemini-1.5-flash GOOGLE_API_KEY โœ… Active
OpenRouter anthropic/claude-3.5-sonnet, openai/gpt-4o OPENROUTER_API_KEY โœ… Active
Hugging Face Any text-generation model HUGGINGFACE_API_KEY โœ… Active

Configuration Example

from llmInvoker import llmInvoker

# Configure multiple providers for failover
invoker = llmInvoker(strategy="failover")
invoker.configure_providers(
    openai=["gpt-4o", "gpt-4o-mini"],
    anthropic=["claude-3-5-sonnet-20241022"],
    github=["gpt-4o"],
    google=["gemini-2.0-flash-exp"],
    openrouter=["anthropic/claude-3.5-sonnet"],
    huggingface=["microsoft/DialoGPT-medium"]
)

๐Ÿค Use Cases

Perfect for:

  • POC Development: Rapid prototyping without worrying about rate limits
  • Multi-Agent Systems: LangGraph, CrewAI, AutoGen integration
  • Model Comparison: A/B testing different models on same tasks
  • Reliability: Production backup strategies for mission-critical applications
  • Cost Optimization: Prefer free models, fallback to paid when needed

๐Ÿ“„ License

MIT License - see LICENSE file for details.

๐Ÿค Contributing

Contributions are welcome! This project was created to solve real-world development challenges, and we'd love to hear about your use cases and improvements.

Getting Started

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes and add tests
  4. Run tests: pytest tests/
  5. Submit a pull request

Development Setup

git clone https://github.com/yourusername/multiagent-failover-invoke.git
cd multiagent-failover-invoke
uv sync --dev

๐Ÿงช Testing

Run the test suite:

# Run all tests
pytest tests/

# Run with coverage
pytest tests/ --cov=llmInvoker

# Run specific test file
pytest tests/test_core.py

๐Ÿ“š Examples

The examples/ directory contains comprehensive examples:

  • failover_example.py - Basic and advanced failover strategies
  • parallel_invoke_example.py - Parallel model invocation
  • multimodal_example.py - Working with images and multimodal content
  • langchain_integration.py - Integration with LangChain workflows
  • quickstart.py - Quick start guide examples

๐Ÿ“ž Support & Community

๐Ÿ‘จโ€๐Ÿ’ป Author

Jlassi Raed

Created during multi-agent system development at ENIT (ร‰cole Nationale d'Ingรฉnieurs de Tunis) to solve real-world rate limiting and provider reliability challenges in POC phase.

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Thanks to all the LLM providers for their APIs and free tiers that make development accessible
  • Inspired by real-world challenges in multi-agent system development
  • Built for the developer community facing similar rate limiting and reliability issues

โญ If this project helps you in your development workflow, please consider giving it a star!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_invoker-0.1.3.tar.gz (96.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_invoker-0.1.3-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file llm_invoker-0.1.3.tar.gz.

File metadata

  • Download URL: llm_invoker-0.1.3.tar.gz
  • Upload date:
  • Size: 96.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for llm_invoker-0.1.3.tar.gz
Algorithm Hash digest
SHA256 8c8db2e1f976063d835f05ec384513bdb07c1f4f82aae9e234bb9ad3b3549d53
MD5 9b1d52b77ab6f53265b3b38f2f16da9f
BLAKE2b-256 eb301979938c2ebeeff9548a396f70ddfa8cabe8fc9d385c103c46c6578dfd28

See more details on using hashes here.

File details

Details for the file llm_invoker-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: llm_invoker-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 23.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for llm_invoker-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2d82faa8df8c3ce6ff177efc53c24bae6c8a57823dcef1c096547b928df7dcbf
MD5 038ee373c7043e1c1432395ba6612966
BLAKE2b-256 58ef280388611e3d63e7814285daa7e97d4fc114ca70855991dfe433d9061069

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page