A Python library for managing multi-agent model invocation with automatic failover strategies

These details have not been verified by PyPI

Project links

Project description

llm-invoker

A Python library for managing multi-agent model invocation with automatic failover strategies, designed for POC development with seamless provider switching and conversation history management.

🎯 Why This Project Exists

The Problem

During the development of multi-agent systems and proof-of-concept projects, developers face several recurring challenges:

Rate Limiting: Free and low-cost LLM providers impose strict rate limits, causing interruptions during active development
Provider Reliability: Individual providers can experience downtime or temporary service issues
Model Comparison: Developers need to test the same prompts across different models and providers to find the best fit
Context Loss: When switching between providers manually, conversation history and context are often lost
Configuration Complexity: Managing multiple API keys and provider configurations becomes cumbersome

The Solution

llmInvoker was created to solve these exact problems by providing:

Automatic Provider Switching: When one provider hits rate limits or fails, automatically switch to the next available provider
Context Preservation: Maintain conversation history across provider switches, ensuring continuity
Unified Interface: Single API to interact with multiple LLM providers (GitHub Models, OpenRouter, Google, OpenAI, Anthropic, etc.)
Development-Focused: Optimized for rapid prototyping and POC development workflows
Zero Configuration: Works out of the box with sensible defaults, but fully customizable when needed

This library was born from real-world frustration during multi-agent system development, where hitting rate limits would halt development flow and require manual intervention to switch providers.

✨ Features

🔄 Automatic Failover: Seamlessly switch between providers when rate limits or errors occur
⚡ Parallel Invocation: Compare responses from multiple models simultaneously
💭 Conversation History: Maintain context across provider switches
🔌 Multi-Provider Support: GitHub Models, OpenRouter, Google Generative AI, Hugging Face, OpenAI, Anthropic
🔍 LangSmith Integration: Monitor token usage and trace executions
🛠️ LangChain Compatible: Easy integration with existing multi-agent frameworks
⚙️ Simple Configuration: Environment-based API key management with code-level provider setup

🚀 Installation

# Using uv (recommended for modern Python projects)
uv add llm-invoker

# Using pip
pip install llm-invoker

# For development/contribution
git clone https://github.com/RaedJlassi/llm-invoker.git
cd llm-invoker
uv sync --dev

⚙️ Environment Setup

Create a .env file in your project root with your API keys (add only the providers you plan to use):

# OpenAI API Key
OPENAI_API_KEY=your_openai_api_key_here

# Anthropic API Key  
ANTHROPIC_API_KEY=your_anthropic_api_key_here

# GitHub Models API Key (free tier available)
GITHUB_TOKEN=your_github_token_here

# Google Generative AI API Key
GOOGLE_API_KEY=your_google_api_key_here

# Hugging Face API Key
HUGGINGFACE_API_KEY=your_huggingface_api_key_here

# OpenRouter API Key (aggregates multiple providers)
OPENROUTER_API_KEY=your_openrouter_api_key_here

# LangSmith Configuration (optional - for monitoring)
LANGSMITH_API_KEY=your_langsmith_api_key_here
LANGSMITH_PROJECT=multiagent_failover_poc

Note: You don't need all API keys. The library will automatically detect which providers are available based on your environment variables.

🎯 Use Cases

This library is particularly useful for:

🔬 Research & Prototyping

Multi-agent system development where different agents might use different models
POC development where you need reliable access to LLMs without manual intervention
Comparing model outputs across different providers for research purposes

🏗️ Development Workflows

Rate limit management during intensive development sessions
Provider redundancy for production applications that can't afford downtime
Cost optimization by utilizing free tiers across multiple providers

🤖 Multi-Agent Applications

Agent swarms where different agents can use different models
Fallback strategies for critical agent communications
Context preservation when agents switch between conversation partners

📊 Model Evaluation

A/B testing different models on the same prompts
Performance benchmarking across providers
Response quality comparison for specific use cases

🚀 Quick Start

1. Installation & Setup

# Convenience function (recommended for simple use cases)
from llmInvoker import invoke_failover

response = invoke_failover(
    message="Explain quantum computing in simple terms",
    providers={
        "github": ["gpt-4o", "gpt-4o-mini"],
        "google": ["gemini-2.0-flash-exp"]
    }
)

if response['success']:
    print(response['response'])

2. Class-based Usage

from llmInvoker import llmInvoker

# Initialize with custom configuration
invoker = llmInvoker(
    strategy="failover",
    max_retries=3,
    timeout=30,
    enable_history=True
)

### Convenience Functions

```python
from llmInvoker import invoke_failover, invoke_parallel

# Quick failover
response = invoke_failover(
    "What are the benefits of renewable energy?",
    providers={
        "github": ["gpt-4o"],
        "google": ["gemini-2.0-flash-exp"]
    }
)

# Parallel comparison
response = invoke_parallel(
    "Explain machine learning in one sentence",
    providers={
        "github": ["gpt-4o"],
        "openrouter": ["deepseek/deepseek-r1"],
        "google": ["gemini-2.0-flash-exp"]
    }
)

# Compare responses from all providers
for result in response['successful_responses']:
    print(f"{result['provider']}: {result['response']}")

📋 Strategies

1. Failover Strategy

Tries providers in order until one succeeds:

invoker = llmInvoker(strategy="failover")
invoker.configure_providers(
    github=["gpt-4o", "gpt-4o-mini"],
    google=["gemini-2.0-flash-exp"],
    openrouter=["deepseek/deepseek-r1"]
)

2. Parallel Strategy

Invokes all providers simultaneously for comparison:

invoker = llmInvoker(strategy="parallel")
# Same configuration as above
response = invoker.invoke_sync("Your question here")
# Get multiple responses to compare

📋 Response Structure

All methods return a standardized response structure for consistency across different strategies and providers:

Successful Response

{
    'success': True,
    'response': str,           # The actual LLM response content
    'provider': str,           # Provider name (e.g., 'github', 'google', 'openrouter')
    'model': str,              # Model name (e.g., 'gpt-4o', 'gemini-2.0-flash-exp')
    'timestamp': str,          # ISO format timestamp
    'attempt': int,            # Number of attempts made (for failover strategy)
    'all_responses': list,     # All responses (for parallel strategy)
    'metadata': dict           # Additional metadata (tokens, latency, etc.)
}

Failed Response

{
    'success': False,
    'error': str,              # Error message describing what went wrong
    'provider': str,           # Last attempted provider (if any)
    'model': str,              # Last attempted model (if any)
    'timestamp': str,          # ISO format timestamp
    'attempt': int,            # Number of attempts made
    'metadata': dict           # Error metadata and debugging info
}

Parallel Strategy Response

When using parallel strategy, successful responses include additional fields:

{
    'success': True,
    'response': str,           # Response from the first successful provider
    'all_responses': [         # List of all provider responses
        {
            'provider': str,
            'model': str,
            'response': str,   # Individual provider response
            'success': bool,   # Whether this specific provider succeeded
            'error': str,      # Error message if failed
            'metadata': dict   # Provider-specific metadata
        },
        # ... more provider responses
    ],
    'provider': str,           # First successful provider
    'model': str,              # First successful model
    'timestamp': str,
    'metadata': dict
}

Metadata Fields

The metadata field may contain:

{
    'tokens': {
        'prompt': int,         # Tokens used for prompt
        'completion': int,     # Tokens used for completion
        'total': int          # Total tokens used
    },
    'latency': float,         # Response time in seconds
    'rate_limit_info': dict,  # Rate limiting information
    'provider_config': dict,  # Provider-specific configuration used
    'langsmith_run_id': str,  # LangSmith tracking ID (if enabled)
    'conversation_id': str    # Conversation tracking ID (if history enabled)
}

🔧 Advanced Configuration

String-based Configuration

# Configure providers using string format
invoker.configure_from_string(
    "github['gpt-4o','gpt-4o-mini'],google['gemini-2.0-flash-exp'],openrouter['deepseek/deepseek-r1']"
)

Default Configurations

# Use default configurations for all available providers
invoker.use_defaults()

Custom Parameters

# Add model parameters
response = invoker.invoke_sync(
    "Your question",
    temperature=0.7,
    max_tokens=500,
    top_p=0.9
)

🤖 LangChain Integration

Seamlessly integrate with existing LangChain workflows:

from llmInvoker import llmInvoker

class LangChainWrapper:
    def __init__(self):
        self.invoker = llmInvoker(strategy="failover")
        self.invoker.use_defaults()
    
    async def __call__(self, prompt: str) -> str:
        response = await self.invoker.invoke(prompt)
        if response['success']:
            return self._extract_content(response['response'])
        raise Exception(f"All providers failed: {response['error']}")

# Use in LangChain chains
llm_wrapper = LangChainWrapper()

📊 Monitoring & History

Conversation History

# Enable history (default)
invoker = llmInvoker(enable_history=True)

# Get conversation summary
history = invoker.get_history()
summary = history.get_summary()
print(f"Total interactions: {summary['total_entries']}")
print(f"Providers used: {summary['providers_used']}")

# Export/import history
invoker.export_history("conversation_history.json")
invoker.import_history("conversation_history.json")

Provider Statistics

stats = invoker.get_provider_stats()
print(f"Total providers: {stats['total_providers']}")
print(f"Total models: {stats['total_models']}")
print(f"Provider details: {stats['providers']}")

LangSmith Integration

Automatic token usage tracking and execution tracing when LangSmith is configured:

# In .env file
LANGSMITH_API_KEY=your_langsmith_api_key
LANGSMITH_PROJECT=your_project_name

🔍 Examples

Check the examples/ directory for complete examples:

failover_example.py - Comprehensive failover strategy examples
parallel_invoke_example.py - Parallel invocation and response comparison
langchain_integration.py - Integration with LangChain and multi-agent frameworks

🛠️ Development

Project Structure

multiagent_failover_invoke/
├── multiagent_failover_invoke/
│   ├── __init__.py          # Main exports
│   ├── core.py              # Core llmInvoker class  
│   ├── providers.py         # Provider implementations
│   ├── strategies.py        # Strategy implementations
│   ├── history.py           # Conversation history management
│   ├── config.py            # Configuration management
│   └── utils.py             # Utility functions
├── examples/                # Usage examples
├── tests/                   # Test suite
├── .env.example            # Environment template
├── pyproject.toml          # Project configuration
└── README.md               # This file

🔧 Supported Providers

llm-invoker supports 6 major AI providers with automatic failover between them:

Providers & Models

Provider	Models Supported	API Key Required	Status
OpenAI	`gpt-4o`, `gpt-4o-mini`, `gpt-4-turbo`, `gpt-3.5-turbo`	`OPENAI_API_KEY`	✅ Active
Anthropic	`claude-3-5-sonnet-20241022`, `claude-3-haiku`, `claude-3-opus`	`ANTHROPIC_API_KEY`	✅ Active
GitHub Models	`gpt-4o`, `gpt-4o-mini`	`GITHUB_TOKEN`	✅ Active
Google AI	`gemini-2.0-flash-exp`, `gemini-1.5-pro`, `gemini-1.5-flash`	`GOOGLE_API_KEY`	✅ Active
OpenRouter	`anthropic/claude-3.5-sonnet`, `openai/gpt-4o`	`OPENROUTER_API_KEY`	✅ Active
Hugging Face	Any text-generation model	`HUGGINGFACE_API_KEY`	✅ Active

Configuration Example

from llmInvoker import llmInvoker

# Configure multiple providers for failover
invoker = llmInvoker(strategy="failover")
invoker.configure_providers(
    openai=["gpt-4o", "gpt-4o-mini"],
    anthropic=["claude-3-5-sonnet-20241022"],
    github=["gpt-4o"],
    google=["gemini-2.0-flash-exp"],
    openrouter=["anthropic/claude-3.5-sonnet"],
    huggingface=["microsoft/DialoGPT-medium"]
)

🤝 Use Cases

Perfect for:

POC Development: Rapid prototyping without worrying about rate limits
Multi-Agent Systems: LangGraph, CrewAI, AutoGen integration
Model Comparison: A/B testing different models on same tasks
Reliability: Production backup strategies for mission-critical applications
Cost Optimization: Prefer free models, fallback to paid when needed

📄 License

MIT License - see LICENSE file for details.

🤝 Contributing

Contributions are welcome! This project was created to solve real-world development challenges, and we'd love to hear about your use cases and improvements.

Getting Started

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes and add tests
Run tests: pytest tests/
Submit a pull request

Development Setup

git clone https://github.com/yourusername/multiagent-failover-invoke.git
cd multiagent-failover-invoke
uv sync --dev

🧪 Testing

Run the test suite:

# Run all tests
pytest tests/

# Run with coverage
pytest tests/ --cov=llmInvoker

# Run specific test file
pytest tests/test_core.py

📚 Examples

The examples/ directory contains comprehensive examples:

failover_example.py - Basic and advanced failover strategies
parallel_invoke_example.py - Parallel model invocation
multimodal_example.py - Working with images and multimodal content
langchain_integration.py - Integration with LangChain workflows
quickstart.py - Quick start guide examples

📞 Support & Community

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Comprehensive examples in the examples/ directory

👨‍💻 Author

Jlassi Raed

Email: raed.jlassi@etudiant-enit.utm.tn
GitHub: @RaedJlassi

Created during multi-agent system development at ENIT (École Nationale d'Ingénieurs de Tunis) to solve real-world rate limiting and provider reliability challenges in POC phase.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Thanks to all the LLM providers for their APIs and free tiers that make development accessible
Inspired by real-world challenges in multi-agent system development
Built for the developer community facing similar rate limiting and reliability issues

⭐ If this project helps you in your development workflow, please consider giving it a star!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.3

Aug 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_invoker-0.1.3.tar.gz (96.2 kB view details)

Uploaded Aug 15, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_invoker-0.1.3-py3-none-any.whl (23.0 kB view details)

Uploaded Aug 15, 2025 Python 3

File details

Details for the file llm_invoker-0.1.3.tar.gz.

File metadata

Download URL: llm_invoker-0.1.3.tar.gz
Upload date: Aug 15, 2025
Size: 96.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for llm_invoker-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`8c8db2e1f976063d835f05ec384513bdb07c1f4f82aae9e234bb9ad3b3549d53`
MD5	`9b1d52b77ab6f53265b3b38f2f16da9f`
BLAKE2b-256	`eb301979938c2ebeeff9548a396f70ddfa8cabe8fc9d385c103c46c6578dfd28`

See more details on using hashes here.

File details

Details for the file llm_invoker-0.1.3-py3-none-any.whl.

File metadata

Download URL: llm_invoker-0.1.3-py3-none-any.whl
Upload date: Aug 15, 2025
Size: 23.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for llm_invoker-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2d82faa8df8c3ce6ff177efc53c24bae6c8a57823dcef1c096547b928df7dcbf`
MD5	`038ee373c7043e1c1432395ba6612966`
BLAKE2b-256	`58ef280388611e3d63e7814285daa7e97d4fc114ca70855991dfe433d9061069`

See more details on using hashes here.

llm-invoker 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llm-invoker

🎯 Why This Project Exists

The Problem

The Solution

✨ Features

🚀 Installation

⚙️ Environment Setup

🎯 Use Cases

🔬 Research & Prototyping

🏗️ Development Workflows

🤖 Multi-Agent Applications

📊 Model Evaluation

🚀 Quick Start

1. Installation & Setup

2. Class-based Usage

📋 Strategies

1. Failover Strategy

2. Parallel Strategy

📋 Response Structure

Successful Response

Failed Response

Parallel Strategy Response

Metadata Fields

🔧 Advanced Configuration

String-based Configuration

Default Configurations

Custom Parameters

🤖 LangChain Integration

📊 Monitoring & History

Conversation History

Provider Statistics

LangSmith Integration

🔍 Examples

🛠️ Development

Project Structure

🔧 Supported Providers

Providers & Models

Configuration Example

🤝 Use Cases

📄 License

🤝 Contributing

Getting Started

Development Setup

🧪 Testing

📚 Examples

📞 Support & Community

👨‍💻 Author

📄 License

🙏 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes