LLMHandler is a unified Python package that provides a single, consistent interface for interacting with multiple LLM providers, offering both structured (typed) and unstructured responses.

Project description

LLMHandler

Unified LLM Interface with Typed & Unstructured Responses

LLMHandler is a Python package that provides a single, consistent interface to interact with multiple large language model (LLM) providers. It supports both structured (Pydantic‑validated) and unstructured free‑form responses, along with advanced features like rate limiting, batch processing, and now per‑prompt partial failure handling.

Overview
Features
Installation
Configuration
Model Format
Supported Providers and Their Models
Usage Examples
Advanced Features
Testing
Development & Contribution
License
Contact

Overview

LLMHandler unifies access to various LLM providers by letting you specify a model using a provider prefix (e.g. openai:gpt-4o-mini). The package automatically appends JSON schema instructions when a Pydantic model is provided to validate and parse responses. Alternatively, you can request unstructured free‑form text. Advanced features include rate limiting, batch processing, and partial failure handling when processing multiple prompts.

Features

Multi‑Provider Support:
Switch easily between providers (OpenAI, Anthropic, Gemini, DeepSeek, Ollama, etc.) using a simple model identifier.
Structured & Unstructured Responses:
Validate outputs using Pydantic models or receive raw text.
Batch Processing:
Process multiple prompts together with results written to JSONL files.
Rate Limiting:
Optionally control the number of requests per minute.
Partial-Failure Handling:
When multiple prompts are provided, each prompt is processed individually. If one prompt fails (for example, if the prompt exceeds the model’s token limit or is excessively long), its failure is captured in a dedicated result (a PromptResult) while the others succeed.
Example: If you intentionally pass a prompt that repeats "word " 2,000,001 times (i.e. over two million words), it will exceed the provider’s maximum allowed input length and the error message from the API (e.g. a 400 error stating that the input is “too long”) will be returned in that prompt’s result. This lets you safely handle errors on a per‑prompt basis without aborting the entire call.
Easy Configuration:
Automatically load API keys and settings from a .env file.

Installation

Requirements

Python 3.9 or later

Using PDM

pdm install

Using Pip (when available)

pip install llmhandler

Configuration

Create a .env file in your project’s root and add your API keys:

OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
GEMINI_API_KEY=your_gemini_api_key
OLLAMA_API_KEY=your_ollama_api_key
DEEPSEEK_API_KEY=your_deepseek_api_key

LLMHandler automatically loads these values at runtime.

Model Format

Every model is passed as a string in the form:

<provider>:<model_name>

Provider Prefix: Identifies the integration class and loads the proper API key and settings.
Model Name: Often validated via a type alias (e.g. KnownModelName) to select the specific LLM.

Supported Providers and Their Models

Provider	Prefix	Supported Models
OpenAI	`openai:`	GPT‑4o Series: • `openai:gpt-4o` • `openai:gpt-4o-2024-05-13` • `openai:gpt-4o-2024-08-06` • `openai:gpt-4o-2024-11-20` • `openai:gpt-4o-audio-preview` • `openai:gpt-4o-audio-preview-2024-10-01` • `openai:gpt-4o-audio-preview-2024-12-17` • `openai:gpt-4o-mini` • `openai:gpt-4o-mini-2024-07-18` • `openai:gpt-4o-mini-audio-preview` • `openai:gpt-4o-mini-audio-preview-2024-12-17` o1 Series: • `openai:o1` • `openai:o1-2024-12-17` • `openai:o1-mini` • `openai:o1-mini-2024-09-12` • `openai:o1-preview` • `openai:o1-preview-2024-09-12`
Anthropic	`anthropic:`	• `anthropic:claude-3-5-haiku-latest` • `anthropic:claude-3-5-sonnet-latest` • `anthropic:claude-3-opus-latest`
Gemini	`google-gla:` (Generative Language API) `google-vertex:` (Vertex AI)	• `gemini-1.0-pro` • `gemini-1.5-flash` • `gemini-1.5-flash-8b` • `gemini-1.5-pro` • `gemini-2.0-flash-exp` • `gemini-2.0-flash-thinking-exp-01-21` • `gemini-exp-1206`
Ollama	`ollama:`	Accepts any valid Ollama model. Common examples: • `ollama:llama3.2` • `ollama:llama3.2-vision` • `ollama:llama3.3-70b-specdec` (See ollama.com/library)
Deepseek	`deepseek:`	• `deepseek:deepseek-chat`

Note: For LLaMA-based models, Ollama (and providers like Groq, if available) are the primary options.

Usage Examples

Structured Response (Single Prompt)

import asyncio
from llmhandler.api_handler import UnifiedLLMHandler
from llmhandler._internal_models import SimpleResponse

async def structured_example():
    handler = UnifiedLLMHandler()  # API keys auto-loaded from .env
    result = await handler.process(
        prompts="Generate a catchy marketing slogan for a coffee brand.",
        model="openai:gpt-4o-mini",
        response_type=SimpleResponse
    )
    print("Structured Response:", result.data)

asyncio.run(structured_example())

Unstructured Response (Single Prompt)

import asyncio
from llmhandler.api_handler import UnifiedLLMHandler

async def unstructured_example():
    handler = UnifiedLLMHandler()
    result = await handler.process(
        prompts="Tell me a fun fact about dolphins.",
        model="openai:gpt-4o-mini"
        # No response_type provided: returns raw text.
    )
    print("Unstructured Response:", result)

asyncio.run(unstructured_example())

Multiple Prompts (Structured)

import asyncio
from llmhandler.api_handler import UnifiedLLMHandler
from llmhandler._internal_models import SimpleResponse

async def multiple_prompts_example():
    handler = UnifiedLLMHandler()
    prompts = [
        "Generate a slogan for a coffee brand.",
        "Create a tagline for a tea company."
    ]
    result = await handler.process(
        prompts=prompts,
        model="openai:gpt-4o-mini",
        response_type=SimpleResponse
    )
    print("Multiple Structured Responses:", result.data)

asyncio.run(multiple_prompts_example())

Batch Processing Example

import asyncio
from llmhandler.api_handler import UnifiedLLMHandler
from llmhandler._internal_models import SimpleResponse

async def batch_example():
    # Set a rate limit to avoid overwhelming the API
    handler = UnifiedLLMHandler(requests_per_minute=60)
    prompts = [
        "Generate a slogan for a coffee brand.",
        "Create a tagline for a tea company.",
        "Write a catchphrase for a juice brand."
    ]
    # Use batch_mode=True to process multiple prompts together (structured responses only)
    batch_result = await handler.process(
        prompts=prompts,
        model="openai:gpt-4o-mini",
        response_type=SimpleResponse,
        batch_mode=True
    )
    print("Batch Processing Result:", batch_result.data)

asyncio.run(batch_example())

Partial Failure Example

When processing multiple prompts, LLMHandler processes each prompt independently. If one prompt fails (for example, if the prompt is extremely long), its error is captured and returned along with the successful responses.

Below is an example that demonstrates this behavior. In this case, we deliberately send a “bad” prompt that repeats the word "word " 2,000,001 times (approximately 2 million words) so that it exceeds the model’s token limit. The resulting output will include an error for that prompt while still returning responses for the other prompts.

import asyncio
from llmhandler.api_handler import UnifiedLLMHandler
from llmhandler._internal_models import SimpleResponse

async def partial_failure_example():
    handler = UnifiedLLMHandler()
    # Two good prompts and one extremely long (bad) prompt.
    good_prompt = "Tell me a fun fact about penguins."
    # Construct a bad prompt that far exceeds any realistic token limit.
    # Here we repeat "word " 2,000,001 times (approximately 2 million words),
    # which should trigger a token limit error.
    bad_prompt = "word " * 2000001
    another_good = "What are the benefits of regular exercise?"
    partial_prompts = [good_prompt, bad_prompt, another_good]

    result = await handler.process(
        prompts=partial_prompts,
        model="openai:gpt-4o-mini",
        response_type=SimpleResponse
    )
    print("Partial Failure Real API Result:")
    # The returned object is a UnifiedResponse whose data is a list of PromptResult objects.
    results_list = result.data if isinstance(result, UnifiedResponse) else result
    for pr in results_list:
        display_prompt = pr.prompt if len(pr.prompt) < 60 else pr.prompt[:60] + "..."
        print(f"Prompt: {display_prompt}")
        if pr.error:
            print(f"  ERROR: {pr.error}")
        else:
            print(f"  Response: {pr.data}")
        print("-" * 40)

asyncio.run(partial_failure_example())

Advanced Features

Batch Processing & Rate Limiting:
Initialize the handler with requests_per_minute to throttle calls. When processing a list of prompts, set batch_mode=True to handle them as a batch (supported only for structured responses).
Structured vs. Unstructured Responses:
- Supply a Pydantic model as response_type for validated, structured output.
- Omit or set response_type=None to receive raw, unstructured text.
Partial Failure Handling:
When multiple prompts are submitted, each prompt is processed independently. If one prompt fails (for example, if you submit a prompt that far exceeds the maximum token limit—as with a prompt containing over 2 million words), the error is captured in its corresponding result. You will receive a list of results where each item (a PromptResult) contains the original prompt along with either a valid response or an error message. This lets you handle failures on a per‑prompt basis without aborting the entire request.
Troubleshooting:
Error messages (such as schema validation failures, token limit errors, or overloaded service errors) are clearly reported in the error field of the UnifiedResponse or PromptResult. Make sure your model strings follow the <provider>:<model_name> format exactly.

Testing

A comprehensive test suite is included. To run tests, simply execute:

pytest

Development & Contribution

Contributions are welcome! To set up your development environment:

Clone the Repository:

git clone https://github.com/yourusername/LLMHandler.git
cd LLMHandler

Install Dependencies:
```
pdm install
```
Run Tests:
```
pytest
```
Submit a Pull Request with your improvements or bug fixes.

License

This project is licensed under the MIT License.

Contact

For questions, feedback, or contributions, please reach out to:

Bryan Nsoh
Email: bryan.anye.5@gmail.com

Happy coding with LLMHandler!

Project details

Release history Release notifications | RSS feed

0.2.0

Feb 25, 2025

0.1.2

Feb 19, 2025

This version

0.1.1

Feb 5, 2025

0.1.0

Jan 31, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_handler_validator-0.1.1.tar.gz (16.8 kB view details)

Uploaded Feb 5, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_handler_validator-0.1.1-py3-none-any.whl (11.2 kB view details)

Uploaded Feb 5, 2025 Python 3

File details

Details for the file llm_handler_validator-0.1.1.tar.gz.

File metadata

Download URL: llm_handler_validator-0.1.1.tar.gz
Upload date: Feb 5, 2025
Size: 16.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: pdm/2.22.3 CPython/3.13.1 Windows/11

File hashes

Hashes for llm_handler_validator-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`c2663e3c2d61546acf67e48d8bea4e308700b8dfbd47edc3493c907140a6d568`
MD5	`611077c358a4475d5d903750782e7151`
BLAKE2b-256	`8f0d20639d9cba09b986ca616bd510e75236762cc3de73d94ea230d188e7358f`

See more details on using hashes here.

File details

Details for the file llm_handler_validator-0.1.1-py3-none-any.whl.

File metadata

Download URL: llm_handler_validator-0.1.1-py3-none-any.whl
Upload date: Feb 5, 2025
Size: 11.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: pdm/2.22.3 CPython/3.13.1 Windows/11

File hashes

Hashes for llm_handler_validator-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`74f7370ff8ee7c130318b6b2a25ac9d0144b23951b4f988ad4a3e4c661f5ef64`
MD5	`9df028b570d863b587d5bc8c44c6fed0`
BLAKE2b-256	`0245924765fb3b80dab007b8a9daacfc943fa030c3ab4db239d000e662b52610`

See more details on using hashes here.

llm_handler_validator 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

LLMHandler

Table of Contents

Overview

Features

Installation

Requirements

Using PDM

Using Pip (when available)

Configuration

Model Format

Supported Providers and Their Models

Usage Examples

Structured Response (Single Prompt)

Unstructured Response (Single Prompt)

Multiple Prompts (Structured)

Batch Processing Example

Partial Failure Example

Advanced Features

Testing

Development & Contribution

License

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes