Skip to main content

Lightweight, pluggable adapter for multiple LLM APIs (OpenAI, Anthropic, Google)

Project description

LLM API Adapter SDK for Python

Overview

This lightweight SDK for Python allows you to use LLM APIs from various providers and models through a unified interface. It is designed to be minimal, dependency-free, and easy to integrate into any Python project. Currently, the project supports API integration for OpenAI, Anthropic, and Google, focusing on chat functionality with consistent cost tracking and unified error handling.

Version

Current version: 0.2.1

Features

  • Unified Interface: Work seamlessly with different LLM providers using a single, consistent API.
  • Multiple Provider Support: Currently supports OpenAI, Anthropic, and Google APIs, allowing easy switching between them.
  • Chat Functionality: Provides an easy way to interact with chat-based LLMs.
  • Extensible Design: Built to easily extend support for additional providers and new functionalities in the future.
  • Error Handling: Standardized error messages across all supported LLMs, simplifying integration and debugging.
  • Flexible Configuration: Manage request parameters like temperature, max tokens, and other settings for fine-tuned control.
  • Token and Cost Accounting: Automatic calculation of token usage and cost per request.
  • Pricing Registry: Model prices are stored in a unified JSON registry with per-model input/output pricing and currency support.

Installation

To install the SDK, you can use pip:

pip install llm-api-adapter

Note: You will need to obtain API keys from each LLM provider you wish to use (OpenAI, Anthropic, Google). Refer to their respective documentation for instructions on obtaining API keys.

Getting Started

Importing and Setting Up the Adapter

To start using the adapter, you need to import the necessary components:

from llm_api_adapter.models.messages.chat_message import (
    AIMessage, Prompt, UserMessage
)
from llm_api_adapter.universal_adapter import UniversalLLMAPIAdapter

Sending a Simple Request

The SDK supports three types of messages for interacting with the LLM:

  • Prompt: Use Prompt to set the context or initial prompt for the model.
  • UserMessage: Use UserMessage to send messages from the user during a conversation.
  • AIMessage: Use AIMessage to simulate responses from the assistant during a conversation.

Here is an example of how to send a simple request to the adapter:

messages = [
    UserMessage("Hi! Can you explain how artificial intelligence works?")
]

adapter = UniversalLLMAPIAdapter(
    organization="openai",
    model="gpt-5",
    api_key=openai_api_key
)

response = adapter.generate_chat_answer(
    messages=messages,
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p
)
print(response.content)

Parameters

  • max_tokens: The maximum number of tokens to generate in the response. This limits the length of the output.

  • temperature: Controls the randomness of the response. Higher values (e.g., 0.8) make the output more random, while lower values (e.g., 0.2) make it more focused and deterministic. Default value: 1.0 (range: 0 to 2).

  • top_p: Limits the response to a certain cumulative probability. This is used to create more focused and coherent responses by considering only the highest probability options. Default value: 1.0 (range: 0 to 1).

Handling Errors

Common Errors

The SDK provides a set of standardized errors for easier debugging and integration:

  • LLMAPIError: Base class for all API-related errors. This error is also used for any unexpected LLM API errors.

  • LLMAPIAuthorizationError: Raised when authentication or authorization fails.

  • LLMAPIRateLimitError: Raised when rate limits are exceeded.

  • LLMAPITokenLimitError: Raised when token limits are exceeded.

  • LLMAPIClientError: Raised when the client makes an invalid request.

  • LLMAPIServerError: Raised when the server encounters an error.

  • LLMAPITimeoutError: Raised when a request times out.

  • LLMAPIUsageLimitError: Raised when usage limits are exceeded.

Configuration and Management

Using Different Providers and Models

The SDK allows you to easily switch between LLM providers and specify the model you want to use. Currently supported providers are OpenAI, Anthropic, and Google.

  • OpenAI: You can use models like gpt-5, gpt-5-mini, gpt-5-nano, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-4o, gpt-4o-mini.
  • Anthropic: Available models include claude-sonnet-4-5, claude-opus-4-1, claude-opus-4-0, claude-sonnet-4-0, claude-3-7-sonnet-latest, claude-3-5-haiku-latest, claude-3-haiku-20240307.
  • Google: Models such as gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.0-flash, gemini-2.0-flash-lite can be used.

Example:

adapter = UniversalLLMAPIAdapter(
    organization="openai",
    model="gpt-5",
    api_key=openai_api_key
)

To switch to another provider, simply change the organization and model parameters.

Switching Providers

Here is an example of how to switch between different LLM providers using the SDK:

Note: Each instance of UniversalLLMAPIAdapter is tied to a specific provider and model. You cannot change the organization parameter for an existing adapter object. To use a different provider, you must create a new instance.

gpt = UniversalLLMAPIAdapter(
    organization="openai",
    model="gpt-5",
    api_key=openai_api_key
)
gpt_response = gpt.generate_chat_answer(messages=messages)
print(gpt_response.content)

claude = UniversalLLMAPIAdapter(
    organization="anthropic",
    model="claude-sonnet-4-5",
    api_key=anthropic_api_key
)
claude_response = claude.generate_chat_answer(messages=messages)
print(claude_response.content)

google = UniversalLLMAPIAdapter(
    organization="google",
    model="gemini-2.5-flash",
    api_key=google_api_key
)
google_response = google.generate_chat_answer(messages=messages)
print(google_response.content)

Example Use Case

Here is a comprehensive example that showcases all possible message types and interactions:

from llm_api_adapter.models.messages.chat_message import (
    AIMessage, Prompt, UserMessage
)                                               
from llm_api_adapter.universal_adapter import UniversalLLMAPIAdapter

messages = [
    Prompt(
        "You are a friendly assistant who explains complex concepts "
        "in simple terms."
    ),
    UserMessage("Hi! Can you explain how artificial intelligence works?"),
    AIMessage(
        "Sure! Artificial intelligence (AI) is a system that can perform "
        "tasks requiring human-like intelligence, such as recognizing images "
        "or understanding language. It learns by analyzing large amounts of "
        "data, finding patterns, and making predictions."
    ),
    UserMessage("How does AI learn?"),
]

adapter = UniversalLLMAPIAdapter(
    organization="openai",
    model="gpt-5",
    api_key=openai_api_key
)

response = adapter.generate_chat_answer(
    messages=messages,
    max_tokens=256,
    temperature=1.0,
    top_p=1.0
)
print(response.content)

The ChatResponse object returned by generate_chat_answer includes:

  1. model: The model that generated the response.
  2. response_id: Unique identifier for the response.
  3. timestamp: Response generation time.
  4. usage: Object containing input_tokens, output_tokens, and total_tokens.
  5. currency: The currency used for cost calculation.
  6. cost_input: Cost of input tokens.
  7. cost_output: Cost of output tokens.
  8. cost_total: Total combined cost.
  9. content: The generated text response.
  10. finish_reason: Reason why generation stopped (e.g., "stop", "length").

Token Usage and Pricing

Token Usage and Pricing Example

google = UniversalLLMAPIAdapter(
    organization="google",
    model="gemini-2.5-flash",
    api_key=google_api_key
)

response = google.generate_chat_answer(**chat_params)

print(response.usage.input_tokens, "tokens", f"({response.cost_input} {response.currency})")
print(response.usage.output_tokens, "tokens", f"({response.cost_output} {response.currency})")
print(response.usage.total_tokens, "tokens", f"({response.cost_total} {response.currency})")

Overriding Pricing or Currency

google = UniversalLLMAPIAdapter(
    organization="google",
    model="gemini-2.5-flash",
    api_key=google_api_key
)

google.pricing.set_in_per_1m(1.5)
google.pricing.set_out_per_1m(3)
google.pricing.set_currency("EUR")

response = google.generate_chat_answer(**chat_params)
print(response.content)
print(response.usage.input_tokens, "tokens", f"({response.cost_input} {response.currency})")
print(response.usage.output_tokens, "tokens", f"({response.cost_output} {response.currency})")
print(response.usage.total_tokens, "tokens", f"({response.cost_total} {response.currency})")

Testing

This project uses pytest for testing. Tests are located in the tests/ directory.

Development & Testing

Note
This section is intended for developers working with the source code from GitHub.
It is not relevant for users installing the package from PyPI.

To run all tests, use the following command:

pytest

Alternatively, you can run the tests using the tests_runner.py script:

python tests/tests_runner.py

Dependencies

Ensure you have the required dependencies installed. You can install them using:

pip install -r requirements-test.txt

Test Structure

  • unit/: Contains unit tests for individual components.
  • integration/: Contains integration tests to verify the interaction between different parts of the system.

License

This project is licensed under the terms of the MIT License.
See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_api_adapter-0.2.1.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_api_adapter-0.2.1-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file llm_api_adapter-0.2.1.tar.gz.

File metadata

  • Download URL: llm_api_adapter-0.2.1.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0rc2

File hashes

Hashes for llm_api_adapter-0.2.1.tar.gz
Algorithm Hash digest
SHA256 cf35fb1edd220143c0c15ddc9f1fdbf1fc3367408a039267a0260e941f8e4ec0
MD5 7e6276494b1be8bfa4ed1d04aca6c608
BLAKE2b-256 d3c1d383bb51eff1abfebd167f11ef1dcd42e1a85c64c77d6291aa0c99a1a2b5

See more details on using hashes here.

File details

Details for the file llm_api_adapter-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: llm_api_adapter-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 20.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0rc2

File hashes

Hashes for llm_api_adapter-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fad7901bd78829c11436201180ee01c61f754c1284e6c38012e772166326ddd8
MD5 ad4ea68f36452f7069ead99b7ef1b19c
BLAKE2b-256 708eb129e5ee7f273093d8989934e400aad51cb1823c862f58f7ed347d7c2365

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page