Multi-LLM Provider Library

These details have not been verified by PyPI

Project links

Project description

llm_async

An async-first Python library for interacting with Large Language Model (LLM) providers.

Features
Installation
- Using Poetry (Recommended)
- Using pip
Usage
API Reference
Development
Roadmap
Contributing
License
Authors

Features

Async-first: Built with asyncio for high-performance, non-blocking operations.
Provider Support: Supports OpenAI, Anthropic Claude, Google Gemini, and OpenRouter for chat completions.
Tool Calling: Tool execution with unified tool definitions across providers.
Structured Outputs: Enforce JSON schema validation on responses (OpenAI, Google, OpenRouter).
Extensible: Easy to add new providers by inheriting from BaseProvider.
Tested: Comprehensive test suite with high coverage.

Installation

Using Poetry (Recommended)

poetry add llm_async

Using pip

pip install git+https://github.com/sonic182/llm_async.git

Usage

Basic Chat Completion

OpenAI

import asyncio
from llm_async import OpenAIProvider

async def main():
    # Initialize the provider with your API key
    provider = OpenAIProvider(api_key="your-openai-api-key")

    # Perform a chat completion
    response = await provider.acomplete(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Hello, how are you?"}
        ]
    )

    print(response.main_response.content)  # Output: The assistant's response

# Run the async function
asyncio.run(main())

OpenRouter

import asyncio
import os
from llm_async import OpenRouterProvider

async def main():
    # Initialize the provider with your API key
    provider = OpenRouterProvider(api_key=os.getenv("OPENROUTER_API_KEY"))

    # Perform a chat completion
    response = await provider.acomplete(
        model="openrouter/auto",  # Let OpenRouter choose the best model
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Hello, how are you?"}
        ],
        http_referer="https://github.com/your-username/your-app",  # Optional
        x_title="My AI App"  # Optional
    )

    print(response.main_response.content)  # Output: The assistant's response

# Run the async function
asyncio.run(main())

Google Gemini

import asyncio
from llm_async.providers.google import GoogleProvider

async def main():
    # Initialize the provider with your API key
    provider = GoogleProvider(api_key="your-google-gemini-api-key")

    # Perform a chat completion
    response = await provider.acomplete(
        model="gemini-2.5-flash",
        messages=[
            {"role": "user", "content": "Hello, how are you?"}
        ]
    )

    print(response.main_response.content)  # Output: The assistant's response

# Run the async function
asyncio.run(main())

Custom Base URL

provider = OpenAIProvider(
    api_key="your-api-key",
    base_url="https://custom-openai-endpoint.com/v1"
)

Tool Usage

import asyncio
import os
from llm_async.models import Tool
from llm_async.providers import OpenAIProvider

# Define a calculator tool
calculator_tool = Tool(
    name="calculator",
    description="Perform basic arithmetic operations",
    parameters={
        "type": "object",
        "properties": {
            "operation": {
                "type": "string",
                "enum": ["add", "subtract", "multiply", "divide"]
            },
            "a": {"type": "number"},
            "b": {"type": "number"}
        },
        "required": ["operation", "a", "b"]
    },
    input_schema={
        "type": "object",
        "properties": {
            "operation": {
                "type": "string",
                "enum": ["add", "subtract", "multiply", "divide"]
            },
            "a": {"type": "number"},
            "b": {"type": "number"}
        },
        "required": ["operation", "a", "b"]
    }
)

def calculator(operation: str, a: float, b: float) -> float:
    """Calculator function that can be called by the LLM."""
    if operation == "add":
        return a + b
    elif operation == "subtract":
        return a - b
    elif operation == "multiply":
        return a * b
    elif operation == "divide":
        return a / b
    return 0

async def main():
    # Initialize provider
    provider = OpenAIProvider(api_key=os.getenv("OPENAI_API_KEY"))
    
    # Tool executor mapping
    tools_map = {"calculator": calculator}
    
    # Initial user message
    messages = [{"role": "user", "content": "What is 15 + 27?"}]
    
    # First turn: Ask the LLM to perform a calculation
    response = await provider.acomplete(
        model="gpt-4o-mini",
        messages=messages,
        tools=[calculator_tool]
    )
    
    # Execute the tool call
    tool_call = response.main_response.tool_calls[0]
    tool_result = await provider.execute_tool(tool_call, tools_map)
    
    # Second turn: Send the tool result back to the LLM
    messages_with_tool = messages + [response.main_response.original_data] + [tool_result]
    
    final_response = await provider.acomplete(
        model="gpt-4o-mini",
        messages=messages_with_tool
    )
    
    print(final_response.main_response.content)  # Output: The final answer

asyncio.run(main())

Structured Outputs

Enforce JSON schema validation on model responses for consistent, type-safe outputs.

import asyncio
import json
from llm_async import OpenAIProvider
from llm_async.providers.google import GoogleProvider

# Define response schema
response_schema = {
    "type": "object",
    "properties": {
        "answer": {"type": "string"},
        "confidence": {"type": "number"}
    },
    "required": ["answer", "confidence"],
    "additionalProperties": False
}

async def main():
    # OpenAI example
    openai_provider = OpenAIProvider(api_key="your-openai-key")
    response = await openai_provider.acomplete(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "What is the capital of France?"}],
        response_schema=response_schema
    )
    result = json.loads(response.main_response.content)
    print(f"OpenAI: {result}")

    # Google Gemini example
    google_provider = GoogleProvider(api_key="your-google-key")
    response = await google_provider.acomplete(
        model="gemini-2.5-flash",
        messages=[{"role": "user", "content": "What is the capital of France?"}],
        response_schema=response_schema
    )
    result = json.loads(response.main_response.content)
    print(f"Gemini: {result}")

asyncio.run(main())

Supported Providers: OpenAI, Google Gemini, OpenRouter. Claude does not support structured outputs.

API Reference

OpenAIProvider

__init__(api_key: str, base_url: str = "https://api.openai.com/v1")
acomplete(model: str, messages: list[dict], stream: bool = False, **kwargs) -> Response | AsyncIterator[StreamChunk]

Performs a chat completion. When stream=True the method returns an async iterator that yields StreamChunk objects as they arrive from the provider.

OpenRouterProvider

__init__(api_key: str, base_url: str = "https://openrouter.ai/api/v1")
acomplete(model: str, messages: list[dict], stream: bool = False, **kwargs) -> Response | AsyncIterator[StreamChunk]

Performs a chat completion using OpenRouter's unified API. Supports the same OpenAI-compatible interface with additional optional headers:
- http_referer: Your application's URL (recommended)
- x_title: Your application's name (recommended)
OpenRouter provides access to hundreds of AI models from various providers through a single API.

GoogleProvider

__init__(api_key: str, base_url: str = "https://generativelanguage.googleapis.com/v1beta/models/")
acomplete(model: str, messages: list[dict], stream: bool = False, **kwargs) -> Response | AsyncIterator[StreamChunk]

Performs a chat completion using Google's Gemini API. Supports structured outputs and uses camelCase for API keys (e.g., generationConfig).

Streaming

Usage: async for chunk in await provider.acomplete(..., stream=True): print or process chunk in real time.

Example output

--- OpenAI streaming response ---
1. Peel and slice potatoes.
2. Par-cook potatoes briefly.
3. Whisk eggs with salt and pepper.
4. Sauté onions until translucent (optional).
5. Combine potatoes and eggs in a pan and cook until set.
6. Fold and serve.
--- Claude streaming response ---
1. Prepare potatoes by peeling and slicing.
2. Fry or boil until tender.
3. Beat eggs and season.
4. Mix potatoes with eggs and cook gently.
5. Serve warm.

Development

Setup

git clone https://github.com/sonic182/llm_async.git
cd llm_async
poetry install

Running Tests

poetry run pytest

Building

poetry build

Roadmap

Support for additional providers (e.g., Grok, Anthropic direct API)
More advanced tool features
Response caching and retry mechanisms

Contributing

Contributions are welcome! Please open an issue or submit a pull request on GitHub.

License

MIT License - see the LICENSE file for details.

Authors

sonic182

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.1

Mar 8, 2026

0.5.0

Mar 8, 2026

0.4.3

Mar 7, 2026

0.4.2

Mar 1, 2026

0.4.1

Feb 20, 2026

0.4.0

Feb 18, 2026

0.3.3

Dec 3, 2025

0.3.1

Nov 8, 2025

0.3.0

Nov 8, 2025

This version

0.2.0

Nov 6, 2025

0.1.1

Oct 31, 2025

0.1.0

Oct 31, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_async-0.2.0.tar.gz (15.7 kB view details)

Uploaded Nov 6, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_async-0.2.0-py3-none-any.whl (19.0 kB view details)

Uploaded Nov 6, 2025 Python 3

File details

Details for the file llm_async-0.2.0.tar.gz.

File metadata

Download URL: llm_async-0.2.0.tar.gz
Upload date: Nov 6, 2025
Size: 15.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for llm_async-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`6d7d34a3bb502bec84f600a528add7147c05eb9cd8d7936d70b59c030634d811`
MD5	`c099ab286b72bb224c7473727ef82349`
BLAKE2b-256	`9315069e7a9de2417bebef29ab6151641cdcc428ac2b892f06123025844a0e8e`

See more details on using hashes here.

File details

Details for the file llm_async-0.2.0-py3-none-any.whl.

File metadata

Download URL: llm_async-0.2.0-py3-none-any.whl
Upload date: Nov 6, 2025
Size: 19.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for llm_async-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6cdcac6e8859bfbfed9f86ffaeb2a204f6794e6346aa24a5ea3cf08f04a0b1d1`
MD5	`eff8387d827b9f19cba37aaf0c75a9c0`
BLAKE2b-256	`9ff427e22d6f857c1e50aa78a99291d39aeace57c8a59932bdf06f01e1830144`

See more details on using hashes here.

llm-async 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llm_async

Table of Contents

Features

Installation

Using Poetry (Recommended)

Using pip

Usage

Basic Chat Completion

OpenAI

OpenRouter

Google Gemini

Custom Base URL

Tool Usage

Structured Outputs

API Reference

OpenAIProvider

OpenRouterProvider

GoogleProvider

Development

Setup

Running Tests

Building

Roadmap

Contributing

License

Authors

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes