Multi-LLM Provider Library
Project description
llm_async
An async-first Python library for interacting with Large Language Model (LLM) providers.
Table of Contents
Features
- Async-first: Built with asyncio for high-performance, non-blocking operations.
- Provider Support: Supports OpenAI, Anthropic Claude, Google Gemini, and OpenRouter for chat completions.
- Tool Calling: Tool execution with unified tool definitions across providers.
- Structured Outputs: Enforce JSON schema validation on responses (OpenAI, Google, OpenRouter).
- Extensible: Easy to add new providers by inheriting from
BaseProvider. - Tested: Comprehensive test suite with high coverage.
Installation
Using Poetry (Recommended)
poetry add llm_async
Using pip
pip install git+https://github.com/sonic182/llm_async.git
Usage
Basic Chat Completion
OpenAI
import asyncio
from llm_async import OpenAIProvider
async def main():
# Initialize the provider with your API key
provider = OpenAIProvider(api_key="your-openai-api-key")
# Perform a chat completion
response = await provider.acomplete(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, how are you?"}
]
)
print(response.main_response.content) # Output: The assistant's response
# Run the async function
asyncio.run(main())
OpenRouter
import asyncio
import os
from llm_async import OpenRouterProvider
async def main():
# Initialize the provider with your API key
provider = OpenRouterProvider(api_key=os.getenv("OPENROUTER_API_KEY"))
# Perform a chat completion
response = await provider.acomplete(
model="openrouter/auto", # Let OpenRouter choose the best model
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, how are you?"}
],
http_referer="https://github.com/your-username/your-app", # Optional
x_title="My AI App" # Optional
)
print(response.main_response.content) # Output: The assistant's response
# Run the async function
asyncio.run(main())
Google Gemini
import asyncio
from llm_async.providers.google import GoogleProvider
async def main():
# Initialize the provider with your API key
provider = GoogleProvider(api_key="your-google-gemini-api-key")
# Perform a chat completion
response = await provider.acomplete(
model="gemini-2.5-flash",
messages=[
{"role": "user", "content": "Hello, how are you?"}
]
)
print(response.main_response.content) # Output: The assistant's response
# Run the async function
asyncio.run(main())
Custom Base URL
provider = OpenAIProvider(
api_key="your-api-key",
base_url="https://custom-openai-endpoint.com/v1"
)
Tool Usage
import asyncio
import os
from llm_async.models import Tool
from llm_async.providers import OpenAIProvider
# Define a calculator tool
calculator_tool = Tool(
name="calculator",
description="Perform basic arithmetic operations",
parameters={
"type": "object",
"properties": {
"operation": {
"type": "string",
"enum": ["add", "subtract", "multiply", "divide"]
},
"a": {"type": "number"},
"b": {"type": "number"}
},
"required": ["operation", "a", "b"]
},
input_schema={
"type": "object",
"properties": {
"operation": {
"type": "string",
"enum": ["add", "subtract", "multiply", "divide"]
},
"a": {"type": "number"},
"b": {"type": "number"}
},
"required": ["operation", "a", "b"]
}
)
def calculator(operation: str, a: float, b: float) -> float:
"""Calculator function that can be called by the LLM."""
if operation == "add":
return a + b
elif operation == "subtract":
return a - b
elif operation == "multiply":
return a * b
elif operation == "divide":
return a / b
return 0
async def main():
# Initialize provider
provider = OpenAIProvider(api_key=os.getenv("OPENAI_API_KEY"))
# Tool executor mapping
tools_map = {"calculator": calculator}
# Initial user message
messages = [{"role": "user", "content": "What is 15 + 27?"}]
# First turn: Ask the LLM to perform a calculation
response = await provider.acomplete(
model="gpt-4o-mini",
messages=messages,
tools=[calculator_tool]
)
# Execute the tool call
tool_call = response.main_response.tool_calls[0]
tool_result = await provider.execute_tool(tool_call, tools_map)
# Second turn: Send the tool result back to the LLM
messages_with_tool = messages + [response.main_response.original_data] + [tool_result]
final_response = await provider.acomplete(
model="gpt-4o-mini",
messages=messages_with_tool
)
print(final_response.main_response.content) # Output: The final answer
asyncio.run(main())
Structured Outputs
Enforce JSON schema validation on model responses for consistent, type-safe outputs.
import asyncio
import json
from llm_async import OpenAIProvider
from llm_async.providers.google import GoogleProvider
# Define response schema
response_schema = {
"type": "object",
"properties": {
"answer": {"type": "string"},
"confidence": {"type": "number"}
},
"required": ["answer", "confidence"],
"additionalProperties": False
}
async def main():
# OpenAI example
openai_provider = OpenAIProvider(api_key="your-openai-key")
response = await openai_provider.acomplete(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What is the capital of France?"}],
response_schema=response_schema
)
result = json.loads(response.main_response.content)
print(f"OpenAI: {result}")
# Google Gemini example
google_provider = GoogleProvider(api_key="your-google-key")
response = await google_provider.acomplete(
model="gemini-2.5-flash",
messages=[{"role": "user", "content": "What is the capital of France?"}],
response_schema=response_schema
)
result = json.loads(response.main_response.content)
print(f"Gemini: {result}")
asyncio.run(main())
Supported Providers: OpenAI, Google Gemini, OpenRouter. Claude does not support structured outputs.
API Reference
OpenAIProvider
-
__init__(api_key: str, base_url: str = "https://api.openai.com/v1") -
acomplete(model: str, messages: list[dict], stream: bool = False, **kwargs) -> Response | AsyncIterator[StreamChunk]Performs a chat completion. When
stream=Truethe method returns an async iterator that yields StreamChunk objects as they arrive from the provider.
OpenRouterProvider
-
__init__(api_key: str, base_url: str = "https://openrouter.ai/api/v1") -
acomplete(model: str, messages: list[dict], stream: bool = False, **kwargs) -> Response | AsyncIterator[StreamChunk]Performs a chat completion using OpenRouter's unified API. Supports the same OpenAI-compatible interface with additional optional headers:
http_referer: Your application's URL (recommended)x_title: Your application's name (recommended)
OpenRouter provides access to hundreds of AI models from various providers through a single API.
GoogleProvider
-
__init__(api_key: str, base_url: str = "https://generativelanguage.googleapis.com/v1beta/models/") -
acomplete(model: str, messages: list[dict], stream: bool = False, **kwargs) -> Response | AsyncIterator[StreamChunk]Performs a chat completion using Google's Gemini API. Supports structured outputs and uses camelCase for API keys (e.g.,
generationConfig).
Streaming
- Usage:
async for chunk in await provider.acomplete(..., stream=True):print or processchunkin real time.
Example output
--- OpenAI streaming response ---
1. Peel and slice potatoes.
2. Par-cook potatoes briefly.
3. Whisk eggs with salt and pepper.
4. Sauté onions until translucent (optional).
5. Combine potatoes and eggs in a pan and cook until set.
6. Fold and serve.
--- Claude streaming response ---
1. Prepare potatoes by peeling and slicing.
2. Fry or boil until tender.
3. Beat eggs and season.
4. Mix potatoes with eggs and cook gently.
5. Serve warm.
Development
Setup
git clone https://github.com/sonic182/llm_async.git
cd llm_async
poetry install
Running Tests
poetry run pytest
Building
poetry build
Roadmap
- Support for additional providers (e.g., Grok, Anthropic direct API)
- More advanced tool features
- Response caching and retry mechanisms
Contributing
Contributions are welcome! Please open an issue or submit a pull request on GitHub.
License
MIT License - see the LICENSE file for details.
Authors
- sonic182
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_async-0.2.0.tar.gz.
File metadata
- Download URL: llm_async-0.2.0.tar.gz
- Upload date:
- Size: 15.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d7d34a3bb502bec84f600a528add7147c05eb9cd8d7936d70b59c030634d811
|
|
| MD5 |
c099ab286b72bb224c7473727ef82349
|
|
| BLAKE2b-256 |
9315069e7a9de2417bebef29ab6151641cdcc428ac2b892f06123025844a0e8e
|
File details
Details for the file llm_async-0.2.0-py3-none-any.whl.
File metadata
- Download URL: llm_async-0.2.0-py3-none-any.whl
- Upload date:
- Size: 19.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6cdcac6e8859bfbfed9f86ffaeb2a204f6794e6346aa24a5ea3cf08f04a0b1d1
|
|
| MD5 |
eff8387d827b9f19cba37aaf0c75a9c0
|
|
| BLAKE2b-256 |
9ff427e22d6f857c1e50aa78a99291d39aeace57c8a59932bdf06f01e1830144
|